Collaborative Projects
Category based explanation: What makes a good explanation?
Michael Weisberg,
UPenn Department of Philosophy, with Tania Lombrozo of Harvard's Psychology Department.
Chinese Treebank
The Chinese Treebank project, initiated in 1998, aims to build a large-scale Chinese
corpus annotated with syntactic structures. So far 500,000 words worth of Chinese
text has been fully annotated. The data in this corpus is mostly selected from Xinhua
newswire, the Sinorama magazine (a Taiwan news magazine) and a small portion of it
is from Hong Kong News. Most of the data has English translations and a parallel
Chinese/English treebank is under development. This project is funded in part by
Department of Defense and in part by DARPA/TIDES.
Chinese PropBank
The Chinese Propbank project started in 2002 and the goal is to add predicate-argument
structure to the Chinese Treebank. The project is on-going, and the first installment
of 250,000 words is near completion. The project is funded by the US Department of Defense.
Human Brain Evolution Library directed by
P. Thomas Schoeneman, Department of Anthropology. A collaboration among Penn's
Departments of Anthropology and Radiology, and Columbia University.
Korean NLP research at UPenn
Korean NLP research at UPenn consists of several projects, which focus on building
Korean natural language resources and developing applications. The Korean Treebank
is a collection of syntactically annotated Korean text, and The Korean Propbank is
a database of verb-argument relations found in the Korean Treebank. The Korean XTAG
project aims at developing a grammar and related tools for Korean in the XTAG
formalism. As for NLP applications, we have under development: Korean/English Machine
Translation, Korean Morphological analysis with a tagging tool, and also a Korean
syntactic parser. These projects have been funded in the past by the Army Research
Lab and DARPA/TIDES and are currently funded by the Army Research Office.
Penn Discourse TreeBank Project
The goal of the Penn Discourse TreeBank project is to annotate the argument structure
and semantics of low-level discourse relations in texts of the Wall Street Journal
Corpus. The discourse relations are anchored by discourse connectives. The discourse
annotations will be linked with the sentence level syntactic and semantic annotation
from the Penn TreeBank and PropBank.
Please note: If you want to share your Penn project with the Cognitive Science Community at large, please contact Laurel Sweeney at laurels@ircs.upenn.edu.