Bonnie Webber

University of Edinburgh/University of Pennsylvania
Friday, May 4, 2007, 12-2 p.m.

Discourse Structure and its Complexity

The complexity of dependencies in syntactic structure has been studied for many years and in many languages as the basis for assessing the formal complexity of human language. Less studied is the complexity of dependencies in discourse structure -- that is, the complexity of dependencies that span more than a single sentence or clause. Do they have something to contribute to our understanding of the formal complexity of language?

The question of what phenomena should be considered as evidence of discourse structure and its complexity is crucial to this enterprise. In this talk, I will first review several opposing views of discourse structure and the phenomena that each subsumes. Each view is associated with a textual corpus that has been manually annotated for the phenomena it considers relevant, which can provide an empirical basis for its claims. I will then draw parallels with syntax, in order to suggest distinctions between which of these phenomena should and should not be considered as evidence for discourse structure. Since the Penn Discourse TreeBank (PDTB) is now the largest corpus in the world manually annotated with discourse relations, it promises to be a good source of evidence for claims about discourse structure and its complexity. Alan Lee will illustrate the range of dependency patterns found in the PDTB between pairs of discourse relations, and we will relate them to my earlier distinctions.

Mark Liberman, in his 6 November 2003 Language Log, commenting on the emergence of two comparable corpora annotated for discourse relations in different ways, noted how this made it a great time to be working on discourse. We hope that the release of the PDTB will make it even better.