Information in the Wild
Friday September 14, 2001, 12-2 p.m.

Web propagandists have claimed that the current Web, or XML, or the so called "Semantic Web" will finally create an unambiguous means of communication. I will argue that they are as misguided as previous believers in the perfectibility of human communication. Web documents have not only many of the same kinds of ambiguity that we know from natural languages, but also create new ones that will keep computational linguists busy for the foreseeable future. I will illustrate the talk with examples drawn from work on Web information extraction at WhizBang! Labs.