Regina Barzilay

Computer Science & Artificial Intelligence Lab

Massachusetts Institute of Technology

 

Friday, April 24

12:00-2:00pm

 

 

Embracing Language Diversity: Unsupervised Multilingual Learning
 
For centuries, the deep connection between human languages has
fascinated scholars, and drove many important discoveries in
anthropology and historical linguistics. In this talk, I will show
that this connection can empower unsupervised methods for language
analysis. The key insight is that joint learning from several languages
reduces uncertainty about the linguistic structure of each individual
language.
 
I will present multilingual generative unsupervised models for
morphological segmentation, part-of-speech tagging, and parsing. In all
of these instances we model the multilingual data as arising through a
combination of language-independent and language-specific probabilistic
processes. This feature allows the model to identify and learn from
recurring cross-lingual patterns to improve prediction accuracy in each
language.
 
This is joint work with Benjamin Snyder, Tahira Naseem and Jacob
Eisenstein.