UTCS Colloquium/AI-Dan Klein/University of California at Berkeley: "Phylogenetic Models for Natural Language," ACES 2.402, Friday, March 12, 2010, 11:00 a.m.
There is a sign-up schedule for this event that can be found
at http://www.cs.utexas.edu/department/webeven
t/utcs/events/cgi/list_events.cgi
Type of Talk: UTCS Colloquium/A
I
Speaker/Affiliation: Dan Klein/University of California at Berkeley
Date/Time: Friday, March 12, 2010, 11:00 a.m.
Location: ACE
S 2.402
Host: Ray Mooney
Talk Title: Phylogenetic Models for N
atural Language
Talk Abstract:
Languages descend in a roughly t
ree-structured evolutionary process. In
historical linguistics, this
process is manually analyzed by comparing and
contrasting modern langu
ages. Many questions arise: What does the tree of
languages look like?
What are the ancestral forms of modern words? What
functional pressur
es shape language change? In this talk, I''ll describe our
work on br
inging large-scale computational methods to bear on these
problems.
In the task of proto-word reconstruction, we infer ancestral word
s from
their modern forms. I''ll present a statistical model in which
each word''s
history is traced down a phylogeny. Along each branch, w
ords mutate
according to regular, learned sound changes. Experiments
in the Romance and
Oceanic families show that accurate automated recon
struction is possible;
using more languages leads to better results.<
br />
Standard reconstruction models assume that one already knows whi
ch words are
cognate, i.e., are descended from the same ancestral wo
rd. However, cognate
detection is its own challenge. I''ll describe m
odels which can automatically
detect cognates (in similar languages) a
nd translations (in divergent
languages). Typical translation-learning
approaches require virtual Rosetta
stones -- collections of bilingual
texts. In contrast, I''ll discuss models
which operate on monolingua
l texts alone.
Finally, I''ll present work on multilingual gram
mar induction, where many
languages'' grammars are simultaneously ind
uced. By assuming that grammar
parameters vary slowly, again along a
phylogenetic tree, we can obtain
substantial increases in grammar qua
lity across the board.
Speaker Bio:
Dan Klein is an associate p
rofessor of computer science at the University
of California, Berkele
y (PhD Stanford, MSt Oxford, BA Cornell). His research
focuses on st
atistical natural language processing, including unsupervised
learnin
g methods, syntactic analysis, information extraction, and machine
translation. Academic honors include a Marshall Fellowship, a Microsoft Ne
w
Faculty Fellowship, a Sloan Fellowship, an NSF CAREER award, the
ACM Grace
Murray Hopper award, and best paper awards at the ACL, NAA
CL, and EMNLP
conferences.
- About
- Research
- Faculty
- Awards & Honors
- Undergraduate
- Graduate
- Careers
- Outreach
- Alumni
- UTCS Direct