AI Forum: Dr. Mark Johnson Bayesian Inference of Grammars ACES 2.402

Contact Name: 
Jenna Whitney
Feb 15, 2007 2:00pm - 3:00pm

There is a signup schedule for this event.

Speaker Nam

e/Affiliation: Dr. Mark Johnson/Brown University

Date/Time: Februa

ry 15 2007 2:00p.m.-3:00p.m.

Location: ACES 2.402

Host: D

r. Ray Mooney

Talk Title: Bayesian Inference of Grammars


k Abstract:
Even though Maximum Likelihood Estimation (MLE) of Probabili

Context-Free Grammars (PCFGs) is well-understood (the Inside-Outsi

algorithm can do this efficiently from the terminal strings alone) t

he inferred grammars are usually linguistically inaccurate. In order to bet

ter understand why maximum likelihood finds poor grammars this talk examin

es two simple natural language induction problems: morphological segmentati

on and word segmentation. We identify several problems with the MLE PCFG mo

dels of these problems and propose Hierarchical Dirichlet Process (HDP) mod

els to overcome them. In order
to test these HDP models we develop MCMC
algorithms for Bayesian inference of these models from strings alone. Fina

lly we discuss to what extent the lessons learnt from these examples can b

e put into a unified framework and applied to the general problem of gramma

r induction. Joint work with Sharon Goldwater and Tom Griffiths.


eaker Bio:
Mark Johnson is a Professor of Cognitive and Linguistic Scien

ce and Computer Science at Brown University and a Visiting Researcher in th

e Natural Language group at Microsoft Research for 2006--2007. He was award

ed a BSc (Hons) in 1979 from the University of Sydney an MA in 1984 from t

he University of California San Diego and a PhD in 1987 from Stanford Univ

ersity. He held a postdoctoral fellowship at MIT from 1987 until 1988 and

has been a visiting researcher at the
University of Stuttgart the Xero

x Research Centre in Grenoble and CSAIL at MIT. He has worked on a wide ran

ge of topics in computational linguistics but his main research area is pa

rsing and its applications to text and speech processing. He was President

of the Association for Computational Linguistics in 2003.