UTCS AI Colloquia - David Chiang, Information Sciences Institute at USC, "Learning Syntax and Semantics for Natural Language Translation"

Contact Name: 
Karl Pichotta
GDC 6.302
Oct 4, 2013 11:00am - 12:00pm

Signup Schedule: http://apps.cs.utexas.edu/talkschedules/cgi/list_events.cgi

Talk Audience: UTCS Faculty, Grads, Undergrads, Other Interested Parties

Host:  Ray Mooney

Talk Abstract: Automatic translation of human languages is one of the oldest problems in computer science. Two general approaches have been taken: one which relies heavily on knowledge of linguistic structure and meaning, and the other which relies on statistics from large amounts of data. For years, these two approaches seemed at odds with each other, but recent developments have made great progress towards building translation systems according to the maxim, "Linguistics tells us what to count, and statistics tells us how to count it" (Joshi). I will give an overview of three such developments from ISI. The first is the introduction of formal grammars (namely, synchronous context-free grammars) to model the syntax of human languages, first successfully demonstrated by my system Hiero. The second is ongoing work at ISI to incorporate knowledge of formal semantics. I will describe the formalism we are currently working with (synchronous hyperedge replacement grammars) and the efficient algorithms we have developed for processing semantic structures. Finally, I will discuss initial results on learning word meanings using neural networks, and prospects for learning them across languages.

Speaker Bio: David Chiang is Research Assistant Professor in the USC Department of Computer Science and Project Leader at the USC Information Sciences Institute. He earned his PhD from the University of Pennsylvania in 2004 under Aravind Joshi. His research is on computational models for learning human languages, particularly how to translate from one language to another. His work on applying formal grammars and machine learning to translation has been recognized with two best paper awards (at ACL 2005 and NAACL HLT 2009) and has transformed the field of machine translation. He has received research grants from DARPA, NSF, and Google, has served on the executive board of NAACL and the editorial board of Computational Linguistics, and is currently on the editorial board of Transactions of the ACL.