Forum for AI: Jason Baldridge/Department of Linguistics UT-Austin Data-Driven Discourse Parsing

Jenna Whitney
Mar 31, 2006 3:00pm - 4:00pm

Speaker Name/Affiliation: Jason Baldridge/Departm

ent of Linguistics UT-Austin

Talk Title: Data-Driven Discourse Par


Date/Time: March 31 2006 at 3:00 p.m.

Location: ACES


Host: Ray Mooney

Talk Abstract:
Computing the struc

ture of discourse is both representationally
and computationally challe

nging. It is largely agreed that
discourses consists of segments that a

re related to one
another through rhetorical relations and the goals an

d intentions
of the speaker(s). While some theories postulate a context

tree representation of discourse structure there are strong

arguments that quite general acyclic graphs are representationally

ssary for adequately capturing the rhetorical connections
of discourse

segments within a text or dialog. This leads
to an explosion of alterna

tive potential analyses that is
difficult to reign in even with very so

phisticated machine
learning models. Another challenge is that there ar

e many
sources of information --e.g. sentence moods discourse
phrases goals and intentions and domain-specific information--

go into the determination of segmentation and rhetorical
This information can be difficult to utilize
effectively especially i

n the face of data sparsity.

In this talk I will discuss data and a
statistical parser
for analyzing appointment scheduling dialogs. The p

which is based on the sentence parsing models of Collins

ilds discourse structures of Segmented Discourse Representation
I will highlight some of the adequacies and inadequacies
of this appro

ach for this task and then present a new approach
based on recent deve

lopments in sentence-level dependency
parsing. Though this approach bri

ngs with it new representational
challenges it promises to greatly imp

rove both the process
of annotation and accuracy in the automatic recov

ery of
discourse structures.

Speaker Bio:
Jason Baldridge i

s an assistant professor in the Department
of Linguistics at the Univer

sity of Texas at Austin. He
completed his dissertation on categorial gr

ammars at the
University of Edinburgh in 2002 advised by Mark Steedman

From 2002 to 2005 he held a post-doctoral position at Edinburgh

working with Alex Lascarides and Miles Osborne. His current
work includ

es research on probabilistic parsing for Portuguese
discriminative par

se ranking models probabilistic discourse
parsing for Segmented Discou

rse Representation Theory active
learning and formal syntax using cat

egorial grammars and
other constraint-based formalisms. With Nicholas A

he recently began a NSF-funded project to investigate the

egration of discourse structure and coreference resolution
using machin

e learning. He has been active for many years
in the creation and promo

tion of open source software for
natural language processing.