Forum for AI: Jason Baldridge/Department of Linguistics UT-Austin Data-Driven Discourse Parsing in ACES 2.302
Speaker Name/Affiliation: Jason Baldridge/Departm
ent of Linguistics UT-Austin
Talk Title: Data-Driven Discourse Par
sing
Date/Time: March 31 2006 at 3:00 p.m.
Location: ACES
2.302
Host: Ray Mooney
Talk Abstract:
Computing the struc
ture of discourse is both representationally
and computationally challe
nging. It is largely agreed that
discourses consists of segments that a
re related to one
another through rhetorical relations and the goals an
d intentions
of the speaker(s). While some theories postulate a context
-free
tree representation of discourse structure there are strong
arguments that quite general acyclic graphs are representationally
nece
ssary for adequately capturing the rhetorical connections
of discourse
segments within a text or dialog. This leads
to an explosion of alterna
tive potential analyses that is
difficult to reign in even with very so
phisticated machine
learning models. Another challenge is that there ar
e many
sources of information --e.g. sentence moods discourse
cue
phrases goals and intentions and domain-specific information--
that
go into the determination of segmentation and rhetorical
relationships.
This information can be difficult to utilize
effectively especially i
n the face of data sparsity.
In this talk I will discuss data and a
statistical parser
for analyzing appointment scheduling dialogs. The p
arser
which is based on the sentence parsing models of Collins
bu
ilds discourse structures of Segmented Discourse Representation
Theory.
I will highlight some of the adequacies and inadequacies
of this appro
ach for this task and then present a new approach
based on recent deve
lopments in sentence-level dependency
parsing. Though this approach bri
ngs with it new representational
challenges it promises to greatly imp
rove both the process
of annotation and accuracy in the automatic recov
ery of
discourse structures.
Speaker Bio:
Jason Baldridge i
s an assistant professor in the Department
of Linguistics at the Univer
sity of Texas at Austin. He
completed his dissertation on categorial gr
ammars at the
University of Edinburgh in 2002 advised by Mark Steedman
.
From 2002 to 2005 he held a post-doctoral position at Edinburgh
working with Alex Lascarides and Miles Osborne. His current
work includ
es research on probabilistic parsing for Portuguese
discriminative par
se ranking models probabilistic discourse
parsing for Segmented Discou
rse Representation Theory active
learning and formal syntax using cat
egorial grammars and
other constraint-based formalisms. With Nicholas A
sher
he recently began a NSF-funded project to investigate the
int
egration of discourse structure and coreference resolution
using machin
e learning. He has been active for many years
in the creation and promo
tion of open source software for
natural language processing.
- About
- Research
- Faculty
- Awards & Honors
- Undergraduate
- Graduate
- Careers
- Outreach
- Alumni
- UTCS Direct