Forum for AI: Jason Baldridge/Department of Linguistics UT-Austin Data-Driven Discourse Parsing in ACES 2.302

Contact Name: 
Jenna Whitney
Date: 
Mar 31, 2006 3:00pm - 4:00pm

Speaker Name/Affiliation: Jason Baldridge/Departm

ent of Linguistics UT-Austin

Talk Title: Data-Driven Discourse Par

sing

Date/Time: March 31 2006 at 3:00 p.m.

Location: ACES

2.302

Host: Ray Mooney

Talk Abstract:
Computing the struc

ture of discourse is both representationally
and computationally challe

nging. It is largely agreed that
discourses consists of segments that a

re related to one
another through rhetorical relations and the goals an

d intentions
of the speaker(s). While some theories postulate a context

-free
tree representation of discourse structure there are strong

arguments that quite general acyclic graphs are representationally
nece

ssary for adequately capturing the rhetorical connections
of discourse

segments within a text or dialog. This leads
to an explosion of alterna

tive potential analyses that is
difficult to reign in even with very so

phisticated machine
learning models. Another challenge is that there ar

e many
sources of information --e.g. sentence moods discourse
cue
phrases goals and intentions and domain-specific information--
that

go into the determination of segmentation and rhetorical
relationships.
This information can be difficult to utilize
effectively especially i

n the face of data sparsity.

In this talk I will discuss data and a
statistical parser
for analyzing appointment scheduling dialogs. The p

arser
which is based on the sentence parsing models of Collins
bu

ilds discourse structures of Segmented Discourse Representation
Theory.
I will highlight some of the adequacies and inadequacies
of this appro

ach for this task and then present a new approach
based on recent deve

lopments in sentence-level dependency
parsing. Though this approach bri

ngs with it new representational
challenges it promises to greatly imp

rove both the process
of annotation and accuracy in the automatic recov

ery of
discourse structures.

Speaker Bio:
Jason Baldridge i

s an assistant professor in the Department
of Linguistics at the Univer

sity of Texas at Austin. He
completed his dissertation on categorial gr

ammars at the
University of Edinburgh in 2002 advised by Mark Steedman

.
From 2002 to 2005 he held a post-doctoral position at Edinburgh

working with Alex Lascarides and Miles Osborne. His current
work includ

es research on probabilistic parsing for Portuguese
discriminative par

se ranking models probabilistic discourse
parsing for Segmented Discou

rse Representation Theory active
learning and formal syntax using cat

egorial grammars and
other constraint-based formalisms. With Nicholas A

sher
he recently began a NSF-funded project to investigate the
int

egration of discourse structure and coreference resolution
using machin

e learning. He has been active for many years
in the creation and promo

tion of open source software for
natural language processing.