UTCS Colloquium/AI - L. Venkata Subramaniam/IBM Research India: "Real World Text Analytics"

Jenna Whitney
Oct 21, 2010 1:00pm - 2:00pm

Type of Talk: UTCS Colloquium/AI

Speaker/Affiliation: L. Ve

nkata Subramaniam/IBM Research India

Date/Time: Thursday, October 21

, 2010, 1:00 p.m.

Location: ACES 2.402

Host: Raymond Mooney

nTalk Title: Real World Text Analytics

Talk Abstract:
Often, in the
real world noise is ubiquitous in text documents. Text produced
by proce

ssing signals intended for human use are often noisy for automated

er processing. Techniques like Automatic Speech Recognition, Optical

racter Recognition and Machine Translation introduce processing noise.

milarly, digital text produced in informal settings such as online chat,

SMS, emails, message boards, newsgroups, blogs, wikis and web pages

considerable noise.

In this talk we will present our work in
dealing with real world noisy text
and extracting useful information fro

m it.

Speaker Bio:
L. Venkata Subramaniam manages the information pr

ocessing and analytics group
at IBM Research India. He received his PhD f

rom IIT Delhi in 1999. His
research focuses on unstructured information m

anagement, statistical natural
language processing, noisy text analytic

s, text and data mining, information
theory, speech and image processi

ng. His work on Data Cleansing and Entity
Resolution has been deployed on
the field in scenarios involving cleansing of
millions of data records.

He co founded the AND (Analytics for Noisy
Unstructured Text Data) worksh

op series and also co-chaired the first four
workshops, 2007-2010. He wa

s guest co-editor of two special issues on Noisy
Text Analytics in the In

ternational Journal of Document Analysis and
Recognition in 2007 and 2009