UTCS Colloquium/AI- Andrew McCallum/University of Massachusetts Amherst: "Probabilistic Programming and Probabilistic Databases with Imperatively-defined Factor Graphs", ACES 2.402

Contact Name: 
Jenna Whitney
Oct 22, 2010 11:00am - 12:00pm

There is a sign-up schedule for this event that can be found at



Type of Talk: UTCS Colloquium/AI

Speaker/Affiliation: Andre

w McCallum/University of Massachusetts Amherst

Date/Time: Friday, Oct

ober 22, 2010, 11:00 a.m.

Location: ACES 2.402

Host: Dana Balla


Talk Title: Probabilistic Programming and Probabilistic Databases w

ith Imperatively-defined Factor Graphs

Talk Abstract:

in natural language processing, information integration, computer vision

and other areas have achieved great empirical success using graphical model

s with repeated, relational structure. But as researchers explore increasi

ngly complex structures, there has been growing interest in new programmin

g languages or toolkits that make it easier to implement such models in a f

lexible, yet scalable way. A key issue in these toolkits is how to define

the templates of these repeated structure and tied parameters. Rather than

using a declarative language, such as SQL or first-order logic, we advoca

te using an imperative language to express various aspects of model structu

re, inference, and learning. By combining the traditional, declarative s

tatistical semantics of factor graphs with imperative definitions of their

construction and operation, we allow the user to mix declarative and proce

dural domain knowledge, and also gain significant efficiencies. We have im

plemented such imperatively defined factor graphs in a system we call FACTO

RIE, a software library for an object-oriented, strongly-typed, function

al language called Scala. I will introduce FACTORIE, give several examples
of its use, explain how it supports a new style of probabilistic database

s, and describe its application to schema alignment and lightly-supervised
extraction of FreeBase-defined relations from several years'' worth of NYT

imes articles.

Speaker Bio:
Andrew McCallum is a Professor and Direc

tor of the Information Extraction and Synthesis Laboratory in the Computer

Science Department at University of Massachusetts Amherst. He has published
over 150 papers in many areas of AI, including natural language processin

g, machine learning, data mining and reinforcement learning, and his wor

k has received over 15,000 citations. He obtained his PhD from University

of Rochester in 1995 with Dana Ballard and a postdoctoral fellowship from C

MU with Tom Mitchell and Sebastian Thrun. In the early 2000''s he was Vice

President of Research and Development at at WhizBang Labs, a 170-person st

art-up company that used machine learning for information extraction from t

he Web. He is a AAAI Fellow, the recipient of the UMass NSM Distinguished

Research Award, the UMass Lilly Teaching Fellowship, and the IBM Faculty

Partnership Award. He was the Program Co-chair for the International Confer

ence on Machine Learning (ICML) 2008, and a member of the board of the Int

ernational Machine Learning Society and the editorial board of the Journal

of Machine Learning Research. For the past ten years, McCallum has been ac

tive in research on statistical machine learning applied to text, especial

ly information extraction, co-reference, document classification, finite
state models, semi-supervised learning, and social network analysis. Wor

k on search and bibliometric analysis of open-access research literature ca

n be found at rexa.info. McCallum''s web page is www.cs.umass.edu/~mccallum