Ultra-Fine Entity Typing (ACL 2018)

Eunsol Choi, Omer Levy, Yejin Choi and Luke Zettlemoyer

Abstract

We introduce a new entity typing task: given a sentence with an entity mention, the goal is to predict a set of free-form phrases (e.g. skyscraper, songwriter, or criminal) that describe appropriate types for the target entity. Our entity mention is not limited to named entities, and encompasses pronoun mentions and common noun expressions. Here are some examples in our dataset:

Example (with entity bolded) Crowdsourced type labels
Young ‘s mother says , when she sees her son , she plans on hugging him for a good solid half hour. person, son, relative, child, man, male
`` There is a wealth of good news in this report , and I ‘m particularly encouraged by the progress we are making against AIDS , ‘’ HHS Secretary Donna Shalala said in a statement . group, organization, government, hospital, administration, socialist
In contrast to the male way of thinking , in which priority has always been given to considerations of political and economic power, Annette Lu has emphasized “soft national power.” person, officeholder, president, official, leader, incumbent
For starters , it ‘s not an ordinary sun but a Cepheid variable - a giant , pulsating star shining with the light of at least a thousand suns . object, celestial body, sun, star

Our crowd-sourced evaluation sets are much more diverse and fine-grained than existing benchmarks, requiring 429 types to cover 80% of data. FIGER requires onlythe top 7 types, while OntoNotes needs only 4.

This formulation allows us to use a new type of distant supervision at large scale: head words, which indicate the type of the noun phrases they appear in. This new type of supervision is contextualized, unlike prior supervision from entity linking which lists all types of the entity. Here are examples of distant supervision used in our task.

Example (with entity bolded) Distant supervision labels Source
Alexis Kaniaris, CEO of the organizing company Eu-ropartners, explained, speaking in a radio program in national radio station NET. radio, station, radio station Headword
Toyota recalled more than 8 million vehicles globally oversticky pedals that can become entrapped in floor mats. manufacter Entity Linking to Wikipedia
Iced Earth’s musical style is influenced by many traditionalheavy metal groups such as Black Sabbath. person, artist, actor, author, musician Entity Linking to Knowledge Base

We present a neural model that can predict ultra-fine types, and is trained using a multitask objective that pools our new head-word supervision with prior supervision from entity linking. Experimental results demonstrate that our model is effective in predicting entity types at varying granularity; it achieves state of the art performance on an existing OntoNotes fine-grained entity typing benchmark, and sets baselines for our newly-introduced datasets.

Data and code

Data (760MB)

Pretrained model / outputs (212MB)

Wikipedia Entity -> Type mapping (414MB)

Code (github repo) (Evaluation script is inside the repo as scorer.py)

Paper (Pdf)

Supplementary Material

Slides

Sample model outputs

Bibtex

@InProceedings{Choi:2018:ACL, author = {Choi, Eunsol and Levy, Omer and Choi, Yejin and Zettlemoyer}, title = {Ultra-Fine Entity Typing}, booktitle = {Proceedings of the ACL}, year = {2018}, publisher = {Association for Computational Linguistics} }