Publications: Lexical Semantics
Lexical semantics concerns the representation and use of word meanings
in natural language processing. Our work in the area has focused on
learning word meanings for use in
semantic
parsing and, more recently, improved distributional (vector space)
models of word meaning. Lexical semantics is part of our research on
natural language learning.
- Cross-Cutting Models of Lexical Semantics
[Details] [PDF] [Slides]
Joseph Reisinger and Raymond Mooney
In Proceedings of The Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), 1405-1415, July 2011.Context-dependent word similarity can be measured over multiple cross-cutting dimensions. For example, lung and breath are similar thematically, while authoritative and superficial occur in similar syntactic contexts, but share little semantic similarity. Both of these notions of similarity play a role in determining word meaning, and hence lexical semantic models must take them both into account. Towards this end, we develop a novel model, Multi-View Mixture (MVM), that represents words as multiple overlapping clusterings. MVM finds multiple data partitions based on different subsets of features, subject to the marginal constraint that feature subsets are distributed according to Latent Dirichlet Allocation. Intuitively, this constraint favors feature partitions that have coherent topical semantics. Furthermore, MVM uses soft feature assignment, hence the contribution of each data point to each clustering view is variable, isolating the impact of data only to views where they assign the most features. Through a series of experiments, we demonstrate the utility of MVM as an inductive bias for capturing relations between words that are intuitive to humans, outperforming related models such
as Latent Dirichlet Allocation.
ML ID: 262
- A Mixture Model with Sharing for Lexical Semantics
[Details] [PDF] [Slides]
Joseph Reisinger and Raymond J. Mooney
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2010), 1173--1182, MIT, Massachusetts, USA, October 9--11 2010.We introduce tiered clustering, a mixture
model capable of accounting for varying degrees
of shared (context-independent) feature
structure, and demonstrate its applicability
to inferring distributed representations of
word meaning. Common tasks in lexical semantics
such as word relatedness or selectional
preference can benefit from modeling
such structure: Polysemous word usage is often
governed by some common background
metaphoric usage (e.g. the senses of line or
run), and likewise modeling the selectional
preference of verbs relies on identifying commonalities
shared by their typical arguments.
Tiered clustering can also be viewed as a form
of soft feature selection, where features that do
not contribute meaningfully to the clustering
can be excluded. We demonstrate the applicability
of tiered clustering, highlighting particular
cases where modeling shared structure is
beneficial and where it can be detrimental.
ML ID: 252
- Cross-cutting Models of Distributional Lexical Semantics
[Details] [PDF] [Slides]
Joseph S. Reisinger
June 2010. Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin.In order to respond to increasing demand for natural language interfaces—and provide
meaningful insight into user query intent—fast, scalable lexical semantic models
with flexible representations are needed. Human concept organization is a rich epiphenomenon
that has yet to be accounted for by a single coherent psychological framework:
Concept generalization is captured by a mixture of prototype and exemplar
models, and local taxonomic information is available through multiple overlapping
organizational systems. Previous work in computational linguistics on extracting lexical
semantic information from the Web does not provide adequate representational
flexibility and hence fails to capture the full extent of human conceptual knowledge.
In this proposal I will outline a family of probabilistic models capable of accounting
for the rich organizational structure found in human language that can predict contextual
variation, selectional preference and feature-saliency norms to a much higher
degree of accuracy than previous approaches. These models account for cross-cutting
structure of concept organization—i.e. the notion that humans make use of different
categorization systems for different kinds of generalization tasks—and can be applied
to Web-scale corpora. Using these models, natural language systems will be able to
infer a more comprehensive semantic relations, in turn improving question answering,
text classification, machine translation, and information retrieval.
ML ID: 249
- Multi-Prototype Vector-Space Models of Word Meaning
[Details] [PDF] [Slides]
Joseph Reisinger, Raymond J. Mooney
In Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-2010), 109-117, 2010.Current vector-space models of lexical semantics
create a single “prototype” vector to represent
the meaning of a word. However, due
to lexical ambiguity, encoding word meaning
with a single vector is problematic. This
paper presents a method that uses clustering
to produce multiple “sense-specific&rdquo vectors
for each word. This approach provides
a context-dependent vector representation of
word meaning that naturally accommodates
homonymy and polysemy. Experimental comparisons
to human judgements of semantic
similarity for both isolated words as well as
words in sentential contexts demonstrate the
superiority of this approach over both prototype
and exemplar based vector-space models.
ML ID: 241
- Acquiring Word-Meaning Mappings for Natural Language Interfaces
[Details] [PDF]
Cynthia A. Thompson and Raymond J. Mooney
Journal of Artificial Intelligence Research, 18:1-44, 2003.This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that learns to parse representations such as logical database queries.
Experimental results are presented demonstrating Wolfie's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996), with results favorable to Wolfie. A second set of experiments demonstrates Wolfie's ability to scale to larger and more difficult, albeit artificially generated, corpora.
In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods (Cohn, Atlas, & Ladner, 1994) attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance.
ML ID: 121
- Automatic Construction of Semantic Lexicons for Learning Natural Language Interfaces
[Details] [PDF]
Cynthia A. Thompson and Raymond J. Mooney
In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99), 487-493, Orlando, FL, July 1999.This paper describes a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of words paired with meaning representations. Wolfie is part of an integrated system that learns to parse novel sentences into semantic representations, such as logical database queries. Experimental results are presented demonstrating Wolfie's ability to learn useful lexicons for a database interface in four different natural languages. The lexicons learned by Wolfie are compared to those acquired by a competing system developed by Siskind.
ML ID: 95
- Semantic Lexicon Acquisition for Learning Natural Language Interfaces
[Details] [PDF]
Cynthia Ann Thompson
PhD Thesis, Department of Computer Sciences, University of Texas at Austin, Austin, TX, December 1998. 101 pages. Also appears as Technical Report AI 99-278, Artificial Intelligence Lab, University of Texas at Austin.A long-standing goal for the field of artificial intelligence is to enable computer understanding of human languages. A core requirement in reaching this goal is the ability to transform individual sentences into a form better suited for computer manipulation. This ability, called semantic parsing, requires several knowledge sources, such as a grammar, lexicon, and parsing mechanism.
Building natural language parsing systems by hand is a tedious, error-prone undertaking. We build on previous research in automating the construction of such systems using machine learning techniques. The result is a combined system that learns semantic lexicons and semantic parsers from one common set of training examples. The input required is a corpus of sentence/representation pairs, where the representations are in the output format desired. A new system, Wolfie, learns semantic lexicons to be used as background knowledge by a previously developed parser acquisition system, Chill. The combined system is tested on a real world domain of answering database queries. We also compare this combination to a combination of Chill with a previously developed lexicon learner, demonstrating superior performance with our system. In addition, we show the ability of the system to learn to process natural languages other than English. Finally, we test the system on an alternate sentence representation, and on a set of large, artificial corpora with varying levels of ambiguity and synonymy.
One difficulty in using machine learning methods for building natural language interfaces is building the required annotated corpus. Therefore, we also address this issue by using active learning to reduce the number of training examples required by both Wolfie and Chill. Experimental results show that the number of examples needed to reach a given level of performance can be significantly reduced with this method.
ML ID: 90
- Semantic Lexicon Acquisition for Learning Natural Language Interfaces
[Details] [PDF]
Cynthia A. Thompson and Raymond J. Mooney
In Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Quebec, Canada, August 1998. Also available as TR AI 98-273, Artificial Intelligence Lab, University of Texas at Austin, May 1998.This paper describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with representations of their meaning. The lexicon learned consists of words paired with meaning representations. WOLFIE is part of an integrated system that learns to parse novel sentences into semantic representations, such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The lexicons learned by WOLFIE are compared to those acquired by a competing system developed by Siskind (1996).
ML ID: 89
- Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning
[Details] [PDF]
Raymond J. Mooney
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-96), 82-91, Philadelphia, PA, 1996.This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.
ML ID: 62
- Corpus-Based Lexical Acquisition For Semantic Parsing
[Details] [PDF]
Cynthia Thompson
February 1996. Ph.D. proposal.Building accurate and efficient natural language processing (NLP) systems is an important and difficult problem. There has been increasing interest in automating this process. The lexicon, or the mapping from words to meanings, is one component that is typically difficult to update and that changes from one domain to the next. Therefore, automating the acquisition of the lexicon is an important task in automating the acquisition of NLP systems. This proposal describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that learns a lexicon from input consisting of sentences paired with representations of their meanings. Preliminary experimental results show that this system can learn correct and useful mappings. The correctness is evaluated by comparing a known lexicon to one learned from the training input. The usefulness is evaluated by examining the effect of using the lexicon learned by WOLFIE to assist a parser acquisition system, where previously this lexicon had to be hand-built. Future work in the form of extensions to the algorithm, further evaluation, and possible applications is discussed.
ML ID: 57
- Lexical Acquisition: A Novel Machine Learning Problem
[Details] [PDF]
Cynthia A. Thompson and Raymond J. Mooney
Technical Report, Artificial Intelligence Lab, University of Texas at Austin, January 1996.This paper defines a new machine learning problem to which standard machine learning algorithms cannot easily be applied. The problem occurs in the domain of lexical acquisition. The ambiguous and synonymous nature of words causes the difficulty of using standard induction techniques to learn a lexicon. Additionally, negative examples are typically unavailable or difficult to construct in this domain. One approach to solve the lexical acquisition problem is presented, along with preliminary experimental results on an artificial corpus. Future work includes extending the algorithm and performing tests on a more realistic corpus.
ML ID: 56
- Acquisition of a Lexicon from Semantic Representations of Sentences
[Details] [PDF]
Cynthia A. Thompson
In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL-95), 335-337, Cambridge, MA, 1995.A system, WOLFIE, that acquires a mapping of words to their semantic representation is presented and a preliminary evaluation is performed. Tree least general generalizations (TLGGs) of the representations of input sentences are performed to assist in determining the representations of individual words in the sentences. The best guess for a meaning of a word is the TLGG which overlaps with the highest percentage of sentence representations in which that word appears. Some promising experimental results on a non-artificial data set are presented.
ML ID: 45
- Integrated Learning of Words and their Underlying Concepts
[Details] [PDF]
Raymond J. Mooney
In Proceedings of the Ninth Annual Conference of the Cognitive Science Society, 947-978, Seattle, WA, July 1987.Models of learning word meanings have generally assumed prior knowledge of the concepts to which the words refer. However, novel natural language text or discourse often presents both unknown concepts and words which refer to these concepts. Also, developmental data suggests that the learning of words and their concepts frequently occurs concurrently instead of concept learning proceeding word learning. This paper presents an integrated computational model for acquiring both word meanings and their underlying concepts concurrently. This model is implemented as a word learning component added to the GENESIS explanation-based learning schema acquisition system for narrative understanding. A detailed example is described in which GENESIS learns provisional definitions for the words "kidnap", "kidnapper", and "ransom" as well as a kidnapping schema from a single narrative.
ML ID: 208