Context-dependent word similarity can be measured over multiple cross-cutting dimensions. For example, lung and breath are similar thematically, while authoritative and superﬁcial occur in similar syntactic contexts, but share little semantic similarity. Both of these notions of similarity play a role in determining word meaning, and hence lexical semantic models must take them both into account. Towards this end, we develop a novel model, Multi-View Mixture (MVM), that represents words as multiple overlapping clusterings. MVM ﬁnds multiple data partitions based on different subsets of features, subject to the marginal constraint that feature subsets are distributed according to Latent Dirichlet Allocation. Intuitively, this constraint favors feature partitions that have coherent topical semantics. Furthermore, MVM uses soft feature assignment, hence the contribution of each data point to each clustering view is variable, isolating the impact of data only to views where they assign the most features. Through a series of experiments, we demonstrate the utility of MVM as an inductive bias for capturing relations between words that are intuitive to humans, outperforming related models such as Latent Dirichlet Allocation.
ML ID: 262
We introduce tiered clustering, a mixture model capable of accounting for varying degrees of shared (context-independent) feature structure, and demonstrate its applicability to inferring distributed representations of word meaning. Common tasks in lexical semantics such as word relatedness or selectional preference can benefit from modeling such structure: Polysemous word usage is often governed by some common background metaphoric usage (e.g. the senses of line or run), and likewise modeling the selectional preference of verbs relies on identifying commonalities shared by their typical arguments. Tiered clustering can also be viewed as a form of soft feature selection, where features that do not contribute meaningfully to the clustering can be excluded. We demonstrate the applicability of tiered clustering, highlighting particular cases where modeling shared structure is beneficial and where it can be detrimental.
ML ID: 252
In order to respond to increasing demand for natural language interfaces—and provide meaningful insight into user query intent—fast, scalable lexical semantic models with flexible representations are needed. Human concept organization is a rich epiphenomenon that has yet to be accounted for by a single coherent psychological framework: Concept generalization is captured by a mixture of prototype and exemplar models, and local taxonomic information is available through multiple overlapping organizational systems. Previous work in computational linguistics on extracting lexical semantic information from the Web does not provide adequate representational flexibility and hence fails to capture the full extent of human conceptual knowledge. In this proposal I will outline a family of probabilistic models capable of accounting for the rich organizational structure found in human language that can predict contextual variation, selectional preference and feature-saliency norms to a much higher degree of accuracy than previous approaches. These models account for cross-cutting structure of concept organization—i.e. the notion that humans make use of different categorization systems for different kinds of generalization tasks—and can be applied to Web-scale corpora. Using these models, natural language systems will be able to infer a more comprehensive semantic relations, in turn improving question answering, text classification, machine translation, and information retrieval.
ML ID: 249
Current vector-space models of lexical semantics create a single “prototype” vector to represent the meaning of a word. However, due to lexical ambiguity, encoding word meaning with a single vector is problematic. This paper presents a method that uses clustering to produce multiple “sense-specific&rdquo vectors for each word. This approach provides a context-dependent vector representation of word meaning that naturally accommodates homonymy and polysemy. Experimental comparisons to human judgements of semantic similarity for both isolated words as well as words in sentential contexts demonstrate the superiority of this approach over both prototype and exemplar based vector-space models.
ML ID: 241
This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that learns to parse representations such as logical database queries.
Experimental results are presented demonstrating Wolfie's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996), with results favorable to Wolfie. A second set of experiments demonstrates Wolfie's ability to scale to larger and more difficult, albeit artificially generated, corpora.
In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods (Cohn, Atlas, & Ladner, 1994) attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance.
ML ID: 121
This paper describes a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of words paired with meaning representations. Wolfie is part of an integrated system that learns to parse novel sentences into semantic representations, such as logical database queries. Experimental results are presented demonstrating Wolfie's ability to learn useful lexicons for a database interface in four different natural languages. The lexicons learned by Wolfie are compared to those acquired by a competing system developed by Siskind.
ML ID: 95
A long-standing goal for the field of artificial intelligence is to enable computer understanding of human languages. A core requirement in reaching this goal is the ability to transform individual sentences into a form better suited for computer manipulation. This ability, called semantic parsing, requires several knowledge sources, such as a grammar, lexicon, and parsing mechanism.
Building natural language parsing systems by hand is a tedious, error-prone undertaking. We build on previous research in automating the construction of such systems using machine learning techniques. The result is a combined system that learns semantic lexicons and semantic parsers from one common set of training examples. The input required is a corpus of sentence/representation pairs, where the representations are in the output format desired. A new system, Wolfie, learns semantic lexicons to be used as background knowledge by a previously developed parser acquisition system, Chill. The combined system is tested on a real world domain of answering database queries. We also compare this combination to a combination of Chill with a previously developed lexicon learner, demonstrating superior performance with our system. In addition, we show the ability of the system to learn to process natural languages other than English. Finally, we test the system on an alternate sentence representation, and on a set of large, artificial corpora with varying levels of ambiguity and synonymy.
One difficulty in using machine learning methods for building natural language interfaces is building the required annotated corpus. Therefore, we also address this issue by using active learning to reduce the number of training examples required by both Wolfie and Chill. Experimental results show that the number of examples needed to reach a given level of performance can be significantly reduced with this method.
ML ID: 90
This paper describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with representations of their meaning. The lexicon learned consists of words paired with meaning representations. WOLFIE is part of an integrated system that learns to parse novel sentences into semantic representations, such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The lexicons learned by WOLFIE are compared to those acquired by a competing system developed by Siskind (1996).
ML ID: 89
This paper describes an experimental comparison of seven different learning algorithms on the problem of learning to disambiguate the meaning of a word from context. The algorithms tested include statistical, neural-network, decision-tree, rule-based, and case-based classification techniques. The specific problem tested involves disambiguating six senses of the word ``line'' using the words in the current and proceeding sentence as context. The statistical and neural-network methods perform the best on this particular problem and we discuss a potential reason for this observed difference. We also discuss the role of bias in machine learning and its importance in explaining performance differences observed on specific problems.
ML ID: 62
Building accurate and efficient natural language processing (NLP) systems is an important and difficult problem. There has been increasing interest in automating this process. The lexicon, or the mapping from words to meanings, is one component that is typically difficult to update and that changes from one domain to the next. Therefore, automating the acquisition of the lexicon is an important task in automating the acquisition of NLP systems. This proposal describes a system, WOLFIE (WOrd Learning From Interpreted Examples), that learns a lexicon from input consisting of sentences paired with representations of their meanings. Preliminary experimental results show that this system can learn correct and useful mappings. The correctness is evaluated by comparing a known lexicon to one learned from the training input. The usefulness is evaluated by examining the effect of using the lexicon learned by WOLFIE to assist a parser acquisition system, where previously this lexicon had to be hand-built. Future work in the form of extensions to the algorithm, further evaluation, and possible applications is discussed.
ML ID: 57
This paper defines a new machine learning problem to which standard machine learning algorithms cannot easily be applied. The problem occurs in the domain of lexical acquisition. The ambiguous and synonymous nature of words causes the difficulty of using standard induction techniques to learn a lexicon. Additionally, negative examples are typically unavailable or difficult to construct in this domain. One approach to solve the lexical acquisition problem is presented, along with preliminary experimental results on an artificial corpus. Future work includes extending the algorithm and performing tests on a more realistic corpus.
ML ID: 56
A system, WOLFIE, that acquires a mapping of words to their semantic representation is presented and a preliminary evaluation is performed. Tree least general generalizations (TLGGs) of the representations of input sentences are performed to assist in determining the representations of individual words in the sentences. The best guess for a meaning of a word is the TLGG which overlaps with the highest percentage of sentence representations in which that word appears. Some promising experimental results on a non-artificial data set are presented.
ML ID: 45
Models of learning word meanings have generally assumed prior knowledge of the concepts to which the words refer. However, novel natural language text or discourse often presents both unknown concepts and words which refer to these concepts. Also, developmental data suggests that the learning of words and their concepts frequently occurs concurrently instead of concept learning proceeding word learning. This paper presents an integrated computational model for acquiring both word meanings and their underlying concepts concurrently. This model is implemented as a word learning component added to the GENESIS explanation-based learning schema acquisition system for narrative understanding. A detailed example is described in which GENESIS learns provisional definitions for the words "kidnap", "kidnapper", and "ransom" as well as a kidnapping schema from a single narrative.
ML ID: 208