Department of Computer Science

Machine Learning Research Group

University of Texas at Austin Artificial Intelligence Lab

Publications: Combining Logical and Distributional Semantics

The Field of Distributional Semantics provides techniques for reasoning about the meanings of words and phrases based on usage statistics in large text corpora; however, these techniques typically fail to capture much of the logical structure of natural language. Purely logic-based approaches to natural language semantics, on the other hand, fail to capture many graded notions of word meaning. We are investigating methods to combine these two approaches to natural language semantics.
  1. Natural Language Semantics Using Probabilistic Logic
    [Details] [PDF] [Slides (PPT)] [Slides (PDF)]
    I. Beltagy
    PhD Thesis, Department of Computer Science, The University of Texas at Austin, December 2016.
    With better natural language semantic representations, computers can do more applications more efficiently as a result of better understanding of natural text. However, no single semantic representation at this time fulfills all requirements needed for a satisfactory representation. Logic-based representations like first-order logic capture many of the linguistic phenomena using logical constructs, and they come with standardized inference mechanisms, but standard first-order logic fails to capture the "graded" aspect of meaning in languages. Other approaches for semantics, like distributional models, focus on capturing "graded" semantic similarity of words and phrases but do not capture sentence structure in the same detail as logic-based approaches. However, both aspects of semantics, structure and gradedness, are important for an accurate language semantics representation.

    In this work, we propose a natural language semantics representation that uses probabilistic logic (PL) to integrate logical with weighted uncertain knowledge. It combines the expressivity and the automated inference of logic with the ability to reason with uncertainty. To demonstrate the effectiveness of our semantic representation, we implement and evaluate it on three tasks, recognizing textual entailment (RTE), semantic textual similarity (STS) and open-domain question answering (QA). These tasks can utilize the strengths of our representation and the integration of logical representation and uncertain knowledge. Our semantic representation has three components, Logical Form, Knowledge Base and Inference, all of which present interesting challenges and we make new contributions in each of them.

    The first component is the Logical Form, which is the primary meaning representation. We address two points, how to translate input sentences to logical form, and how to adapt the resulting logical form to PL. First, we use Boxer, a CCG-based semantic analysis tool to translate sentences to logical form. We also explore translating dependency trees to logical form. Then, we adapt the logical forms to ensure that universal quantifiers and negations work as expected.

    The second component is the Knowledge Base which contains "uncertain" background knowledge required for a given problem. We collect the "relevant" lexical information from different linguistic resources, encode them as weighted logical rules, and add them to the knowledge base. We add rules from existing databases, in particular WordNet and the Paraphrase Database (PPDB). Since these are incomplete, we generate additional on-the-fly rules that could be useful. We use alignment techniques to propose rules that are relevant to a particular problem, and explore two alignment methods, one based on Robinson's resolution and the other based on graph matching. We automatically annotate the proposed rules and use them to learn weights for unseen rules.

    The third component is Inference. This component is implemented for each task separately. We use the logical form and the knowledge base constructed in the previous two steps to formulate the task as a PL inference problem then develop a PL inference algorithm that is optimized for this particular task. We explore the use of two PL frameworks, Markov Logic Networks (MLNs) and Probabilistic Soft Logic (PSL). We discuss which framework works best for a particular task, and present new inference algorithms for each framework.

    ML ID: 337
  2. Representing Meaning with a Combination of Logical and Distributional Models
    [Details] [PDF]
    I. Beltagy and Stephen Roller and Pengxiang Cheng and Katrin Erk and Raymond J. Mooney
    The special issue of Computational Linguistics on Formal Distributional Semantics, 42(4), 2016.
    NLP tasks differ in the semantic information they require, and at this time no single semantic representation fulfills all requirements. Logic-based representations characterize sentence structure, but do not capture the graded aspect of meaning. Distributional models give graded similarity ratings for words and phrases, but do not capture sentence structure in the same detail as logic-based approaches. So it has been argued that the two are complementary. We adopt a hybrid approach that combines logical and distributional semantics using probabilistic logic, specifically Markov Logic Networks (MLNs). In this paper, we focus on the three components of a practical system: 1) Logical representation focuses on representing the input problems in probabilistic logic. 2) Knowledge base construction creates weighted inference rules by integrating distributional information with other sources. 3) Probabilistic inference involves solving the resulting MLN inference problems efficiently. To evaluate our approach, we use the task of textual entailment (RTE), which can utilize the strengths of both logic-based and distributional representations. In particular we focus on the SICK dataset, where we achieve state-of-the-art results. We also release a lexical entailment dataset of 10,213 rules extracted from the SICK dataset, which is a valuable resource for evaluating lexical entailment systems
    ML ID: 316
  3. On the Proper Treatment of Quantifiers in Probabilistic Logic Semantics
    [Details] [PDF] [Slides (PPT)]
    I. Beltagy and Katrin Erk
    In Proceedings of the 11th International Conference on Computational Semantics (IWCS-2015), London, UK, April 2015.
    As a format for describing the meaning of natural language sentences, probabilistic logic combines the expressivity of first-order logic with the ability to handle graded information in a principled fashion. But practical probabilistic logic frameworks usually assume a finite domain in which each entity corresponds to a constant in the logic (domain closure assumption). They also assume a closed world where everything has a very low prior probability. These assumptions lead to some problems in the inferences that these systems make. In this paper, we show how to formulate Textual Entailment (RTE) inference problems in probabilistic logic in a way that takes the domain closure and closed-world assumptions into account. We evaluate our proposed technique on three RTE datasets, on a synthetic dataset with a focus on complex forms of quantification, on FraCas and on one more natural dataset. We show that our technique leads to improvements on the more natural dataset, and achieves 100% accuracy on the synthetic dataset and on the relevant part of FraCas.
    ML ID: 311
  4. Natural Language Semantics using Probabilistic Logic
    [Details] [PDF] [Slides (PPT)]
    I. Beltagy
    October 2014. PhD proposal, Department of Computer Science, The University of Texas at Austin.
    With better natural language semantic representations, computers can do more applications more efficiently as a result of better understanding of natural text. However, no single semantic representation at this time fulfills all requirements needed for a satisfactory representation. Logic-based representations like first-order logic capture many of the linguistic phenomena using logical constructs, and they come with standardized inference mechanisms, but standard first-order logic fails to capture the ``graded'' aspect of meaning in languages. Distributional models use contextual similarity to predict the ``graded'' semantic similarity of words and phrases but they do not adequately capture logical structure. In addition, there are a few recent attempts to combine both representations either on the logic side (still, not a graded representation), or in the distribution side(not full logic).

    We propose using probabilistic logic to represent natural language semantics combining the expressivity and the automated inference of logic, and the gradedness of distributional representations. We evaluate this semantic representation on two tasks, Recognizing Textual Entailment (RTE) and Semantic Textual Similarity (STS). Doing RTE and STS better is an indication of a better semantic understanding.

    Our system has three main components, 1. Parsing and Task Representation, 2. Knowledge Base Construction, and 3. Inference The input natural sentences of the RTE/STS task are mapped to logical form using Boxer which is a rule based system built on top of a CCG parser, then they are used to formulate the RTE/STS problem in probabilistic logic. Then, a knowledge base is represented as weighted inference rules collected from different sources like WordNet and on-the-fly lexical rules from distributional semantics. An advantage of using probabilistic logic is that more rules can be added from more resources easily by mapping them to logical rules and weighting them appropriately. The last component is the inference, where we solve the probabilistic logic inference problem using an appropriate probabilistic logic tool like Markov Logic Network (MLN), or Probabilistic Soft Logic (PSL). We show how to solve the inference problems in MLNs efficiently for RTE using a modified closed-world assumption and a new inference algorithm, and how to adapt MLNs and PSL for STS by relaxing conjunctions. Experiments show that our semantic representation can handle RTE and STS reasonably well.

    For the future work, our short-term goals are 1. better RTE task representation and finite domain handling, 2. adding more inference rules, precompiled and on-the-fly, 3. generalizing the modified closed-world assumption, 4. enhancing our inference algorithm for MLNs, and 5. adding a weight learning step to better adapt the weights. On the longer-term, we would like to apply our semantic representation to the question answering task, support generalized quantifiers, contextualize WordNet rules we use, apply our semantic representation to languages other than English, and implement a probabilistic logic Inference Inspector that can visualize the proof structure.

    ML ID: 308
  5. UTexas: Natural Language Semantics using Distributional Semantics and Probabilistic Logic
    [Details] [PDF]
    I. Beltagy and Stephen Roller and Gemma Boleda and and Katrin Erk and Raymond J. Mooney
    In The 8th Workshop on Semantic Evaluation (SemEval-2014), 796--801, Dublin, Ireland, August 2014.
    We represent natural language semantics by combining logical and distributional information in probabilistic logic. We use Markov Logic Networks (MLN) for the RTE task, and Probabilistic Soft Logic (PSL) for the STS task. The system is evaluated on the SICK dataset. Our best system achieves 73% accuracy on the RTE task, and a Pearson's correlation of 0.71 on the STS task.
    ML ID: 305
  6. Efficient Markov Logic Inference for Natural Language Semantics
    [Details] [PDF] [Poster]
    I. Beltagy and Raymond J. Mooney
    In Proceedings of the Fourth International Workshop on Statistical Relational AI at AAAI (StarAI-2014), 9--14, Quebec City, Canada, July 2014.
    Using Markov logic to integrate logical and distributional information in natural-language semantics results in complex inference problems involving long, complicated formulae. Current inference methods for Markov logic are ineffective on such problems. To address this problem, we propose a new inference algorithm based on SampleSearch that computes probabilities of complete formulae rather than ground atoms. We also introduce a modified closed-world assumption that significantly reduces the size of the ground network, thereby making inference feasible. Our approach is evaluated on the recognizing textual entailment task, and experiments demonstrate its dramatic impact on the efficiency of inference.
    ML ID: 303
  7. Semantic Parsing using Distributional Semantics and Probabilistic Logic
    [Details] [PDF] [Poster]
    I. Beltagy and Katrin Erk and Raymond Mooney
    In Proceedings of ACL 2014 Workshop on Semantic Parsing (SP-2014), 7--11, Baltimore, MD, June 2014.
    We propose a new approach to semantic parsing that is not constrained by a fixed formal ontology and purely logical inference. Instead, we use distributional semantics to generate only the relevant part of an on-the-fly ontology. Sentences and the on-the-fly ontology are represented in probabilistic logic. For inference, we use probabilistic logic frameworks like Markov Logic Networks (MLN) and Probabilistic Soft Logic (PSL). This semantic parsing approach is evaluated on two tasks, Textual Entitlement (RTE) and Textual Similarity (STS), both accomplished using inference in probabilistic logic. Experiments show the potential of the approach.
    ML ID: 301
  8. Probabilistic Soft Logic for Semantic Textual Similarity
    [Details] [PDF] [Poster]
    I. Beltagy and Katrin Erk and Raymond J. Mooney
    In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL-14), 1210--1219, Baltimore, MD, 2014.
    Probabilistic Soft Logic (PSL) is a recently developed framework for probabilistic logic. We use PSL to combine logical and distributional representations of natural-language meaning, where distributional information is represented in the form of weighted inference rules. We apply this framework to the task of Semantic Textual Similarity (STS) (i.e. judging the semantic similarity of natural-language sentences), and show that PSL gives improved results compared to a previous approach based on Markov Logic Networks (MLNs) and a purely distributional approach.
    ML ID: 300
  9. Montague Meets Markov: Deep Semantics with Probabilistic Logical Form
    [Details] [PDF] [Slides (PPT)]
    I. Beltagy, Cuong Chau, Gemma Boleda, Dan Garrette, Katrin Erk, Raymond Mooney
    In Proceedings of the Second Joint Conference on Lexical and Computational Semantics (*Sem-2013), 11--21, Atlanta, GA, June 2013.
    We combine logical and distributional representations of natural language meaning by transforming distributional similarity judgments into weighted inference rules using Markov Logic Networks (MLNs). We show that this framework supports both judging sentence similarity and recognizing textual entailment by appropriately adapting the MLN implementation of logical connectives. We also show that distributional phrase similarity, used as textual inference rules created on the fly, improves its performance.
    ML ID: 285
  10. Integrating Logical Representations with Probabilistic Information using Markov Logic
    [Details] [PDF] [Slides (PDF)]
    Dan Garrette, Katrin Erk, Raymond Mooney
    In Proceedings of the International Conference on Computational Semantics, 105--114, Oxford, England, January 2011.
    First-order logic provides a powerful and flexible mechanism for representing natural language semantics. However, it is an open question of how best to integrate it with uncertain, probabilistic knowledge, for example regarding word meaning. This paper describes the first steps of an approach to recasting first-order semantics into the probabilistic models that are part of Statistical Relational AI. Specifically, we show how Discourse Representation Structures can be combined with distributional models for word meaning inside a Markov Logic Network and used to successfully perform inferences that take advantage of logical concepts such as factivity as well as probabilistic information on word meaning in context.
    ML ID: 253