Statistical relational learning (SRL) is an emerging area of research that addresses the problem of learning from noisy structured/relational data. Markov logic networks (MLNs), sets of weighted clauses, are a simple but powerful SRL formalism that combines the expressivity of first-order logic with the flexibility of probabilistic reasoning. Most of the existing learning algorithms for MLNs are in the generative setting: they try to learn a model that maximizes the likelihood of the training data. However, most of the learning problems in relational data are discriminative. So to utilize the power of MLNs, we need discriminative learning methods that well match these discriminative tasks.
In this proposal, we present two new discriminative learning algorithms for MLNs. The first one is a discriminative structure and weight learner for MLNs with non-recursive clauses. We use a variant of Aleph, an off-the-shelf Inductive Logic Programming (ILP) system, to learn a large set of Horn clauses from the training data, then we apply an L1-regularization weight learner to select a small set of non-zero weight clauses that maximizes the conditional log-likelihood (CLL) of the training data. The experimental results show that our proposed algorithm outperforms existing learning methods for MLNs and traditional ILP systems in term of predictive accuracy, and its performance is comparable to state-of-the-art results on some ILP benchmarks. The second algorithm we present is a max-margin weight learner for MLNs. Instead of maximizing the CLL of the data like all existing discriminative weight learners for MLNs, the new weight learner tries to maximize the ratio between the probability of the correct label (the observable data) and and the closest incorrect label (among all the wrong labels, this one has the highest probability), which can be formulated as an optimization problem called 1-slack structural SVM. This optimization problem can be solved by an efficient algorithm based on the cutting plane method. However, this cutting plane algorithm requires an efficient inference method as a subroutine. Unfortunately, exact inference in MLNs is intractable. So we develop a new approximation inference method for MLNs based on Linear Programming relaxation. Extensive experiments in two real-world MLN applications demonstrate that the proposed max-margin weight learner generally achieves higher F1 scores than the current best discriminative weight learner for MLNs.
For future work, our short-term goal is to develop a more efficient inference algorithm and test our max-margin weight learner on more complex problems where there are complicated relationships between the input and output variables and among the outputs. In the longer-term, our plan is to develop more efficient learning algorithms through online learning and algorithms that revise both the clauses and their weights to improve predictive performance.
unpublished. Ph.D. proposal, Department of Computer Sciences, University of Texas at Austin.