Efficient prediction of relational structure and its application to natural language processing
Files
Item Status
Embargo End Date
Date
Authors
Abstract
Many tasks in Natural Language Processing (NLP) require us to predict a relational
structure over entities. For example, in Semantic Role Labelling we try to predict the
’semantic role’ relation between a predicate verb and its argument constituents. Often
NLP tasks not only involve related entities but also relations that are stochastically
correlated. For instance, in Semantic Role Labelling the roles of different constituents
are correlated: we cannot assign the agent role to one constituent if we have already
assigned this role to another.
Statistical Relational Learning (also known as First Order Probabilistic Logic) allows
us to capture the aforementioned nature of NLP tasks because it is based on the
notions of entities, relations and stochastic correlations between relationships. It is
therefore often straightforward to formulate an NLP task using a First Order probabilistic
language such as Markov Logic. However, the generality of this approach
comes at a price: the process of finding the relational structure with highest probability,
also known as maximum a posteriori (MAP) inference, is often inefficient, if not
intractable.
In this work we seek to improve the efficiency of MAP inference for Statistical
Relational Learning. We propose a meta-algorithm, namely Cutting Plane Inference
(CPI), that iteratively solves small subproblems of the original problem using any
existing MAP technique and inspects parts of the problem that are not yet included in
the current subproblem but could potentially lead to an improved solution. Our hypothesis
is that this algorithm can dramatically improve the efficiency of existing methods
while remaining at least as accurate.
We frame the algorithm in Markov Logic, a language that combines First Order
Logic and Markov Networks. Our hypothesis is evaluated using two tasks: Semantic
Role Labelling and Entity Resolution. It is shown that the proposed algorithm improves
the efficiency of two existing methods by two orders of magnitude and leads an
approximate method to more probable solutions. We also give show that CPI, at convergence,
is guaranteed to be at least as accurate as the method used within its inner
loop.
Another core contribution of this work is a theoretic and empirical analysis of the
boundary conditions of Cutting Plane Inference. We describe cases when Cutting Plane
Inference will definitely be difficult (because it instantiates large networks or needs
many iterations) and when it will be easy (because it instantiates small networks and
needs only few iterations).
This item appears in the following Collection(s)

