Learning representations of entities and relations
Item Status
Embargo End Date
Date
Authors
Balažević, Ivana
Abstract
Learning to represent factual knowledge about the world in a succinct and accessible
manner is a fundamental machine learning problem. Encoding facts as representations
of entities and binary relationships between them, as learned by knowledge graph representation models, is useful for various tasks, including predicting new facts (i.e. link
prediction), question answering, fact checking and information retrieval. The focus of
this thesis is on (i) improving knowledge graph representation with the aim of tackling
the link prediction task; and (ii) devising a theory on how semantics can be captured
in the geometry of relation representations.
Most knowledge graphs are very incomplete and manually adding new information is
costly, which drives the development of methods which can automatically infer missing
facts. This thesis introduces three knowledge graph representation methods, each applied to the link prediction task. The first contribution is HypER, a convolutional model
which simplifies and improves upon the link prediction performance of the existing convolutional state-of-the-art model ConvE and can be mathematically explained in terms
of constrained tensor factorisation. Drawing inspiration from the tensor factorisation
view of HypER, the second contribution is TuckER, a relatively straightforward linear
knowledge graph representation model, which, at the time of its introduction, obtained
state-of-the-art link prediction performance across standard datasets. With a specific
focus on representing hierarchical knowledge graph relations, the third contribution
is MuRP, first multi-relational graph representation model embedded in hyperbolic
space. MuRP outperforms all existing models and its Euclidean counterpart MuRE in
link prediction on hierarchical knowledge graph relations whilst requiring far fewer dimensions. Since their publication, all above mentioned models have influenced a range
of subsequent developments in the knowledge graph representation field.
Despite the development of a large number of knowledge graph representation models
with gradually increasing predictive performance, relatively little is known of the latent
structure they learn. We generalise recent theoretical understanding of how semantic
relations of similarity, paraphrase and analogy are encoded in the geometric interactions
of word embeddings to how more general relations, as found in knowledge graphs, can
be encoded in their representations. This increased theoretical understanding can be
used to aid future knowledge graph representation model design, as well as to improve
models which incorporate logical rules between relations into their representations or
those that jointly learn from multiple data sources (e.g. knowledge graphs and text).
This item appears in the following Collection(s)

