Graph-based broad-coverage semantic parsing
Many broad-coverage meaning representations can be characterized as directed graphs, where nodes represent semantic concepts and directed edges represent semantic relations among the concepts. The task of semantic parsing is to generate such a meaning representation from a sentence. It is quite natural to adopt a graph-based approach for parsing, where nodes are identified conditioning on the individual words, and edges are labeled conditioning on the pairs of nodes. However, there are two issues with applying this simple and interpretable graph-based approach for semantic parsing: first, the anchoring of nodes to words can be implicit and non-injective in several formalisms (Oepen et al., 2019, 2020). This means we do not know which nodes should be generated from which individual word and how many of them. Consequently, it makes a probabilistic formulation of the training objective problematical; second, graph-based parsers typically predict edge labels independent from each other. Such an independence assumption, while being sensible from an algorithmic point of view, could limit the expressiveness of statistical modeling. Consequently, it might fail to capture the true distribution of semantic graphs. In this thesis, instead of a pipeline approach to obtain the anchoring, we propose to model the implicit anchoring as a latent variable in a probabilistic model. We induce such a latent variable jointly with the graph-based parser in an end-to-end differentiable training. In particular, we test our method on Abstract Meaning Representation (AMR) parsing (Banarescu et al., 2013). AMR represents sentence meaning with a directed acyclic graph, where the anchoring of nodes to words is implicit and could be many-to-one. Initially, we propose a rule-based system that circumvents the many-to-one anchoring by combing nodes in some pre-specified subgraphs in AMR and treats the alignment as a latent variable. Next, we remove the need for such a rule-based system by treating both graph segmentation and alignment as latent variables. Still, our graph-based parsers are parameterized by neural modules that require gradient-based optimization. Consequently, training graph-based parsers with our discrete latent variables can be challenging. By combing deep variational inference and differentiable sampling, our models can be trained end-to-end. To overcome the limitation of graph-based parsing and capture interdependency in the output, we further adopt iterative refinement. Starting with an output whose parts are independently predicted, we iteratively refine it conditioning on the previous prediction. We test this method on semantic role labeling (Gildea and Jurafsky, 2000). Semantic role labeling is the task of predicting the predicate-argument structure. In particular, semantic roles between the predicate and its arguments need to be labeled, and those semantic roles are interdependent. Overall, our refinement strategy results in an effective model, outperforming strong factorized baseline models.