Predicting the content of peer-to-peer interactions
Software agents interact to solve tasks, the details of which need to be described in a language understandable by all the actors involved. Ontologies provide a formalism for defining both the domain of the task and the terminology used to describe it. However, finding a shared ontology has proved difficult: different institutions and developers have different needs and formalise them in different ontologies. In a closed environment it is possible to force all the participants to share the same ontology, while in open and distributed environments ontology mapping can provide interoperability between heterogeneous interacting actors. However, conventional mapping systems focus on acquiring static information, and on mapping whole ontologies, which is infeasible in open systems. This thesis shows a different approach to the problem of heterogeneity. It starts from the intuitive idea that when similar situations arise, similar interactions are performed. If the interactions between actors are specified in formal scripts, shared by all the participants, then when the same situation arises, the same script is used. The main hypothesis that this thesis aims to demonstrate is that by analysing different runs of these scripts it is possible to create a statistical model of the interactions, that reflect the frequency of terms in messages and of ontological relations between terms in different messages. The model is then used during a run of a known interaction to compute the probability distribution for terms in received messages. The probability distribution provides additional information, contextual to the interaction, that can be used by a traditional ontology matcher in order to improve efficiency, by reducing the comparisons to the most likely ones given the context, and possibly both recall and precision, in particular helping disambiguation. The ability to create a model that reflects real phenomena in this sort of environment is evaluated by analysing the quality of the predictions, in particular verifying how various features of the interactions, such as their non-stationarity, affect the predictions. The actual improvements to a matcher we developed are also evaluated. The overall results are very promising, as using the predictor can lower the overall computation time for matching by ten times, while maintaining or in some cases improving recall and precision.