Edinburgh Research Archive

Automatic construction and updating of knowledge base from log data

dc.contributor.advisor
Bundy, Alan
dc.contributor.advisor
Pan, Jeff
dc.contributor.advisor
Nuamah, Kwabena
dc.contributor.author
Zhu, Ricky
dc.contributor.sponsor
other
en
dc.date.accessioned
2024-08-07T14:52:09Z
dc.date.available
2024-08-07T14:52:09Z
dc.date.issued
2024-08-07
dc.description.abstract
Large software systems can be very complex, and they get more and more complex in the recently popular Microservice Architecture due to increasing interactions among more components in bigger systems. Plain model-based and data-driven diagnosis approaches can be used for fault detection, but they are usually opaque and demand massive computing power. On the other hand, knowledge-based methods have shown to be not only effective but explainable and human-friendly for various tasks such as Fault Analysis, but are dependent on having a knowledge base. The construction and maintenance of knowledge bases are not a trivial problem, which is referred to as the knowledge bottleneck. Software system logs are the primary and most available, sometimes the only available data that record system runtime information, which are critical for software system Operation and Maintenance (O\&M). I proposed the TREAT framework, which can automate the construction and update a knowledge base from a continual stream of logs, which aims to, as faithfully as possible, reflect the latest states of the assisted software system, and facilitate downstream tasks, typically fault localisation. To the best of our knowledge, this is the first effort to construct a fully automated ever-updating knowledge base from logs that aims at reflecting the internal changing states of a software system. To evaluate the TREAT framework, I devised a knowledge-based solution involving logic programming and inductive logic programming that makes use of a TREAT-powered knowledge base to fault localisation and conducted empirical experiments of this solution on a real-life 5G network test bed system. Since evaluating the TREAT framework by fault localisation is indirect and involves many confounding factors, e.g., the specific solution to fault localisation, I explored and came up with a novel method called LP-Measure that can directly assess the quality of a given knowledge base, in particular the robustness and redundancy of a knowledge graph. Besides, it was observed that although the extracted knowledge is of high quality in general, there are also errors in the knowledge extraction process. I surveyed the way to quantify the uncertainty during the knowledge extraction process and assign probabilities of correct extraction to every piece of knowledge, which led to a deep investigation into probability calibration and knowledge graph embeddings, specifically testing and confirming the phenomenon of uncalibrated probabilities in knowledge graph embeddings and how to choose specific calibration models from the existing toolbox.
en
dc.identifier.uri
https://hdl.handle.net/1842/42068
dc.identifier.uri
http://dx.doi.org/10.7488/era/4790
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Xue Li, Alan Bundy, Ruiqi Zhu, FangrongWang, Stefano Mauceri, Lei Xu, and Jeff Z Pan. Abc in root cause analysis: Discovering missing information and repairing system failures. In International Conference on Machine Learning, Optimization, and Data Science, pages 346–359. Springer, 2022
en
dc.relation.hasversion
Fangrong Wang, Alan Bundy, Xue Li, Ruiqi Zhu, Kwabena Nuamah, Lei Xu, Stefano Mauceri, and Jeff Z. Pan. LEKG: A system for constructing knowledge graphs from log extraction. In The 10th International Joint Conference on Knowledge Graphs, IJCKG’21, page 181–185, New York, NY, USA, 2021. Association for Computing Machinery
en
dc.relation.hasversion
Ricky Zhu, Alan Bundy, Fangrong Wang, Xue Li, Kuwabena Nuamah, Lei Xu, Stefano Mauceri, and Jeff Z Pan. Assessing the quality of a knowledge graph via link prediction tasks. In 7th International Conference on Natural Language Processing and Information Retrieval, pages 1–10. Association for Computing Machinery, 2023
en
dc.relation.hasversion
Ruiqi Zhu, Fangrong Wang, Alan Bundy, Xue Li, Kwabena Nuamah, Lei Xu, Stefano Mauceri, and Jeff Z Pan. A closer look at probability calibration of knowledge graph embedding. In Proceedings of the 11th International Joint Conference on Knowledge Graphs, pages 104–109, 2022
en
dc.subject
knowledge bases
en
dc.subject
Software system logs
en
dc.subject
Operation and Maintenance (O\&M)
en
dc.subject
TREAT framework
en
dc.title
Automatic construction and updating of knowledge base from log data
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
ZhuR_2024.pdf
Size:
3.92 MB
Format:
Adobe Portable Document Format
Description:

This item appears in the following Collection(s)