Automatic construction and updating of knowledge base from log data

Zhu, Ricky

Automatic construction and updating of knowledge base from log data

Simple item page

dc.contributor.advisor

Bundy, Alan

dc.contributor.advisor

Pan, Jeff

dc.contributor.advisor

Nuamah, Kwabena

dc.contributor.author

Zhu, Ricky

dc.contributor.sponsor

other

en

dc.date.accessioned

2024-08-07T14:52:09Z

dc.date.available

2024-08-07T14:52:09Z

dc.date.issued

2024-08-07

dc.description.abstract

Large software systems can be very complex, and they get more and more complex in the recently popular Microservice Architecture due to increasing interactions among more components in bigger systems. Plain model-based and data-driven diagnosis approaches can be used for fault detection, but they are usually opaque and demand massive computing power. On the other hand, knowledge-based methods have shown to be not only effective but explainable and human-friendly for various tasks such as Fault Analysis, but are dependent on having a knowledge base. The construction and maintenance of knowledge bases are not a trivial problem, which is referred to as the knowledge bottleneck. Software system logs are the primary and most available, sometimes the only available data that record system runtime information, which are critical for software system Operation and Maintenance (O\&M). I proposed the TREAT framework, which can automate the construction and update a knowledge base from a continual stream of logs, which aims to, as faithfully as possible, reflect the latest states of the assisted software system, and facilitate downstream tasks, typically fault localisation. To the best of our knowledge, this is the first effort to construct a fully automated ever-updating knowledge base from logs that aims at reflecting the internal changing states of a software system. To evaluate the TREAT framework, I devised a knowledge-based solution involving logic programming and inductive logic programming that makes use of a TREAT-powered knowledge base to fault localisation and conducted empirical experiments of this solution on a real-life 5G network test bed system. Since evaluating the TREAT framework by fault localisation is indirect and involves many confounding factors, e.g., the specific solution to fault localisation, I explored and came up with a novel method called LP-Measure that can directly assess the quality of a given knowledge base, in particular the robustness and redundancy of a knowledge graph. Besides, it was observed that although the extracted knowledge is of high quality in general, there are also errors in the knowledge extraction process. I surveyed the way to quantify the uncertainty during the knowledge extraction process and assign probabilities of correct extraction to every piece of knowledge, which led to a deep investigation into probability calibration and knowledge graph embeddings, specifically testing and confirming the phenomenon of uncalibrated probabilities in knowledge graph embeddings and how to choose specific calibration models from the existing toolbox.

en

dc.identifier.uri

https://hdl.handle.net/1842/42068

dc.identifier.uri

http://dx.doi.org/10.7488/era/4790

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Xue Li, Alan Bundy, Ruiqi Zhu, FangrongWang, Stefano Mauceri, Lei Xu, and Jeff Z Pan. Abc in root cause analysis: Discovering missing information and repairing system failures. In International Conference on Machine Learning, Optimization, and Data Science, pages 346–359. Springer, 2022

en

dc.relation.hasversion

Fangrong Wang, Alan Bundy, Xue Li, Ruiqi Zhu, Kwabena Nuamah, Lei Xu, Stefano Mauceri, and Jeff Z. Pan. LEKG: A system for constructing knowledge graphs from log extraction. In The 10th International Joint Conference on Knowledge Graphs, IJCKG’21, page 181–185, New York, NY, USA, 2021. Association for Computing Machinery

en

dc.relation.hasversion

Ricky Zhu, Alan Bundy, Fangrong Wang, Xue Li, Kuwabena Nuamah, Lei Xu, Stefano Mauceri, and Jeff Z Pan. Assessing the quality of a knowledge graph via link prediction tasks. In 7th International Conference on Natural Language Processing and Information Retrieval, pages 1–10. Association for Computing Machinery, 2023

en

dc.relation.hasversion

Ruiqi Zhu, Fangrong Wang, Alan Bundy, Xue Li, Kwabena Nuamah, Lei Xu, Stefano Mauceri, and Jeff Z Pan. A closer look at probability calibration of knowledge graph embedding. In Proceedings of the 11th International Joint Conference on Knowledge Graphs, pages 104–109, 2022

en

dc.subject

knowledge bases

en

dc.subject

Software system logs

en

dc.subject

Operation and Maintenance (O\&M)

en

dc.subject

TREAT framework

en

dc.title

Automatic construction and updating of knowledge base from log data

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ZhuR_2024.pdf
Size:: 3.92 MB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection