Knowledge Acquisition from Data Bases
Knowledge acquisition from databases is a research frontier for both data base technology and machine learning (ML) techniques,and has seen sustained research over recent years.It also acts as a link between the two fields,thus offering a dual benefit. Firstly, since database technology has already found wide application in many fields ML research obviously stands to gain from this greater exposure and established technological foundation. Secondly, ML techniques can augment the ability of existing database systems to represent acquire,and process a collection of expertise such as those which form part of the semantics of many advanced applications (e.gCAD/CAM).The major contribution of this thesis is the introduction of an effcient induction algorithm to facilitate the acquisition of such knowledge from databases. There are three typical families of inductive algorithms: the generalisation- specialisation based AQ11-like family, the decision tree based ID3-like family,and the extension matrix based family. A heuristic induction algorithm, HCV based on the newly-developed extension matrix approach is described in this thesis. By dividing the positive examples (PE) of a specific class in a given example set into intersect in groups and adopting a set of strategies to find a heuristic conjunctive rule in each group which covers all the group's positiv examples and none of the negativ examples(NE),HCV can find rules in the form of variable-valued logic for PE against NE in low-order polynomial time. The rules generated in HCV are shown empirically to be more compact than the rules produced by AQ1-like algorithms and the decision trees produced by the ID3-like algorithms. KEshell2, an intelligent learning database system, which makes use of the HCV algorithm and couples ML techniques with database and knowledgebase technology, is also described.