dc.contributor.advisor | Atkinson, Malcolm | |
dc.contributor.advisor | Papapanagiotou, Petros | |
dc.contributor.advisor | Fleuriot, Jacques | |
dc.contributor.author | Zhao, Rui | |
dc.date.accessioned | 2022-12-15T12:06:52Z | |
dc.date.available | 2022-12-15T12:06:52Z | |
dc.date.issued | 2022-12-15 | |
dc.identifier.uri | https://hdl.handle.net/1842/39615 | |
dc.identifier.uri | http://dx.doi.org/10.7488/era/2864 | |
dc.description.abstract | The data “Terms of Use” (ToU) widely exists, with different names, such as “Privacy Policy” or “Data Consent”, and everyone handling data will deal with them. They are, in general, a form of data-governance rules, which not only includes access controls but also contains more general terms such as obligations. In the current world, the designing and handling of data governance rules is often polarized: either open data with almost no governance rules, or restricted data with tight governance rules as well as applications, training, supervision, etc. This poses challenges for researchers, especially when they combine data from different sources and share their results with others. Existing research about automated compliance handling falls into two major categories: single-infrastructural and data-flow tracking. They have different properties and features, but normally only target policies about access controls, and fall short in supporting rule combination for multi-input-multi-output (MIMO) processes and therefore arbitrary directed acyclic graphs (DAGs).
In this thesis, a novel extensible language is introduced, designed for MIMO processes and their composed DAGs. It contains two parts, the data rule and the flow rule, for writing data terms of use and writing how the processes affect the data terms of use, respectively. In addition to the expected policy derivation of the data-flow tracking category, it supports obligations, which also mimic access controls, to demonstrate the language features. The language is formalised using situation calculus, with reasoning process explained. Relevant proofs are shown to demonstrate the correctness of the whole-graph reasoning for any DAG, enabling further optimisation of the reasoning. Then, Dr.Aid, the prototype system implementation, is introduced and discussed, whose name is an abbreviation of Data Rule Aid. It takes provenance as the source of data flow graphs, uses Golog as the situation calculus reasoner, and supports rule identification through the recognizer component. Two provenance schemas, CWL-Prov and S-Prov, are supported to demonstrate the generality that supports the main two types of workflow management systems, file-oriented and data-streaming. After that, relevant evaluations of the language and the system are presented. Apart from the already-introduced proofs of correctness of the reasoning, the evaluation includes how the proposed language meets all five principles used to evaluate related research, the capacity of the language to encode real-world data ToU, and the capability of the system for real-world data-use activities in different scientific communities. The evaluation has shown that our language model can encode a substantial amount of real-life data ToU (90% for actioning rules and 74% for all rules), and our framework has the potential to be used in a wide range of applications. The limitations and future works are discussed afterwards, as well as our prospective vision of future data activities using technologies similar to those proposed in this thesis. We believe the work presented pioneers a productive direction for research in this domain. | en |
dc.language.iso | en | en |
dc.publisher | The University of Edinburgh | en |
dc.relation.hasversion | Computer-supported ethical rules for collaboratively sharing data Zhao, R., Atkinson, M. P., Papapanagiotou, P., Fleuriot, J. & Pagé, C., 17 Oct 2020. 6 p. Research output: Contribution to conference › Paper › peer-review | en |
dc.relation.hasversion | Towards a computer-interpretable actionable formal model to encode data governance rules Zhao, R. & Atkinson, M., 19 Mar 2020, Proceedings of the IEEE eScience 2019 proceedings. San Diego, CA, USA: Institute of Electrical and Electronics Engineers (IEEE), p. 594-603 10 p. Research output: Chapter in Book/Report/Conference proceeding › Conference contribution | en |
dc.relation.hasversion | An Automated Framework for Supporting Data-Governance Rule Compliance in Decentralized MIMO Contexts Zhao, R., 19 Aug 2021, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21. Zhou, Z-H. (ed.). International Joint Conferences on Artificial Intelligence Organization, p. 4929-4930 2 p. Research output: Chapter in Book/Report/Conference proceeding › Conference contribution | en |
dc.relation.hasversion | Dr.Aid: Supporting Data-Governance Rule Compliance for Decentralized Collaboration in an Automated Way Zhao, R., Atkinson, M., Papapanagiotou, P., Magnoni, F. & Fleuriot, J., 18 Oct 2021, In: Proceedings of the ACM on Human-Computer Interaction. 5, CSCW2, 43 p., 460. Research output: Contribution to journal › Article › peer-review | en |
dc.subject | Dr.Aid | en |
dc.subject | data governance | en |
dc.subject | data governance rules | en |
dc.subject | automated compliance handling | en |
dc.subject | single-infrastructural | en |
dc.subject | data-flow tracking | en |
dc.subject | multi-input-multi-output processes | en |
dc.subject | MIMO | en |
dc.subject | directed acyclic graphs | en |
dc.subject | DAGs | en |
dc.subject | Data Rule Aid | en |
dc.subject | CWLProv | en |
dc.subject | S-Prov | en |
dc.title | Dr.Aid: a formal framework assisting compliance with data governance rules | en |
dc.type | Thesis or Dissertation | en |
dc.type.qualificationlevel | Doctoral | en |
dc.type.qualificationname | PhD Doctor of Philosophy | en |
dc.rights.embargodate | 2023-12-15 | en |
dcterms.accessRights | Restricted Access | en |