Show simple item record

dc.contributor.advisorSanguinetti, Guidoen
dc.contributor.advisorCohen, Shayen
dc.contributor.authorScher, Emily Aliceen
dc.date.accessioned2021-03-15T15:51:17Z
dc.date.available2021-03-15T15:51:17Z
dc.date.issued2020-11-30
dc.identifier.urihttps://hdl.handle.net/1842/37528
dc.identifier.urihttp://dx.doi.org/10.7488/era/812
dc.description.abstractSince the turn of the century, the scope and scale of Synthetic Biology projects have grown dramatically. Instead of limiting themselves to simple genetic circuits, researchers aim for genome-scale organism redesigns, revolutionary gene therapies, and high throughput, industrial scale natural product syntheses. However, the engineering principles adopted by the founders of the field have been applied to Biology in a way that does not fit many modern experiments. This has limited the usefulness of common sequence design paradigms. As experiments have become more complex, the sequence design process has taken up more and more intellectual bandwidth, partially because software tools for DNA design have remained largely unchanged. This thesis will explore software engineering, social science, and machine learning projects aiming to improve the ways in which researchers design novel DNA sequences for Synthetic Biology experiments. Popular DNA design tools will be reviewed, alongside an analysis of the key conceptual metaphors that underlie their workflows. Flaws in the ubiquitous parts-based design model will be demonstrated, and several alternatives will be explored. A tool called Part Crafter (partcrafter.com) will be presented, which aggregates sequence and annotation data from a variety of data sources to allow for rational search over genomic features, as well as the automated production of biological parts for Synthetic Biology experiments. However, Part Crafter’s mode of part creation is more flexible than traditional implementations of parts-based design in the field. Parts are abstracted away from specific manufacturing standards, and as much contextual information as possible is presented alongside parts of interest. Additionally, various types of machine learning models will be presented which predict histone modification occupancy in novel sequences. Current Synthetic Biology design paradigms largely ignore the epigenetic context of designed sequences. A gradient of increasingly complex models will be analysed in order to characterise the complexity of the combinatorial patterns of sequences of these epigenetic proteins. This work was exploratory, serving as a proof of concept for using a variety of increasingly complex models to represent genomic elements, and demonstrating that the parts-based design model is not the only option available to us. The aims of the field of Synthetic Biology become more ambitious every year. In order for the goals of the field to be accomplished, we must be able to better understand the sequences we are designing. The projects presented in this thesis were all completed with the aim of assisting Synthetic Biologists in designing sequences deliberately. By taking into account as much contextual information as possible, including epigenetic factors, researchers will be able to design sequences more quickly and reliably, increasing their chances of achieving the moon shot goals of the field.en
dc.language.isoen
dc.publisherThe University of Edinburghen
dc.relation.hasversionEmily Scher, Yisha Luo, Aaron Berliner, Jacqueline Quinn, Carlos Olguin, and Yizhi Cai. GenomeCarver: harvesting genetic parts from genomes to support biological design automation. In 6th International Workshop on Bio-Design Automation, Seattle, WA, 2014.en
dc.relation.hasversionEmily Scher, Shay B Cohen, and Guido Sanguinetti. PartCrafter: find, generate and analyze BioParts. Synthetic Biology, 4(1):ysz014, 2019.en
dc.relation.hasversionErika Szymanski and Emily Scher. Models for DNA Design Tools: The Trouble with Metaphors Is That They Don’t Go Away. ACS Synthetic Biology, 8(12):2635–2641, 2019.en
dc.subjectSynthetic Biologyen
dc.subjectmachine learningen
dc.subjectnovel DNA sequence designen
dc.subjectDNA design software reviewen
dc.subjectPart Crafteren
dc.subjectepigenetic protein predictionen
dc.titleHuman genome interaction: models for designing DNA sequencesen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record