Visualising plasmodium falciparum functional genomic data in MaGnET: malaria genome exploration tool
Sharman, Joanna Louise
Malaria affects the lives of 500 million people around the world each year. The disease is caused by protozoan parasites of the genus Plasmodium, whose ability to evade the immune system and quickly evolve resistance to drugs poses a major challenge for disease control. The results of several Plasmodium genome sequencing projects have revealed how little is known about the function of their genes (over half of the approximately 5400 genes in Plasmodium falciparum, the most deadly human parasite, are annotated as hypothetical ). Recently, several large-scale studies have attempted to shed light on the processes in which genes are involved; for example, the use of DNA microarrays to profile the parasite s gene expression. With the emergence of varied types of functional genomic data comes a need for effective tools that allow biologists (and bioinformaticians) to explore these data. The goal of exploration/browsing-style analyses will typically be to derive clues towards the function of thus far uncharacterised gene products, and to formulate experimentally testable hypotheses. Graphic interfaces to individual data sets are obviously beneficial in this endeavour. However, effective visual data exploration requires also that interfaces to different functional genomic data are integrated and that the user can carry forward a selected group of genes (not merely one at a time) across a variety of data sets. Non-expert users especially benefit from workbenchlike tools offering access to the data in this way. Still, only very few of the contemporary publicly available software have implemented such functionality. This work introduces a novel software tool for the integrated visualisation of functional genomic data relating to P. falciparum: the Malaria Genome Exploration Tool (MaGnET). MaGnET consists of a light-weight Java program for effective visualisation linked to a MySQL database for data storage. In order to maximise accessibility, the program is publicly available over the World Wide Web (http://www.malariagenomeexplorer.org/). MaGnET incorporates a Genome Viewer for visualising the location of genomic features, a Protein-Protein Interaction Viewer for visualising networks of experimentally determined interactions and an Expression Data Viewer for displaying mRNA and protein expression data. Complex database queries can easily be constructed in the Data Analysis Viewer. An advantage over most other tools is that all sections are fully integrated, allowing users to carry selected groups of genes across different datasets. Furthermore, MaGnET provides useful advanced visualisation features, including mapping of expression data onto genomic location or protein-protein interaction network. The inclusion of available third-party Java software has expanded the visualisation capability of MaGnET; for example, the Jmol viewer has been incorporated for viewing 3-D protein structures. An effort has been made to only include data in MaGnET that is at least of reasonable quality. The MaGnET database collates experimental data from various public Plasmodium resources (e.g. PlasmoDB) and from published functional genomic studies, such as DNA microarrays. In addition, through careful filtering and labelling we have been able to include some predicted annotation that has not been experimentally confirmed, such as Gene Ontology and InterPro functional assignments and modelled protein structures. The application of MaGnET to malaria biology is demonstrated through a series of small studies. Initial examples show how MaGnET can be used to effectively demonstrate results from previously published analyses. This is followed up by using MaGnET to make a set of predictions about the possible functions of selected uncharacterised genes and suggesting follow-up experiments.