Edinburgh Research Archive

Proteogenomics for precision medicine: overcoming challenges in specificity, reproducibility, and data integration

Abstract

The integration of proteomics with genomics—termed proteogenomics—offers an unprecedented opportunity to deepen our understanding of human biology, disease mechanisms, and biomarker discovery. Proteogenomics leverages protein abundance data to inform genomic analyses, refining our ability to detect and interpret variations in protein levels that underlie health and disease. However, despite its promise, proteogenomics faces several methodological and technological challenges, particularly in the precision, scalability, and reproducibility of proteomic measurements. Multiplex proteomics platforms have evolved significantly over the last decade, providing extensive coverage of the human proteome. Traditional techniques such as mass spectrometry have been challenged by affinity-based proteomics, including SomaLogic's SomaScan and Olink’s proximity extension assay, which have vastly expanded the depth of protein quantification to thousands of proteins.  A fundamental challenge in proteogenomics today is target specificity and standardization. While genetic studies rely on the stability of DNA, proteins are highly dynamic, influenced by post-translational modifications, protein-protein interactions, and environmental conditions. Nonetheless, proteogenomics holds transformative potential in clinical and translational research. Large proteomic datasets are already discovering new pathways for disease and health, in addition to demonstrating predictive power in disease onset modelling, outperforming traditional clinical biomarkers. The increasing integration of machine learning into proteogenomics workflows promises to enhance biomarker discovery and clinical translation. This thesis explores the application of state-of-the-art proteomics technologies in proteogenomics, emphasizing their strengths, limitations, and implications for biomarker discovery and disease association studies. The work presented herein aims to address key challenges in proteomic specificity, cross-platform reproducibility, and the integration of proteomic data with genomic insights in small cohorts to advance precision medicine.

This item appears in the following Collection(s)