Genomic insights into cardiovascular traits from deep phenotyping of large-scale population cohorts

Promotion M. Yeung

Cardiovascular diseases (CVDs) represent a significant global health burden, affecting over 500 million individuals worldwide and being the leading cause of premature death and disability. Metabolic diseases such as diabetes, hypertension, and hyperlipidaemia are well-established risk factors for cardiovascular diseases development. Human genetics plays a pivotal role in understanding the multifactorial nature of cardiometabolic diseases, which are influenced by complex interactions between genetic, and environmental factors. Large-scale population-based biobanks, such as the UK Biobank, have accelerated genetic discoveries in cardiometabolic diseases by providing extensive phenotypic data, including biochemical measurements, and imaging data relevant to the cardiovascular system.

This dissertation of Ming Yeung describes analyses of large-scale biobank datasets using state-of-the-art machine learning methods, such as deep learning. We developed analytic pipelines to enable efficient processing of high dimensional medical data including millions of copies of health records, measurements, and medical images, to enhance the phenomic space. We then illustrate the biological relevance of these emerging biomarkers and phenotypes by conducting genome-wide association studies, which lead to identification of hundreds of genetic loci. Shared genetic components of these phenotypes to the multiple cardiovascular diseases such as coronary artery disease, peripheral artery disease, aortic aneurysm and arrhythmias support the utilities of these enhanced phenotypes in studying the different facets of cardiovascular system. Combining additional -omic datasets and external knowledge base, we further provide evidence on candidate genes and pathways underlying the associations.
In summary, this dissertation highlights the value of deep phenotyping and machine learning in advancing our understanding of the genetic basis of cardiovascular diseases.