Deep Learning predicts 3D structure of genome

Predicting the impact of non-coding genetic variation requires interpreting it in the context of 3D genome architecture. We have developed deepC, a transfer learning based deep neural network that accurately predicts genome folding from megabase-scale DNA sequence. DeepC predicts domain boundaries at high-resolution, learns the sequence determinants of genome folding and predicts the impact of both large-scale structural and single base pair variations.

Our genome is not a linear molecule, but is folded up in complicated ways in order to fit inside the nucleus of a cell. How this is done influences which parts of the genome can interact with which other. As a result, this process is actively regulated by signals on the genome itself.

In our paper "DeepC: predicting 3D genome folding using megabase-scale transfer learning" we show how deep neural networks are able to predict the interaction maps directly from the DNA sequence. By interrogating the neural network we can learn about the particular signals that drive these interactions. Having an "in silico" model of this process also allows us to predict the impact of mutations on these interactions, and therefore assess the possible impact of these mutations on downstream phenotypes.

Interested? Read: DeepC: predicting 3D genome folding using megabase-scale transfer learning