New deep learning neural network model predicts physical interactions of protein complexes

From the muscle fibers that move us to the enzymes that replicate our DNA, proteins are the molecular machinery that makes life possible.

The function of proteins is highly dependent on their three-dimensional structure, and researchers around the world have long struggled to answer a seemingly simple question to link function and form: if you know the building blocks of these molecular machines, can you you predict how they are assembled into their functional form?

It is not so easy to answer this question. With complex structures dependent on complex physical interactions, researchers turned to artificial neural network models — mathematical frameworks that convert complex patterns into digital representations — to predict and “see” the shape of proteins in 3D.

In a new article published in Nature Communicationresearchers at Georgia Tech and Oak Ridge National Laboratory are using one such model, AlphaFold 2, to predict not only the biologically active conformation of individual proteins, but also functional protein pairings called complexes.

The work could help researchers bypass long experiments to study the structure and interactions of large-scale protein complexes, said Jeffrey Skolnick, Regents Professor and Mary and Maisie Gibson Chair in the School of Biological Sciences and one of the study’s corresponding authors. , adding that computer models like these could mean big things for the field.

If these new computer models are successful, Skolnick said, “it could fundamentally change the way biological molecular systems are studied.”

Prepared for protein prediction

Created by London-based artificial intelligence lab DeepMind, AlphaFold 2 is a deep learning neural network model designed to predict the three-dimensional structure of a single protein based on its amino acid sequence.

Skolnick and fellow corresponding author Mu Gao, a principal investigator in the School of Biological Sciences, shared that the Alphafold 2 program performed very well in blind tests conducted at the 14and iteration of the Community Experiment on Critical Appraisal of Protein Structure Prediction Techniques, or CASP14, a biannual competition where researchers from around the world come together to test their computational models.

“For us, what is striking about AlphaFold 2 is that it not only makes excellent predictions about individual protein domains (the basic structural or functional modules of a protein sequence), but ‘It also works very well on protein sequences composed of multiple domains,’ Skolnick shared. And so, with the ability to predict the structure of these complex, multi-domain proteins, the research team set out to determine if the program could go one step further.

The physical interactions between the different [protein] domains of the same sequence are essentially the same as the interactions that glue different proteins together. It quickly became apparent that relatively simple modifications of AlphaFold 2 could allow it to predict the structural patterns of a protein complex..”

Mu Gao, Corresponding Author and Principal Investigator, School of Biological Sciences, Georgia Institute of Technology

To explore different strategies, Davi Nakajima An, a fourth-year student in the School of Computer Science, was recruited to join the team’s effort.

Instead of plugging the features of a single protein sequence into AlphaFold 2 according to its original design, the researchers joined the input features of multiple protein sequences together. Combined with new measurements to assess the strength of interactions between probed proteins, their new AF2Complex program was created.

Charting new territory

To put AF2Complex to the test, the researchers teamed up with the high-performance computing center, Partnership for an Advanced Computing Environment (PACE), at Georgia Tech, and tasked the model with predicting the structures of protein complexes it n had never seen before. The modified program was able to correctly predict the structure of more than twice as many protein complexes as a more traditional method called docking. While AF2Complex only needs protein sequences as input, docking relies on prior knowledge of individual protein structures to predict their combined structure based on complementary shapes.

“Encouraged by these promising results, we extended this idea to an even larger problem, which is to predict the interactions between several arbitrarily chosen proteins, for example, in a simple case, two arbitrary proteins,” shared Skolnick.

In addition to predicting the structure of protein complexes, AF2Complex was tasked with identifying which of more than 500 pairs of proteins were able to form a complex. Using newly designed metrics, AF2Complex outperformed conventional docking and AlphaFold 2 methods to identify arbitrary pairs known to experimentally interact.

To test AF2Complex at the scale of the proteome, which encompasses an organism’s entire library of expressible proteins, the researchers turned to the Summit Oak Ridge Leadership Computing Facility, the second-largest supercomputing center in the world. “Thanks to this resource, we were able to apply AF2Complex on approximately 7,000 protein pairs of the bacterium E.coli“, shared Gao.

In this test, the team’s new model not only identified many pairs of proteins known to form complexes, but was also able to provide information on “suspected but never observed experimentally” interactions, said Gao.

Digging deeper into these interactions has revealed a potential molecular mechanism for protein complexes particularly important for energy transport. These protein complexes are known to carry hemes, essential metabolites that give blood a dark red color.

Using the predicted structural models of AF2Complex, Jerry M. Parks, senior research and development scientist at Oak Ridge National Laboratory and collaborator on the study, was able to place hemes at their suspected reaction sites within of structure. “These computer models now provide insights into the molecular mechanisms of how this biomolecular system works,” Gao said.

“Deep learning is changing the way you study a biological system,” Skolnick added. “We envision that methods like AF2Complex will become powerful tools for any biologist who wishes to understand the molecular mechanisms of a biosystem involving protein interactions.”


Georgia Institute of Technology

Journal reference:

Gao, M. et al. (2022) AF2Complex predicts direct physical interactions in multimeric proteins with deep learning. Nature Communication.

Source link