Shock AI Discovery suggests we haven’t even discovered half of what’s inside our cells
Inside every cell in the human body is a constellation of proteins, millions of them. They all jostle each other, being quickly assembled, folded, packaged, shipped, cut, and recycled in a hive of activity that works at a breakneck pace to keep us alive and going.
But without a complete inventory of the universe of proteins inside our cells, scientists find it difficult to appreciate at the molecular level what is wrong with our bodies and which leads to disease.
Now, researchers have developed a new technique that uses artificial intelligence to assimilate data from single cell microscopy images and biochemical analyzes, in order to create a “unified map” of subcellular components – half of which it turns out, has never been seen before. .
“Scientists have long understood that there is more than we don’t know, but now we finally have a way to dig deeper,” says computer scientist and networks biologist Trey Ideker of the University of California (UC) in San Diego.
Microscopes, as powerful as they are, allow scientists to peer inside individual cells, down to the level of organelles such as mitochondria, the powerhouses of cells, and ribosomes, protein factories. We can even add fluorescent dyes to easily label and track proteins.
Biochemistry techniques can go one step further, focusing on single proteins using, for example, targeted antibodies that bind to the protein, remove it from the cell, and see what is attached to it.
The integration of these two approaches is a challenge for cell biologists.
“How to bridge this gap between the nanometric and micrometric scale? This has long been a major obstacle in the biological sciences, ”explains Ideker.
“Turns out you can do it with artificial intelligence – by looking at data from multiple sources and having the system put it together into a cell model.”
The result: Ideker and his colleagues turned textbook cards of red blood cells that give us a bird’s-eye view of candy-colored organelles in a complex web of protein-protein interactions, organized by the tiny distances between them .
By merging image data from a library called the Human Protein Atlas and existing maps of protein interactions, the machine learning algorithm was tasked with calculating the distances between pairs of proteins.
The aim was to identify communities of proteins, called assemblages, which coexist in cells at different scales, from very small (less than 50 nm) to very “large” (more than 1 m).
A shy of the 70 protein communities was classified by the algorithm, which was trained using a reference library of proteins with known or estimated diameters, and validated by further experiments.
About half of the protein components identified are apparently unknown to science, never documented in published literature, the researchers suggest.
In the mix was a group of proteins forming an unknown structure, which the researchers determined were probably responsible for splicing and dicing the newly created transcripts of the genetic code that are used to make proteins.
Other proteins mapped included transmembrane transport systems that pump supplies in and out of cells, families of proteins that help organize large chromosomes, and protein complexes whose job is to make, well, more protein.
A considerable effort, but this is not the first time that scientists have tried to map the inner workings of human cells.
Other efforts to create reference maps of protein interactions yielded equally mind-boggling numbers and attempted to measure protein levels in human body tissues.
Researchers have also developed techniques to visualize and track the interaction and movement of proteins in cells.
This pilot study goes one step further by applying machine learning to cell microscopy images that locate proteins relative to large cellular landmarks such as the nucleus, and to data from protein interaction studies that identify the closest neighbors. close to a protein at the nanoscale.
“The combination of these technologies is unique and powerful because it is the first time that measurements at very different scales have come together,” says bioinformaticist Yue Qin, also of UC San Diego.
In doing so, the Multi-Scale Integrated Cell or MuSIC technique “increases the resolution of imaging while giving protein interactions a spatial dimension, paving the way for the incorporation of various types of data into cellular scale maps. of the proteome, ”write Qin, Ideker and their colleagues. .
To be clear, this research is very preliminary: the team focused on validating their method and only looked at the available data for 661 proteins in a cell type, a kidney cell line that scientists grow in the lab. for five decades.
The researchers plan to apply their new technique to other types of cells, explains Ideker.
But in the meantime, we will have to humbly accept that we are mere intruders inside our own cells, capable of understanding a small fraction of the total proteome.
“Ultimately, we may be able to better understand the molecular basis of many diseases by comparing the differences between healthy cells and diseased cells,” says Ideker.
The study was published in Nature.