Almost a decade after the U.S. human genome project was completed, scientists say they have mapped the underlying regulatory system that switches DNA on and off, potentially spurring a wave of new research into the molecular basis of complex diseases such as Type 1 diabetes.
Many parts of DNA previously termed “junk” by scientists are, instead, levers that control the genetic activity that can lead alternately to health or illness, according to reports published simultaneously in the journals Science and Nature by the Encode international consortium.
Scientists previously thought that only genes, small pieces of DNA that make up about 1 percent of the genome, have a function. The new findings show that an underlying circuitry exists in which 80% of the DNA code within each human cell can contribute to disease. This may be why large studies targeting gene variants haven’t identified treatable causes for many complex maladies, the scientists said. The circuitry can be disrupted at several individual waypoints.
“This takes us from a concentration on individual genes to the whole genome,” said Eric Topol, professor of translational genomics at the Scripps Research Institute in La Jolla, California, in a telephone interview. “This series of articles is amazing, it’s a blitz of information.”
The science consortium identified about 4 million genetic switches, though the researchers expect the number will rise as more discoveries are made, said Ewan Birney, the associate director of the Cambridge-based EMBL-European Bioinformatics Institute in the U.K. The circuitry identified by the group regulates about 20,000 genes, he said in a conference call.
Encode, short for The Encyclopedia of DNA Elements, was started in September 2003, just five months after the U.S. Human Genome Project was declared over. Its broad goal was to identify all elements in the genome that had a function.
The $288 million project, funded by a unit of the U.S. National Institutes of Health, eventually gathered 443 scientists from more than 30 institutions worldwide into the consortium that made the announcements. More than 1,600 experiments on 147 types of tissue were performed.
“It was an extraordinary group response right from the start,” said Tim Hubbard, who leads the Cambridge-based Vertebrate Genome Analysis Project at the Wellcome Trust Sanger Institute in the U.K. “We had a map, but we needed insight into the function of each part of the genome.
‘‘I can’t say there was one person who drove it,’’ he said. ‘‘I think the model provided by the Human Genome Project and the role of the Internet intersected at a certain point to bring many voices together to say this is what we need to do next. It was unusual then, but becoming less so now.’’
Six of the studies published yesterday appeared in Nature and two in the journal Science. Several more appeared in Genome Research and Genome Biology, showing the extraordinary range of the material being presented.
The Encode results demonstrate the importance of DNA feedback mechanisms that the genome uses to control itself, said John Stamatoyannopoulos, a study author and associate professor of genome sciences at the University of Washington in Seattle.
While the vast majority of human DNA doesn’t make cellular proteins, the new results suggest they may create the RNA molecules that help regulate when a gene turns off and on, and creates specific types of proteins. Additionally, the non-coding DNA also may boost or muffle a gene’s expression.
‘‘It’s like a brain in every cell,” Stamatoyannopoulos said in a telephone interview.
Scientists studying individual genes and proteins will be able to use the Encode data to gain more insight into regulatory mechanisms in their individual areas of research, said Stephen Elledge, a geneticist at Harvard Medical School in Boston. The data will help researchers better understand how regulatory changes underlying genetic activity might affect people’s risk or severity of disease, he said
Genome-wide association studies are done by scanning the genome for many people to find variations linked to disease. About 93 percent of the variants found in this research hasn’t involved genes that code for proteins, and few explain the bulk of most complex diseases, Stamatoyannopoulos said.
His study, published in Science, found that 76 percent of these disease-associated variants existed within or near regulatory DNA, suggesting a more complex cause may exist. His group also determined many complex disease share some genomic switches, including autoimmune diseases such as asthma, multiple sclerosis, rheumatoid arthritis, Type 1 diabetes and lupus.
“We knew that hidden out there were instructions for turning things off and on and understanding that process was necessary for understanding disease,” Stamatoyannopoulos said.
In a paper in Nature, Job Dekker, a professor of biochemistry at the University of Massachusetts, and his team demonstrated using three-dimensional models that many regulatory regions work by directly touching genes when folded.
The Human Genome Project was a 13-year research effort to identify the approximately 20,000 genes in human DNA, and determine which sequences of the chemical base pairs make up DNA. The research allowed scientists to understand the sets of genetic instructions found in human cells. In people, the genome is 23 pairs of chromosomes.
The newest results take that road map further.
“This is a story of comprehensiveness,” said Thomas Gingeras, one of the study authors and the head of functional genomics at Cold Spring Harbor Laboratory on New York’s Long Island. “Now we have a large number of these regulatory regions and a sense of when they’re activated.”