A five-year project involving 442 scientists from around the globe has now described for the first time the architecture of the human genome and how it works. It largely dispels the notion of junk DNA and gives a general description of how disease occurs.
The Encyclopedia of DNA Elements known as ENCODE, funded by the National Institutes of Health, provides the first insight into how genetic switches turn genes on and off in complex processes that involve a major portion -- if not the entire -- genome.
The National Human Genome Research Institute held a teleconference Wednesday to announce results that provide a blueprint of human biology.
"What it turned out to be is nothing short of breathtaking," said Eric Green, NHGRI director, describing ENCODE as providing "a functional landscape of the genome.
"This illuminates the blueprint where switches make sure an incredibly complicated choreography works optimally."
When the human genome was transcribed in 2003, it revealed 3 billion chemical base pairs of DNA. These combinations of the four letters representing compounds in DNA's double helix structure make up a huge book that was much like an ancient language yet to be translated.
ENCODE provides the first general translation of those letters, providing a catalog of 50,000 genes and 4 million previously unknown switches, along with new information about the complexities of human biology. It provides a foundation for future research, including a focus on specific genes and genetic processes to describe bodily process and disease.
But the key finding disproved that 98 percent of the human genome was junk DNA -- DNA that serves no apparent biological purpose. Instead, Dr. Green said, ENCODE shows that 80 percent of the genome has some biological function with expectations that, in time, the entire genome might be proven to play a biological role.
What previously was known as junk DNA actually serve as genetic switches, or regulatory sequences of DNA, that use proteins known as "transcription factors" to turn genes on and off. A more thorough description of how genes, which control all biological processes, are turned on and off represents a breakthrough in the field.
The research project got under way in 2003 with a four-year $40 million pilot project, funded by NIH, followed by the five-year ENCODE project that NIH also funded at a cost of $123 million. Further funding is expected for the next round of research on the human genome.
The major analysis of ENCODE was published online Wednesday in the journal Nature. A first set of 30 studies and analyses will be published, in a collaborative effort among major scientific journals, in Nature, Science, Genome Research, Genome Biology and the Journal of Biological Chemistry. The ENCODE database is available online.
A team of Penn State University researchers, led by Ross Hardison, a professor of biochemistry and molecular biology, played a major role in the ENCODE project by describing how genetic switches -- sometimes dozens of them at once -- activate genes involved in diseases. The project is expected to lay groundwork to explain major common diseases including 17 cancers that have different triggers but seem to be associated with a common DNA sequence.
"Our genetic studies have found scores to hundreds of places where individual variations in the DNA play a role in determining whether someone likely will get diabetes, cardiovascular disease, cancers, Crohn's disease -- a lot of the diseases that every family has someone affected by," he said. "These are common diseases. These genetic components determine a person's likelihood to get the diseases."
Mr. Hardison, 61, who holds a doctoral degree, said ENCODE is just the start. While researchers still are "a long way from final answers," ENCODE represents a good foundation. "We don't know the full answer, but we just made a tremendous advance," he said. "This should lead to exciting advances that lead us to better health care, and hopefully that will occur in my lifetime."
Understanding these processes, Mr. Hardison said, eventually could lead to new treatment therapies.
The human genome, he said, no longer can be considered a DNA wasteland. He said the project has prompted him to view it, instead, as a beautiful vista.
"You now want to think of the genome as a landscape where you would want to go on vacation -- like the Rocky Mountains -- where everywhere looks gorgeous with light and glaciers and mountains and ridges, with fascinating stuff everywhere," he said. "You can be proud of the genome."
ENCODE program director Elise Feingold and Mike Pazin, NHGRI program director in functional genomics, said research before ENCODE focused largely on individual genes or small areas of the genome. But ENCODE focused on the entire system, much the way Google Maps can show the entire nation, with a means to zoom into specific areas.
Mr. Hardison, a key member of the ENCODE team, has had a distinguished career in genetics with expertise in gene biology and expression, Ms. Feingold said. He helped analyze ENCODE data and wrote an important review about using ENCODE to study genetic diseases.
"A big finding -- and Ross was a big part of it -- was finding that in the ENCODE data you can turn to information on how genetic variations are important in human disease," Mr. Pazin said.