In 2001, as part of the Human Genome Project, the first human genome was mapped, although researchers realized it wasn't full or accurate. Scientists have now completed the most comprehensive human genome sequence to date, filling in gaps and fixing errors found in the previous edition.
The sequence is the most comprehensive mammalian reference genome to date. The findings of six new genome-related publications published in Science should lead to a better understanding of human evolution and the discovery of novel targets for treating a variety of disorders.
A more precise human genome
"The Human Genome Project used DNA from blood samples since that was the technology at the time," explains Adam Phillippy, chief of genome informatics at the National Human Genome Research Institute (NHGRI) and senior author of one of the new publications. "At the time, the methodologies used introduced faults and gaps that have lasted all these years." It's good to be able to fill in the blanks and repair the errors immediately."
"We always knew pieces were missing, but I don't believe any of us realized how extensive they were, or how interesting they were," says Michael Schatz, a Johns Hopkins University professor of computer science and biology and another senior author of the same research.
The Telomere to Telomere consortium, which is financed by NHGRI and includes genomic and computational biology experts from dozens of institutes throughout the world, is the source of this research. The team concentrated on filling in the 8% of the human genome that was still missing from the first draft sequence. Since then, geneticists have been working to fill in the gaps one by one. The most recent research uncovers nearly a full chromosome's worth of new sequences, totaling 200 million additional base pairs (letters that make up the genome) and 1,956 new genes.
"We've declared triumph a few times over the previous two decades since the Human Genomic Project [in 2001]," says Evan Eichler, a professor of genome sciences at the University of Washington and another senior author of one of the publications. The emphasis of what has been sequenced this time around, according to Eichler, who was also involved in the mapping of that original sequence, is different. "While the original goal of the Human Genome Project was to arrange and orientate every base pair, the technology wasn't mature enough to accomplish this." As a result, we completed the components that we were able to complete."
The promise of the new findings
The newly sequenced parts include previously inaccessible bits like centromeres, the tightly curled center portions of chromosomes that keep the lengthy double strands of DNA orderly as they unwind, bit by little, to copy themselves and divide into two cells as a single cell divides. These areas are important for normal human development as well as brain growth and neurodegenerative disorders. "The fact that all eukaryotes—all plants, animals, people, trees, flowers, and higher organisms—have centromeres has long been one of biology's great mysteries. It's an essential component of how DNA replicates, chromosomes arrange, and cells divide. "However, despite the fact that its role has existed for billions of years, it was nearly hard to examine because we didn't have a centromere sequence to look at," Schatz explains. "At long last, we have."
Scientists were also able to sequence lengthy segments of DNA that had repeating sequences, which genetic specialists had previously disregarded as "junk DNA" because they appeared to be comparable to copying errors. These repetitive sequences, on the other hand, may play a role in the development of some human disorders. "Just because something is repetitive doesn't imply it's garbage," Eichler argues. He points out that critical genes are embedded in these repeated regions, including genes that contribute to protein-making machinery, genes that control how cells divide and split their DNA evenly between their two daughter cells, and human-specific genes that could help us distinguish ourselves from our closest evolutionary relatives, primates. Researchers discovered that primates had different amounts of copies of these repetitive sections than humans, and that they appear in various locations of the genome in one of the articles.
"These are some of the most critical functions for living and for making us human," Eichler explains. "Obviously, if you don't have these genes, you won't live." To me, that's not garbage."
Deciphering what these repeated sections mean, if anything, and how the sequences of previously unsequenced regions like the centromeres will translate to new therapies or a better understanding of human disease, according to Deanna Church, a vice president at Inscripta, a genome engineering company that wrote a commentary accompanying the scientific articles, is just getting started. It's not the same as decoding a human genome; she points out that today, roughly half of patients with suspected genetic illnesses whose genomes have been sequenced may be linked to particular mutations in their DNA. As a result, most of what the human genome does is still unknown.
Future research
There is still room for growth. The new sequence is made up of roughly half of a human's genetic content—that is, half of the genetic information ordinarily contained in a person's DNA. A person's chromosomes are divided into two sets: maternal and paternal. Each strand of DNA has slightly different copies of genes, thereby creating two genomes. Assembling those two genomes is a difficult process, and such difficulties impeded the original Human Genome Project, resulting in missing pieces. Because the scientists couldn't easily distinguish between maternal and paternal copies of DNA at the time, if they tried to match up certain sections thinking they were working with the maternal chromosome, they might run into areas where they couldn't because they were actually working with the paternal chromosome. "It's like having two puzzles in the same box," Phillippy explains. "You must figure out what the differences are and then rebuild both."
The scientists used a fertilization mistake in which the ensuing embryo solely had paternal chromosomes to create this novel variant. The resultant tumor was excised, and the cell line was kept alive in the lab in the early 2000s despite its aberrant chromosomal composition. Because the teams were effectively working with a single genetic puzzle to solve, it was easier for them to put the genome together.
Researchers will eventually require a more complete human genome, which includes the entire sequences of both maternal and paternal chromosomes. That will happen shortly. Phillippy and his colleagues are working with trios of DNA samples from volunteers, their mothers and dads, in order to isolate maternal DNA from paternal sequences and build two genomes independently. By the end of the year, the researchers hope to have finished the so-called diploid human genome sequencing.
"The new genome assembly is paying dividends because it offers a more precise map to comprehend what data we had before meant," says Winston Timp, associate professor of biomedical engineering at Johns Hopkins and a co-author on one of the articles. Finding novel variations that may identify healthy people from those who are afflicted with sickness, as well as variants that may place people at a higher risk of getting specific diseases, is one example.
"We uncovered millions of previously unknown genetic variations across samples of thousands of individuals whose genomes have already been sequenced," says Rajiv McCoy, a Johns Hopkins associate professor of biology and another co-author. "We'll have to wait for additional research to understand more about their links to disease, but for now, a large focus of research will be on attempting to find new genetic variants that haven't been studied before."
Despite the fact that the new version of the human genome is more complete, scientists are unlikely to rush to replace the previous version, which has gaps and flaws. That's because decades of research in human genetics have rendered the older version significantly more annotated than the new one—much like the difference between your favorite edition of a book, complete with handwritten notes and underlining in the margins, and a brand-new copy from the bookshop. "A genome's annotation is only as good as its annotation," Eichler explains. "Based on the previous, gap-filled genome, all clinical and scientific facilities have amassed decades of data." It would be a nightmare to duplicate all of that work for each individual lab." Many laboratories will gradually transition to working with the new genome, he says, by comparing smaller datasets first in a test run to determine how much richer and more complete the data they get from the new genome is. The new human genome, like the original, is available on a public database for any scientist to utilize. "For the time being, both genomes will be maintained up to date," he adds, adding that there will be no replacement.
Researchers will begin to build more complete genomes in the coming years, utilizing both mother and paternal DNA, to aid scientists in identifying the best targets for novel medicines and better understand human growth and evolution. The more genomes they have, the more potentially relevant patterns will emerge, leading to new insights into human disease and novel therapies. The ultimate objective is for every individual to have their entire genome sequenced as part of their medical record, allowing clinicians to compare their sequences to reference sequences and discover which mutations may be contributing to certain disorders.
Karen Miga, an associate professor of biomolecular engineering at the University of California, Santa Cruz, and a senior author of one of the articles, says, "This is providing the world with a whole other chromosome that we have never seen before." "We have new landscapes, new sequences, and the possibility and promise of fresh discoveries," says the researcher.
In the genomics and medical communities, there is a distinct sense of anticipation. "Hallelujah, we finally finished one human genome," Eichler remarked at a press conference. "But the greatest is yet to come." "This should not be seen as the finish, but rather as the start of a shift not just in genetic research but also in clinical treatment."