
© Illustration from Getty Images
The X chromosome is the first human chromosome to be completely sequenced from end to end, with no gaps in the sequence and an unprecedented level of accuracy.
Although the current human reference genome is the most accurate and complete vertebrate genome ever produced, there are still gaps in the DNA sequence, even after two decades of improvements.
Now, for the first time, scientists have determined the complete sequence of a human chromosome from one end to the other ('telomere to telomere') with no gaps and an unprecedented level of accuracy.
The publication of the telomere-to-telomere assembly of a complete human X chromosome
July 14 in Nature is a landmark achievement for genomics researchers. Lead author Karen Miga, a research scientist at the UC Santa Cruz Genomics Institute, said the project was made possible by new sequencing technologies that enable "ultra-long reads," such as the
nanopore sequencing technology pioneered at UC Santa Cruz.
Repetitive DNA sequences are common throughout the genome and have always posed a challenge for sequencing because most technologies produce relatively short "reads" of the sequence, which then have to be pieced together like a jigsaw puzzle to assemble the genome. Repetitive sequences yield lots of short reads that look almost identical, like a large expanse of blue sky in a puzzle, with no clues to how the pieces fit together or how many repeats there are.
"These repeat-rich sequences were once deemed intractable, but now we've made leaps and bounds in sequencing technology," Miga said. "With nanopore sequencing, we get ultra-long reads of hundreds of thousands of base pairs that can span an entire repeat region, so that bypasses some of the challenges."
Filling in the remaining gaps in the human genome sequence opens up new regions of the genome where researchers can search for associations between sequence variations and disease and for other clues to important questions about human biology and evolution.
"We're starting to find that some of these regions where there were gaps in the reference sequence are actually among the richest for variation in human populations, so we've been missing a lot of information that could be important to understanding human biology and disease," Miga said.
Comment: