Extreme 2004: Exploring the Deep Frontier Search

Home Mission and Crew Seafloor Geology Creature Features High-Tech Tools

Genome Sequencing

Dive Deeper!
Image of DNA sequencing. Click here to learn more.
A Celera lab technician works with a DNA sequencing machine. Photos courtesy of www.GenomeNews Network.org/J. Craig Venter Institute.

It's All in the Code!

Genes make each of us what we are. Each cell in our bodies contains our genome, a code of more than 3 billion letters contained in the protein DNA. Sequencing DNA means figuring out what order the letters appear in -- their sequence. Knowing the sequence helps scientists figure out what kind of genetic information is carried in a particular section of DNA. Some sections contain genes; other sections don't. Some sections may show changes in sequence, called mutations, that can cause disease.

In the past, geneticists studied genes one at a time. Today, scientists study the whole code, a situation comparable to getting a complete story, rather than using individual words to try to figure out a plot.

Picture of Dr. J. Craig Venter

Dr. J. Craig Venter.

How do you sequence DNA when such gigantic numbers are involved? If you figured out one letter per second, it would take you longer than a century to sequence a human's DNA, says J. Craig Venter. He's the scientist who first sequenced a genome, the Haemophilus influenzae bacterium, which can cause ear infections. In 1994, he used a new machine and computers to do the job of identifying each letter in the code. He later went on to be the first to sequence the human genome.

The DNA code is made up of four "bases," the letters of the genetic alphabet: A is for adenine, G is for guanine, C is for cytosine, and T is for thymine. When an organism's genome is sequenced, the result is thousands to billions of these letters. A virus of the E. coli bacterium has around 5,000 base pairs while the Pompeii worm's is estimated to be about 800 million. The human genome has over 3 billion. Yet humans by far do not have the largest genome. The record for the largest known genome currently is held by a tiny, single-celled organism, an amoeba (Amoeba dubia), which has some 670 billion base pairs.

DOG LinkThe first DNA sequencing methods were developed in the mid-1970s. Back then, scientists could only sequence a few base pairs of DNA per year -- not nearly enough to sequence a single gene. When the Human Genome Project began in 1990, only a few labs could sequence even 100,000 base pairs per year. Today, the latest production-scale sequencer can analyze up to 2 million base pairs of DNA in a 24-hour period.

Since most genomes are too large for any machine to sequence all at once, scientists have to chop up the genome into manageable chunks. These pieces are sequenced and then fit together much like a giant jigsaw puzzle to form the complete genome.

But where does one chunk of the genome end and another begin? Computer programs called assemblers look for overlaps in identical sequences so that they can put the genome back together in the proper order. As careful as scientists might be, errors can occur at many different points in the sequencing process. So how do scientists know if they've got the sequence right? To make sure, scientists typically will sequence a genome between 6 to 10 times.

While automatic sequencing machines have sped up the task of revealing the genetic code, they do not tell scientists an organism's genetic secrets. That takes much more work although the genome sequence can provide scientists with clues about where certain genes are. Genome maps also can help scientists navigate their way to locations of interest.

Quick Guide to Sequencing click hereScientists constantly are adding to a catalog of genomes that have been sequenced. Our Extreme 2004 scientists will "blast" the microbes they find at the hydrothermal vents. This means that they will try to match the microbes' DNA to those in the catalog on the way to sequencing their genomes. This is important not just "because it's there" -- for the sake of discovering something new -- but because new bacteria have the potential to help us find new medicines, fuel sources, and even foods.


Sources:
Canadian Museum of Nature: The Geee! in Genome
Genome News Network: What's a Genome
National Human Genome Research Institute, National Institutes of Health
U.S. Department of Energy Human Genome Project Information Web Site

Contact Us

University of Delaware
Copyright University of Delaware, November 2004