 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
| A
Celera lab technician works with a DNA sequencing machine.
Photos courtesy of www.GenomeNews
Network.org/J. Craig Venter Institute. |
 |
 |
It's All
in the Code!
Genes
make each of us what we are. Each cell in our bodies contains
our genome, a code of more than 3 billion letters contained
in the protein DNA. Sequencing DNA means figuring out what order
the letters appear in -- their sequence. Knowing
the sequence helps scientists figure out what kind of genetic
information is carried in a particular section of DNA. Some
sections contain genes; other sections don't. Some sections
may show changes in sequence, called mutations, that can cause
disease.
In the past, geneticists
studied genes one at a time. Today, scientists study the whole
code, a situation comparable to getting a complete story, rather
than using individual words to try to figure out a plot.
 |
Dr. J. Craig Venter. |
How
do you sequence DNA when such gigantic numbers are involved?
If you figured out one letter per second, it would take you
longer than a century to sequence a human's DNA, says J.
Craig Venter. He's the scientist who first sequenced a genome, the
Haemophilus influenzae bacterium, which can cause ear infections.
In 1994, he used a new machine and computers to do the job of
identifying each letter in the code. He later went on to be
the first to sequence the human genome.
The
DNA code is made up of four "bases," the letters of the genetic
alphabet: A is for adenine, G is for guanine, C is for cytosine,
and T is for thymine. When an organism's genome is sequenced,
the result is thousands to billions of these letters. A virus
of the E. coli bacterium has around 5,000 base pairs
while the Pompeii worm's is estimated to be about 800 million.
The human genome has over 3 billion. Yet
humans by far do not have the largest genome. The
record for the largest known genome currently is held by a tiny,
single-celled organism, an amoeba (Amoeba
dubia), which has some 670 billion base
pairs.
The
first DNA sequencing methods were developed in the mid-1970s.
Back then, scientists could only sequence a few base pairs of
DNA per year -- not nearly enough to sequence a single gene.
When the Human Genome Project began in 1990, only a few labs
could sequence even 100,000 base pairs per year. Today, the
latest production-scale sequencer can analyze up to 2 million
base pairs of DNA in a 24-hour period.
Since
most genomes are too large for any machine to sequence all at
once, scientists have to chop up the genome into manageable
chunks. These pieces are sequenced
and then fit together much like a giant jigsaw puzzle to
form the complete genome.
But
where does one chunk of the genome end and another begin? Computer
programs called assemblers look for overlaps in identical sequences
so that they can put the genome back together in the proper
order. As careful as scientists might be,
errors can occur at many different points in the sequencing
process. So how do scientists know if they've got the sequence right?
To make sure, scientists typically will sequence a genome between
6 to 10 times.
While
automatic sequencing machines have sped up the task of revealing
the genetic code, they do not tell scientists an organism's
genetic secrets. That takes much more work although the genome
sequence can provide scientists with clues about where certain
genes are. Genome maps also can help scientists navigate their way
to locations of interest.
Scientists constantly are adding to a catalog
of genomes that have been sequenced. Our Extreme 2004 scientists
will "blast" the microbes they find at the hydrothermal
vents. This means that they will try to match the microbes'
DNA to those in the catalog on the way to sequencing their genomes.
This is important not just "because it's there" --
for the sake of discovering something new -- but because new
bacteria have the potential to help us find new medicines, fuel
sources, and even foods.
Sources:
Canadian
Museum of Nature: The
Geee! in Genome
Genome News Network: What's
a Genome
National Human
Genome Research Institute, National Institutes of Health
U.S.
Department of Energy Human Genome Project Information Web Site
|