Cancer Genetics: A Quick Course
We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest.
-James D. Watson and Francis H. Crick, Nature, April 25, 1953
With that masterly stroke of understatement, Watson and Crick, two gawky young scientists racing their colleagues in a Cambridge laboratory, ushered in the era of molecular biology. In 1968, when he published The Double Helix, Watson pondered how they unraveled the twisted strand of life, the winding DNA molecule. During those feverish years in the early 50s, wrote Watson, DNA was a "mystery up for grabs."
The solution to that mystery has led, in rapid order, all the way to gene therapy. Scientists can now reach down into the smallest unit of life-giving instructions, and rewrite the code. It won't be long before they perfect this technique and apply the fix to all manner of diseases. At least that's the hope.
The Book of Life
For some, it may rob life of its romance to know we boil down to a handful of chemicals that direct everything we are. But those few chemicals-often called "letters" in the Book of Life-have a great deal to say when combined and strung together just so. Adenine, thymine, cytosine and guanine (A, T, C, G) make up our alphabet, building blocks of the original "body language." These bases form in pairs: A and T together forever, C and G, likewise. They are the most important part of DNA.
The Genetic Code
This is how the four chemical bases in mRNA combine into codons to produce the 20 amino acids. A = adenine, C = cytosine, U = uracil, G = guanine.
Translate into phenylalanine
Translate into serine
Translate into tyrosine
Translate into cysteine
Translates into tryptophan
Translate into leucine
Translate into proline
Translate into histidine
Translate into glycine
Translate into arginine
Translate into isoleucine
Translates into methionine
Translate into threonine
Translate into asparagine
Translate into lysine
Translate into valine
Translate into alanine
Translate into aspartic acid
Translate into glutamine
Translate into glutamic acid
It took an astronomer, George Gamow, to suggest that these four chemical bases could combine in groups of three to cook up amino acids. The three-letter groups, or codons, serve as the "words" in the Book of Life. In the 1960s the Indian biochemist Har Gobind Khorana figured out the 64 possible codons and the 20 amino acids they specify. For instance, ACG creates threonine; GAG codes for glutamine. When strung together the amino acids produce the blueprint for proteins built by the body's cells. Hormones, enzymes, and antibodies-these are all examples of proteins vital to the structure, function, and regulation of the body.
Codons make up our 100,000 genes-"sentences" in the DNA Book of Life. To the untrained eye, the string of codons looks like a long line of babble. Where do the sentences start and end? It turns out that three "stop codons" signal the end of each gene, UAA, UGA, and UAG.
As in any book, the sentences fit into chapters-in this case, our 46 chromosomes, dwelling in the nucleus of every somatic cell in the body-altogether 22 pairs of autosomes and 2 sex chromosomes (XX in women; XY in men). Sex cells-sperm and egg-contain 23 chromosomes; when the two halves combine during mating, the full complement of chromosomes come together. The word chromosome comes from the Greek words chroma, meaning "color," and "soma," meaning body. In the 19th century, scientists knew that if they dyed dividing cells, they could stain the chromosomes' threadlike protein structures within.
Each person's 23 pairs of human chromosomes can be distinguished by size and by unique banding patterns.
Normal male 46,XY
Normal female 46,XX
When stained and viewed under a powerful light microscope today, chromosomes reveal a pattern of light and dark bands. These bands indicate variations in the amounts of nucleotides-that is, the four bases plus their attached sugars and phosphates. (Thymine substitutes for uracil in DNA). The differences allow researchers to distinguish the chromosomes from one another in an analysis called a karyotype. It's important to categorize the chromosomes in this way when searching for the exact location of specific genes, or for mutations-missing, broken or extracopies of a chromosome. People with Down's syndrome, for instance, have a third copy of chromosome 21.
Double Helix-DNA, which carries the instructions that allow cells to make proteins, is made up of four chemical bases. Tightly coiled strands of DNA are packaged in our chromosomes, housed in the cell's nucleus. Genes are the working subunits of DNA.
The Double Helix
When Watson and Crick offered up their twisting ladder of life, they dryly observed, "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material." In fact,you can't help but feel awed by the elegance of this information-storing, self-replicating molecule tightly coiled in our cells. It means that like can beget like; the world can keep turning.
How does it happen? Let's look first at the makeup of DNA. We've already talked about the four most important components: the bases adenine and thymine, cytosine and guanine. They comprise the rungs of the ladder, paired and held together by weak hydrogen bonds. Sugar (deoxyribose) and phosphate serve as the ladder's uprights. When a gene puts in an order for a protein, the two uprights tear away, "unzipping" the molecule, leaving each base unpaired. Watson and Crick's postulated "specific pairing" dictates that the partner-less bases will always attract their complementary opposite. That means that at the end of DNA's day, a new double-stranded molecule, identical to the original, will have formed.
For a cell to make a protein, the information from a gene is copied, base by base, from DNA into new strands of messenger RNA (mRNA). Then mRNA travels out of the nucleus into the cysoplasm, to cell organeles called ribosomes. There mRNA directs the assembly of amino acids that fold into a completed protein molecule.
Enzymes help move this process along. One enzyme unravels the double helix, another holds the strands apart, and another-DNA polymerase-plays a key role in replication. Like a good referee, polymerase makes sure everyone plays by the rules, seeing that adenine, for instance, doesn't wander into guanine's berth or cytosine hasn't encroached on thymine's turf. Polymerase corrects such mistakes, shooing off the interlopers and waving in the proper base. If polymerase fails to blow the whistle, mutations occur, and that could lead to genetic diseases.
When a gene contains a mutation, the protein encoded by that gene will be abnormal. Some protein changes are insignificant, others are disabling.
Once new DNA has formed in the nucleus, the cell's head office, protein synthesis can take place. But someone has to go out into the cytoplasm, the cell's workfloor, and inform the machine shop workers, the ribosomes, that it's time to assemble the amino acids and proteins. That "someone" is RNA: ribonucleic acid, a single-strand molecule chemically similar to DNA, except for its ribose sugar and its uracil base (U), which can stand in for thymine. As the executive assistant, RNA must perform two duties: transcribe and translate DNA's instructions.
To transcribe, RNA calls into action its own polymerase enzyme which hitches on to a DNA site at the beginning of a gene. The RNA polymerase then pulls a section of the DNA strand apart to expose unattached DNA bases. One of these loose DNA strands acts as a template for the messenger or "mRNA." Before the nucleus deems this transcripted message mature enough for release, the head office gives mRNA a once-over. Nuclear enzymes snip out noncoding sections called introns (the enigmatic "junk DNA") and splice together other sections called exons, the working sequences that code for proteins. (There's room for error here: if the gene splicing goes awry, mutations can appear.) Now ready to convey, or translate DNA's instructions, the mRNA bustles out of the nucleus into the cytoplasm.
A summary of the steps leading from DNA to proteins. Replication and transcription occur in the cell nucleus, after which then mRNA is transported to the cytoplasm, where translation of the mRNA into amino acid sequences composing a protein occurs.
Inherit the Blend
Genes come in pairs, one from each parent. Gregor Mendel, the 19th century Austrian monk so fascinated by peas, came upwith the notion of paired factors or "elements" as he called them. Today we call these elements alleles, one of two or more forms of the same gene. Of each pair, one is often dominant, meaning that it masks the other. Mendel found this out by observing what happened when he crossed tall pea plants with short plants, and plants with different colored flowers. The masked, or recessive allele doesn't disappear, he discovered; it can show up in a later generation.
In the case of human eye color, one allele produces brown eyes, another makes blue eyes. Your gene pair for eye color may be:
Blue/Blue, making you homozygous with respect to that gene, and blue-eyed; Brown/Brown making you homozygous and brown-eyed; or Brown/Blue which means you are heterozygous but still brown-eyed, because the brown allele is dominant.
This explains why two parents with brown eyes, if they are both heterozygous, can produce blue-eyed children. (Note that modifier genes can alter these two colors to produce hazel, green, gray or even-improbably-violet eyes.)
While the system of inheritance works well to pass along such traits, it also can perpetuate mutations that lead to trouble. Considering that 3 billion DNA base pairs replicate in each cell division, the process is amazingly accurate. Scientists estimate that the proofreading and repair team-several dozen enzymes-mop up 99.9% of errors. Yet misspellings do slip through. And the simplest misprint can have the most drastic consequences. For example, our oxygen-toting protein, hemoglobin, consists of a string of 146 amino acids. If even one amino acid in that chain-valine-seizes the rightful spot of another-glutamic acid-the entire protein malfunctions. It's just a small typo: GAA changed to GUA, a point mutation. But it's enough to cause sickle cell anemia. Other types of mutation: deletions or insertions. They can produce extra or missing amino acids in a protein and hence defective genes. Many cases of cystic fibrosis, for instance, result from a three-base pair deletion.
Different mutations in the same gene can produce a wide range of effects. In cystic fibrosis, for instance, the gene that controls mucus production can have more than 300 different mutations; some cause severe symptoms; some, mild symptoms; and some, no symptoms at all.
Inherited genetic diseases result from DNA flawed in the sex, or germline, cells. These blunders can pass from one generation to the next in three ways:
Autosomal dominant, in which the defective gene need be present in only one (dominant) allele in order for the disease to show outwardly.
In dominant genetic disorders, if one affected parent has a disease-causing allele that dominates its normal counterpart, each child in the family has a 50% chance of inheriting the disease allele and the disorder.
Autosomal recessive, in which the defective (recessive) gene must be inherited in a double dose to cause abnormality. Parents can carry masked copies of the recessive gene without having the disease themselves. But when two such parents mate and pass along two copies of the masked gene, the disease reveals itself in their child.
In diseases associated with altered recessive genes, both parents-though disease-free themselves-carry one normal allele and one altered allele. Each child has one chance in four of inheriting two normal alleles; and two chances in four of inheriting one normal and one altered allele, and being a carrier like both parents. Examples of recessive disorders, albeit rare, which predispose to breast cancer are ataxia telangiectasia and Bloom's Syndrome.
X-linked recessive, in which a disease that is caused by a defect in the X chromosome usually leads to illness in males in whom the defect cannot be masked by a second, normal X chromosome.
Nature Plus Nurture
While errors in somatic cells may cause disease, including some cancers, they don't carry forward to the next generation. That makes these diseases genetic but not inherited events. In fact, most cancers arise in this way. Tripped perhaps by too much sun exposure, a close encounter with toxic chemicals, or faulty DNA repair, sometimes genes stumble, forcing somatic mutations.
Over time genetic mistakes accumulate in the body's tissues and random or "sporadic" cancers can take root with cells proliferating unchecked. Oncologists tag this "deregulation." In revolt, the cells have chucked out the rule book on when to stop growing and when to start differentiating into specialized units. Essentially, there are two types of genes that jump-start this revolt:
Tumor-suppressor genes. Under normal conditions these genes act as brakes on cell growth. When missing or inactivated, the brakes fail and a malignant bloom can veer out of control. Among the most notorious of the tumor-suppressor genes is p53, involved in transcription and cell cycle regulation. When p53 turns bad, it turns very bad, responsible for breast and colon cancers, leukemia, and soft tissue sarcomas, among others. Mutant versions of p53 have been found in DNA samples from more than half of human tumors, making it the most common gene linked with cancer. Other examples of tumor suppressors: APC which causes colon, pancreatic and stomach cancers; RB which causes retinoblastoma, osteosarcoma, breast, lung, prostate and bladder cancers; and WT1 which causes nephroblastoma (Wilms' tumor).
Oncogenes. Normally these genes accelerate cell growth (in a controlled way). When mutated or "over"-activated, the oncogene accelerator gets stuck to the floor, flooding cells with signals that shout: "Keep on dividing!" The end result: a high-speed, wild cellular ride-essentially the same outcome as if the tumor-suppressor brakes had failed. Some examples of oncogenes: RAF, which leads to stomach cancer; MYC, which leads to lymphomas; and TRK which leads to thyroid tumors. In a number of colon cancers, pathologists have found activated RAS oncogenes living side-by-side with inactivated tumor-suppressor genes such as p53. This suggests that creating cancer may require both types of genetic changes.
Cancer usually arises in a single cell. The cell's progress from normal to malignant to metastatic appears to follow a series of distinct steps, each controlled by a different gene or set of genes. Persons with hereditary cancer already have the first mutation.
So far, we've been referring to these mutations in somatic cells. But they can take place in germline cells as well, which means that cancer-causing genes can pass to offspring, producing families with a large number of breast, ovarian, or colon cancers, for example. Such "cancer families" are rare, but that's cold comfort to those who inherit the mutant allele that leads to a tumor. However, we don't inherit cancer so much as a predisposition to cancer. It takes a confluence of events from within the body and without to produce pathology. Nature (genes) and nurture (diet, mutagenic environmental exposure) each play a role. This means that even though genes may dictate what we are, by changing a harmful lifestyle we can better control what we become.
An oft-cited example: In Japan, the lifetime risk of colon cancer was ten times lower than in the United States: 0.5% versus 5%. Epidemiologists who studied Japanese immigrants to the US found that among the first-generation Japanese living in Hawaii, the frequency of colon cancer rose several fold: not as high as on the US mainland, but higher than in Japan. By the time second-generation Japanese had lived on the mainland, their colon cancer rates equaled that of other Americans. All this implicated diet and lifestyle as important factors in the development of disease.
But that doesn't mean genetic factors play no role. Some North Americans still contract colon cancers while others do not. How does research account for that? Differences within the environment such as varied diet, and differences in genetic predisposition. (For instance, when a first degree relative has colon cancer, an individual's risk rises several fold.) And how did scientists account for the difference in colon cancer frequency between Japanese living in their native country and immigrants to the US? One theory: something in the Asian environment rendered the predisposing genes less penetrant-that is, less likely to result in disease.
In both real estate and genetics, it seems, location is everything. Knowing where on the chromosomes human genes reside will forever change medical practice and biomedical research. With the genetic blueprint in hand, we can better understand how humans develop from a single cell, how genes govern the functions of tissues and organs, how the disease process devolves. And that will lead to better diagnosis, treatment, and even prevention of disease.
Toward these ends, the federal government established the National Center for Human Genome Research (NCHGR) in 1989. This center directs the United States' role in the Human Genome Project, a gargantuan effort to decipher our DNA-the Book of Life, the genome. On 25 October 1996, the journal Science published a partially complete gene map compiled by an international team of 100 scientists. The map, which you can explore at the interactive Internet site http://www.ncbi.nlm.nih.gov/SCIENCE96/, nails down the location of most of our genes.
Francis S. Collins, MD, PhD, who heads the NCHGR, pioneered a powerful gene-finding method known as "positional cloning" that has given an enormous boost to genome mapping. The technique has isolated dozens of disease genes including those for retinoblastoma, Wilms' tumor, Von Hippel-Lindau disease, breast and ovarian cancers. To isolate and clone such target genes, the Collins technique relies on other potent tools that sift through the genome haystack of 3 billion base pairs. As scientists hone these tools and tie them together, they drive the gene hunt forward faster and cheaper. At the beginning of the gene pursuit it took months-sometimes years-and $5 to find each base pair; now it costs only weeks of time and as little as 50¢ per nucleotide by using these improved methods;
Genetic (or Linkage) Mapping.
This is the first step in isolating a gene, and offers firm evidence that a disease or trait is linked to a culprit gene (or genes) passed from parent to child. Genetic mapping also provides clues about which chromosome contains the gene and where on that chromosome the gene lies. Using blood or tissue samples from members of a family in which a disease occurs, scientists first isolate DNA from the samples. They then look for markers-characteristic molecular patterns inherited along with the disease, piggyback fashion. The more markers on the map, the more likely one of them will relate to a disease gene, making it much easier for researchers to zero in on the gene.
Markers consist of slight spelling differences in the genetic alphabet A, T, C, and G. Called "polymorphisms," these differences usually occur in the so-called junk DNA and normally do not affect a person's health. But they can tell a researcher from whom the DNA came, making them useful in tracking inheritance through generations. Police departments and coroners also use such DNA "fingerprints" to identify victims and perpetrators.
In 1994 an international group of investigators published a genetic map with nearly 6,000 markers spaced, on average, less than a million bases apart. Leaders of the Human Genome Project announced that the map contains more details than originally hoped for, and was completed a full year ahead of schedule. Since then, scientists have continued to fine-tune this map.
Maps of DNA can have several levels of detail: from the banding patterns of the chromosomes, to clones of overlapping segments of DNA, and ultimately to the base-by-base sequence of DNA.
Physical Mapping. Once scientists use genetic mapping to assign a gene to a small area on a chromosome, they must then examine that region more closely to find the gene's exact location for a physical map. To construct such a map, gene hunters use restriction enzymes-nature's scissors-to slice apart a chromosome into smaller, workable pieces. They then copy, or clone, the pieces, matching them up in their starting order so they can trace the origin and genetic content of each one. The data go into a computer, and the DNA snippets into a freezer. When a genetic linkage map shows that a gene lies in a certain region, it narrows the search for the gene in question. Next step: defrost and examine the appropriate copied pieces. They should lead to another gene and a new entry on the physical map.
Because it's essential to keep track of the chromosome pieces in their proper order, scientists had to develop a system of markers much like the mile posts on a highway. In this case the markers connect one section of chromosome "road" with the next. Cloned pieces of DNA road overlap in places that share the same marker (called "sequenced-tagged site"). These mile post markers let researchers know how far they have driven along the chromosome to their destination-the disease gene they seek. With few markers to guide them in the past, scientists used to spend as many as 10 years traveling lonely stretches of chromosome highway searching in vain for a gene.
The original goal for sets of overlapping DNA pieces, or "contigs," was that they measure 2 million bases in length by 1995. By 1996 the contigs ranged from 20 million to 50 million bases. Already, researchers have pieced together enough sets to complete chromosomes 21, 22, and Y, and nearly all of chromosomes 3, 4, 7, 11, 12, 16, 19, and X. The ultimate goal: every chromosome in the human genome.
DNA Sequencing. Physical maps provide the raw material for understanding the sequence of bases-the four essential letters-A, T, C and G-in the Book of Life. By knowing the correct sequence of these letters and the genes that they compose, scientists can spot the specific aberrations that may cause disease.
To construct the human genome, scientists find it useful to sequence other organisms used in research as models for human disease: E. coli, the gut bacterium; C. elegans, a microscopic see-through roundworm; drosophila (fruit flies); and mice. In the spring of 1996 researchers completed the sequence for baker's yeast, significant because yeast resembles human cells more closely than do bacteria. Because of their relative simplicity, these organisms make an ideal testing ground for new sequencing technology.
For example, labs working on the roundworm C. elegans have increased their annual production rate to 15 million base pairs. (With 100 million base pairs to sequence in the worm, researchers expect to finish by 1998.) During the 1970s, labs could barely churn out a few base pairs per year, hardly up to the task of tackling 3 billion. When the Human Genome Project got underway in 1990, few labs had sequenced even 100,000 bases. Since then, improved technology and automation have increased speed and lowered cost considerably.
Delivering the Promise
Now that the Human Genome Project has crossed the finish line after 15 years and billions of dollars, this Big Science effort will signal more of a beginning than an end. The genome "provides grist for the next generations of effort to figure out how the genes work," says Dr. Collins. But patients want more than just the promise of the human genome-they want their disease treated, cured. And eventually that will come.
For now, says Dr. Collins, gene testing affords doctors the chance to practice "individualized preventive medicine in which they focus medical surveillance and lifestyle management on the people who need it most." It's unrealistic to think that gene therapies will be on the market in the next few years, he points out. But finding disease genes puts us that much closer.More about oncogenes and cancer