DNA profiling (also called DNA testing, DNA typing, or genetic fingerprinting) is a technique employed by forensic scientists to assist in the identification of individuals on the basis of their respective DNA profiles. DNA profiles are encrypted sets of numbers that reflect a person's DNA makeup, which can also be used as the person's identifier. DNA profiling should not be confused with full genome sequencing.
Although 99.9% of human DNA sequences are the same in every person, enough of the DNA is different to distinguish one individual from another.DNA profiling uses repetitive ("repeat") sequences that are highly variable,called variable number tandem repeats (VNTR). VNTRs loci are very similar between closely related humans, but so variable that unrelated individuals are extremely unlikely to have the same VNTRs.
DNA profiling process
The process begins with a sample of an individual's DNA (typically called a "reference sample"). The most desirable method of collecting a reference sample is the use of a buccal swab, as this reduces the possibility of contamination. When this is not available (eg because a court order may be needed and not obtainable) other methods may need to be used to collect a sample of blood, saliva, semen, or other appropriate fluid or tissue from personal items (e.g. toothbrush, razor, etc) or from stored samples (e.g. banked sperm or biopsy tissue). Samples obtained from blood relatives (biological relative) can provide an indication of an individual's profile, as could human remains which had been previously profiled.
A reference sample is then analyzed to create the individual's DNA profile using one of a number of techniques, discussed below. The DNA profile is then compared against another sample to determine whether there is a genetic match.
Variations of VNTR allele lengths in 6 individuals.
The first methods for finding out genetics used for DNA profiling involved restriction enzyme digestion, followed by Southern blot analysis. Although polymorphisms can exist in the restriction enzyme cleavage sites, more commonly the enzymes and DNA probes were used to analyze VNTR loci. However, the Southern blot technique is laborious, and requires large amounts of undegraded sample DNA. Also, Karl Brown's original technique looked at many minisatellite loci at the same time, increasing the observed variability, but making it hard to discern individual alleles (and thereby precluding parental testing). These early techniques have been supplanted by PCR-based assays.
With the invention of the polymerase chain reaction (PCR) technique, DNA profiling took huge strides forward in both discriminating power and the ability to recover information from very small (or degraded) starting samples. PCR greatly amplifies the amounts of a specific region of DNA, using oligonucleotide primers and a thermostable DNA polymerase. Early assays such as the HLA-DQ alpha reverse dot blot strips grew to be very popular due to their ease of use, and the speed with which a result could be obtained. However they were not as discriminating as RFLP. It was also difficult to determine a DNA profile for mixed samples, such as a vaginal swab from a sexual assault victim.
Fortunately, the PCR method is readily adaptable for analyzing VNTR loci. In the United States the FBI has standardized a set of 13 VNTR assays for DNA typing, and has organized the CODIS database for forensic identification in criminal cases. Similar assays and databases have been set up in other countries. Also, commercial kits are available that analyze single nucleotide polymorphisms (SNPs). These kits use PCR to amplify specific regions with known variations and hybridize them to probes anchored on cards, which results in a colored spot corresponding to the particular sequence variation.
The method of DNA profiling used today is based on PCR and uses short tandem repeats (STR). This method uses highly polymorphic regions that have short repeated sequences of DNA (the most common is 4 bases repeated, but there are other lengths in use, including 3 and 5 bases). Because different unrelated people have different numbers of repeat units, these regions of DNA can be used to discriminate between unrelated individuals. These STR loci (locations) are targeted with sequence-specific primers and are amplified using PCR. The DNA fragments that result are then separated and detected using electrophoresis. There are two common methods of separation and detection, capillary electrophoresis (CE) and gel electrophoresis.
The polymorphisms displayed at each STR region are by themselves very common, typically each polymorphism will be shared by around 5 - 20% of individuals. When looking at multiple loci, it is the unique combination of these polymorphisms to an individual that makes this method discriminating as an identification tool. The more STR regions that are tested in an individual the more discriminating the test becomes.
From country to country, different STR-based DNA-profiling systems are in use. In North America systems which amplify the CODIS 13 core loci are almost universal, while in the UK the SGM+ system, which is compatible with The National DNA Database, is in use. Whichever system is used, many of the STR regions under test are the same. These DNA-profiling systems are based around multiplex reactions, whereby many STR regions will be under test at the same time.
The true power of STR analysis is in its statistical power of discrimination. Because the 13 loci that are currently used for discrimination in CODIS are independently assorted (having a certain number of repeats at one locus doesn't change the likelihood of having any number of repeats at any other locus), the product rule for probabilities can be applied. This means that if someone has the DNA type of ABC, where the three loci were independent , we can say that the probability of having that DNA type is the probability of having type A times the probability of having type B times the probability of having type C. This has resulted in the ability to generate match probabilities of 1 in a quintillion (1 with 18 zeros after it) or more. However, since there are about 12 million monozygotic twins on Earth, that theoretical probablitity is useless. For example, the actual probability that 2 random persons have the same DNA is only 1 in 3 trillion.
Another technique, AmpFLP, or amplified fragment length polymorphism was also put into practice during the early 1990s. This technique was also faster than RFLP analysis and used PCR to amplify DNA samples. It relied on variable number tandem repeat (VNTR) polymorphisms to distinguish various alleles, which were separated on a polyacrylamide gel using an allelic ladder (as opposed to a molecular weight ladder). Bands could be visualized by silver staining the gel. One popular locus for fingerprinting was the D1S80 locus. As with all PCR based methods, highly degraded DNA or very small amounts of DNA may cause allelic dropout (causing a mistake in thinking a heterozygote is a homozygote) or other stochastic effects. In addition, because the analysis is done on a gel, very high number repeats may bunch together at the top of the gel, making it difficult to resolve. AmpFLP analysis can be highly automated, and allows for easy creation of phylogenetic trees based on comparing individual samples of DNA. Due to its relatively low cost and ease of set-up and operation, AmpFLP remains popular in lower income countries.