The Roots of Genomic Individuality
It came as a surprise: The genomes of two individuals differ remarkably in regions made up of certain DNA segments in variable copy numbers.

"Genetic variation in the human genome takes many forms, ranging from large, microscopically visible chromosome anomalies to single-nucleotide changes." An international consortium of scientists opens with this statement in its article on the identification of another marker of individuality that was published this month in
Nature. (Vol. 444, 444-54).
The most recent magic buzzword in research on individual genetic variety is "copy number variants", or CNVs. These are defined as DNA segments of more than 500 base pairs, which are present at a variable copy number in comparison with a reference genome. Due to events of deletion, insertion or duplication there are loads of CNVs in the genome and their number strongly varies between individuals. One person may carry one copy of a CNV; the next may have two, three or more - or no copy at all. And sometimes the segments are shuffled up in different ways. In their paper the authors now report the first comprehensive map of CNVs and this way contribute significantly to our understanding of the variation in everybody's "book of life".
A similar approach had already been taken within the international HapMap Project. Here the focus was on single nucleotide polymorphisms (SNPs); single base pair changes that distinguish any two unrelated copies of the genome. Using DNA samples from 270 individuals who were part of the HapMap Project, the researchers now screened for the much more complex genetic variation in the form of CNVs. The collection of DNA sequences comprised four populations: European descents from Utah, USA, the Yoruba from Nigeria, Japanese from Tokyo and Han Chinese from Beijing.
The consortium applied two complementary technologies, described in more detail in two more papers published in
Genome Research (Epub, November 22). The first was a genotyping approach in which some 500,000 SNPs were assayed, looking for stretches of adjacent SNPs that occurred in levels different from the expected ratios. An atypical ratio of the two versions (allele) of a given SNP pointed at the sought-after variations. The second technology used by the scientists was to compare each sample with a reference standard, looking for differences in the copy number. During a clone-based comparative genomic hybridisation (CGH) the team used more than 26,000 large-insert clones representing 93.7% of the euchromatic portion of the human genome. The combination of both approaches was adequate to detect most forms of CNVs and finally the scientists identified a total of 1,447 CNVs - an unexpectedly high number.
Thus, the "book of life" not only varies in single letters but also in whole sentences and paragraphs. "Each one of us has a unique pattern of gains and losses of complete sections of DNA," said Matthew Hurles, one of the project leaders at the Wellcome Trust Sanger Institute, Cambridge, UK, in a press release by the Institute. But the real surprise was the amount of DNA that varies in copy number. "We estimate this to be at least 12% of the genome, similar in extent to SNPs". This presents some 5 to 10-fold more variation between any two randomly chosen genomes than suggested previously by studying SNPs alone. Even if the SNP maps produced by HapMap and other approaches are invaluable, they actually miss most CNVs. Some CNVs may be associated with their neighbouring SNPs over time but others cannot be tagged that easily.
More than half of the CNVs identified overlap known annotated genes. The comparison with a database of disease-related genes showed an association with CNVs for 10% of these genes. Therefore, it seems to be likely that the variations in copy number not only make each one of us unique but also play a role in several diseases. In cases in which the amount of a functional product is critical, CNVs might underscore the variation in susceptibility to the disease. The effect of the copy number of the CCL3L1 chemokine gene on susceptibility to AIDS is just one example.
Beside this, CNVs could also be used for a more efficient hunting of genes underlying diseases as well as help study familial genetic conditions. Furthermore, the CNV map will help to narrow down those regions of chromosomal rearrangements that are involved in developmental effects by excluding variations found in unaffected individuals. The analysis of the underlying mechanisms themselves could also benefit from the CNV map.
Finally, the map sheds light on human history. Even if, because of our recent common origin in Africa, around 89% of the CNVs are shared among the diverse human populations studied, their individual pattern reflects our ancestry and can be used to infer, for instance, in which of the three continental populations our recent ancestry lies. Typical differences will help to define variants that played a role during adaptation to the environment.
Nevertheless, the 1,447 CNVs reported by Redon
et al. probably only cover the tip of the iceberg due to the limited set of reference samples. "We think that we are at the stage where we can confidently detect CNVs of 50 kilobases or more, but ... the overall aim must be to increase resolution by two orders of magnitude such that we can detect CNVs of 500 base pairs," Hurles told
The Scientist.
As published in another paper in
Nature Genetics (Epub, November 22) an alternative route to the complete map of DNA variations might be the direct comparison of whole genomes. In doing so, the consortium confirmed more than 1.5 million SNPs and 240 variable regions of different kinds.
So it looks like genomic individuality won't be completely captured for a long time yet ... but that, at least, doesn't come as a big surprise, does it?