What are microsatellites?

Microsatellites consist of a 1-6 base pair unit repeated in tandem to form an array: over 1 million exist in the human genome often embedded in gene introns, gene exons, and regulatory regions. Interestingly, the length of microsatellite arrays frequently change due to strand slip replication and heterozygote instability. These changes can influence gene expression by inducing Z-DNA and H-DNA folding; altering nucleosome positioning; and changing the spacing of DNA binding sites. For these reasons microsatellites have been called the tuning knobs of gene expression.

The 1000 Genomes Project (KGP) was launched in 2008 with the aim of creating the world's largest public catalogue of human genetic variation. Now finished, the complete data collection includes exome and whole genome sequencing data from over 2,500 individuals. These individuals in turn come from 26 worldwide populations belonging to 5 ethnicities (superpopulations).

We introduce the Comparative Analysis of Germline Microsatellites (CAGm) database; it is designed to assist with future studies of germline microsatellites and enhance our understanding of human genetic variation. Analysis is included for all germline microsatellites in whole exome sequencing of 2,529 individuals in the 1,000 genomes project. Users can query genotypes, view multiple sequence alignments, and download data for further analysis. The database has a wide range of additional capabilities. See our future manuscript for additional information.




Nicholas Kinney, PhD
Kyle Titus-Glover
Mike Liao
Arichanah Pulenthiran
Robin Varghese, PhD
Ramu Anandakrishnan, PhD
Harold Garner, PhD

Did you know?

CAGm also stands for an important class of poly-glutamine microsatellites. At least nine poly-glutamine disorders are known which are characterized by expansion of the CAG motif.

Verion and change log