What are microsatellites?
Microsatellites consist of a 1-6 base pair unit repeated in tandem to form an array: over 1 million exist in the human genome often embedded in gene introns, gene exons, and regulatory regions. Interestingly, the length of microsatellite arrays frequently change due to strand slip replication and heterozygote instability. These changes can influence gene expression by inducing Z-DNA and H-DNA folding; altering nucleosome positioning; and changing the spacing of DNA binding sites. For these reasons microsatellites have been called the tuning knobs of gene expression.
The 1000 Genomes Project (KGP) was launched in 2008 with the aim of creating the world's largest public catalogue of human genetic variation. Now finished, the complete data collection includes exome and whole genome sequencing data from over 2,500 individuals. These individuals in turn come from 26 worldwide populations belonging to 5 ethnicities (superpopulations).
We introduce the Comparative Analysis of Germline Microsatellites (CAGm) database; it is designed to assist with future studies of germline microsatellites and enhance our understanding of human genetic variation. Analysis is included for all germline microsatellites in whole exome sequencing of 2,529 individuals in the 1,000 genomes project. Users can query genotypes, view multiple sequence alignments, and download data for further analysis. The database has a wide range of additional capabilities. See our future manuscript for additional information.
Usage
- Search samples - browse the samples in the database
- a. Use the details link to list info about each sample
- b. The genotypes link will display genetic information about each sample
- i. The alignments link will allow you to verify each genotype
- Search micros - browse the microsatellites in the database
- a. The details link will list information about each microsatellite in the database
- Tables - show genetic tables for a microsatellite
- a. The details link will list information about each microsatellite
- b. Use the radio buttons to view genetic information about each microsatellite across all samples
- i. Control the grouping of genetic information with the additional radio buttons
- ii. When viewing a genetic table use the "show table details" button to list the individual samples
- iii. From there you can view the alignments to varify genotypes. Note: all genotypes in the tables require 6 reads minimum
- Search motifs - browse the microsatellites in coding regions based on their amino acid sequence, unit, or annotation
- a. Use the show links to list microsatellites having a motif of interest
- i. From here you have the option to view the genetic tables. Use the radio buttons to select microsatellites and their groupings
Example
- The KAT6B gene is a histone acetyltransferase on chromosome 10 74824927-75032623
- a. Search for this gene on the "search micros" page: link
- b. results show 48 microsatellites in this region
- c. Look up details using the "Details" button
- d. We find at least one of these in a coding region: link
- Now we look up the genetic tables for these microsatellites
- a. Search for this region on the "tables" page: link
- b. Use the radio buttons in the "pick" column to view genetic tables
- c. For the microsatellite at 75022148 we find that 21bp and 24bp alleles are both common: link
- i. or view by population: link
- ii. now view the genotypes: link
- d. The 18bp allele is only found in a few African individuals: link
- e. Find those individuals with the "show table details" button
- f. View the alignments for one of these individuals: link
- Lets look for all the coding microsatellites in this region
- a. Search for this region on the "search motifs" page: link
- b. We find two coding microsatellites in this region
- i. Click the show button to view them: link
- c. From here we can view the genetic tables, microsatellite details, and multiple sequence alignments.
Contributors
Nicholas Kinney, PhD
Kyle Titus-Glover
Mike Liao
Arichanah Pulenthiran
Robin Varghese, PhD
Ramu Anandakrishnan, PhD
Harold Garner, PhD
Did you know?
CAGm also stands for an important class of poly-glutamine microsatellites. At least nine poly-glutamine disorders are known which are characterized by expansion of the CAG motif.
Verion and change log
- Version 1.2 (current)
- a. added likelihood filter for genotypes 9/6/18
- Version 1.1
- a. add gene searching on all pages 7/4/18
- b. smart sorting on table details page 7/4/18
- c. annotations added to view microsatellites page 7/4/18
- d. improved chromosome searching on all pages 7/4/18
- Version 1.0
- a. added page jumping on view_micros.php 6/19/18
Code acknowledgements
Copyright © Plain and Simple 2007
Designed by edg3.co.uk
Sponsored by Open Designs
Valid CSS & XHTML