Snip

Hacking on 23&Me and DbSNP

What is is in DbSNP that we care about?

  • Single-nucleotide Polymorphisms

    A change (deletion, insertion or replacement) in a single nucleotide base pair, which is found in at least 1% of the population.

  • GWAS Allele Frequencies

    Genome-wide Association Studies give the percentage of the studied population who have the SNP, i.e. how rare it is.

  • Clinical Significance of SNP

    How likely the SNP is to cause a disease, based off of a universal standard.

What does 23&Me provide us with?

Our Genotype information at various 'ref SNPs', specific locations on the genome that someone has deemed significant.

rsid    chromosome  position    genotype
# Typical 'rsid' ID that starts with 'rs'
rs12564807  1   734462  AA
rs3131972   1   752721  AG
rs148828841 1   760998  CC
rs12124819  1   776546  AA
rs115093905 1   787173  --
rs11240777  1   798959  GG
rs7538305   1   824398  --
rs4970383   1   838555  CC
rs4475691   1   846808  CT
rs7537756   1   854250  AA
rs13302982  1   861808  AG
rs55678698  1   864490  CC
# 23&Me's internal, experiemental ref_snps, starts with 'i'
i6019299    1   871267  CC

                

Clinical Significance of Alleles

We probably only care about:

  • Pathogenic

  • Likely Pathogenic

  • Drug Response

  • Protective

  • Risk Factor

snip=# SELECT Count(*) cnt, clinical_significance_csv
FROM ref_snp_allele_clin_diseases
GROUP BY clinical_significance_csv
HAVING COUNT(*) > 25;
  cnt   |                 clinical_significance_csv
--------+-----------------------------------------------------------
    119 | affects
    181 | association
  53646 | benign
  14252 | conflicting-interpretations-of-pathogenicity
    543 | drug-response
 101029 | likely-benign
  24631 | likely-pathogenic
  13958 | not-provided
    293 | not-provided,conflicting-interpretations-of-pathogenicity
   2173 | other
  63192 | pathogenic
     66 | protective
    824 | risk-factor
 186296 | uncertain-significance
(14 rows)
                

What's Next??

  • OpenSpace tomorrow (Sat.) at 4:00PM, Room 19

    Come checkout the actual implementation! Walk away with a JSON file with your results. Looking for contributors..

  • JSON API Service

    Provide a JSON API service to provide research and information that pertain to your individual genome.

  • Single-page App Interface

    We need a front-end to interface with this data.