Skip to main content

Advertisement

Table 2 Data sources

From: Seq4SNPs: new software for retrieval of multiple, accurately annotated DNA sequences, ready formatted for SNP assay design

Datum Source Notes Step
SNP rs or ss number* User input File or text input 1
Trivial name User input Same file as above 1
Size of assay sequence User input e.g. 200 specifies 200 nucleotides each side of assay SNP (401 altogether) 1
New rs number NCBI dbSNP cluster page* New rs retrieved when rs no longer in use** or if ss number submitted*** 2
Fasta sequence, allele, ditto Fasta output with allele in header (major allele first) 2
Major allele, validation of assay, heterozygosity ditto 'Allele' report.  
Fasta sequence (second attempt) NCBI contig fasta sequence**** If sequence in cluster page too short: contig reference from cluster page* 2
Gene, chromosome NCBI cluster page* 'Gene' report 2
Masked sequences RepeatMasker (see text) Takes fasta output above and produces fasta for next step. 3
Platform User input Choose TaqMan, SNPstream or Sequenom 3
Chromosome position, adjacent SNP list, with 21 nucleotide sequence etc. Mysql local database with dbSNP data Annotation of assay sequence using Seq4SNP algorithm 4
Validation, heterozygosity Ditto Part of Adjacent SNP Report (Fig 3) detailing each SNP and flagging placement mismatches 4
SNP assay sequences   Final output compatible with assay designers 4
  1. Data used by Seq4SNPs is drawn from various sources, listed here: Seq4SNPs inputs (italics), outputs (bold). Some items are taken from web pages accessed by the universal resource locator (URL), or FTP download sites, shown below.
  2. Example URLs:
  3. *dbSNP rs cluster page: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=123
  4. **New rs number: http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp&cmd=search&term=rs840 (if cluster page not available)
  5. ***rs number for ss:
  6. http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp&cmd=search&term=ss19333593
  7. ****NCBI contig download:
  8. http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucletide&val=NT_007819.16&dopt=fasta&from=24454804&to=24456004
  9. dbSNP downloads (human): fasta sequences and chromosome positions respectively from
  10. ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/rs_fasta
  11. ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/chr_rpts
  12. Note that to extend Seq4SNPs adjacent SNP addition to other species, data from other species may be downloaded from the organisms folder and put into the MySQL database with the human SNPs