Skip to main content

Table 2 Data sources

From: Seq4SNPs: new software for retrieval of multiple, accurately annotated DNA sequences, ready formatted for SNP assay design

Datum

Source

Notes

Step

SNP rs or ss number*

User input

File or text input

1

Trivial name

User input

Same file as above

1

Size of assay sequence

User input

e.g. 200 specifies 200 nucleotides each side of assay SNP (401 altogether)

1

New rs number

NCBI dbSNP cluster page*

New rs retrieved when rs no longer in use** or if ss number submitted***

2

Fasta sequence, allele,

ditto

Fasta output with allele in header (major allele first)

2

Major allele, validation of assay, heterozygosity

ditto

'Allele' report.

 

Fasta sequence (second attempt)

NCBI contig fasta sequence****

If sequence in cluster page too short: contig reference from cluster page*

2

Gene, chromosome

NCBI cluster page*

'Gene' report

2

Masked sequences

RepeatMasker (see text)

Takes fasta output above and produces fasta for next step.

3

Platform

User input

Choose TaqMan, SNPstream or Sequenom

3

Chromosome position, adjacent SNP list, with 21 nucleotide sequence etc.

Mysql local database with dbSNP data

Annotation of assay sequence using Seq4SNP algorithm

4

Validation, heterozygosity

Ditto

Part of Adjacent SNP Report (Fig 3) detailing each SNP and flagging placement mismatches

4

SNP assay sequences

 

Final output compatible with assay designers

4

  1. Data used by Seq4SNPs is drawn from various sources, listed here: Seq4SNPs inputs (italics), outputs (bold). Some items are taken from web pages accessed by the universal resource locator (URL), or FTP download sites, shown below.
  2. Example URLs:
  3. *dbSNP rs cluster page: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=123
  4. **New rs number: http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp&cmd=search&term=rs840 (if cluster page not available)
  5. ***rs number for ss:
  6. http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp&cmd=search&term=ss19333593
  7. ****NCBI contig download:
  8. http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucletide&val=NT_007819.16&dopt=fasta&from=24454804&to=24456004
  9. dbSNP downloads (human): fasta sequences and chromosome positions respectively from
  10. ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/rs_fasta
  11. ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/chr_rpts
  12. Note that to extend Seq4SNPs adjacent SNP addition to other species, data from other species may be downloaded from the organisms folder and put into the MySQL database with the human SNPs