Skip to main content

Table 1 A brief summary of existing tools for simulating DNA sequencing data

From: SimuSCoP: reliably simulate Illumina sequencing data based on position and context dependent profiles

Simulator

Layoutb

Output

Language

Genomic variation

Tumor sample

GC bias

Profiles

Sequencing strategyc

Ref

SNV

CNV

Indel

Impurity

Aneuploidy

Intra-tumor heterogeneity

Position dependent

Context dependent

ART

SE, PE

FQ, SAM

C++, Perl

       

X

 

G

[7]

Grinder

SE, PE

FQ, FA

Perl

 

X

     

X

 

G

[8]

pIRS

PE

FQ

C++, Perl

X

X

X

   

X

X

 

G

[9]

GemSIM

SE, PE

FQ, SAM

Python

X

      

X

X

G

[10]

Wessima

SE, PE

FQ, SAM

Python

      

X

X

X

E

[11]

NeSSM

SE, PE

FQ

C, Perl

      

X

X

 

G

[12]

BEAR

SE, PE

FQ

Perl, Python

       

X

 

G

[13]

FASTQSim

SE

FQ

Python

       

X

 

G

[14]

SInC

PE

FQ

C

X

X

X

    

X

 

G

[15]

SCNVSima

SE, PE

FQ

Java

X

X

X

X

X

X

 

X

 

G

[16]

NEAT

SE, PE

FQ

Python

X

X

X

   

X

X

 

G, E

[17]

IntSIM

SE, PE

FQ

C++, Perl, R

X

X

X

X

X

X

X

X

 

G

[18]

Pysim-sva

SE, PE

FQ

Python

X

X

X

X

X

X

X

X

 

G

[19]

InSilicoSeq

PE

FQ

Python

      

X

X

 

G

[20]

SimuSCoP

SE, PE

FQ

C++

X

X

X

X

X

X

X

X

X

G, E

 
  1. X: a given functional capability is supported by a simulator. a: these tools depend on third party NGS read simulator to generate reads. b: SE denotes single end and PE represents paired-end. c: G denotes whole-genome sequencing, and E indicates target or exome sequencing