Skip to main content

Table 2 Comparison between Spark SQL and existing software methods that are used to retrieve genomic data. Complex queries that today need more than a hundred lines of code to be implemented, take an order of magnitude fewer lines of code on Spark SQL without performance sacrifices

From: GenAp: a distributed SQL interface for genomic data

Software tool

Lines of code

Runtime (min)

Spark SQL

11

16

State of the art software

BEDtools: 1

26

 

samtols API based code: 130

1

 

total: 131

27