Skip to main content

Table 1 RNA-Seq datasets and computing resources used for each RNA-Seq data

From: K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity

organism # of reads # of unique k-mers Computing resource data source
    MR-Inchworm Original Inchworm  
mouse 105,290,476 746,811,557 iDataplex-nextscale iDataplex-nextscale:single node (64GB mem) [22]
sugarbeet 129,832,549 2,213,519,875 iDataplex-nextscale iDataplex:single node (256GB mem) unpublished data
wheat 1,468,701,119 5,775,799,648 iDataplex-nextscale iDataplex:single vSMP node (4 TB mem) cerated by ScaleMP unpublished data
  1. All datasets are pair-end datasets, in which only mouse dataset is strand-specific.iDataplex-nextscale cluster is known as BlueWonder-NextScale, consisting of 360 nodes each with 2 × 12 core Intel Xeon processors (E5-2697v2 2.7GHz) and 64GB RAM making total 8640 cores in total. iDataplex cluster is known as “BlueWonder”, consisting of 512 nodes each with 2 × 8 core Intel SandyBridge processors (2.6 Ghz) making 8192 cores in total. Original Inchworm with sugarbeet dataset was run using a single iDataplex node with 256GB memory. Original Inchworm with wheat dataset was run using a single vSMP node with 4 Tb memory created by ScalewMP software (http://www.scalemp.com) on iDataplex. ScaleMP creates a virtual symmetric multiprocessing (vSMP) node for shared memory by aggregating multiple compute nodes