Skip to main content

Table 2 Sizes of inputs ad outputs for three different cell lines, and execution times (in minutes) for the TICA query over four cluster configurations

From: PyGMQL: scalable data extraction and analysis for heterogeneous genomic datasets

 GM12878HepG2K562
Input samples164224347
Distinct TFs116192268
Input regions3,003,1214,384,1816,101,933
Output samples13,45436,33071.612
Output regions109,858,355213,499,617381,255,507
Output size (MB)3,1226,064.10,921
1 node e. t. 26.7373.05246.85
3 nodes e. t. 10.4026.2891.27
5 nodes e. t. 7.2116.6759.12
10 nodes e. t. 4.759.6732.92