Skip to main content

Table 1 Genometric RGMQL functions with their extension over already existing R functions and mapping to corresponding GMQL operators

From: RGMQL: scalable and interoperable computing of heterogeneous omics big data and metadata in R/Bioconductor

R package of origin RGMQL function GMQL operator Brief description
dplyr arrange() ORDER It orders samples sample regions based on metadata region attributes
dplyr collect() MATERIALIZE Itsaves persistently the content of any dataset obtained after query completion
dplyr filter() SELECT It extracts a subset of samples sample regions using region metadata predicates
dplyr group_by() GROUP It groups samples sample regions based on region metadata attributes with the same value
dplyr select() PROJECT It selects region metadata attributes to be kept and can update create metadata region attributes
dplyr setdiff() DIFFERENCE It discards the regions of the first dataset intersecting regions of the second one
dplyr union() UNION It puts together samples of two datasets keeping as region attributes those of the first one
base merge() JOIN It returns a dataset by joining the regions of two datasets based on distance region predicates
stats aggregate() MERGE It combines all the samples of a dataset into a single sample
cover() COVER It collapses the samples of a dataset into a single sample based on specified rules
execute() It launches the query execution
extend() EXTEND It generates new metadata attributes for each sample from aggregations applied to region attributes
map() MAP It computes aggregated values from overlapping regions of two datasets