Skip to main content

Table 1 Genometric RGMQL functions with their extension over already existing R functions and mapping to corresponding GMQL operators

From: RGMQL: scalable and interoperable computing of heterogeneous omics big data and metadata in R/Bioconductor

R package of origin

RGMQL function

GMQL operator

Brief description

dplyr

arrange()

ORDER

It orders samples sample regions based on metadata region attributes

dplyr

collect()

MATERIALIZE

Itsaves persistently the content of any dataset obtained after query completion

dplyr

filter()

SELECT

It extracts a subset of samples sample regions using region metadata predicates

dplyr

group_by()

GROUP

It groups samples sample regions based on region metadata attributes with the same value

dplyr

select()

PROJECT

It selects region metadata attributes to be kept and can update create metadata region attributes

dplyr

setdiff()

DIFFERENCE

It discards the regions of the first dataset intersecting regions of the second one

dplyr

union()

UNION

It puts together samples of two datasets keeping as region attributes those of the first one

base

merge()

JOIN

It returns a dataset by joining the regions of two datasets based on distance region predicates

stats

aggregate()

MERGE

It combines all the samples of a dataset into a single sample

–

cover()

COVER

It collapses the samples of a dataset into a single sample based on specified rules

–

execute()

–

It launches the query execution

–

extend()

EXTEND

It generates new metadata attributes for each sample from aggregations applied to region attributes

–

map()

MAP

It computes aggregated values from overlapping regions of two datasets