Skip to main content

Table 2 Additional RGMQL functions to handle initialization, remote data exploration, processing and result conversions

From: RGMQL: scalable and interoperable computing of heterogeneous omics big data and metadata in R/Bioconductor

Function type RGMQL function Brief description Input dataset Output dataset Remote processing required
FUNCTIONS TO HANDLE, READ AND ANALYZE LOCAL AND REMOTE DATASETS, PROVINDING ALSO USEFUL CONVERSIONS delete_dataset() It deletes a private dataset from remote repository Remote dataset YES
download_dataset() It downloads a private dataset from remote repository to local path Remote dataset Local dataset YES
download_as_GRangesList() It downloads a private dataset into R environment as a GRangesList Remote dataset GRangesList YES
export_gmql() It creates a GDM-like dataset from a GRangesList GRangesList Local dataset NO
filter_and_extract() It filters based on metadata predicates and generates a new GRanges with a chosen list of region attributes. It works if samples have their region coordinates (chr, ranges, strand) in the same order Local dataset/ GRangesList GRanges NO
import_gmql() It creates a GRangesList from a GDM-like dataset Local dataset GRangesList NO
read_gmql() It reads a GMQLDataset from a dataset (with a valid format) on disk, or from the remoterepository in case of remote processing Local/Remote dataset GMQLDataset YES, if is_local = FALSE
read_GRangesList() It reads a GMQLDataset from a GRangeList GRangesList GMQLDataset NO
sample_metadata() It retrieves metadata of a specific sample in a dataset Remote dataset YES
sample_region() It retrieves regions data of a specific sample in a dataset Remote dataset YES
semijoin() It supports the filter method defining semijoin conditions on metadata NO
show_datasets_list() It shows all GMQL datasets in remote repository, both public or privately stored by the user YES
show_all_metadata() It shows all metadata of a given GMQL dataset either locally or in the remote repository NO
show_samples_list() It show all samples of a GMQL dataset on the remote repository YES
show_schema() It shows the region attribute schema of a GMQL dataset on the remote repository YES
take() It saves as a GRangesList any dataset resulting from local processing. If invoked after collect(), the dataset is materialized also in local File System GMQLDataset GRangesList NO, only for local processing
upload_dataset() It uploads a dataset (GDM or not), and a corresponding GMQL dataset is created on the remote repository Local dataset Remote dataset YES
FUNCTIONS TO HANDLE GMQL SERVER AND MONITOR REMOTE JOBS, IF NEEDED init_gmql() It initializes and runs GMQL server to execute any processing, and also performs a login to GMQL REST services suite, if needed NO
login_gmql() Login to GMQL REST services suite as a registered user, specifying username and password, or as guest YES
logout_gmql() Logout from GMQL REST services suite YES
register_gmql() Register to GMQL REST services suite YES
remote_processing() It allows to enable or disable remote processing YES
show_jobs_log() It shows the log of a specific job YES
trace_job() It traces a specific job YES
show_job_list() It shows all jobs (run, succeded or failed) invoked by the user on the remote GMQL server YES
show_queries_list() It shows all the GMQL queries saved by the user on the remote repository YES
stop_gmql() It stops the GMQL server processing NO
stop_job() It stops a specific job YES
FUNCTIONS USING QUERIES IN GMQL SYNTAX compile_query() It compiles a GMQL query inserted as a text string YES
compile_query_fromfile() It compiles a GMQL query taken from a file YES
run_query() It runs a GMQL query inserted as a text string YES
run_query_fromfile() It runs a GMQL query taken from a file YES
save_query() It saves into the remote repository a GMQL query, taken from a file YES
save_query_fromfile() It saves into the remote repository a GMQL query, inserted as a text string YES
  1. For each function, we report if it requires remote resources and processing, as well as the formats of its input and output data