From: PyGMQL: scalable data extraction and analysis for heterogeneous genomic datasets
PyGMQL function | Description | GMQL operator |
---|---|---|
load_from_path | UTIL, loads a dataset from local repository | SELECT |
load_from_remote | UTIL, loads a dataset from remote repository | SELECT |
load_from_file | UTIL, loads a bed file from local repository | Â |
selectreg_selectmeta_select | UNOP, filters samples using region and/or metadata predicates | SELECT |
projectreg_projectmeta_project | UNOP, projects (in/out) attributes of regions or metadata. Creates new attributes by means of expressions | PROJECT |
extend | UNOP, creates a new metadata attribute by aggregation of region data | EXTEND |
covernormal_coverflat_coversummit_coverhistogram_cover | UNOP, collapses regions from several samples into regions of a single sample, based on min/max accumulation indexes | COVER |
order | UNOP, orders the samples of a dataset based on regions and/or metadata attributes | ORDER |
merge | UNOP, merges all the samples of a dataset into a single one | MERGE |
groupmeta_groupreg_group | UNOP, groups regions and/or metadata with the same values | GROUP |
join | BINOP, joins the regions of two datasets based on distance-based predicates | JOIN |
map | BINOP, computes aggregate values from overlapping regions of two datasets | MAP |
union | BINOP, builds the union of regions and metadata of two datasets | UNION |
difference | BINOP, keeps the regions of a dataset not intersecting with regions of another one | DIFFERENCE |
materialize | UTIL, triggers the query execution for the specified dataset and stores the result after query completion | MATERIALIZE |
head | UTIL, Shows the first lines of a dataset | Â |