TY - JOUR AU - Giancarlo, R. AU - Scaturro, D. AU - Utro, F. PY - 2009 DA - 2009// TI - Textual data compression in computational biology: a synopsis JO - Bioinformatics VL - 25 UR - https://doi.org/10.1093/bioinformatics/btp117 DO - 10.1093/bioinformatics/btp117 ID - Giancarlo2009 ER - TY - JOUR AU - Hsi-Yang Fritz, M. AU - Leinonen, R. AU - Cochrane, G. AU - Birney, E. PY - 2011 DA - 2011// TI - Efficient storage of high throughput DNA sequencing data using reference-based compression JO - Genome Res VL - 21 UR - https://doi.org/10.1101/gr.114819.110 DO - 10.1101/gr.114819.110 ID - Hsi-Yang Fritz2011 ER - TY - JOUR AU - Pavlichin, D. AU - Weissman, T. AU - Mably, G. PY - 2018 DA - 2018// TI - The quest to save genomics: Unless researchers solve the looming data compression problem, biomedical science could stagnate JO - IEEE Spectr VL - 55 UR - https://doi.org/10.1109/MSPEC.2018.8449046 DO - 10.1109/MSPEC.2018.8449046 ID - Pavlichin2018 ER - TY - JOUR AU - Hernaez, M. AU - Pavlichin, D. AU - Weissman, T. AU - Ochoa, I. PY - 2019 DA - 2019// TI - Genomic data compression JO - Annu Rev Biomed Data Sci VL - 2 UR - https://doi.org/10.1146/annurev-biodatasci-072018-021229 DO - 10.1146/annurev-biodatasci-072018-021229 ID - Hernaez2019 ER - TY - STD TI - Collet Yann. LZ4; 2011 (Available from: https://github.com/lz4/lz4). UR - https://github.com/lz4/lz4 ID - ref5 ER - TY - STD TI - Seward Julian. BZIP2; 1996 (Available from: http://www.bzip.org/). UR - http://www.bzip.org/ ID - ref6 ER - TY - JOUR AU - Giancarlo, R. AU - Rombo, S. E. AU - Utro, F. PY - 2014 DA - 2014// TI - Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies JO - Brief Bioinform VL - 15 UR - https://doi.org/10.1093/bib/bbt088 DO - 10.1093/bib/bbt088 ID - Giancarlo2014 ER - TY - JOUR AU - Numanagić, I. AU - Bonfield, J. K. AU - Hach, F. AU - Voges, J. AU - Ostermann, J. AU - Alberti, C. PY - 2016 DA - 2016// TI - Comparison of high-throughput sequencing data compression tools JO - Nat Methods VL - 13 UR - https://doi.org/10.1038/nmeth.4037 DO - 10.1038/nmeth.4037 ID - Numanagić2016 ER - TY - JOUR AU - Kahn, S. D. PY - 2011 DA - 2011// TI - On the future of genomic data JO - Science VL - 331 UR - https://doi.org/10.1126/science.1197891 DO - 10.1126/science.1197891 ID - Kahn2011 ER - TY - JOUR AU - Dean, J. AU - Ghemawat, S. PY - 2008 DA - 2008// TI - MapReduce: simplified data processing on large clusters JO - Commun ACM VL - 51 UR - https://doi.org/10.1145/1327452.1327492 DO - 10.1145/1327452.1327492 ID - Dean2008 ER - TY - BOOK AU - White, T. PY - 2015 DA - 2015// TI - Hadoop: the definitive guide PB - O’Reilly CY - Beijing ID - White2015 ER - TY - STD TI - Chambers B, Zaharia M. Spark: The definitive guide: Big data processing made simple. “O’Reilly Media, Inc.”; 2018. ID - ref12 ER - TY - STD TI - Cattaneo G, Giancarlo R, Ferraro Petrillo U, Roscigno G. MapReduce in Computational Biology Via Hadoop and Spark. In: Ranganathan, S N, K SC, Gribskov M, editors. Encyclopedia of Bioinformatics and Computational Biology. vol. 1. Oxford: Elsevier; 2019, p. 221–229. ID - ref13 ER - TY - STD TI - Shi H, Zhu Y, Samsudin J. Reference-based data compression for genome in cloud. In: Proceedings of the 2nd International Conference on Communication and Information Processing; 2016. p. 55–59. ID - ref14 ER - TY - JOUR AU - Chandak, S. AU - Tatwawadi, K. AU - Ochoa, I. AU - Hernaez, M. AU - Weissman, T. PY - 2019 DA - 2019// TI - SPRING: a next-generation compressor for FASTQ data JO - Bioinformatics VL - 35 UR - https://doi.org/10.1093/bioinformatics/bty1015 DO - 10.1093/bioinformatics/bty1015 ID - Chandak2019 ER - TY - JOUR AU - Roguski, Ł. AU - Deorowicz, S. PY - 2014 DA - 2014// TI - DSRC 2: Industry-oriented compression of FASTQ files JO - Bioinformatics VL - 30 UR - https://doi.org/10.1093/bioinformatics/btu208 DO - 10.1093/bioinformatics/btu208 ID - Roguski2014 ER - TY - STD TI - Collet Yann. ZSTD; 2015. (Available from: https://github.com/facebook/zstd). UR - https://github.com/facebook/zstd ID - ref17 ER - TY - STD TI - Bonfield JK, Mahoney MV. Compression of FASTQ and SAM format sequencing data. PloS one. 2013;8(3). ID - ref18 ER - TY - JOUR AU - Pinho, A. J. AU - Pratas, D. PY - 2013 DA - 2013// TI - MFCompress: a compression tool for FASTA and multi-FASTA data JO - Bioinformatics VL - 30 UR - https://doi.org/10.1093/bioinformatics/btt594 DO - 10.1093/bioinformatics/btt594 ID - Pinho2013 ER - TY - STD TI - Wegrzyn JL, Lin BY, Zieve JJ, Dougherty WM, Martinez-Garcia PJ, Koriabine M, et al. Insights into the loblolly pine genome: characterization of BAC and fosmid sequences. PLoS One. 2013;8(9). ID - ref20 ER - TY - JOUR AU - Bentley, D. R. AU - Balasubramanian, S. AU - Swerdlow, H. P. AU - Smith, G. P. AU - Milton, J. AU - Brown, C. G. PY - 2008 DA - 2008// TI - Accurate whole human genome sequencing using reversible terminator chemistry JO - Nature VL - 456 UR - https://doi.org/10.1038/nature07517 DO - 10.1038/nature07517 ID - Bentley2008 ER - TY - JOUR AU - Ferraro Petrillo, U. AU - Roscigno, G. AU - Cattaneo, G. AU - Giancarlo, R. PY - 2017 DA - 2017// TI - FASTdoop: a versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications JO - Bioinformatics (Oxford, Engl) VL - 33 ID - Ferraro Petrillo2017 ER - TY - STD TI - Burrows M, Wheeler DJ. A block-sorting lossless data compression algorithm; 1994. ID - ref23 ER -