TY - STD TI - NCBI: NCBI-GenBank Flat File Release 159 Release Notes.[ftp://ftp.ncbi.nih.gov/genbank/release.notes/gb159.release.notes] UR - ftp://ftp.ncbi.nih.gov/genbank/release.notes/gb159.release.notes ID - ref1 ER - TY - STD TI - NCBI News: GenBank Passes the 100 Gigabase Mark.In NCBI News Edited by: Benson D and Wheeler D. [http://www.ncbi.nlm.nih.gov/Web/Newsltr/V14N2/100gig.html] UR - http://www.ncbi.nlm.nih.gov/Web/Newsltr/V14N2/100gig.html ID - ref2 ER - TY - JOUR AU - Ziv, J. AU - Lempel, A. PY - 1977 DA - 1977// TI - Universal Algorithm for Sequential Data Compression JO - IEEE Transactions on Information Theory VL - 23 UR - https://doi.org/10.1109/TIT.1977.1055714 DO - 10.1109/TIT.1977.1055714 ID - Ziv1977 ER - TY - STD TI - Gailly J, Adler M: gzip (GNU zip) compression utility.[http://www.gnu.org/software/gzip/] UR - http://www.gnu.org/software/gzip/ ID - ref4 ER - TY - CHAP AU - Matsumoto, T. AU - Sadakane, K. AU - Imai, H. PY - 2000 DA - 2000// BT - Biological sequence compression algorithms: December 18-19; Tokyo. ID - Matsumoto2000 ER - TY - CHAP AU - Grumbach, S. AU - Tahi, F. ED - Storer, J. A. ED - Cohn, M. PY - 1993 DA - 1993// BT - Compression of DNA sequences: 30 March-2 April; Snowbird, Utah. ID - Grumbach1993 ER - TY - JOUR AU - Grumbach, S. AU - Tahi, F. PY - 1994 DA - 1994// TI - A New Challenge for Compression Algorithms - Genetic Sequences JO - Inf Process Manage VL - 30 UR - https://doi.org/10.1016/0306-4573(94)90014-0 DO - 10.1016/0306-4573(94)90014-0 ID - Grumbach1994 ER - TY - JOUR AU - Chen, X. AU - Kwong, S. AU - Li, M. PY - 2001 DA - 2001// TI - A compression algorithm for DNA sequences JO - IEEE Engineering in Medicine and Biology Magazine VL - 20 UR - https://doi.org/10.1109/51.940049 DO - 10.1109/51.940049 ID - Chen2001 ER - TY - JOUR AU - Chen, X. AU - Li, M. AU - Ma, B. AU - Tromp, J. PY - 2002 DA - 2002// TI - DNACompress: fast and effective DNA sequence compression JO - Bioinformatics VL - 18 UR - https://doi.org/10.1093/bioinformatics/18.12.1696 DO - 10.1093/bioinformatics/18.12.1696 ID - Chen2002 ER - TY - JOUR AU - Li, M. AU - Badger, J. H. AU - Chen, X. AU - Kwong, S. AU - Kearney, P. AU - Zhang, H. Y. PY - 2001 DA - 2001// TI - An information-based sequence distance and its application to whole mitochondrial genome phylogeny JO - Bioinformatics VL - 17 UR - https://doi.org/10.1093/bioinformatics/17.2.149 DO - 10.1093/bioinformatics/17.2.149 ID - Li2001 ER - TY - JOUR AU - Kocsor, A. AU - Kertesz-Farkas, A. AU - Kajan, L. AU - Pongor, S. PY - 2006 DA - 2006// TI - Application of compression-based distance measures to protein sequence classification: a methodological study JO - Bioinformatics VL - 22 UR - https://doi.org/10.1093/bioinformatics/bti806 DO - 10.1093/bioinformatics/bti806 ID - Kocsor2006 ER - TY - JOUR AU - Ma, B. AU - Tromp, J. AU - Li, M. PY - 2002 DA - 2002// TI - PatternHunter: faster and more sensitive homology search JO - Bioinformatics VL - 18 UR - https://doi.org/10.1093/bioinformatics/18.3.440 DO - 10.1093/bioinformatics/18.3.440 ID - Ma2002 ER - TY - JOUR AU - Strelets, V. B. AU - Lim, H. A. PY - 1995 DA - 1995// TI - Compression of Protein-Sequence Databases JO - Comput Appl Biosci VL - 11 ID - Strelets1995 ER - TY - JOUR AU - Wu, C. H. AU - Yeh, L. S. L. AU - Huang, H. Z. AU - Arminski, L. AU - Castro-Alvear, J. AU - Chen, Y. X. AU - Hu, Z. Z. AU - Kourtesis, P. AU - Ledley, R. S. AU - Suzek, B. E. AU - Vinayaka, C. R. AU - Zhang, J. AU - Barker, W. C. PY - 2003 DA - 2003// TI - The Protein Information Resource JO - Nucleic Acids Res VL - 31 UR - https://doi.org/10.1093/nar/gkg040 DO - 10.1093/nar/gkg040 ID - Wu2003 ER - TY - CHAP AU - Katz, P. PY - 1990 DA - 1990// BT - PKZIP PB - PKWARE, Inc. CY - Milwaukee, WI, USA ID - Katz1990 ER - TY - JOUR AU - Li, W. Z. AU - Jaroszewski, L. AU - Godzik, A. PY - 2001 DA - 2001// TI - Clustering of highly homologous sequences to reduce the size of large protein databases JO - Bioinformatics VL - 17 UR - https://doi.org/10.1093/bioinformatics/17.3.282 DO - 10.1093/bioinformatics/17.3.282 ID - Li2001 ER - TY - JOUR AU - Li, W. Z. AU - Jaroszewski, L. AU - Godzik, A. PY - 2002 DA - 2002// TI - Tolerating some redundancy significantly speeds up clustering of large protein databases JO - Bioinformatics VL - 18 UR - https://doi.org/10.1093/bioinformatics/18.1.77 DO - 10.1093/bioinformatics/18.1.77 ID - Li2002 ER - TY - JOUR AU - Li, W. Z. AU - Godzik, A. PY - 2006 DA - 2006// TI - Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences JO - Bioinformatics VL - 22 UR - https://doi.org/10.1093/bioinformatics/btl158 DO - 10.1093/bioinformatics/btl158 ID - Li2006 ER - TY - STD TI - nrdb[http://blast.wustl.edu/pub/nrdb/] UR - http://blast.wustl.edu/pub/nrdb/ ID - ref19 ER - TY - JOUR AU - Thompson, J. D. AU - Higgins, D. G. AU - Gibson, T. J. PY - 1994 DA - 1994// TI - Clustal-W - Improving the Sensitivity of Progressive Multiple Sequence Alignment through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice JO - Nucleic Acids Res VL - 22 UR - https://doi.org/10.1093/nar/22.22.4673 DO - 10.1093/nar/22.22.4673 ID - Thompson1994 ER - TY - JOUR AU - Foulds, L. R. AU - Graham, R. L. PY - 1982 DA - 1982// TI - The Steiner problem in phylogeny is NP-complete JO - Advances in Applied Mathematics VL - 3 UR - https://doi.org/10.1016/S0196-8858(82)80004-3 DO - 10.1016/S0196-8858(82)80004-3 ID - Foulds1982 ER - TY - JOUR AU - Chazelle, B. PY - 2000 DA - 2000// TI - A minimum spanning tree algorithm with Inverse Ackermann type complexity JO - Journal of the ACM VL - 47 UR - https://doi.org/10.1145/355541.355562 DO - 10.1145/355541.355562 ID - Chazelle2000 ER - TY - JOUR AU - Ferragina, P. AU - Manzini, G. PY - 2005 DA - 2005// TI - Indexing compressed text JO - J ACM VL - 52 UR - https://doi.org/10.1145/1082036.1082039 DO - 10.1145/1082036.1082039 ID - Ferragina2005 ER - TY - CHAP AU - Russo, L. M. S. AU - Oliveira, A. L. PY - 2006 DA - 2006// TI - A compressed self-index using a Ziv-Lempel dictionary BT - String Processing and Information Retrieval, Proceedings PB - SPRINGER-VERLAG BERLIN CY - Berlin UR - https://doi.org/10.1007/11880561_14 DO - 10.1007/11880561_14 ID - Russo2006 ER - TY - JOUR AU - Foschini, L. AU - Grossi, R. AU - Gupta, A. AU - Vitter, J. S. PY - 2006 DA - 2006// TI - When indexing equals compression: Experiments with compressing suffix arrays and applications JO - ACM Trans Algorithms VL - 2 UR - https://doi.org/10.1145/1198513.1198521 DO - 10.1145/1198513.1198521 ID - Foschini2006 ER - TY - JOUR AU - Lipman, D. J. AU - Pearson, W. R. PY - 1985 DA - 1985// TI - Rapid and Sensitive Protein Similarity Searches JO - Science VL - 227 UR - https://doi.org/10.1126/science.2983426 DO - 10.1126/science.2983426 ID - Lipman1985 ER - TY - CHAP AU - Seward, J. PY - 1997 DA - 1997// BT - bzip2 and libbzip2 - A program and library for data compression ID - Seward1997 ER - TY - JOUR AU - Hunt, J. W. AU - Szymanski, T. G. PY - 1977 DA - 1977// TI - A Fast Algorithm for Computing Longest Common Subsequences JO - Communications of the ACM VL - 20 UR - https://doi.org/10.1145/359581.359603 DO - 10.1145/359581.359603 ID - Hunt1977 ER - TY - JOUR AU - Ning, Z. M. AU - Cox, A. J. AU - Mullikin, J. C. PY - 2001 DA - 2001// TI - SSAHA: A fast search method for large DNA databases JO - Genome Res VL - 11 UR - https://doi.org/10.1101/gr.194201 DO - 10.1101/gr.194201 ID - Ning2001 ER - TY - CHAP AU - Burkhardt, S. AU - Karkkainen, J. PY - 2002 DA - 2002// TI - One-gapped q-gram filters for Levenshtein distance BT - Combinatorial Pattern Matching PB - SPRINGER-VERLAG BERLIN CY - Berlin UR - https://doi.org/10.1007/3-540-45452-7_19 DO - 10.1007/3-540-45452-7_19 ID - Burkhardt2002 ER - TY - JOUR AU - Kruskal, J. B. PY - 1956 DA - 1956// TI - On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem JO - Proceedings of the American Mathematical Society VL - 7 UR - https://doi.org/10.1090/S0002-9939-1956-0078686-7 DO - 10.1090/S0002-9939-1956-0078686-7 ID - Kruskal1956 ER - TY - JOUR AU - Prim, R. C. PY - 1957 DA - 1957// TI - Shortest Connection Networks and Some Generalizations JO - Bell System Technical Journal VL - 36 UR - https://doi.org/10.1002/j.1538-7305.1957.tb01515.x DO - 10.1002/j.1538-7305.1957.tb01515.x ID - Prim1957 ER - TY - CHAP AU - Moret, B. AU - Shapiro, H. PY - 1991 DA - 1991// BT - Algorithms from P to NP: Design and Efficiency PB - Benjamin/Cummings CY - Redwood City, CA ID - Moret1991 ER - TY - JOUR AU - Tarjan, R. E. PY - 1975 DA - 1975// TI - Efficiency of a Good but Not Linear Set Union Algorithm JO - J ACM VL - 22 UR - https://doi.org/10.1145/321879.321884 DO - 10.1145/321879.321884 ID - Tarjan1975 ER - TY - JOUR AU - Myers, E. W. PY - 1986 DA - 1986// TI - An O(ND) Difference Algorithm and its Variations JO - Algorithmica VL - 1 UR - https://doi.org/10.1007/BF01840446 DO - 10.1007/BF01840446 ID - Myers1986 ER - TY - STD TI - GenBank Sequence Database[http://www.ncbi.nlm.nih.gov/Genbank/index.html] UR - http://www.ncbi.nlm.nih.gov/Genbank/index.html ID - ref36 ER - TY - CHAP AU - Shkarin, D. PY - 2002 DA - 2002// BT - PPM: One Step to Practicality ID - Shkarin2002 ER - TY - STD TI - 7-Zip[http://www.7-zip.org] UR - http://www.7-zip.org ID - ref38 ER -