Use este identificador para citar ou linkar para este item: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1102528
Título: Relative scalability of NoSQL databases for genotype data manipulation.
Autoria: ALMEIDA, A. L.
SCHETTINO, V. J.
BARBOSA, T. J. R.
FREITAS, P. F.
GUIMARÃES, P. G. S.
ARBEX, W. A.
Afiliação: WAGNER ANTONIO ARBEX, CNPGL.
Ano de publicação: 2018
Referência: Revista de Informática Teórica e Aplicada, v. 25, n. 2, p. 93-100, 2018.
Conteúdo: Abstract Genotype data manipulation is one of the greatest challenges in bioinformatics and genomics mainly because of high dimensionality and unbalancing characteristics. These peculiarities explains why Relational Database Management Systems (RDBMSs), the "de facto" standard storage solution, have not been presented as the best tools for this kind of data. However, Big Data has been pushing the development of modern database systems that might be able to overcome RDBMSs deficiencies. In this context, we extended our previous works on the evaluation of relative performance among NoSQLs engines from different families, adapting the schema design in order to achieve better performance based on its conclusions, thus being able to store more SNP markers for each individual. Using Yahoo! Cloud Serving Benchmark (YCSB) benchmark framework, we assessed each database system over hypothetical SNP sequences. Results indicate that although Tarantool has the best overall throughput, MongoDB is less impacted by the increase of SNP markers per individual.
NAL Thesaurus: Bioinformatics
Genotype
Palavras-chave: Database
NoSQL
Data Science
SNP
Digital Object Identifier: 10.22456/2175-2745.79334
Tipo do material: Artigo de periódico
Acesso: openAccess
Aparece nas coleções:Artigo em periódico indexado (CNPGL)

Arquivos associados a este item:
Arquivo Descrição TamanhoFormato 
ArtigoRevInfTeorAplArbexRelative.pdf251,44 kBAdobe PDFVisualizar/Abrir

FacebookTwitterDeliciousLinkedInGoogle BookmarksMySpace