Machado: open source genomics data integration framework.

MUDADU, M. de A.; ZERLOTINI NETO, A.

Please use this identifier to cite or link to this item: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125289

Full metadata record

DC Field	Value	Language
dc.contributor.author	MUDADU, M. de A.
dc.contributor.author	ZERLOTINI NETO, A.
dc.date.accessioned	2020-10-06T09:14:12Z	-
dc.date.available	2020-10-06T09:14:12Z	-
dc.date.created	2020-10-05
dc.date.issued	2020
dc.identifier.citation	GigaScience, v. 9, n. 9, p. 1-16, Sept. 2020.
dc.identifier.uri	http://www.alice.cnptia.embrapa.br/alice/handle/doc/1125289	-
dc.description	Abstract. Background: Genome projects and multiomics experiments generate huge volumes of data that must be stored, mined, and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for more than a decade and have been implementing software and databases to meet this challenge. The GMOD's (Generic Model Organism Database) biological relational database schema, known as Chado, is one of the few successful open source initiatives; it is widely adopted and many software packages are able to connect to it. Findings: We have been developing an open source software package named Machado, a genomics data integration framework implemented in Python, to enable research groups to both store and visualize genomics data. The framework relies on the Chado database schema and, therefore, should be very intuitive for current developers to adopt it or have it running on top of already existing databases. It has several data-loading tools for genomics and transcriptomics data and also for annotation results from tools such as BLAST, InterproScan, OrthoMCL, and LSTrAP. There is an API to connect to JBrowse, and a web visualization tool is implemented using Django Views and Templates. The Haystack library integrated with the ElasticSearch engine was used to implement a Google-like search, i.e., single auto-complete search box that provides fast results and filters. Conclusion: Machado aims to be a modern object-relational framework that uses the latest Python libraries to produce an effective open source resource for genomics research.
dc.language.iso	eng
dc.rights	openAccess	eng
dc.subject	Dados genômicos
dc.subject	Multiomics
dc.subject	Chado
dc.title	Machado: open source genomics data integration framework.
dc.type	Artigo de periódico
dc.subject.thesagro	Base de Dados
dc.subject.nalthesaurus	Python
dc.subject.nalthesaurus	Genomics
dc.description.notes	Na publicação: Adhemar Zerlotini.
riaa.ainfo.id	1125289
riaa.ainfo.lastupdate	2020-10-06 -03:00:00
dc.identifier.doi	10.1093/gigascience/giaa097
dc.contributor.institution	MAURICIO DE ALVARENGA MUDADU, CNPTIA; ADHEMAR ZERLOTINI NETO, CNPTIA.
Appears in Collections:	Artigo em periódico indexado (CNPTIA)

Files in This Item:

File	Description	Size	Format
AP-Machado-2020.pdf		1,47 MB	Adobe PDF	View/Open

Show simple item record