Please use this identifier to cite or link to this item: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1176510
Title: SNP discovery of baru tree (Dipteryx alata Vogel) accessions enriches its genomic toolkit for species conservation, domestication, and breeding.
Authors: PESSOA FILHO, M. A. C. de P.
DE CAMPOS TELLES, M. P.
COELHO, A. S. G.
CHAVES, L. J.
SOARES, T. N.
ANDRÉ, T.
Affiliation: MARCO AURÉLIO CALDAS DE PINHO PESSO, CENARGEN; MARIANA P. DE CAMPOS TELLES, UNIVERSIDADE FEDERAL DE GOIÁS; ALEXANDRE S. G. COELHO, UNIVERSIDADE FEDERAL DE GOIÁS; LÁZARO J. CHAVES, UNIVERSIDADE FEDERAL DE GOIÁS; THANNYA N. SOARES, UNIVERSIDADE FEDERAL DE GOIÁS; THIAGO ANDRÉ, UNIVERSIDADE FEDERAL DE GOIÁS.
Date Issued: 2024
Citation: In: CONGRESSO NACIONAL DE BOTÂNICA, 74., 2024, Brasília, DF. Botânica Brasileira celebrando a diversidade: livro de resumos. Brasília, DF: Sociedade Botânica do Brasil, 2024
Description: Baru seeds are protein-rich sources of nutrients with a supply chain primarily based on extractivism. The baru tree (Dipteryx alata Vogel, Fabaceae) is a neotropical species native to Latin American savannas, included as vulnerable in the IUCN Red List of Threatened Species. Conservation, domestication, and breeding efforts would benefit from a rich set of genomic resources, which already includes a draft genome assembly. We used whole-genome sequencing (WGS) of individual baru trees to discover and genotype single-nucleotide polymorphisms (SNPs) on a genome-wide scale. Trees belonging to 24 accessions of the baru germplasm collection at Universidade Federal de Goias were selected for WGS. Young leaf tissue was collected for DNA extraction, and an Illumina DNA library was prepared with Nextera DNA Flex Indexes. The library was paired-end sequenced (2 x 300 bp) on a NextSeq 1000. A customized pipeline with successive iterations of analyses on the GATK and FreeBayes was used to build a high-quality SNP database. Adapter-marked reads were mapped to the draft assembly with BWA, followed by the marking of duplicates. The HaplotypeCaller and GenotypeGVCFs tools of the GATK were used in a first round of genotyping and hard-filtering, generating a dataset for base-quality score recalibration (BQSR). Recalibrated BAMs were used to call SNPs with FreeBayes, followed by hard filtering based on QUAL and DP annotations. Intersection of common variants between the GATK and Freebayes was performed, followed by selecting biallelic variants with MAF > 0.05 and pruning for one variant every 150 bp. This variant resource was used for a first round of Variant Quality Score Recalibration to obtain a sensitive variant dataset with a truth sensitivity threshold of 95.0. This dataset was used on a second round of BQSR, after which the pipeline was rerun with the rerecalibrated BAMs, including a second round of VQSR and selection of variants in the truth sensitivity tranche 90.0. The final dataset includes 14,428,145 variants, of which 8,509,545 are biallelic with MAF > 0.05, representing one variant every 94 bp of the baru tree genome. Genomewide linkage disequilibrium (LD) estimates from a pruned dataset of 2 million SNPs showed a mean value of 0.30 for r2 and an LD decay of 5.4 kbp. Ongoing efforts of chromosome-scale scaffolding of the draft assembly and prediction and annotation of gene models will allow further selection of SNPs for germplasm characterization and breeding.
NAL Thesaurus: Plant genetic resources
Keywords: Plant genomics
Notes: Na publicação: Pessoa Filho, Marco.
Type of Material: Resumo em anais e proceedings
Access: openAccess
Appears in Collections:Resumo em anais de congresso (CENARGEN)

Files in This Item:
File Description SizeFormat 
resumo-cnbot-baru.pdf106.15 kBAdobe PDFThumbnail
View/Open

FacebookTwitterDeliciousLinkedInGoogle BookmarksMySpace