Use este identificador para citar ou linkar para este item: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1185675
Título: Protein family membership governs exosite predictability across the structural proteome.
Autoria: OMAGE, F. B.
MAZONI, I.
YANO, I. H.
NESHICH, G.
Afiliação: FOLORUNSHO BRIGHT OMAGE, UNIVERSIDADE ESTADUAL DE CAMPINAS; IVAN MAZONI, CNPTIA; INACIO HENRIQUE YANO, CNPTIA; GORAN NESIC, CNPTIA.
Ano de publicação: 2026
Referência: Artificial Intelligence in the Life Sciences, v. 9, 100166, June 2026.
Conteúdo: Exosites, defined as protein surface regions that mediate macromolecular recognition at sites distinct from catalytic centers, represent emerging targets for selective drug design, yet their structural diversity has precluded systematic computational identification. Here we demonstrate that exosite prediction performance varies substantially across protein families, ranging from Matthews correlation coefficient (MCC) of 0.47 for coagulation factors to 0.14 for kinases. Using ExositeDB, we developed STINGExoFind, a gradient boosting framework leveraging 87 structural descriptors from the STINGRDB2 database, and evaluated 180 proteins under leave-one-protein-out cross-validation (LOPO-CV). Coagulation proteases achieved 50% success rates at the MCC ≥ 0.5 threshold, whereas kinases and caspases remained largely unpredictable. Ten structures spanning six families exceeded MCC ≥ 0.7, including MAPK/ERK2 (MCC = 0.86) within the otherwise challenging kinase family, indicating that high-confidence predictions remain achievable for specific proteins even in poorly-performing families. These results establish exosite prediction as a family-specific rather than universal challenge: computational approaches can meaningfully guide experimental validation for coagulation factors and similarly consistent protein families, while structurally diverse families require experimental characterization. STINGExoFind is provided as a community resource to support future method development and exosite-targeting drug discovery.
NAL Thesaurus: Protein structure
Palavras-chave: Aprendizado de máquina
Estrutura proteica
Aumento de gradiente
Descritores de nanoambiente
Descoberta de fármacos
Exosite prediction
Machine learning
Gradient boosting
Nanoenvironment descriptors
Drug discovery
ISSN: 2667-3185
Digital Object Identifier: https://doi.org/10.1016/j.ailsci.2026.100166
Tipo do material: Artigo de periódico
Acesso: openAccess
Aparece nas coleções:Artigo em periódico indexado (CNPTIA)

Arquivos associados a este item:
Arquivo TamanhoFormato 
AA-Protein-family-2026.pdf925,33 kBAdobe PDFVisualizar/Abrir

FacebookTwitterDeliciousLinkedInGoogle BookmarksMySpace