Por favor, use este identificador para citar o enlazar este ítem: http://www.alice.cnptia.embrapa.br/alice/handle/doc/1185675
Título: Protein family membership governs exosite predictability across the structural proteome.
Autor: OMAGE, F. B.
MAZONI, I.
YANO, I. H.
NESHICH, G.
Afiliación: FOLORUNSHO BRIGHT OMAGE, UNIVERSIDADE ESTADUAL DE CAMPINAS; IVAN MAZONI, CNPTIA; INACIO HENRIQUE YANO, CNPTIA; GORAN NESIC, CNPTIA.
Año: 2026
Referencia: Artificial Intelligence in the Life Sciences, v. 9, 100166, June 2026.
Descripción: Exosites, defined as protein surface regions that mediate macromolecular recognition at sites distinct from catalytic centers, represent emerging targets for selective drug design, yet their structural diversity has precluded systematic computational identification. Here we demonstrate that exosite prediction performance varies substantially across protein families, ranging from Matthews correlation coefficient (MCC) of 0.47 for coagulation factors to 0.14 for kinases. Using ExositeDB, we developed STINGExoFind, a gradient boosting framework leveraging 87 structural descriptors from the STINGRDB2 database, and evaluated 180 proteins under leave-one-protein-out cross-validation (LOPO-CV). Coagulation proteases achieved 50% success rates at the MCC ≥ 0.5 threshold, whereas kinases and caspases remained largely unpredictable. Ten structures spanning six families exceeded MCC ≥ 0.7, including MAPK/ERK2 (MCC = 0.86) within the otherwise challenging kinase family, indicating that high-confidence predictions remain achievable for specific proteins even in poorly-performing families. These results establish exosite prediction as a family-specific rather than universal challenge: computational approaches can meaningfully guide experimental validation for coagulation factors and similarly consistent protein families, while structurally diverse families require experimental characterization. STINGExoFind is provided as a community resource to support future method development and exosite-targeting drug discovery.
NAL Thesaurus: Protein structure
Palabras clave: Aprendizado de máquina
Estrutura proteica
Aumento de gradiente
Descritores de nanoambiente
Descoberta de fármacos
Exosite prediction
Machine learning
Gradient boosting
Nanoenvironment descriptors
Drug discovery
ISSN: 2667-3185
DOI: https://doi.org/10.1016/j.ailsci.2026.100166
Tipo de Material: Artigo de periódico
Acceso: openAccess
Aparece en las colecciones:Artigo em periódico indexado (CNPTIA)

Ficheros en este ítem:
Fichero TamañoFormato 
AA-Protein-family-2026.pdf925,33 kBAdobe PDFVisualizar/Abrir

FacebookTwitterDeliciousLinkedInGoogle BookmarksMySpace