Please use this identifier to cite or link to this item: https://elib.vku.udn.vn/handle/123456789/2745
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTran, Thi Xuan-
dc.contributor.authorNguyen, Van Nui-
dc.contributor.authorLe, Nguyen Quoc Khanh-
dc.date.accessioned2023-09-26T02:20:55Z-
dc.date.available2023-09-26T02:20:55Z-
dc.date.issued2023-07-
dc.identifier.isbn978-3-031-36886-8-
dc.identifier.urihttps://link.springer.com/chapter/10.1007/978-3-031-36886-8_7-
dc.identifier.urihttp://elib.vku.udn.vn/handle/123456789/2745-
dc.descriptionLecture Notes in Networks and Systems (LNNS, volume 734); CITA: Conference on Information Technology and its Applications; pp: 74-88.vi_VN
dc.description.abstractThe incidence of thyroid cancer and breast cancer is increasing every year, and the specific pathogenesis is unclear. Post-translational modifications are an important regulatory mechanism that affects the function of almost all proteins. They are essential for a diverse and well-functioning proteome and can integrate metabolism with physiological and pathological processes. In recent years, post-translational modifications have become a research hotspot, with methylation, phosphorylation, acetylation and succinylation being the main focus. SUMOylated proteins are predominantly localized in the nucleus, and SUMO regulates nuclear processes, including cell cycle control and DNA repair. SUMOylated proteins are predominantly localized in the nucleus, and SUMO regulates nuclear processes, including cell cycle control and DNA repair. SUMOylation has been increasingly implicated in cancer, Alzheimer’s, and Parkinson’s diseases. Therefore, identification and characterization SUMOylation sites are essential for determining modification-specific proteomics. This study aims to propose a novel schema for predicting protein SUMOylation sites based on the incorporation of natural language features (Word2Vec) and sequence-based features. In addition, the novel model, called RSX_SUMO, is proposed for the prediction of protein SUMOylation sites. Our experiments reveal that the performance of RSX_SUMO model achieves the highest performance in both five-fold cross-validation and independent testing, obtain the performance on independent testing with acccuracy at 88.6% and MCC value of 0.743. In addition, the comparison with several existing prediction models show that our proposed model outperforms and obtains the highest performance. We hope that our findings would provide effective suggestions and be a great helpful for researchers related to their related studies.vi_VN
dc.language.isoenvi_VN
dc.publisherSpringer Naturevi_VN
dc.subjectSUMOylation sites predictionvi_VN
dc.subjectMachine Learningvi_VN
dc.subjectWord2Vecvi_VN
dc.subjectRandom forestvi_VN
dc.subjectXGBoostvi_VN
dc.subjectSVMvi_VN
dc.titleIncorporating Natural Language-Based and Sequence-Based Features to Predict Protein Sumoylation Sitesvi_VN
dc.typeWorking Papervi_VN
Appears in Collections:CITA 2023 (International)

Files in This Item:

 Sign in to read



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.