报告题目: Advancing a Unified Deep Learning Frameworkfor Peptide and Molecule Mass Spectrometry
报告人:崔学峰 教授
主持人:张媛媛 青岛理工大学
时间:2024年1月19日 10:00
地点:信控楼208
报告摘要:
Tandem Mass Spectrometry (MS/MS) is a highly effective analytical technique used to recognize and characterize molecules, such as proteins and other large biological molecules. At present, two major difficulties are being encountered in research. The ever-growing amount of peptide MS/MS spectra data necessitates the invention of new computational techniques to rapidly search through these databases. We introduce MS2VEC, a unique fingerprint embedding model for large-scale peptide MS/MS spectra library retrieval. This model captures the connections between far-off peaks and incorporates position-aware fingerprint features. To accomplish this, dilated convolutions are utilized to capture remote associations, and a position-aware multi-head attention pooling mechanism is employed to abstract fingerprint features. Given the limited amount of small molecule MS/MS spectra data, traditional methods that rely on database comparisons are not suitable for newly discovered molecules that have not been added to the database. To address this issue, we introduce MS2SMILES, a novel approach that considers hydrogen atoms as implicitly linked to heavy atoms. This method is specifically designed to accurately predict hydrogen atoms in chemical structures, which are not explicitly represented in SMILES.
个人简介:
崔学峰,现担任山东大学计算机科学与技术学院教授。他的学术历程起始于加拿大滑铁卢大学David R. Cheriton计算机科学学院,先后取得学士、硕士及博士学位。2016年,崔学峰进入清华大学交叉信息研究院,2019年,他正式成为山东大学的正教授。此外,他在2019年荣获了ACM SIGBIO新星奖。
崔学峰教授致力于开发机器学习和并行算法,主要研究领域是生物信息学,以解决在生物领域中与人类生活息息相关的问题。他已经提出了多种基于深度学习的生物大数据检索和配对算法,例如基于三维结构进行同源蛋白质检索的算法,以及基于目标蛋白质结构进行药物小分子虚拟筛选和对接的算法。他的杰出研究成果已经在"Intelligent Systems for Molecular Biology"(ISMB,生物信息学领域的顶级会议,每年只接收大约40篇论文)上发表过三次,并且多次在"Bioinformatics"、"Genome Medicine"、"Nucleic Acid Research" (NAR)、"ACS Synthetic Biology"等国际知名期刊上发表。此外,他的创新性研究成果还曾一次被"Bio-Techniques"国际媒体报道,两次被"Science X"国际媒体报道。