帮助 关于我们

返回检索结果

抗癌候选药物ERα抑制剂活性预测
Activity prediction of anti-cancer drug candidate ERα inhibitor

查看参考文献20篇

夏玉兰 1   谢济铭 1   王雅婧 2   卢梦媛 1   王锦锐 1   秦雅琴 1 *  
文摘 乳腺癌是目前威胁全球女性健康最常见的恶性肿瘤.本研究通过统计分析并采用随机森林方法,确定了雌激素受体α亚型(estrogen receptor alpha subtype,ERα)在乳腺的发育过程中起着重要的作用,被视为乳腺癌治疗的重要靶标,拮抗ERα活性的化合物可作为乳腺癌治疗的候选药物.为有效预测小样本、多特征条件下的乳腺癌治疗靶标ERα的化合物生物活性,提出一种抗乳腺癌药物定量结构-活性关系的集成机器学习预测模型,称为Mul-BHO-Bi-LSTM(multivariate-Bayesian hyperparametric optimization bi-directional long short-term memory)模型.对1 974个化合物的729个分子描述符信息进行描述性统计和多重共线性诊断,采用随机森林方法,筛选20个显著变量的重要性评分大于0.01的变量.构建基于卷积神经网络的二维特征矩阵,采用贝叶斯超参数优化方法,对双向长短期记忆(bi-directional long short-term memory,Bi-LSTM)模型进行超参数寻优.对模型的预测效果进行分析和评价,结果显示,相比梯度提升决策树(gradient boosting decision tree,GBDT)集成学习方法,Mul-BHO-Bi-LSTM模型的预测效果较优,模型误差相关指标均方误差、归一化均方根误差、误差平均值、误差标准差均小于0.15,关联指标R~2和r达0.99以上,表明Mul-BHO-Bi-LSTM的集成机器学习预测模型具有较好鲁棒性和泛化性.该预测模型可为抗乳腺癌药物的筛选与设计提供方法.
其他语种文摘 Breast cancer is the most common malignancy which threats the women's health worldwide. Studies have revealed that the estrogen receptor alpha subtype(ERα) plays an important role in breast development and is considered as an important target for breast cancer treatment. Compounds that can antagonize ERα activity may be candidates for breast cancer treatment. A quantitative structure-activity relationship prediction model is proposed to predict the bioactivity of compounds that can be applied to anti-breast cancer drugs under small samples and multicharacteristic conditions. First, the descriptive statistics and multicollinearity diagnosis are performed on the information of 729 molecular descriptors of 1 974 compounds, and the random forest method is used to screen 20 significant variables with variable importance measure that is greater than 0.01. Then, a CNN-based twodimensional feature matrix is constructed, and a Bayesian hyperparametric optimization(BHO) method is used to perform hyperparametric optimization of the Bi-LSTM model. Finally, the prediction effect of model is analyzed and evaluated. The results show that compared with the GBDT integrated learning method, the prediction effect of Mul-BHO-Bi-LSTM integrated machine learning prediction model is better, and the model error indexes MSE, NRMSE, error mean, and error std are less than 0.15, and the correlated indicators R~2 and r are above 0.99, indicating that the integrated machine learning predictionmodel of Mul-BHO-Bi-LSTM has the good robustness and generalization, and the model can provide a method for the screening and design of anti-breast cancer drugs.
来源 深圳大学学报. 理工版 ,2022,39(5):529-537 【核心库】
DOI 10.3724/SP.J.1249.2022.05529
关键词 计算机应用 ; 集成学习 ; 生物活性预测 ; 特征筛选 ; 超参数优化 ; 随机森林
地址

1. 昆明理工大学交通工程学院, 云南, 昆明, 650504  

2. 温州医科大学第一临床医学院, 浙江, 温州, 325006

语种 中文
文献类型 研究性论文
ISSN 1000-2618
学科 药学;自动化技术、计算机技术
基金 国家自然科学基金资助项目
文献收藏号 CSCD:7302259

参考文献 共 20 共1页

1.  刘宗超. 2020全球癌症统计报告解读. 肿瘤综合治疗电子杂志,2021,7(2):1-13 CSCD被引 199    
2.  孙少康. 生物活性多糖抗乳腺癌作用研究进展. 世界中医药,2021,16(18):2798-2805 CSCD被引 4    
3.  Kidera A. Statistical analysis of the physical properties of the 20 naturally occurring amino acids. Journal of Protein Chemistry,1985,4(1):23-55 CSCD被引 22    
4.  王青艳. 药物分子设计中定量结构-活性关系计算方法的研究. 广西科学,2014,21(1):6-11 CSCD被引 2    
5.  Lavecchia A. Machine-learning approaches in drug discovery: methods and applications. Drug Discovery Today,2014,20(3):318-331 CSCD被引 7    
6.  Stephenson N. Survey of machine learning techniques in drug discovery. Current Drug Metabolism,2019,20(3):185-193 CSCD被引 3    
7.  黄斌. 基于支持向量学习机预测药物透血脑屏障的活性. 计算机与应用化学,2009,26(2):188-190 CSCD被引 4    
8.  Sardari S. Artificial neural network modeling of antimycobacterial chemical space to introduce efficient descriptors employed for drug design. Chemometrics and Intelligent Laboratory Systems,2014,130:151-158 CSCD被引 1    
9.  Dutt R. Development and application of novel molecular descriptors for predicting biological activity. Medicinal Chemistry Research,2017,26(9):1988-2006 CSCD被引 1    
10.  陆家兴. 基于LINCS-L1000扰动信号通过SAE-XGBoost算法预测药物诱导下的细胞活性. 生物工程学报,2021,37(4):1346-1359 CSCD被引 2    
11.  Bergstra J. Random search for hyperparameter optimization. Journal of Machine Learning Research,2012,13(1):281-305 CSCD被引 154    
12.  李玉娟. 基于改进粒子群算法的深度学习超参数优化方法. 信息通信,2020(1):52-53,55 CSCD被引 1    
13.  Wu Jia. Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization. Journal of Electronic Science & Technology,2019,17(1):26-40 CSCD被引 20    
14.  朱钰. 统计学意义下的多重共线性检验方法. 统计与决策,2020,36(7):34-36 CSCD被引 16    
15.  Breiman L. Random forests, machine learning 45. Journal of Clinical Microbiology,2001,2:199-228 CSCD被引 21    
16.  魏腾飞. 基于改进PSO优化LSTM网络的短期电力负荷预测. 系统仿真学报,2021,33(8):1866-1874 CSCD被引 12    
17.  尹诗. 风电机组发电机前轴承健康度预测方法及实现. 系统仿真学报,2021,33(6):1323-1333 CSCD被引 6    
18.  周飞燕. 卷积神经网络研究综述. 计算机学报,2017,40(6):1229-1252 CSCD被引 635    
19.  Shahriari B. Taking the human out of the loop: a review of Bayesian optimization. Proceedings of the IEEE,2015,104(1):148-175 CSCD被引 95    
20.  Friedman J H. Greedy function approximation: a gradient boosting machine. Annals of Statistics,2001,29(5):1189-1232 CSCD被引 692    
引证文献 1

1 秦雅琴 抗乳腺癌活性化合物的ADMET性质预测模型 云南大学学报. 自然科学版,2022,44(6):1127-1134
CSCD被引 0 次

显示所有1篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号