蛋白质中残基远程相互作用预测算法研究综述
A Survey on Algorithms for Protein Contact Prediction
查看参考文献49篇
文摘
|
蛋白质是由多个氨基酸残基顺序连接而成的长链.在天然状态下,蛋白质并不是无规则的自由状态,而是自发形成特定的空间结构,以执行其特定的生物学功能.驱动蛋白质形成特定空间结构的主要因素是残基间的非共价相互作用,包括疏水作用、静电相互作用、范德华力等.因此,对残基之间远程相互作用的准确预测将有助于对蛋白质空间结构的预测,进而有助于对蛋白质生物学功能的了解.在蛋白质进化过程,有相互作用残基对之间存在一种"共进化"模式,即当一个残基发生变异时,与其有相互作用的残基也要发生相应的变异,以维持相互作用,进而维持整体空间结构以及生物学功能.基于上述生物学观察,研究者开发了多个统计模型和算法以预测残基对之间的相互作用:1)概述残基之间远程相互作用的两大类基本预测算法,包括无监督学习方法和监督学习方法;2)使用蛋白质结构预测CASP比赛结果来客观比较上述各类算法的性能,分析各个算法的特点和优势;3)从生物学观察和统计模型2个角度分析总结了未来的发展趋势. |
其他语种文摘
|
Proteins are large molecules consisting of a linear sequence of amino acids. In the natural environment, a protein spontaneously folds into specific tertiary structure to perform its biological functionality. The main factors that drive proteins to fold are interactions between residues, including hydrophobic interaction, Van der Waals' force and electrostatic interactions. The interactions between residues usually lead to residue-residue contacts, and the prediction of residue-residue contacts should greatly facilitate understanding of protein structures and functionalities. A great variety of techniques have been proposed for residue-residue contacts prediction, including machine learning, statistical models, and linear programing. It should be pointed out that most of these techniques are based on the biological insight of co-evolution, i.e., during the evolutionary history of proteins, a residue's mutation usually leads its contacting partner to mutate accordingly. In this review, we summarize the state-of-art algorithms in this field with emphasis on the construction of statistical models based on biological insights. We also present the evaluation of these algorithms using CASP (critical assessment of techniques for protein structure prediction) targets as well as popular benchmark datasets, and describe the trends in the field of protein contact prediction. |
来源
|
计算机研究与发展
,2017,54(1):1-19 【核心库】
|
DOI
|
10.7544/issn1000-1239.2017.20151076
|
关键词
|
残基远程相互作用预测
;
蛋白质三级结构预测
;
图模型
;
共进化
;
机器学习
|
地址
|
1.
中国科学院计算技术研究所, 北京, 100190
2.
北京大学定量生物学中心, 北京, 100871
3.
中国科学院理论物理研究所, 北京, 100190
|
语种
|
中文 |
文献类型
|
综述型 |
ISSN
|
1000-1239 |
学科
|
自动化技术、计算机技术 |
基金
|
国家973计划
;
国家自然科学基金项目
;
中国科学院理论物理研究所理论物理国家重点实验室开放工程项目
|
文献收藏号
|
CSCD:5902929
|
参考文献 共
49
共3页
|
1.
Lodish H F.
Molecular Cell Biology,2000
|
被引
1
次
|
|
|
|
2.
Petsko G A.
Protein Structure and Function,2004
|
被引
8
次
|
|
|
|
3.
Wuthrich K. The way to NMR structures of proteins.
Nature Structural & Molecular Biology,2001,8(11):923-925
|
被引
4
次
|
|
|
|
4.
Kendrew J C. A three-dimensional model of the myoglobin molecule obtained by X-ray analysis.
Nature,1958,181(4610):662-666
|
被引
23
次
|
|
|
|
5.
Taylor K A. Electron diffraction of frozen, hydrated protein crystals.
Science,1974,186(4168):1036-1037
|
被引
23
次
|
|
|
|
6.
Marks D S. Protein structure prediction from sequence variation.
Nature Biotechnology,2012,30(11):1072-1080
|
被引
5
次
|
|
|
|
7.
Anfinsen C B. Principles that govern the folding of protein chains.
Science,1973,181(4096):223-230
|
被引
165
次
|
|
|
|
8.
Kim De. One contact for every twelve residues allows robust and accurate topology-level protein structure modeling.
Proteins: Structure, Function, and Bioinformatics,2014,82(S2):208-218
|
被引
1
次
|
|
|
|
9.
Haile J M.
Molecular Dynamics Simulation,1992
|
被引
6
次
|
|
|
|
10.
Anzai Y.
Pattern Recognition and Machine Learning,2012
|
被引
6
次
|
|
|
|
11.
Jones D T. MetaPSICOV: Combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.
Bioinformatics,2015,31(7):999-1006
|
被引
7
次
|
|
|
|
12.
Cheng Jianlin. Improved residue contact prediction using support vector machines and a large feature set.
BMC Bioinformatics,2007,8(1):11-13
|
被引
1
次
|
|
|
|
13.
陈鹏.
蛋白质残基间的相互作用分析与预测,2007
|
被引
1
次
|
|
|
|
14.
Marks D S. Protein 3D structure computed from evolutionary sequence variation.
PLoS ONE,2011,6(12):1287-1296
|
被引
16
次
|
|
|
|
15.
Gobel U. Correlated mutations and residue contacts in proteins.
Proteins: Structure, Function and Bioinfomatics,1994,18(4):309-317
|
被引
13
次
|
|
|
|
16.
Martin L C. Using information theory to search for co-evolving residues in proteins.
Bioinformatics,2005,21(22):4116-4124
|
被引
8
次
|
|
|
|
17.
Kass I. Mapping pathways of allosteric communication in GroEL by analysis of correlated mutations.
Proteins: Structure, Function, and Bioinformatics,2002,48(4):611-617
|
被引
4
次
|
|
|
|
18.
Kamisetty H. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence-and structure-rich era.
Proceedings of the National Academy of Sciences,2013,110(39):15674-15679
|
被引
12
次
|
|
|
|
19.
Morcos F. Direct-coupling analysis of residue coevolution captures native contacts across many protein families.
Proceedings of the National Academy of Sciences,2011,108(49):1293-1301
|
被引
18
次
|
|
|
|
20.
Weigt M. Identification of direct residue contacts in protein-protein interaction by message passing.
Proceedings of the National Academy of Sciences,2009,106(1):67-72
|
被引
14
次
|
|
|
|
|