搜索

x

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于氨基酸位置特异性的蛋白质Loop区结构预测改进方法

袁飞 张传彪 周昕 黎明

引用本文:
Citation:

基于氨基酸位置特异性的蛋白质Loop区结构预测改进方法

袁飞, 张传彪, 周昕, 黎明

An improved algorithm for prediction of protein loop structure based on position specificity of amino acids

Yuan Fei, Zhang Chuan-Biao, Zhou Xin, Li Ming
PDF
导出引用
  • 蛋白质loop区的结构预测是理解蛋白质功能的重要一环,而长loop区的结构预测至今还是生物信息学中的难题. 目前已经出现了多种loop结构的算法,其中LEAP是预测精度最高的算法之一,但它在长loop区初始主链构象采样上仍有较大的改进余地. 本文中我们将蛋白质二级结构预测算法SPINE X与LEAP算法结合起来,构建了新的主链扭转角分布图(拉氏图),在主链初始构象采样中引入氨基酸在蛋白序列中的位置特异性信息,使得初始构象的采样更具针对性. 对取自CASP10单链蛋白的loop测试集的分析表明,对长度为10,11,12个氨基酸的长loop区,改进后算法都比原始LEAP算法的预测精度有显著提升. 这种引入氨基酸位置特异性从而提高预测精度的思路有望进一步推广至loop结构预测的其他算法.
    Loop region is necessary structural element of protein molecule, and plays significant roles in protein functioning, e.g., in signaling, ligand recognition. Unlike the well-defined secondary structures (i.e., helix, sheet), however, loop regions vary in structure and some of them are even not able to be measured by ordinary experimental methods. For these reasons, computer-aided prediction of loop structure became a hotspot in bioinformatics and biophysics. Sorts of algorithms have been developed for this purpose. So far, however, the prediction of long loop is still a challenge. Among all the common algorithms, LEAP algorithm achieves the highest precision on long loop prediction. Our investigation on a test data set with LEAP algorithm reveals that the ultimate loop structure predicted by LEAP is almost entirely determined by the initial sampling of the conformation of the loop backbone. If all the backbone conformations in the initial sampling are quite distant from the real (native) conformation, the ultimately predicted structure is also distant from the native conformation, and the prediction accuracy cannot be improved obviously only by increasing the computation time. In the original LEAP, the initial sampling is based on the rough distribution of the backbone torsion angle (Ramachandran plot, R-plot) which doesn't consider the sequence information of the loop region. Many conformations which are far from the native conformation are most likely generated in the sampling. So there raises the open question, is it possible to enhance the initial sampling to be more targeted to the native conformation? In this paper, we suggest an approach to introduce the position-specific amino-acid sequence information into the initial sampling of the backbone conformation, which may generate more targeted initial decoys. An algorithm of protein secondary structure prediction, SPINE X, is used to generate rough but reasonable estimates of torsion angles of each amino acid of the loop backbone in sequence-dependent way. We then combine these values with the original R-plot to reconstruct a new R-plot for each amino acid in the loop, and the initial sampling is performed according to the new R-plot. We applied this new algorithm to a test set of loops (generated from single-chain proteins in CASP 10), and found the medians/means of RMSDs can reduce about 0.12 /0.13 , 0.25 /0.27 , 0.47 /0.27 for loop sets of length 10, 11, 12, respectively. Comparing to the original LEAP algorithm, the probability of making more accurate predictions is almost doubled when using the refined algorithm. The logic of our approach is not limited to LEAP, and can be extended to other algorithms which are also significantly dependent on initial sampling.
      通信作者: 黎明, liming@ucas.ac.cn
    • 基金项目: 国家自然科学基金(批准号:11105218,11347614)资助的课题.
      Corresponding author: Li Ming, liming@ucas.ac.cn
    • Funds: Project supported by the National Natural Science Foundation of China (Grant Nos.11105218, 11347614).
    [1]

    Anfinsen C B, Redfield R R, Choate W L, Page J, Carroll W R 1954 J. Biol. Chem. 207 201

    [2]

    Decanniere K, Muyldermans S, Wyns L 2000 J. Mol. Biol. 300 83

    [3]

    Likitvivatanavong S, Aimanova K G, Gill S S 2007 FEBS Lett. 583 2021

    [4]

    Lepsik M, Field M J 2007 J. Phys. Chem. B 111 10012

    [5]

    Sutcliffe M J, Haneef I, Carney D, Blundell T L 1987 Protein Eng. 1 377

    [6]

    Tossato C E, Bindewald E, Hesser J, Maenner R 2002 Protein Eng. 15 279

    [7]

    Lee J, Lee D, Park H, Coutsias E A, Seok C 2010 Proteins: Struct., Funct., Bioinf. 78 3428

    [8]

    Fiser A, Do R K, Sali A 2000 Protein Sci. 9 1753

    [9]

    Spassov V Z, Flook P K, Yan L 2008 Protein Eng., Des. Sel. 21 91

    [10]

    Jacobson M P, Pincus D L, Rapp C S, Day T J F, Honig B, Shaw D W, Friesner R A 2004 Proteins: Struct., Funct., Bioinf. 55 351

    [11]

    Zhu K, Pincus D L, Zhao S W, Friesner R A 2006 Proteins: Struct., Funct., Bioinf. 65 438

    [12]

    Li J, Abel R, Zhu K, Cao Y, Zhao S, Friesner R A 2011 Proteins: Struct., Funct., Bioinf. 79 2794

    [13]

    Xiang Z, Soto C S, Honig B 2002 Proc. Natl. Acad. Sci. U. S. A. 99 7432

    [14]

    Soto C S, Fasnacht M, Zhu J, Forrest L, Honig B 2008 Proteins: Struct., Funct., Bioinf. 70 834

    [15]

    Rohl C A, Strauss C E M, Chivian D, Baker D 2004 Proteins: Struct., Funct., Bioinf. 55 656

    [16]

    Liang S, Zhang C, Zhou Y 2014 J. Comput. Chem. 35 335

    [17]

    Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y 2012 J. Comput. Chem. 33 259

    [18]

    Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y 2015 Sci. Rep. 5 11476

  • [1]

    Anfinsen C B, Redfield R R, Choate W L, Page J, Carroll W R 1954 J. Biol. Chem. 207 201

    [2]

    Decanniere K, Muyldermans S, Wyns L 2000 J. Mol. Biol. 300 83

    [3]

    Likitvivatanavong S, Aimanova K G, Gill S S 2007 FEBS Lett. 583 2021

    [4]

    Lepsik M, Field M J 2007 J. Phys. Chem. B 111 10012

    [5]

    Sutcliffe M J, Haneef I, Carney D, Blundell T L 1987 Protein Eng. 1 377

    [6]

    Tossato C E, Bindewald E, Hesser J, Maenner R 2002 Protein Eng. 15 279

    [7]

    Lee J, Lee D, Park H, Coutsias E A, Seok C 2010 Proteins: Struct., Funct., Bioinf. 78 3428

    [8]

    Fiser A, Do R K, Sali A 2000 Protein Sci. 9 1753

    [9]

    Spassov V Z, Flook P K, Yan L 2008 Protein Eng., Des. Sel. 21 91

    [10]

    Jacobson M P, Pincus D L, Rapp C S, Day T J F, Honig B, Shaw D W, Friesner R A 2004 Proteins: Struct., Funct., Bioinf. 55 351

    [11]

    Zhu K, Pincus D L, Zhao S W, Friesner R A 2006 Proteins: Struct., Funct., Bioinf. 65 438

    [12]

    Li J, Abel R, Zhu K, Cao Y, Zhao S, Friesner R A 2011 Proteins: Struct., Funct., Bioinf. 79 2794

    [13]

    Xiang Z, Soto C S, Honig B 2002 Proc. Natl. Acad. Sci. U. S. A. 99 7432

    [14]

    Soto C S, Fasnacht M, Zhu J, Forrest L, Honig B 2008 Proteins: Struct., Funct., Bioinf. 70 834

    [15]

    Rohl C A, Strauss C E M, Chivian D, Baker D 2004 Proteins: Struct., Funct., Bioinf. 55 656

    [16]

    Liang S, Zhang C, Zhou Y 2014 J. Comput. Chem. 35 335

    [17]

    Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y 2012 J. Comput. Chem. 33 259

    [18]

    Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y 2015 Sci. Rep. 5 11476

  • [1] 葛一璇, 于婷婷, 梁文杰. 原位合成方法制备超灵敏和高特异性的微型氢气传感器. 物理学报, 2024, 73(2): 020701. doi: 10.7498/aps.73.20231265
    [2] 潘钦杰, 赵灿东, 陈琪, 何毓辉, 缪向水. 面向单分子检测的纳米孔传感特异性增强技术. 物理学报, 2024, 73(10): 108702. doi: 10.7498/aps.73.20240159
    [3] 王向贤, 白雪琳, 庞志远, 杨华, 祁云平, 温晓镭. 聚甲基丙烯酸甲酯间隔的金纳米立方体与金膜复合结构的表面增强拉曼散射研究. 物理学报, 2019, 68(3): 037301. doi: 10.7498/aps.68.20190054
    [4] 童莹, 沈越泓, 魏以民. 基于旋转主方向梯度直方图特征的判别稀疏图映射算法. 物理学报, 2019, 68(19): 194202. doi: 10.7498/aps.68.20190224
    [5] 康文斌, 王骏, 王炜. 内禀无序蛋白构象与带电氨基酸残基排布关系——以精氨酸和天冬氨酸组成的随机多肽为例. 物理学报, 2018, 67(5): 058701. doi: 10.7498/aps.67.20172246
    [6] 胡兴健, 郑百林, 杨彪, 余金桂, 贺鹏飞, 岳珠峰. 初始压入位置对Ni基单晶合金纳米压痕影响研究. 物理学报, 2015, 64(7): 076201. doi: 10.7498/aps.64.076201
    [7] 刘全, 于明, 林忠, 王瑞利. 流体力学拉氏守恒滑移线算法设计. 物理学报, 2015, 64(19): 194701. doi: 10.7498/aps.64.194701
    [8] 陆乃彦, 元冰, 杨恺. 带电多孔二氧化硅纳米颗粒在硫醇/磷脂混合双层膜上的非特异性吸附. 物理学报, 2013, 62(17): 178701. doi: 10.7498/aps.62.178701
    [9] 羊梦诗, 李鑫, 叶志鹏, 陈亮, 徐灿, 储修祥. 丝素氨基酸寡肽链生长过程中的尺寸效应. 物理学报, 2013, 62(23): 236101. doi: 10.7498/aps.62.236101
    [10] null. 初始位置布局不平衡的疏散行人流仿真研究. 物理学报, 2012, 61(13): 130509. doi: 10.7498/aps.61.130509
    [11] 江绍钏, 章林溪, 夏阿根, 陈宏平. 脱氧核糖核酸单链序列的预测. 物理学报, 2010, 59(6): 4337-4342. doi: 10.7498/aps.59.4337
    [12] 张耘. 周期性极化铌酸锂的微区拉曼及荧光研究. 物理学报, 2010, 59(8): 5528-5532. doi: 10.7498/aps.59.5528
    [13] 朱平, 高雷, 徐振源. 基于拟氨基酸编码方法的同义密码子的偏好性与结合强度的相关性研究. 物理学报, 2009, 58(6): 4295-4300. doi: 10.7498/aps.58.4295
    [14] 何 兰, 沈允文, 容启亮, 徐 雁. 基于分子动力学模拟的主链型液晶聚合物的新模型. 物理学报, 2006, 55(9): 4407-4413. doi: 10.7498/aps.55.4407
    [15] 岳伟伟, 王卫宁, 赵国忠, 张存林, 闫海涛. 芳香族氨基酸的太赫兹光谱研究. 物理学报, 2005, 54(7): 3094-3099. doi: 10.7498/aps.54.3094
    [16] 李子平. 高阶微商场论中奇异拉氏量系统的量子正则对称性. 物理学报, 1996, 45(8): 1255-1263. doi: 10.7498/aps.45.1255
    [17] 李子平. 奇异拉氏量系统的整体量子正则对称性质. 物理学报, 1996, 45(10): 1601-1608. doi: 10.7498/aps.45.1601
    [18] 滕凤恩, 崔相旭. 多晶X射线线形傅氏分析方法在合金材料力学性能预测上的应用. 物理学报, 1989, 38(11): 1845-1848. doi: 10.7498/aps.38.1845
    [19] 张耀中. 手征QCD2模型的有效拉氏量和质量生成. 物理学报, 1987, 36(11): 1513-1518. doi: 10.7498/aps.36.1513
    [20] 周光召, 郝柏林, 于渌. 非平衡统计场论与临界动力学(Ⅱ)——拉氏场论表述. 物理学报, 1980, 29(8): 969-977. doi: 10.7498/aps.29.969
计量
  • 文章访问数:  10253
  • PDF下载量:  311
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-04-22
  • 修回日期:  2016-05-13
  • 刊出日期:  2016-08-05

/

返回文章
返回