Search

Article

x

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

Exploring proten’s conformational space by using encoding layer supervised auto-encoder

Chen Guang-Lin Zhang Zhi-Yong

Citation:

Exploring proten’s conformational space by using encoding layer supervised auto-encoder

Chen Guang-Lin, Zhang Zhi-Yong
PDF
HTML
Get Citation
  • Protein function is related to its structure and dynamic change. Molecular dynamics simulation is an important tool for studying protein dynamics by exploring its conformational space, however, conformational sampling is a nontrivial issue, because of the risk of missing key details during sampling. In recent years, deep learning methods, such as auto-encoder, can couple with MD to explore conformational space of protein. After being trained with the MD trajectories, auto-encoder can generate new conformations quickly by inputting random numbers in low dimension space. However, some problems still exist, such as requirements for the quality of the training set, the limitation of explorable area and the undefined sampling direction. In this work, we build a supervised auto-encoder, in which some reaction coordinates are used to guide conformational exploration along certain directions. We also try to expand the explorable area by training through the data generated by the model. Two multi-domain proteins, bacteriophage T4 lysozyme and adenylate kinase, are used to illustrate the method. In the case of the training set consisting of only under-sampled simulated trajectories, the supervised auto-encoder can still explore along the given reaction coordinates. The explored conformational space can cover all the experimental structures of the proteins and be extended to regions far from the training sets. Having been verified by molecular dynamics and secondary structure calculations, most of the conformations explored are found to be plausible. The supervised auto-encoder provides a way to efficiently expand the conformational space of a protein with limited computational resources, although some suitable reaction coordinates are required. By integrating appropriate reaction coordinates or experimental data, the supervised auto-encoder may serve as an efficient tool for exploring conformational space of proteins.
      Corresponding author: Zhang Zhi-Yong, zzyzhang@ustc.edu.cn
    • Funds: Project supported by the National Key Research and Development Program of China (Grant No. 2021YFA1301504), the National Natural Science Foundation of China (Grant No. 91953101), and the Strategic Priority Research Program (B) of the Chinese Academy of Sciences (Grant No. XDB37040202).
    [1]

    Chu X, Gan L, Wang E, Wang J 2013 Proc. Natl. Acad. Sci. U.S.A. 110 E2342Google Scholar

    [2]

    Smyth M S, Martin J H 2000 Mol. Pathol. 53 8Google Scholar

    [3]

    Danev R, Yanagisawa H, Kikkawa M 2019 Trends Biochem. Sci. 44 837Google Scholar

    [4]

    Vincenzi M, Mercurio F A, Leone M 2021 Curr. Med. Chem. 28 2729Google Scholar

    [5]

    Kachala M, Valentini E, Svergun D I 2015 Adv. Exp. Med. Biol. 870 261Google Scholar

    [6]

    Chu F, Thornton D T, Nguyen H T 2018 Methods 144 53Google Scholar

    [7]

    Bhaumik S R 2021 Emerg. Top Life Sci. 5 49Google Scholar

    [8]

    Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl S A A, Ballard A J, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior A W, Kavukcuoglu K, Kohli P, Hassabis D 2021 Nature 596 583Google Scholar

    [9]

    Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee G R, Wang J, Cong Q, Kinch L N, Schaeffer R D, Millán C, Park H, Adams C, Glassman C R, DeGiovanni A, Pereira J H, Rodrigues A V, van Dijk A A, Ebrecht A C, Opperman D J, Sagmeister T, Buhlheller C, Pavkov-Keller T, Rathinaswamy M K, Dalwadi U, Yip C K, Burke J E, Garcia K C, Grishin N V, Adams P D, Read R J, Baker D 2021 Science 373 871Google Scholar

    [10]

    Karplus M, Kuriyan J 2005 Proc. Natl. Acad. Sci. 102 6679Google Scholar

    [11]

    Bernardi R C, Melo M C R, Schulten K 2015 Biochim. Biophys. Acta 1850 872Google Scholar

    [12]

    Mu J, Liu H, Zhang J, Luo R, Chen H F 2021 J. Chem. Inf. Model. 61 1037Google Scholar

    [13]

    Lemke T, Peter C 2019 J. Chem. Theory Comput. 15 1209Google Scholar

    [14]

    Zhu J, Wang J, Han W, Xu D 2022 Nat. Commun. 13 1661Google Scholar

    [15]

    Hinton G E, Salakhutdinov R R 2006 Science 313 504Google Scholar

    [16]

    Degiacomi M T 2019 Structure 27 1034Google Scholar

    [17]

    Wen B, Peng J, Zuo X, Gong Q, Zhang Z 2014 Biophysical J. 107 956Google Scholar

    [18]

    Giri Rao V V H, Gosavi S 2014 PLOS Computational Biology 10 e1003938Google Scholar

    [19]

    Abraham M J, Murtola T, Schulz R, Páll S, Smith J C, Hess B, Lindahl E 2015 SoftwareX 1–2 19Google Scholar

    [20]

    Weaver L H, Matthews B W 1987 J. Mol. Biol. 193 189Google Scholar

    [21]

    Zhang X J, Wozniak J A, Matthews B W 1995 J. Mol. Biol. 250 527Google Scholar

    [22]

    Müller C W, Schulz G E 1992 J. Mol. Biol. 224 159Google Scholar

    [23]

    Müller C W, Schlauderer G J, Reinstein J, Schulz G E 1996 Structure 4 147Google Scholar

    [24]

    Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C 2006 Proteins Struct. Funct. Bioinf. 65 712Google Scholar

    [25]

    Izadi S, Anandakrishnan R, Onufriev A V 2014 J. Phys. Chem. Lett. 5 3863Google Scholar

    [26]

    Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot B L, Grubmüller H, MacKerell A D 2017 Nat. Methods 14 71Google Scholar

    [27]

    Bussi G, Donadio D, Parrinello M 2007 J. Chem. Phys. 126 014101Google Scholar

    [28]

    Essmann U, Perera L E, Berkowitz M L, Darden T A, Lee H C, Pedersen L G 1995 J. Chem. Phys. 103 8577Google Scholar

    [29]

    Kingma D P, Ba J 2014 arXiv:1412.6980 [cs.LG

    [30]

    Lovell S C, Davis I W, Arendall III W B, de Bakker P I W, Word J M, Prisant M G, Richardson J S, Richardson D C 2003 Proteins Struct. Funct. Bioinf. 50 437Google Scholar

    [31]

    Eastman P, Swails J, Chodera J D, McGibbon R T, Zhao Y, Beauchamp K A, Wang L P, Simmonett A C, Harrigan M P, Stern C D, Wiewiora R P, Brooks B R, Pande V S 2017 PLoS Comput. Biol. 13 e1005659Google Scholar

    [32]

    Shirts M R, Klein C, Swails J M, Yin J, Gilson M K, Mobley D L, Case D A, Zhong E D 2017 J. Comput. -Aided Mol. Des. 31 147Google Scholar

    [33]

    Touw W G, Baakman C, Black J, te Beek T A, Krieger E, Joosten R P, Vriend G 2015 Nucleic Acids Res. 43 D364Google Scholar

  • 图 1  中间层受监督的自编码器示意图

    Figure 1.  Schematic of supervised-AE.

    图 2  本研究中使用的两种蛋白质分子的不同结构 (a) T4L的闭合(不透明)和打开(透明)结构, 紫色为α螺旋, 黄色为β折叠; (b) AdK的闭合(不透明)和打开(透明)结构, 不同颜色表示不同的结构域

    Figure 2.  Different structures of the two proteins in the work. (a) The close (opaque) and open (transparent) state of T4L. α-helix is colored in purple and β-sheet is colored in yellow. (b) The close (opaque) and open (transparent) state of AdK. Different domains are colored in different colors.

    图 3  T4L的构象空间探索结果 (a) 使用AMBER99SB力场/OPC水模型; (b)使用CHARMM36m力场/TIP3P水模型

    Figure 3.  Results of conformational space exploration of T4L: (a) With AMBER99SB/OPC; (b) with CHARMM36m/ TIP3P.

    图 4  探索到的不同T4L构象 (a) PDB编号173L的晶体结构(不透明)与探索到的相似结构(透明); (b) 开合程度不同的两个构象; (c) 扭动情况不同的两个构象; 紫色为α螺旋, 黄色为β折叠

    Figure 4.  Different T4L conformations explored: (a) PDB:173L (opaque) and a similar structure explored; (b) two conformations with different degrees of opening and closing; (c) two conformations with different degrees of twisting. α-helix is colored in purple and β-sheet is colored in yellow.

    图 5  T4L构象探索结果的合理性检验 (a) 使用AMBER99SB力场/OPC水模型; (b) 使用CHARMM36m力场/TIP3P水模型; (c) 修复后各代表构象的二级结构含量, 参考值为模拟轨迹的平均值

    Figure 5.  Plausibility check of T4L conformational exploration results: (a) With AMBER99SB/OPC; (b) with CHARMM36m/TIP3P; (c) secondary structure counts of each representative conformation after fixing, the reference is the average value of the simulated trajectory.

    图 6  仅从打开状态出发的T4L构象探索结果

    Figure 6.  Results of T4L conformational exploration from the open state only.

    图 7  AdK的构象空间探索结果 (a) 使用AMBER99SB力场/OPC水模型; (b)使用CHARMM36m力场/TIP3P水模型

    Figure 7.  Results of conformational space exploration of AdK: (a) With AMBER99SB/OPC; (b) with CHARMM36m/TIP3P.

    图 8  探索到的不同AdK构象

    Figure 8.  Different AdK conformations explored.

    图 9  AdK构象探索结果的合理性检验 (a) 使用AMBER99SB力场/OPC水模型; (b)使用CHARMM36m力场/TIP3P水模型; (c) 修复后各代表构象的二级结构含量, 参考值为模拟轨迹的平均值

    Figure 9.  Plausibility check of AdK conformational exploration results: (a) With AMBER99SB/OPC; (b) with CHARMM36m/TIP3P; (c) secondary structure counts of each representative conformation after fixing, the reference is the average value of the simulated trajectory.

    图 10  使用普通自编码器探索AdK的构象空间

    Figure 10.  Exploring the conformational space of AdK with a common self-encoder.

  • [1]

    Chu X, Gan L, Wang E, Wang J 2013 Proc. Natl. Acad. Sci. U.S.A. 110 E2342Google Scholar

    [2]

    Smyth M S, Martin J H 2000 Mol. Pathol. 53 8Google Scholar

    [3]

    Danev R, Yanagisawa H, Kikkawa M 2019 Trends Biochem. Sci. 44 837Google Scholar

    [4]

    Vincenzi M, Mercurio F A, Leone M 2021 Curr. Med. Chem. 28 2729Google Scholar

    [5]

    Kachala M, Valentini E, Svergun D I 2015 Adv. Exp. Med. Biol. 870 261Google Scholar

    [6]

    Chu F, Thornton D T, Nguyen H T 2018 Methods 144 53Google Scholar

    [7]

    Bhaumik S R 2021 Emerg. Top Life Sci. 5 49Google Scholar

    [8]

    Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, Tunyasuvunakool K, Bates R, Žídek A, Potapenko A, Bridgland A, Meyer C, Kohl S A A, Ballard A J, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Petersen S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior A W, Kavukcuoglu K, Kohli P, Hassabis D 2021 Nature 596 583Google Scholar

    [9]

    Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee G R, Wang J, Cong Q, Kinch L N, Schaeffer R D, Millán C, Park H, Adams C, Glassman C R, DeGiovanni A, Pereira J H, Rodrigues A V, van Dijk A A, Ebrecht A C, Opperman D J, Sagmeister T, Buhlheller C, Pavkov-Keller T, Rathinaswamy M K, Dalwadi U, Yip C K, Burke J E, Garcia K C, Grishin N V, Adams P D, Read R J, Baker D 2021 Science 373 871Google Scholar

    [10]

    Karplus M, Kuriyan J 2005 Proc. Natl. Acad. Sci. 102 6679Google Scholar

    [11]

    Bernardi R C, Melo M C R, Schulten K 2015 Biochim. Biophys. Acta 1850 872Google Scholar

    [12]

    Mu J, Liu H, Zhang J, Luo R, Chen H F 2021 J. Chem. Inf. Model. 61 1037Google Scholar

    [13]

    Lemke T, Peter C 2019 J. Chem. Theory Comput. 15 1209Google Scholar

    [14]

    Zhu J, Wang J, Han W, Xu D 2022 Nat. Commun. 13 1661Google Scholar

    [15]

    Hinton G E, Salakhutdinov R R 2006 Science 313 504Google Scholar

    [16]

    Degiacomi M T 2019 Structure 27 1034Google Scholar

    [17]

    Wen B, Peng J, Zuo X, Gong Q, Zhang Z 2014 Biophysical J. 107 956Google Scholar

    [18]

    Giri Rao V V H, Gosavi S 2014 PLOS Computational Biology 10 e1003938Google Scholar

    [19]

    Abraham M J, Murtola T, Schulz R, Páll S, Smith J C, Hess B, Lindahl E 2015 SoftwareX 1–2 19Google Scholar

    [20]

    Weaver L H, Matthews B W 1987 J. Mol. Biol. 193 189Google Scholar

    [21]

    Zhang X J, Wozniak J A, Matthews B W 1995 J. Mol. Biol. 250 527Google Scholar

    [22]

    Müller C W, Schulz G E 1992 J. Mol. Biol. 224 159Google Scholar

    [23]

    Müller C W, Schlauderer G J, Reinstein J, Schulz G E 1996 Structure 4 147Google Scholar

    [24]

    Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C 2006 Proteins Struct. Funct. Bioinf. 65 712Google Scholar

    [25]

    Izadi S, Anandakrishnan R, Onufriev A V 2014 J. Phys. Chem. Lett. 5 3863Google Scholar

    [26]

    Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot B L, Grubmüller H, MacKerell A D 2017 Nat. Methods 14 71Google Scholar

    [27]

    Bussi G, Donadio D, Parrinello M 2007 J. Chem. Phys. 126 014101Google Scholar

    [28]

    Essmann U, Perera L E, Berkowitz M L, Darden T A, Lee H C, Pedersen L G 1995 J. Chem. Phys. 103 8577Google Scholar

    [29]

    Kingma D P, Ba J 2014 arXiv:1412.6980 [cs.LG

    [30]

    Lovell S C, Davis I W, Arendall III W B, de Bakker P I W, Word J M, Prisant M G, Richardson J S, Richardson D C 2003 Proteins Struct. Funct. Bioinf. 50 437Google Scholar

    [31]

    Eastman P, Swails J, Chodera J D, McGibbon R T, Zhao Y, Beauchamp K A, Wang L P, Simmonett A C, Harrigan M P, Stern C D, Wiewiora R P, Brooks B R, Pande V S 2017 PLoS Comput. Biol. 13 e1005659Google Scholar

    [32]

    Shirts M R, Klein C, Swails J M, Yin J, Gilson M K, Mobley D L, Case D A, Zhong E D 2017 J. Comput. -Aided Mol. Des. 31 147Google Scholar

    [33]

    Touw W G, Baakman C, Black J, te Beek T A, Krieger E, Joosten R P, Vriend G 2015 Nucleic Acids Res. 43 D364Google Scholar

  • [1] Song Rui, Liu Xue-Mei, Wang Hai-Bin, Lü Hao, Song Xiao-Yan. Hardness prediction of WC-Co cemented carbide based on machine learning model. Acta Physica Sinica, 2024, 73(12): 126201. doi: 10.7498/aps.73.20240284
    [2] Zhang Xu, Ding Jin-Min, Hou Chen-Yang, Zhao Yi-Ming, Liu Hong-Wei, Liang Sheng. Research on laser homogenization method based on machine learning. Acta Physica Sinica, 2024, 0(0): . doi: 10.7498/aps.73.20240747
    [3] Ouyang Xin-Jian, Zhang Yan-Xing, Wang Zhi-Long, Zhang Feng, Chen Wei-Jia, Zhuang Yuan, Jie Xiao, Liu Lai-Jun, Wang Da-Wei. Modeling ferroelectric phase transitions with graph convolutional neural networks. Acta Physica Sinica, 2024, 73(8): 086301. doi: 10.7498/aps.73.20240156
    [4] Zhang Jia-Hui. Machine learning for in silico protein research. Acta Physica Sinica, 2024, 73(6): 069301. doi: 10.7498/aps.73.20231618
    [5] Zhang Yi-Fan, Ren Wei, Wang Wei-Li, Ding Shu-Jian, Li Nan, Chang Liang, Zhou Qian. Machine learning combined with solid solution strengthening model for predicting hardness of high entropy alloys. Acta Physica Sinica, 2023, 72(18): 180701. doi: 10.7498/aps.72.20230646
    [6] Guo Wei-Chen, Ai Bao-Quan, He Liang. Reveal flocking phase transition of self-propelled active particles by machine learning regression uncertainty. Acta Physica Sinica, 2023, 72(20): 200701. doi: 10.7498/aps.72.20230896
    [7] Liu Ye, Niu He-Ran, Li Bing-Bing, Ma Xin-Hua, Cui Shu-Wang. Application of machine learning in cosmic ray particle identification. Acta Physica Sinica, 2023, 72(14): 140202. doi: 10.7498/aps.72.20230334
    [8] Zhang Yu-Hang, Xue Zhen-Yong, Sun Hao, Zhang Zhu-Wei, Chen Hu. Single molecule magnetic tweezers for unfolding dynamics of Acyl-CoA binding protein. Acta Physica Sinica, 2023, 72(15): 158702. doi: 10.7498/aps.72.20230533
    [9] Luo Fang-Fang, Cai Zhi-Tao, Huang Yan-Dong. Progress in protein pKa prediction. Acta Physica Sinica, 2023, 72(24): 248704. doi: 10.7498/aps.72.20231356
    [10] Luo Qi-Rui, Shen Yi-Fan, Luo Meng-Bo. Computer simulation and machine learning of polymer collapse and critical adsorption phase transitions. Acta Physica Sinica, 2023, 72(24): 240502. doi: 10.7498/aps.72.20231058
    [11] Guan Xing-Yue, Huang Heng-Yan, Peng Hua-Qi, Liu Yan-Hang, Li Wen-Fei, Wang Wei. Machine learning in molecular simulations of biomolecules. Acta Physica Sinica, 2023, 72(24): 248708. doi: 10.7498/aps.72.20231624
    [12] Lin Kai-Dong, Lin Xiao-Qian, Lin Xu-Bo. Virtual screening of drugs targeting PD-L1 protein. Acta Physica Sinica, 2023, 72(24): 240501. doi: 10.7498/aps.72.20231068
    [13] Li Wei, Long Lian-Chun, Liu Jing-Yi, Yang Yang. Classification of magnetic ground states and prediction of magnetic moments of inorganic magnetic materials based on machine learning. Acta Physica Sinica, 2022, 71(6): 060202. doi: 10.7498/aps.71.20211625
    [14] Kang Jun-Feng, Feng Song-Jiang, Zou Qian, Li Yan-Jie, Ding Rui-Qiang, Zhong Quan-Jia. Machine learning based method of correcting nonlinear local Lyapunov vectors ensemble forecasting. Acta Physica Sinica, 2022, 71(8): 080503. doi: 10.7498/aps.71.20212260
    [15] Zhang Jia-Wei, Yao Hong-Bo, Zhang Yuan-Zheng, Jiang Wei-Bo, Wu Yong-Hui, Zhang Ya-Ju, Ao Tian-Yong, Zheng Hai-Wu. Self-powered sensing based on triboelectric nanogenerator through machine learning and its application. Acta Physica Sinica, 2022, 71(7): 078702. doi: 10.7498/aps.71.20211632
    [16] Ai Fei, Liu Zhi-Bing, Zhang Yuan-Tao. Numerical study of discharge characteristics of atmospheric dielectric barrier discharges by integrating machine learning. Acta Physica Sinica, 2022, 71(24): 245201. doi: 10.7498/aps.71.20221555
    [17] Lin Jian, Ye Meng, Zhu Jia-Wei, Li Xiao-Peng. Machine learning assisted quantum adiabatic algorithm design. Acta Physica Sinica, 2021, 70(14): 140306. doi: 10.7498/aps.70.20210831
    [18] Chen Jiang-Zhi, Yang Chen-Wen, Ren Jie. Machine learning based on wave and diffusion physical systems. Acta Physica Sinica, 2021, 70(14): 144204. doi: 10.7498/aps.70.20210879
    [19] Liu Wu, Zhu Cheng-Wan, Li Hao-Tian, Zhao Su-Ling, Qiao Bo, Xu Zheng, Song Dan-Dan. Optimization of Ga content gradient in Cu(In,Ga)Se2 solar cells through machine learning and device simulation. Acta Physica Sinica, 2021, 70(23): 238802. doi: 10.7498/aps.70.20211234
    [20] Yang Zi-Xin, Gao Zhang-Ran, Sun Xiao-Fan, Cai Hong-Ling, Zhang Feng-Ming, Wu Xiao-Shan. High critical transition temperature of lead-based perovskite ferroelectric crystals: A machine learning study. Acta Physica Sinica, 2019, 68(21): 210502. doi: 10.7498/aps.68.20190942
Metrics
  • Abstract views:  2098
  • PDF Downloads:  175
  • Cited By: 0
Publishing process
  • Received Date:  28 June 2023
  • Accepted Date:  29 July 2023
  • Available Online:  12 September 2023
  • Published Online:  20 December 2023

/

返回文章
返回