搜索

x

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于最大熵模型的微博传播网络中的链路预测

李勇军 尹超 于会 刘尊

引用本文:
Citation:

基于最大熵模型的微博传播网络中的链路预测

李勇军, 尹超, 于会, 刘尊

Link prediction in microblog retweet network based on maximum entropy model

Li Yong-Jun, Yin Chao, Yu Hui, Liu Zun
PDF
导出引用
  • 微博是基于用户关注关系建立的具有媒体特性的实时信息分享社交平台. 微博上的信息扩散具有快速性、爆发性和时效性. 理解信息的传播机理, 预测信息转发行为, 对研究微博上舆论的形成、产品的推广等具有重要意义. 本文通过解析微博转发记录来研究影响信息转发的因素或特征, 把微博信息转发预测问题抽象为链路预测问题, 并提出基于最大熵模型的链路预测算法. 实例验证的结果表明: 1)基于最大熵模型的算法在运行时间上具有明显的优势; 2)在预测结果方面, 最大熵模型比同类其他算法表现优异; 3) 当训练集大小和特征数量变化时, 基于最大熵模型的预测结果表现稳定. 该方法在预测链路时避免了特征之间相互独立的约束, 准确率优于其他同类方法, 对解决复杂网络中其他类型的预测问题具有借鉴意义.
    Microblog is a social media platform, based on the follower-followee relationship, that enables users to share real-time information, by which the information propagation is characterized as rapid, explosive, and immediate. The research on the information propagation and retweet prediction is very important for public sentiment analysis and product promotion. A majority of existing works adopt several traditional prediction methods to predict the future information retweet based on the features extracted from existing retweet behaviors, which are hard to reconcile accuracy, complexity, robustness and feature extensiveness. To overcome the above mentioned shortcomings in existing works, we propose in this paper a link prediction algorithm based on maximum entropy model to predict retweet behavior on microblog. In our proposed approach, firstly we abstract the retweet prediction problem to a link prediction problem. Then we analyze the retweet behaviors on microblog and determine the factors influencing the retweet behavior. We extract the features from the retweet behaviors based on these factors in the next step. Now based on these features, the retweet behavior could be predicted by the proposed approach. However, information redundancy and other issues may exist among these features. These issues will cause an increase in computational complexity or a decrease in computational accuracy. To solve the above problems, we selecte the features dominating the retweet behavior with feature selection methods such as Information Gain, IG-CHI. The proposed model requires no further independent assumption in features or intrinsic constraints, and omits the processing in relation to features, which is usually the prerequisite of other prediction methods. We take the Sina Weibo retweet records in a time span from 2009 to 2012 as an example to test the effectiveness and efficiency of our link prediction algorithm. Results show that: 1) the proposed algorithm has incomparable advantages in running time; 2) as for the predicted result, the proposed algorithm is better than other algorithms in performance evaluations; 3) the proposed algorithm runs stably for different sizes of training sets and feature sets; 4) the accuracy of the predicted results remains stable based on the selected features. The proposed approach avoids the independent restriction among features and shows better accuracy than other similar methods, thus it has reference values for resolving other prediction problems in complex networks.
      通信作者: 李勇军, lyj@nwpu.edu.cn
    • 基金项目: 陕西省自然科学基础研究计划(批准号: 2014JM2-6104, 2015JM6290)资助的课题.
      Corresponding author: Li Yong-Jun, lyj@nwpu.edu.cn
    • Funds: Project supported by the Shaanxi Provincial Natural Science Foundation, China (Grant Nos. 2014JM2-6104, 2015JM6290).
    [1]

    Watts D J, Strogatz S H 1998 Nature 393 440

    [2]

    Barabsi A L, Albert R 1999 Science 286 509

    [3]

    Pastor S R, Vespignani A 2001 Phys. Rev. Lett. 86 3200

    [4]

    Wu T F, Zhou C L, Wang X H, Huang X X, Chen Z Q, Wang R B 2014 Acta Phys. Sin. 63 240501 (in Chinese) [吴腾飞, 周昌乐, 王小华, 黄孝喜, 谌志群, 王荣波 2014 物理学报 63 240501]

    [5]

    Wang J L, Liu F A, Zhu Z F 2015 Acta Phys. Sin. 64 050501 (in Chinese) [王金龙, 刘方爱, 朱振方 2015 物理学报 64 050501]

    [6]

    Wang Y Z, Zheng B H 2014 Proceedings of 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Beijing, China, Aug. 17-20, 2014, p285

    [7]

    Zhao X Q, Tajima K 2014 Proceedings of 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies Warsaw, Poland, Aug. 11-14, 2014, p282

    [8]

    Ding H Y, Wu J 2015 Proceedings of 2015 IEEE International Conference on Multimedia Big Data Beijing, China, Apr. 20-22, 2015, p56

    [9]

    Luo Z L, Wang Y, Wu X T 2012 Proceedings of the 13th International Conference on Web Information System Engineering Paphos, Cyprus, Nov. 28-30, 2012, p777

    [10]

    Yang Z, Guo J Y, Cai K K, Tang J, Li J Z, Zhang L, Su Z 2010 Proceedings of the 19th ACM conference on information and knowledge management Toronto, Canada, Oct. 26-30, 2010, p1633

    [11]

    Peng H K, Zhu J, Piao D Z, Yan R, Zhang Y 2011 Proceedings of IEEE 11th International Conference on Data Mining Workshops Vancouver, Canada, Dec. 11, 2011, p336

    [12]

    Zhao H D, Liu G, Shi C, Wu B 2014 Proceedings of 2014 IEEE International Conference on Data Mining Workshop Shenzhen, China, Dec. 14, 2014, p952

    [13]

    Hou W, Huang Y, Zhang K 2015 Proceedings of IEEE 14th International Conference on Cognitive Informatics Cognitive Computing Beijing, China, Jul. 6-8, 2015, p255

    [14]

    Huang D X, Zhou J, Mu D J, Yang F S 2014 Proceedings of 7th International Symposium on Computational Intelligence and Design Hangzhou, China, Dec. 13-14, 2014, p30

    [15]

    Wu Y, Hu Y, He X H, Deng K 2014 Chin. Phys. B 23 060101

    [16]

    Wang F, Wang H Y, Xu K 2012 Proceedings of IEEE ICDCS Workshop on Peer-to-Peer Computing and Online Social Networking Macau, China, Jun. 18-21 2012, p133

    [17]

    Zhang L M, Pei J, Jia Y, Zhou B, Wang X 2014 Proceedings of 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Beijing, China, Aug. 17-20, 2014, p208

    [18]

    Wu Z X, Liao J X, Zhang L J 2013 Proceedings of 5m th IEEE Conference on Broadband Network $ Multimedia Technology Guilin, China, Nov. 17-19, 2013, p119

    [19]

    Suh B, Hong L C, Pirolli P, Chi E D H 2010 Proceedings of The 2010 IEEE International Conference on Privacy, Security, Risk and Trust Minneapolis, USA, Aug. 20-22, 2010, p177

    [20]

    Xu Z H, Yang Q 2012 Proceedings of 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Istanbul, Turkey, Aug. 26-29, 2012, p46

    [21]

    Leicht E A, Holme P, Newman M E J 2006 Phys. Rev. E 73 026120

    [22]

    Liben-Nowell D, Kleinberg J 2007 J. Am. Soc. Inf. Sci. Tec. 58 1019

    [23]

    L L Y, Zhou T 2011 Physica A 390 1150

    [24]

    Bai M, Hu K, Tang Y 2011 Chin. Phys. B 20 128902

    [25]

    Brin S, Page L 1998 Comput. Netw. ISDN Syst. 30 107

    [26]

    Lorrain F, White H C 1971 J. Math. Soc. 1 49

    [27]

    Liu W P, L L Y 2010 Europhys. Lett. 89 58007

    [28]

    Berger A L, Pietra S, Pietra V 1996 Comput. Linguist. 22 39

    [29]

    Byrd R H, Nocedal J, Schnabel R B 1994 Math. Program.: Series A and B 63 4

  • [1]

    Watts D J, Strogatz S H 1998 Nature 393 440

    [2]

    Barabsi A L, Albert R 1999 Science 286 509

    [3]

    Pastor S R, Vespignani A 2001 Phys. Rev. Lett. 86 3200

    [4]

    Wu T F, Zhou C L, Wang X H, Huang X X, Chen Z Q, Wang R B 2014 Acta Phys. Sin. 63 240501 (in Chinese) [吴腾飞, 周昌乐, 王小华, 黄孝喜, 谌志群, 王荣波 2014 物理学报 63 240501]

    [5]

    Wang J L, Liu F A, Zhu Z F 2015 Acta Phys. Sin. 64 050501 (in Chinese) [王金龙, 刘方爱, 朱振方 2015 物理学报 64 050501]

    [6]

    Wang Y Z, Zheng B H 2014 Proceedings of 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Beijing, China, Aug. 17-20, 2014, p285

    [7]

    Zhao X Q, Tajima K 2014 Proceedings of 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies Warsaw, Poland, Aug. 11-14, 2014, p282

    [8]

    Ding H Y, Wu J 2015 Proceedings of 2015 IEEE International Conference on Multimedia Big Data Beijing, China, Apr. 20-22, 2015, p56

    [9]

    Luo Z L, Wang Y, Wu X T 2012 Proceedings of the 13th International Conference on Web Information System Engineering Paphos, Cyprus, Nov. 28-30, 2012, p777

    [10]

    Yang Z, Guo J Y, Cai K K, Tang J, Li J Z, Zhang L, Su Z 2010 Proceedings of the 19th ACM conference on information and knowledge management Toronto, Canada, Oct. 26-30, 2010, p1633

    [11]

    Peng H K, Zhu J, Piao D Z, Yan R, Zhang Y 2011 Proceedings of IEEE 11th International Conference on Data Mining Workshops Vancouver, Canada, Dec. 11, 2011, p336

    [12]

    Zhao H D, Liu G, Shi C, Wu B 2014 Proceedings of 2014 IEEE International Conference on Data Mining Workshop Shenzhen, China, Dec. 14, 2014, p952

    [13]

    Hou W, Huang Y, Zhang K 2015 Proceedings of IEEE 14th International Conference on Cognitive Informatics Cognitive Computing Beijing, China, Jul. 6-8, 2015, p255

    [14]

    Huang D X, Zhou J, Mu D J, Yang F S 2014 Proceedings of 7th International Symposium on Computational Intelligence and Design Hangzhou, China, Dec. 13-14, 2014, p30

    [15]

    Wu Y, Hu Y, He X H, Deng K 2014 Chin. Phys. B 23 060101

    [16]

    Wang F, Wang H Y, Xu K 2012 Proceedings of IEEE ICDCS Workshop on Peer-to-Peer Computing and Online Social Networking Macau, China, Jun. 18-21 2012, p133

    [17]

    Zhang L M, Pei J, Jia Y, Zhou B, Wang X 2014 Proceedings of 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Beijing, China, Aug. 17-20, 2014, p208

    [18]

    Wu Z X, Liao J X, Zhang L J 2013 Proceedings of 5m th IEEE Conference on Broadband Network $ Multimedia Technology Guilin, China, Nov. 17-19, 2013, p119

    [19]

    Suh B, Hong L C, Pirolli P, Chi E D H 2010 Proceedings of The 2010 IEEE International Conference on Privacy, Security, Risk and Trust Minneapolis, USA, Aug. 20-22, 2010, p177

    [20]

    Xu Z H, Yang Q 2012 Proceedings of 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Istanbul, Turkey, Aug. 26-29, 2012, p46

    [21]

    Leicht E A, Holme P, Newman M E J 2006 Phys. Rev. E 73 026120

    [22]

    Liben-Nowell D, Kleinberg J 2007 J. Am. Soc. Inf. Sci. Tec. 58 1019

    [23]

    L L Y, Zhou T 2011 Physica A 390 1150

    [24]

    Bai M, Hu K, Tang Y 2011 Chin. Phys. B 20 128902

    [25]

    Brin S, Page L 1998 Comput. Netw. ISDN Syst. 30 107

    [26]

    Lorrain F, White H C 1971 J. Math. Soc. 1 49

    [27]

    Liu W P, L L Y 2010 Europhys. Lett. 89 58007

    [28]

    Berger A L, Pietra S, Pietra V 1996 Comput. Linguist. 22 39

    [29]

    Byrd R H, Nocedal J, Schnabel R B 1994 Math. Program.: Series A and B 63 4

  • [1] 马金龙, 张俊峰, 张冬雯, 张红斌. 基于通信序列熵的复杂网络传输容量. 物理学报, 2021, 70(7): 078902. doi: 10.7498/aps.70.20201300
    [2] 谭索怡, 祁明泽, 吴俊, 吕欣. 复杂网络链路可预测性: 基于特征谱视角. 物理学报, 2020, 69(8): 088901. doi: 10.7498/aps.69.20191817
    [3] 陈单, 石丹丹, 潘贵军. 复杂网络电输运性能与通信序列熵之间的关联. 物理学报, 2019, 68(11): 118901. doi: 10.7498/aps.68.20190230
    [4] 孔江涛, 黄健, 龚建兴, 李尔玉. 基于复杂网络动力学模型的无向加权网络节点重要性评估. 物理学报, 2018, 67(9): 098901. doi: 10.7498/aps.67.20172295
    [5] 韩忠明, 陈炎, 李梦琪, 刘雯, 杨伟杰. 一种有效的基于三角结构的复杂网络节点影响力度量模型. 物理学报, 2016, 65(16): 168901. doi: 10.7498/aps.65.168901
    [6] 胡庆成, 张勇, 许信辉, 邢春晓, 陈池, 陈信欢. 一种新的复杂网络影响力最大化发现方法. 物理学报, 2015, 64(19): 190101. doi: 10.7498/aps.64.190101
    [7] 蔡萌, 杜海峰, 费尔德曼. 一种基于最大流的网络结构熵. 物理学报, 2014, 63(6): 060504. doi: 10.7498/aps.63.060504
    [8] 刘树新, 季新生, 刘彩霞, 郭虹. 一种信息传播促进网络增长的网络演化模型. 物理学报, 2014, 63(15): 158902. doi: 10.7498/aps.63.158902
    [9] 袁铭. 带有层级结构的复杂网络级联失效模型. 物理学报, 2014, 63(22): 220501. doi: 10.7498/aps.63.220501
    [10] 王亚奇, 王静, 杨海滨. 基于复杂网络理论的微博用户关系网络演化模型研究. 物理学报, 2014, 63(20): 208902. doi: 10.7498/aps.63.208902
    [11] 吴腾飞, 周昌乐, 王小华, 黄孝喜, 谌志群, 王荣波. 基于平均场理论的微博传播网络模型. 物理学报, 2014, 63(24): 240501. doi: 10.7498/aps.63.240501
    [12] 邓奇湘, 贾贞, 谢梦舒, 陈彦飞. 基于有向网络的Email病毒传播模型及其震荡吸引子研究. 物理学报, 2013, 62(2): 020203. doi: 10.7498/aps.62.020203
    [13] 李勇军, 刘尊, 于会. 基于最大熵模型的导师-学生关系推测. 物理学报, 2013, 62(16): 168902. doi: 10.7498/aps.62.168902
    [14] 梁义, 王兴元. 基于低阶矩阵最大特征值的复杂网络牵制混沌同步. 物理学报, 2012, 61(3): 038901. doi: 10.7498/aps.61.038901
    [15] 宋玉蓉, 蒋国平. 具有非均匀传输和抗攻击差异的网络病毒传播模型. 物理学报, 2010, 59(11): 7546-7551. doi: 10.7498/aps.59.7546
    [16] 邢长明, 刘方爱. 基于Sierpinski分形垫的确定性复杂网络演化模型研究. 物理学报, 2010, 59(3): 1608-1614. doi: 10.7498/aps.59.1608
    [17] 宋青松, 冯祖仁, 李人厚. 用于混沌时间序列预测的多簇回响状态网络. 物理学报, 2009, 58(7): 5057-5064. doi: 10.7498/aps.58.5057
    [18] 欧阳敏, 费 奇, 余明晖. 基于复杂网络的灾害蔓延模型评价及改进. 物理学报, 2008, 57(11): 6763-6770. doi: 10.7498/aps.57.6763
    [19] 郭进利. 供应链型网络中双幂律分布模型. 物理学报, 2006, 55(8): 3916-3921. doi: 10.7498/aps.55.3916
    [20] 李 季, 汪秉宏, 蒋品群, 周 涛, 王文旭. 节点数加速增长的复杂网络生长模型. 物理学报, 2006, 55(8): 4051-4057. doi: 10.7498/aps.55.4051
计量
  • 文章访问数:  4152
  • PDF下载量:  520
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-06-23
  • 修回日期:  2015-10-20
  • 刊出日期:  2016-01-20

基于最大熵模型的微博传播网络中的链路预测

  • 1. 西北工业大学计算机学院, 西安 710072
  • 通信作者: 李勇军, lyj@nwpu.edu.cn
    基金项目: 陕西省自然科学基础研究计划(批准号: 2014JM2-6104, 2015JM6290)资助的课题.

摘要: 微博是基于用户关注关系建立的具有媒体特性的实时信息分享社交平台. 微博上的信息扩散具有快速性、爆发性和时效性. 理解信息的传播机理, 预测信息转发行为, 对研究微博上舆论的形成、产品的推广等具有重要意义. 本文通过解析微博转发记录来研究影响信息转发的因素或特征, 把微博信息转发预测问题抽象为链路预测问题, 并提出基于最大熵模型的链路预测算法. 实例验证的结果表明: 1)基于最大熵模型的算法在运行时间上具有明显的优势; 2)在预测结果方面, 最大熵模型比同类其他算法表现优异; 3) 当训练集大小和特征数量变化时, 基于最大熵模型的预测结果表现稳定. 该方法在预测链路时避免了特征之间相互独立的约束, 准确率优于其他同类方法, 对解决复杂网络中其他类型的预测问题具有借鉴意义.

English Abstract

参考文献 (29)

目录

    /

    返回文章
    返回