-
An important problem in the area of social networking is the community detection. In the problem of community detection, the goal is to partition the network into dense regions of the graph. Such dense regions typically correspond to entities which are closely related with each other, and can hence be said to belong to a community. Detecting communities is of great importance in computing biology and sociology networks. There have been lots of methods to detect community. When detecting communities in social media networks, there are two possible sources of information one can use: the network link structure, and the features and attributes of nodes. Nodes in social media networks have plenty of attributes information, which presents unprecedented opportunities and flexibility for the community detection process. Some community detection algorithms only use the links between the nodes in order to determine the dense regions in the graph. Such methods are typically based purely on the linkage structure of the underlying social media network. Some other community detection algorithms may utilize the nodes' attributes to cluster the nodes, i.e. which nodes with the same attributes would be put into the same cluster. While traditional methods only use one of the two sources or simple linearly combine the results of community detection based on different sources, they cannot detect community with node attributes effectively. In recent years, matrix factorization (MF) has received considerable interest from the data mining and information retrieval fields. MF has been successfully applied in document clustering, image representation, and other domains. In this paper, we use nodes attributes as a better supervision to the community detection process, and propose an algorithm based on joint matrix factorization (CDJMF). Our method is based on the assumption that the two different information sources of linkage and node attributes can get an identical nodes' affiliation matrix. This assumption is reasonable and can interpret the inner relationship between the two different information sources, based on which the performance of community detection can be greatly improved. We also conduct some experiments on three different real social networks; theoretical analysis and numerical simulation results show that our approach can get a superior performance than some classical algorithms, so our method is an effective way to explore community structure of social networks.
-
Keywords:
- matrix factorization /
- node attributes /
- community detection
[1] Fortunato S 2010 Physics Reports 486 75
[2] Tang J L, Wang X F, Liu H 2011 MSMMUSE 7472 1
[3] Wang X Y, Zhao Z X 2014 Acta Phys. Sin 63 178901 (in Chinese) [王兴元, 赵仲祥 2014 物理学报 63 178901]
[4] Chen Y H, Wang L J, Dong M 2010 IEEE Transactions on knowledge and data engineering 22 1459
[5] Li M, Wang B H 2014 Chin. Phys. B 23 76402
[6] Girvan M, Newman M E J 2002 Proc. Natl. Acad. Sci 99 7821
[7] Luxburg U 2007 Statistics and Computing 17 395
[8] Palla G, Dernyi I, Farkas I, Vicsek T 2005 Nature 435 814
[9] Tang L, Liu H 2009 Proceedings of the 18th ACM Conference on Information and Knowledge ManagementNY, USA November 2-6, 2009 p1107
[10] Liang Z W, Li J P, Yang F, Athina Petropulu 2014 Chin. phys. B 23 98902
[11] Su X P, Song Y R 2013 Acta Phys. Sin. 64 020101 (in Chinese) [苏晓萍, 宋玉蓉 2015 物理学报 64 020101]
[12] Zhou Y, Cheng H, Yu J X Proceedings of the VLDB Endowment Lyon, France August 24-28, 2009 p718
[13] Xiang R, Neville J, Rogati M Proceedings of the 19th international conference on World wide web NY, USA April 26-30, 2010 p981
[14] Qi G J, Charu C. Aggarwal Proceedings of the 2012 IEEE 28th International Conference on Data Engineering Washington DC, USA April 1-5, 2012 p534
[15] Yang J, McAuley J, Leskovec J Proceedings of the IEEE International Conference on Data Mining Dallas, TX, USA December 7-11, 2013 p1151
[16] Ruan Y, Fuhry D, Parthasarathy S Proceedings of the 22 nd international conference on World Wide Web Republic and Canton of Geneva, Switzerland May 13-17, 2013 p1089
[17] Yuan C, Chai Y 2012 Acta Phys. Sin 61 218901 (in Chinese) [袁超, 柴毅 2012 物理学报 61 218901]
[18] Gunnemann S, Boden B, Farber I, Seidl T Proceedings of the 17th Pacific-Asia Conference (PAKDD) Gold Coast, Australia April 14-17, 2013 p261
[19] Lee D D, Seung H S 1999 Nature 401 788
[20] Wang H, Nie F P, Huang H, Ding C Proceedings of the 2011 IEEE 11th International Conference on Data Mining Vancouver, Canada December 11-14, 2011 p774
[21] Shang F H 2012 Ph. D. Dissertation (Xi an: Xidian University) (in Chinese) [尚凡华 2012 博士学位论文(西安:西安电子科技大学)]
[22] Cai D, He X F, Han J W, Huang T S 2011 IEEE Transactions on Pattern Analysis and Machine Intelligence 8 1548
[23] Christopher M 2010 Ph. D. Dissertation (Stanford: Stanford university)
[24] Eustace J, Wang X Y, Cui Y Z 2015 Physica A 421 510
[25] Meyer C D, Wessell C D 2012 SIAM J. Matrix Anal. Appl. 33 1214
[26] Marcus W, Wasinee R, Alexander S 2004 ZIB-Report 04 1
[27] McAuley J Leskovec J Proceedings of the Advances in Neural Information Processing Systems 25 Lake Tahoe, Nevada, USA December 3-6, 2012 p548
[28] Prithviraj S, Galileo M N, Mustafa B, Lise G, Brian G, Tina E R 2008 AI Magazine 3 93
[29] MeCallum A, Nlgam K, Rennie J, Seymore K 2000 Information Retrieval Journal 3 127
[30] Kanungo T, Mount D M, Netanyahu N S 2002 IEEE Transactions on Pattern Analysis and Machine Intelligence 24 881
-
[1] Fortunato S 2010 Physics Reports 486 75
[2] Tang J L, Wang X F, Liu H 2011 MSMMUSE 7472 1
[3] Wang X Y, Zhao Z X 2014 Acta Phys. Sin 63 178901 (in Chinese) [王兴元, 赵仲祥 2014 物理学报 63 178901]
[4] Chen Y H, Wang L J, Dong M 2010 IEEE Transactions on knowledge and data engineering 22 1459
[5] Li M, Wang B H 2014 Chin. Phys. B 23 76402
[6] Girvan M, Newman M E J 2002 Proc. Natl. Acad. Sci 99 7821
[7] Luxburg U 2007 Statistics and Computing 17 395
[8] Palla G, Dernyi I, Farkas I, Vicsek T 2005 Nature 435 814
[9] Tang L, Liu H 2009 Proceedings of the 18th ACM Conference on Information and Knowledge ManagementNY, USA November 2-6, 2009 p1107
[10] Liang Z W, Li J P, Yang F, Athina Petropulu 2014 Chin. phys. B 23 98902
[11] Su X P, Song Y R 2013 Acta Phys. Sin. 64 020101 (in Chinese) [苏晓萍, 宋玉蓉 2015 物理学报 64 020101]
[12] Zhou Y, Cheng H, Yu J X Proceedings of the VLDB Endowment Lyon, France August 24-28, 2009 p718
[13] Xiang R, Neville J, Rogati M Proceedings of the 19th international conference on World wide web NY, USA April 26-30, 2010 p981
[14] Qi G J, Charu C. Aggarwal Proceedings of the 2012 IEEE 28th International Conference on Data Engineering Washington DC, USA April 1-5, 2012 p534
[15] Yang J, McAuley J, Leskovec J Proceedings of the IEEE International Conference on Data Mining Dallas, TX, USA December 7-11, 2013 p1151
[16] Ruan Y, Fuhry D, Parthasarathy S Proceedings of the 22 nd international conference on World Wide Web Republic and Canton of Geneva, Switzerland May 13-17, 2013 p1089
[17] Yuan C, Chai Y 2012 Acta Phys. Sin 61 218901 (in Chinese) [袁超, 柴毅 2012 物理学报 61 218901]
[18] Gunnemann S, Boden B, Farber I, Seidl T Proceedings of the 17th Pacific-Asia Conference (PAKDD) Gold Coast, Australia April 14-17, 2013 p261
[19] Lee D D, Seung H S 1999 Nature 401 788
[20] Wang H, Nie F P, Huang H, Ding C Proceedings of the 2011 IEEE 11th International Conference on Data Mining Vancouver, Canada December 11-14, 2011 p774
[21] Shang F H 2012 Ph. D. Dissertation (Xi an: Xidian University) (in Chinese) [尚凡华 2012 博士学位论文(西安:西安电子科技大学)]
[22] Cai D, He X F, Han J W, Huang T S 2011 IEEE Transactions on Pattern Analysis and Machine Intelligence 8 1548
[23] Christopher M 2010 Ph. D. Dissertation (Stanford: Stanford university)
[24] Eustace J, Wang X Y, Cui Y Z 2015 Physica A 421 510
[25] Meyer C D, Wessell C D 2012 SIAM J. Matrix Anal. Appl. 33 1214
[26] Marcus W, Wasinee R, Alexander S 2004 ZIB-Report 04 1
[27] McAuley J Leskovec J Proceedings of the Advances in Neural Information Processing Systems 25 Lake Tahoe, Nevada, USA December 3-6, 2012 p548
[28] Prithviraj S, Galileo M N, Mustafa B, Lise G, Brian G, Tina E R 2008 AI Magazine 3 93
[29] MeCallum A, Nlgam K, Rennie J, Seymore K 2000 Information Retrieval Journal 3 127
[30] Kanungo T, Mount D M, Netanyahu N S 2002 IEEE Transactions on Pattern Analysis and Machine Intelligence 24 881
Catalog
Metrics
- Abstract views: 7084
- PDF Downloads: 386
- Cited By: 0