Detecting opinion spammer groups and spam targets through community discovery and sentiment analysis

Choo, Euijin; Yu, Ting; Chi, Min

doi:10.3233/JCS-16941

Journal of Computer Security

Detecting opinion spammer groups and spam targets through community discovery and sentiment analysis

Euijin Choo Ting Yu Min Chi

https://doi.org/10.3233/JCS-16941

Pub. online: 5 September 2023 Type: Research Article

Published
5 September 2023

Abstract

In this paper we investigate on detecting opinion spammer groups through analyzing how users interact with each other. More specifically, our approaches are based on 1) discovering strong vs. weak implicit communities by mining user interaction patterns, and 2) revealing positive vs. negative communities through sentiment analysis on user interactions. Through extensive experiments over various datasets collected from Amazon, we found that the discovered strong, positive communities are significantly more likely to be opinion spammer groups than other communities. Interestingly, while our approach focused mainly on the characteristics of user interactions, it is comparable to the state of the art content-based classifier that mainly uses various content-based features extracted from user reviews. More importantly, we argue that our approach can be more robust than the latter in that if spammers superficially alter their review contents, our approach can still reliably identify them while the content-based approaches may fail.

References

[1]

J. Abernethy, O. Chapelle and C. Castillo, Web spam identification through content and hyperlinks, in: Proc. of the 4th Int’l Workshop on Adversarial Information Retrieval on the Web, ACM, 2008, pp. 41–44.

[2]

AlchemyAPI, 2015, http://www.alchemyapi.com/.

[3]

AmazonVine, 2015, http://www.amazon.com/gp/vine/help.

[4]

M. Anderson, Customer survey, 2013, http://searchengineland.com/2013-study-79-of-consumers-trust-online-reviews-as-much-as-personal-recommendations-164565.

[5]

BBC, Yelp admits a quarter of submitted reviews could be fake, 2013, http://www.bbc.com/news/technology-24299742.

[6]

M. Bendersky and W.B. Croft, Finding text reuse on the web, in: Proc. of 2nd ACM Int’l Conf. on Web Search and Data Mining, ACM, 2009, pp. 262–271.

[7]

C. Castillo, D. Donato, A. Gionis, V. Murdock and F. Silvestri, Know your neighbors: Web spam detection using the web topology, in: Proc. of the 30th Annual Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval, ACM, 2007, pp. 423–430.

[8]

E. Choo, T. Yu, M. Chi and Y. Sun, Revealing and incorporating implicit communities to improve recommender systems, in: Proc. of the 15th ACM Conf. on Economics and Computation, ACM, 2014, pp. 489–506.

[9]

C.N. Dellarocas, Strategic manipulation of Internet opinion forums: Implications for consumers and firms, MIT Sloan working papers no. 4501-04, 2004, SSRN: http://ssrn.com/abstract=585404.

[10]

G. Fei, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos and R. Ghosh, Exploiting burstiness in reviews for review spammer detection, in: 7th Int’l AAAI Conf. on Weblogs and Social Media, 2013.

[11]

S. Feng, L. Xing, A. Gogar and Y. Choi, Distributional footprints of deceptive product reviews, in: ICWSM, 2012.

[12]

D. Fetterly, M. Manasse and M. Najork, Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages, in: Proc. of the 7th Int’l Workshop on the Web and Databases, ACM, 2004, pp. 1–6.

[13]

Z. Gyongyi and H. Garcia-Molina, Web spam taxonomy, in: 1st Int’l Workshop on Adversarial Information Retrieval on the Web (AIRWEB 2005), 2005.

[14]

Z. Gyöngyi and H. Garcia-Molina, Link spam alliances, in: Proc. of the 31st Int’l Conf. on Very Large Data Bases, 2005, pp. 517–528.

[15]

Z. Gyöngyi, H. Garcia-Molina and J. Pedersen, Combating web spam with trustrank, in: Proc. of the 13th Int’l Conf. on Very Large Data Bases, Vol. 30, VLDB Endowment, 2004, pp. 576–587.

[16]

C. Harris, Detecting deceptive opinion spam using human computation, in: Workshops at AAAI on Artificial Intelligence, 2012.

[17]

M. Henzinger, Finding near-duplicate web pages: A large-scale evaluation of algorithms, in: Proc. of the 29th Annual Int’l ACM SIGIR Conf. on Research and Development in Information Retrieval, ACM, 2006, pp. 284–291.

[18]

A. Heydari, M. Tavakoli and N. Salim, Detection of fake opinions using time series, Expert Systems with Applications 58 (2016), 83–92. doi:10.1016/j.eswa.2016.03.020.

[19]

M. Jiang, P. Cui and C. Faloutsos, Suspicious behavior detection: Current trends and future directions, Intelligent Systems, IEEE 31(1) (2016), 31–39. doi:10.1109/MIS.2016.5.

[20]

N. Jindal and B. Liu, Opinion spam and analysis, in: Proc. of the Int’l Conf. on Web Search and Web Data Mining, 2008, pp. 219–230.

[21]

N. Jindal, B. Liu and E.-P. Lim, Finding unusual review patterns using unexpected rules, in: Proc. of the 19th ACM Int’l Conf. on Information and Knowledge Management, ACM, 2010, pp. 1549–1552.

[22]

M. Kokkodis, Learning from positive and unlabeled Amazon reviews: Towards identifying trustworthy reviewers, in: Proceedings of the 21st International Conference on World Wide Web, ACM, 2012, pp. 545–546.

[23]

F. Li, M. Huang, Y. Yang and X. Zhu, Learning to identify review spam, in: IJCAI Proceedings-International Joint Conference on Artificial Intelligence, Vol. 22, 2011, p. 2488.

[24]

E.-P. Lim, V.-A. Nguyen, N. Jindal, B. Liu and H.W. Lauw, Detecting product review spammers using rating behaviors, in: Proc. of the 19th ACM Int’l Conf. on Information and Knowledge Management, ACM, 2010, pp. 939–948.

[25]

Y. Liu and Y. Sun, Anomaly detection in feedback-based reputation systems through temporal and correlation analysis, in: Social Computing (SOCIALCOM), 2010 IEEE 2nd Int’l Conf. on, IEEE, 2010, pp. 65–72.

[26]

Y. Lu, L. Zhang, Y. Xiao and Y. Li, Simultaneously detecting fake reviews and review spammers using factor graph model, in: Proc. of the 5th Annual ACM Web Science Conf., ACM, 2013, pp. 225–233.

[27]

D. Mayzlin, Y. Dover and J.A. Chevalier, Promotional reviews: An empirical investigation of online review manipulation, Technical report, National Bureau of Economic Research, 2012.

[28]

A. Mukherjee, A. Kumar, B. Liu, J. Wang, M. Hsu, M. Castellanos and R. Ghosh, Spotting opinion spammers using behavioral footprints, in: Proc. of the 19th ACM Int’l Conf. on Knowledge Discovery and Data Mining, 2013, pp. 632–640.

[29]

A. Mukherjee, B. Liu and N. Glance, Spotting fake reviewer groups in consumer reviews, in: Proc. of the 21st WWW, ACM, 2012, pp. 191–200.

[30]

A. Mukherjee, V. Venkataraman, B. Liu and N. Glance, What yelp fake review filter might be doing, in: 7th Int’l AAAI Conf. on Weblogs and Social Media, 2013.

[31]

A. Ntoulas, M. Najork, M. Manasse and D. Fetterly, Detecting spam web pages through content analysis, in: Proc. of the 15th WWW, ACM, 2006, pp. 83–92.

[32]

M. Ott, C. Cardie and J. Hancock, Estimating the prevalence of deception in online review communities, in: Proceedings of the 21st Int’l Conf. on World Wide Web, ACM, 2012, pp. 201–210.

[33]

M. Ott, C. Cardie and J.T. Hancock, Negative deceptive opinion spam, in: HLT-NAACL, 2013, pp. 497–501.

[34]

M. Ott, Y. Choi, C. Cardie and J.T. Hancock, Finding deceptive opinion spam by any stretch of the imagination, arXiv preprint arXiv:1107.4557, 2011.

[35]

S. Pandit, D.H. Chau, S. Wang and C. Faloutsos, NETPROBE: A fast and scalable system for fraud detection in online auction networks, in: Proceedings of the 16th International Conference on World Wide Web, ACM, 2007, pp. 201–210.

[36]

W. Pirie, Spearman rank correlation coefficient, in: Encyclopedia of Statistical Sciences, 1988.

[37]

D. Quercia, H. Askham and J. Crowcroft, Tweetlda: Supervised topic classification and link prediction in Twitter, in: Proc. of the 3rd Annual ACM Web Science Conference, ACM, 2012, pp. 247–250.

[38]

M. Rahman, B. Carbunar, J. Ballesteros, G. Burri, D. Horng et al., Turning the tide: Curbing deceptive yelp behaviors, in: SDM, SIAM, 2014, pp. 244–252.

[39]

H. Saito, M. Toyoda, M. Kitsuregawa and K. Aihara, A large-scale study of link spam detection by graph algorithms, in: Proc. of the 3rd Int’l Workshop on Adversarial Information Retrieval on the Web, ACM, 2007, pp. 45–48.

[40]

D. Savage, X. Zhang, X. Yu, P. Chou and Q. Wang, Detection of opinion spam based on anomalous rating deviation, Expert Systems with Applications 42(22) (2015), 8650–8657. doi:10.1016/j.eswa.2015.07.019.

[41]

A.A. Sheibani, Opinion mining and opinion spam: A literature review focusing on product reviews, in: Telecommunications (IST), 2012 6th Int’l Symposium on, IEEE, 2012, pp. 1109–1113.

[42]

V. Singh, R. Piryani, A. Uddin and P. Waila, Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification, in: Automation, Computing, Communication, Control and Compressed Sensing (IMAC4S), 2013 Int’l Multi-Conf on, IEEE, 2013, pp. 712–717.

[43]

H. Sun, A. Morales and X. Yan, Synthetic review spamming and defense, in: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2013, pp. 1088–1096.

[44]

N. Spirin and J. Han, Survey on web spam detection: Principles and algorithms, ACM SIGKDD Explorations Newsletter 13(2) (2012), 50–64. doi:10.1145/2207243.2207252.

[45]

G. Wang, S. Xie, B. Liu and P.S. Yu, Review graph based online store review spammer detection, in: Data Mining (ICDM), 2011 IEEE 11th Int’l Conf. on, IEEE, 2011, pp. 1242–1247.

[46]

G. Wang, S. Xie, B. Liu and P.S. Yu, Identify online store review spammers via social review graph, ACM Transactions on Intelligent Systems and Technology (TIST) 3(4) (2012), 61.

[47]

Z. Wang, Anonymity, social image, and the competition for volunteers: A case study of the online market for reviews, The BE Journal of Economic Analysis & Policy 10(1) (2010).

[48]

S. Xie, G. Wang, S. Lin and P.S. Yu, Review spam detection via temporal pattern discovery, in: Proc. of the 18th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, ACM, 2012, pp. 823–831.

[49]

K.-H. Yoo and U. Gretzel, Comparison of deceptive and truthful travel reviews, in: Information and Communication Technologies in Tourism 2009 2009, pp. 37–47.

[50]

D. Zhou, C.J. Burges and T. Tao, Transductive link spam detection, in: Proc. of the 3rd Int’l Workshop on Adversarial Information Retrieval on the Web, ACM, 2007, pp. 21–28.

Full article Related articles

Keywords

Opinion spammer groups sentiment analysis community discovery

Metrics

since February 2017

Article info
views

Full article
views

PDF
downloads

XML
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file

PDF Preview