Over the past several years, as the development of Internet, social media websites such as Twitter and Weibo have received much attention due to their enormous users. A lot of research has been done on sentiment analysis and opinion mining in these websites. However the number of research on using the data in the social media websites to predict the stock market price movement is limited. Behavioral economics and behavioral finance believe that public mood is correlated with economic indicators and financial decisions are significantly driven by emotions. This paper first presents a Chinese emotion mining approach and discusses whether the public emotions or opinions in the Chinese social media websites could be used to predict the stock market price in China. The experimental results demonstrate that the emotions automatically extracted from the large scale Weibo posts represent the real public opinions about some special topics of the stock market in China. Some public mood states extracted such as the “Happiness” and “Disgust” states are highly correlated with the change of stock price according to the Granger causality analysis. Finally, a nonlinear autoregressive model with exogenous sentiment inputs is proposed to predict the stock price movement.
In this paper we investigate on detecting opinion spammer groups through analyzing how users interact with each other. More specifically, our approaches are based on 1)Â discovering strong vs. weak implicit communities by mining user interaction patterns, and 2)Â revealing positive vs. negative communities through sentiment analysis on user interactions. Through extensive experiments over various datasets collected from Amazon, we found that the discovered strong, positive communities are significantly more likely to be opinion spammer groups than other communities. Interestingly, while our approach focused mainly on the characteristics of user interactions, it is comparable to the state of the art content-based classifier that mainly uses various content-based features extracted from user reviews. More importantly, we argue that our approach can be more robust than the latter in that if spammers superficially alter their review contents, our approach can still reliably identify them while the content-based approaches may fail.