Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Public Mood and Consumption Choices: Evidence from Sales of Sony Cameras on Taobao

  • Qingguo Ma ,

    maqingguo3669@zju.edu.cn

    Affiliations School of management, Zhejiang University, Hangzhou, China, Neuromanagment Lab, Zhejiang University, Hangzhou, China

  • Wuke Zhang

    Affiliations School of management, Zhejiang University, Hangzhou, China, Neuromanagment Lab, Zhejiang University, Hangzhou, China

Abstract

Previous researchers have tried to predict social and economic phenomena with indicators of public mood, which were extracted from online data. This method has been proved to be feasible in many areas such as financial markets, economic operations and even national suicide numbers. However, few previous researches have examined the relationship between public mood and consumption choices at society level. The present study paid attention to the “Diaoyu Island” event, and extracted Chinese public mood data toward Japan from Sina MicroBlog (the biggest social media in China), which demonstrated a significant cross-correlation between the public mood variable and sales of Sony cameras on Taobao (the biggest Chinese e-business company). Afterwards, several candidate predictors of sales were examined and finally three significant stepwise regression models were obtained. Results of models estimation showed that significance (F-statistics), R-square and predictive accuracy (MAPE) all improved due to inclusion of public mood variable. These results indicate that public mood is significantly associated with consumption choices and may be of value in sales forecasting for particular products.

Introduction

Traditionally, behavioral economics emphasized the effect of emotion on individual behavior and decision-making. When there are enough people in the same mood, it may have considerate influence on related events. For instance, public mood has been shown to have an influence on presidential election [1], economic indexes [2], fluctuation of financial markets [35] and even the number of national suicides [6]. Moreover, public mood information is also valuable in predicting related events and indexes [711]. However, few prior studies have examined the relationship between public mood and consuming choices at society level, let alone applying public mood information to make a prediction.

In previous studies, public mood information was extracted from news titles, survey data [5], searching data from Google [2] as well as social media [1, 36, 10]. Among these sources, data from social media has been widely applied in forecasting financial indexes [3, 4], economic indexes [2] and suicide numbers [6]. Moreover, compared with offline data, social media data has been demonstrated to be more accurate in predicting financial indexes, which also has a greater lead time [5].

The territorial dispute between China and Japan has been around for a long time and the conflict was more intense in July 2012 because Japanese government announced the ownership of Diaoyu Island at that time. This led to a sharp fluctuation of Chinese public mood and subsequently changed Chinese consumption choices of Japanese products.

With the rapid development of IT and social media in China, to extract public mood information from social media becomes available, and makes it possible to study the relation between public mood and consumption choices. Because Sony camera is a familiar Japanese product and is widely consumed in China, it was adopted to study the influence of Chinese public mood on the sales of Japanese products during the period of the “Diaoyu Island” event. The key points of this study were to extract public mood data from social media and to explore its relationship with the sales of Japanese products. This relationship would contribute to the building of forecasting models with public mood as independent variable and sales data as dependent variable.

Materials and Methods

Sales data

Without official authorization, it would be impossible to collect complete sales data (daily or weekly) of Sony camera in China. Considering that Taobao (www.taobao.com) is the biggest Chinese e-business company with more than 3.7 hundred million members and tens of millions daily deals, we chose Taobao as sales data resource. In 2012/10/10, we collected daily sales of Sony camera (St) from 2012/8/1 to 2012/10/8 on shu.taobao.com. These data can be seen in S1 Dataset.

Social Media Data

Sina Microblog is one of the leading social media in China, users of which include sport and movie stars, enterprise managers, media practitioners, government officials and other people from nearly all industries. Thus, Sina Microblog was chosen as the data source of public mood information.

As has been mentioned above, Japanese government announced the ownership of Diaoyu Island in July, 2012, which led to a fierce diplomacy conflict between China and Japan. In the present study, we defined the measurement: daily original blogs (Bt) as the number of daily original blogs simultaneously mentioning the Chinese words “Diaoyu Island”, “territory”, “sovereignty” and at least one of the following terms: “boycott of Japanese goods”, “defending Diaoyu Islands”, “defending sovereignty”, “defending territory”, “fighting for sovereignty”, “fighting for territory”, “protesting against the Japanese government”, “disdaining Japan” and “disdaining Japanese government”. The daily original blogs (Bt) might reflect negative public mood of Chinese toward Japan and Japanese products. A similar method has been adopted in previous studies. [5, 6] For example, one study used the Tweet volumes of financial search terms (TV-FST) as negative mood variable of financial market. [5] And another one used the daily document frequency mentioning particular words as negative public mood variable to forecast national suicide numbers. [6] All the media data used in this study can be collected from www.weibo.com, and we collected these data in 2012/10/10. These data can be seen in S1 Dataset.

Ethics Statement

This study collected existing data that were publicly available on the Internet. No individual and personal details were identified. Therefore, ethics approval was deemed unnecessary.

Statistical analysis

Cross-correlation analysis is the basic method of forecasting a time series with another one and cross-correlation coefficients can be very helpful in building prediction models. However, this method is not always working. A better method is to find suitable functions to change time series and let the changed time series to be expectedly significantly cross-correlated. With the new cross-correlation, forecasting models can be built including autocorrelation of dependent variables. The key point in exploring correlational relationship in big data is also finding the appropriate functions to change variables and make them significantly correlated.

Based on these rules, cross–correlation analysis between daily original blogs (Bt) and daily sales of Sony camera (St) was conducted at first place in order to explore direct cross–correlation. Afterwards, we explored appropriate functions to change the two time series data to get a better cross-correlation. After that, partial autocorrelation analysis of daily sale data of Sony camera (St) was conducted to study the effect of advance sales on later ones. Finally, three stepwise regression models were built with the two correlation coefficients (cross-correlation and partial autocorrelation) and were evaluated according to Mean Absolute Percentage Error (MAPE). All statistical analyses including variable selection and models construction were performed using SPSS19.0.

Result

Trends of daily original blogs (Bt) and daily sales of Sony camera (St) with cross-correlation analysis

Over the 70-days period of this study, both daily original blogs (Bt) and daily sales of Sony camera (St) experienced obvious fluctuations (Fig 1). Compared with St, Bt experienced fiercer variation, booming from 2 to nearly 30000. Cross-correlation analysis between St and Bt was also conducted, which was shown in Table 1.

thumbnail
Fig 1. Trends of daily original blogs (Bt) and daily sales of Sony camera (St).

https://doi.org/10.1371/journal.pone.0123129.g001

thumbnail
Table 1. Results of cross-correlation analysis between daily original blogs (Bt) and daily sales of Sony camera (St).

https://doi.org/10.1371/journal.pone.0123129.t001

From Fig 1, we did not find the relationship between Bt and St intuitively, mainly due to the wide range of Bt. We speculated that there might be a cross-correlation between the two time series data and the result (Table 1) supported our conjecture. In the lags (days) of 1, 5, 6, Bt was significantly cross-correlated with St.

Exploring transformation functions for daily original blogs (Bt) and daily sales of Sony camera (St)

The first chosen transformation was moving average, since it would reduce drastic fluctuations of the two time series data (Bt and St). Then, we tried many different transformation functions including moving average, logarithmtics and combination of them with different parameters as far as possible. We ultimately determined the following combination of transformation functions as it was the best one in all groups we tried with higher significance level and better forecasting lags. The final transforming functions were:

The trends of transformed time series data (Xt and Yt) could be seen in Fig 2. From the result, we could conjecture that there were negative correlations between Xt and Yt intuitively. According to the curves of Xt and Yt, the 70-days periods could be further divided into six sub-periods: which are 2012/8/1 to 2012/8/13, 2012/8/14 to 2012/8/25, 2012/8/26 to 2012/9/10, 2012/9/11 to 2012/9/24, 2012/9/25 to 2012/9/30 and 2012/10/1 to 2012/10/8 respectively. During each sub-period, the negative cross-correlation between Xt and Yt was more obvious.

thumbnail
Fig 2. Trends of public mood variable (Xt) and camera sales variable (Yt).

https://doi.org/10.1371/journal.pone.0123129.g002

Cross-correlation test of public mood variable (Xt) and Camera sales variable (Yt)

In this section, Cross-correlation test of public mood variable (Xt) and Camera sales variable (Yt) was conducted and the cross-correlation results could be seen in Fig 3.

In Fig 3, there were ten significant cross-correlation coefficients (p<0.05), which were Xt-6 and Yt, Xt-5 and Yt, Xt-4 and Yt, Xt-3 and Yt, Xt-2 and Yt, Xt-1 and Yt, Xt and Yt, Xt+1 and Yt, Xt+2 and Yt, Xt+3 and Yt. Among these coefficients, the latter four were meaningless in predicting camera sales with public mood variable either because the forecasting direction was reversed (applying camera sales variable to predict public mood variable) or the advance lag is 0 (Xt and Yt). If we had applied the 6 preceding public mood variables (Xt-6, Xt-5, Xt-4, Xt-3, Xt-2 and Xt-1), there might be a high multicollinearity in regression model.

Thus, in order to determine what public mood variables should be chosen in final prediction models, we conducted regression analysis for each public mood variable (Xt-6, Xt-5, Xt-4, Xt-3, Xt-2 and Xt-1) and camera sales variable (Yt). The results were shown in Table 2.

thumbnail
Table 2. Regression analysis for each public mood variable (Xt-6, Xt-5, Xt-4, Xt-3, Xt-2 and Xt-1) and camera sales variable (Yt).

https://doi.org/10.1371/journal.pone.0123129.t002

From the statistics, we could see that Xt-3 had the highest t-value, the lowest p-value and the highest R-square, so the public mood variable (Xt-3) would be included in final prediction models. Moreover, results of stepwise regression for all public mood variables (Xt-6, Xt-5, Xt-4, Xt-3, Xt-2 and Xt-1) and camera sales variable (Yt) also supported this choice:

In the above stepwise regression, other public mood variables (Xt-6, Xt-5, Xt-4, Xt-2 and Xt-1) were all removed excluding Xt-3.

Practically, when we use cross-correlation test to predict a time series data with another one, valuable information in the forecasted variable might be ignored. Therefore, autocorrelation of camera sales variable (Yt) would be studied in the next section.

Autocorrelation and autoregression analyses of camera sales variable (Yt)

Since partial autocorrelation coefficients correspond to autoregression models of time series data, we made autocorrelation analysis of Yt (Fig 4).

thumbnail
Fig 4. Partial autocorrelation results of camera sales variable (Yt).

https://doi.org/10.1371/journal.pone.0123129.g004

From Fig 4, we could see that the camera sales variable (Yt) was partially autocorrelated at the advance lag (days) of 1, 2 and 5. Therefore, variables, Yt-1, Yt-2 and Yt-5, went into the stepwise regression with Yt as dependent variable. There were two significant models just as follows:

In the results, variable Yt-5 was removed in the stepwise regression, because its p-value was more than 0.1. And Yt-1 and Yt-2 were used in autoregression. Thus, Yt-1 and Yt-2 were selected into final prediction models.

Multiple regression models for Sony camera sales

Adopting the selected variables (Xt-3, Yt-1 and Yt-2), we built prediction models for camera sales (Yt) applying stepwise regression (Table 3).

thumbnail
Table 3. Multiple regression models for camera sales (Yt).

https://doi.org/10.1371/journal.pone.0123129.t003

In Table 3, we could see that there were three significant models (all p-value<0.001) including different independent variables: Model 1 only included a sales variable (Yt-1), and Model 2 included a sales variable (Yt-1) and a public mood variable (Xt-3). Model 3 included two sales variables (Yt-2 and Yt-1) and a public mood variable (Xt-3). Equations of these models were as follows:

Only Model 2 and Model 3 had a public mood variable (Xt-3).

Model estimation

In order to test the value of public mood variable in prediction of camera sales, we compared F statistic, R-square and prediction accuracy of Model 1 and Model 2. Forecasting accuracy was measured in terms of Mean Absolute Percentage Error (MAPE). The MAPE was defined as follows:

Where At was the actual value and Ft was the predicted value at the time point t.

The results could be seen in Table 4, which showed that inclusion of the public mood variable (Xt-3) (1) promoted significance (reduction of F-statistic, 53.146->32.575), (2) while increased R-square (0.454->0.508) and (3) reduced MAPE prediction error (12.70->11.35). Therefore, we would conclude that public mood could influence consumption choices, which was an appropriate indicator in forecast of sales.

Discussion

In previous studies, some scholars attempted to extract public mood indicators from a huge amount of online data (e.g. search engine and social media data) and studied their prediction validity in presidential election [1], economic operations [2], financial market indexes [35], and even the number of national suicides [6]. These researches [16] have demonstrated that it is feasible to extract public mood indicators from online data to make predictions. However, few previous researches have examined relationship, specially forecasting relationship between public mood and consuming choices at society level.

Concerned with the “Diaoyu Island” event, this study extracted Chinese public mood information toward Japan and Japanese products from social media, and then analyzed the cross-correlation between the public mood variable and sales variable applying suitable functions. Finally, three prediction models for Sony camera sales (Yt) were built, with public mood information and advance sales as independent variables. The results showed that: (1) the public mood variable could be significantly cross-correlated to the sale variable of a particular product and this correlativity could be used to build prediction models; (2) adding public mood variable in prediction models would promote the significance (reducing F-statistic), increase R-square and reduce MAPE prediction error in the prediction models. These results indicated that public mood was significantly associated with consumption choices and might be of value in sales forecasting for particular products. This study was the extension and supplement of previous data mining researches of online big data.

The main contributions of this study are as follows: 1) the present study paid attention to the “Diaoyu Island Event” between China and Japan, and empirically studied the influence of public mood on related consumption, which had not been studied by previous researches; 2) beyond correlation between public mood and related consumption, this study found that public mood might be valuable in forecasting sales of particular products; 3) the current paper discussed the approach of applying variable transformation to probe the correlativity between time series data, which might be a new way to analyze online big data in the future.

Supporting Information

S1 Dataset. Data of daily original blogs (Bt) and daily sales of Sony camera (St).

https://doi.org/10.1371/journal.pone.0123129.s001

(XLSX)

Acknowledgments

We would like to express our gratitude to Liang Meng and Tianzhi Li for their help with this project.

Author Contributions

Conceived and designed the experiments: QM WZ. Performed the experiments: QM WZ. Analyzed the data: QM WZ. Contributed reagents/materials/analysis tools: QM WZ. Wrote the paper: QM WZ.

References

  1. 1. Tumasjan A, Sprenger TO, Sandner PG, Welpe IM. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. ICWSM2010. p. 178–85.
  2. 2. Choi H, Varian H. Predicting the present with google trends. Economic Record. 2012;88(s1):2–9.
  3. 3. Bollen J, Mao H, Pepe A, editors. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM; 2011.
  4. 4. Bollen J, Mao H, Zeng X. Twitter mood predicts the stock market. Journal of Computational Science. 2011;2(1):1–8.
  5. 5. Mao H, Counts S, Bollen J. Predicting financial markets: Comparing survey, news, twitter and search engine data. arXiv preprint arXiv:11121051. 2011.
  6. 6. Won H- H, Myung W, Song G-Y, Lee W-H, Kim J-W, Carroll BJ, et al. Predicting national suicide numbers with social media data. PloS one. 2013;8(4):e61809. pmid:23630615
  7. 7. Cook S, Conrad C, Fowlkes AL, Mohebbi MH. Assessing Google flu trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic. PloS one. 2011;6(8):e23610. pmid:21886802
  8. 8. Ortiz JR, Zhou H, Shay DK, Neuzil KM, Fowlkes AL, Goss CH. Monitoring influenza activity in the United States: a comparison of traditional surveillance systems with Google Flu Trends. PloS one. 2011;6(4):e18687. pmid:21556151
  9. 9. Dugas AF, Jalalpour M, Gel Y, Levin S, Torcaso F, Igusa T, et al. Influenza forecasting with Google flu trends. PloS one. 2013;8(2):e56176. pmid:23457520
  10. 10. Asur S, Huberman BA, editors. Predicting the future with social media. Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on; 2010: IEEE.
  11. 11. McIver DJ, Brownstein JS. Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time. PLoS computational biology. 2014;10(4):e1003581. pmid:24743682