The social media has gained popularity with the youth across the world. The developments in the telecommunication field and the information technology field have further advanced the use of social media for communication purposes. This study is built around the Japanese dairy producer Yakult Honsha Company which is engaged in production of health drink Yakult since 1935. Yakult is marketed in a distinctive bottle with a foil top. It is a fermented milk drink with a strain of “Lactobacillus casei” which is considered good for overall health and well-being. Few of the key characters in the Netflix movie, “To all the boys I’ve loved before” (released on 17th August 2018 in the USA), are shown drinking “a yogurt smoothie”. As the youth have always been influenced by the movies, the product immediately generated a lot of excitement, acceptance and had an influence on the target youth audience which supposedly led to an increase in sales. Consequently, the stock price of Yakult also increased sharply since the release date of the movie. Thus, there is a need to study if social media such as Twitter can positively influence the sales of a product by creating a positive sentiment.
It is very interesting to note that the movie does not mention “Yakult” at all. It is well understood that not all movie fans use twitter and vice versa yet, this phenomenon was quite remarkable. It is believed that on Twitter, approximately 6000 tweets are made per second and it is estimated that about 500 million tweets are made per day. Still “Yakult” could stand-out in all the noise. Twitter allows maximum 140 characters for a single tweet and it is very popular as a micro-blogging site especially amongst the youth audience.
A unique feature of the study is the qualitative design which was undertaken since there were many perspectives of the phenomenon which were yet to be discovered. Since the aim was to explore whether the social media could positively impact the sales of a product, a grounded theory approach was the best methodological choice for theory generation and elaboration. Another contribution of the study is that it improves the present understanding of the EV theory in an emerging market context. The grounded theory process starts when the first set of data is received and the further collection of the data is also influenced by the analysis of the data of the first set. There is no fixed minimum sample size and rather the focus is on purposive sampling to focus on cases which are information rich with respect to the study. In this study “key informants” were intensively interviewed to gain insights into the phenomenon. The study found out that social media, in this case Tweets, reinforced the positive behavioral sentiment about the movie and the product and also lead to an increase in sales of the product.
The paper is structured as follows. After the introduction in Section 1, conceptual development is discussed in Section 2 which is followed by the qualitative analysis of tweets and Yakult’s stock price movement in Section 3. Section 4 discusses the data analysis of the in-depth interviews and the grounded theory development. Section 5 discusses the findings, limitations and the discussion which is followed by the conclusion in Section 6.
2. Conceptual Development
A “sentiment” had been defined as a positive or negative feeling (Go et al.)  which collectively impact the acceptance of products and services. In the past, scholars have studied blogs and social networks which lead to opinion mining  . Other scholars such as Yang et al.  used web pages as the data for their study. The concept of using tweets as qualitative inputs for a grounded theory development remained largely unexplored.
The Efficient Market Hypothesis (EMH) suggests that the stock prices are largely fluctuating on the basis of news, current events  and thus an understanding of the collective sentiment through tweets would be useful to predict such movements  . Other prior works such as those of Asur and Huberman  had predicted box office collections for movies prior to the release based on Twitter analysis. Bollen  studied whether the public sentiment, as studied from twitter, was correlated to the Dow Jones Industrial Index. Bing et al.  studied tweets and predicted industry’s sector-based stock prices. Poddar et al.  studied the impact of location-based tweets. Hence, there was a promise in using tweets as raw data in the theory building exercise.
Sotiriadis and Van Zyl  highlighted that social media, which is based on word of mouth, was more credible than other forms of paid media. However, none of these prior works used a qualitative interview method post the initial social media-based data collection to go deep into the cause of the phenomenon and to develop a grounded theory of social media’s impact on sales. The present study fills this important gap. The fact that Twitters API is open to other applications and most of the data is public  further positions tweets as interesting input for this study.
The present study does not aim to perform a sentiment analysis and rather presumes the sentiment to be present based on primary observation of tweets and the resultant stock price movement of Yakult. The objective of the study is to de-mystify the sentiment, and do a qualitative probe leading to an extension of the Expectancy Valence (EV) theory. The study also borrows from the pragmatic constructivism school. The construct causality is central to this school. In the context of this study, construct causality is evidenced by the integration of facts, human behavior, possible outcomes, and making social media influence the target audience.
The twitter user profile is understood to be of younger age groups  and such age groups were understood to be more influenced by their own peer groups  . The starting premise of the study was to evaluate the influence of the movie on its audience and subsequently creating the positive sentiment on twitter which proved substantial on creating a positive demand for Yakult. The above discussions lead to the following hypothesis:
H1: Tweets influence the creation of a positive sentiment.
H2: A positive sentiment leads to a favorable sales trend.
The tweet data was collected and analyzed by using NVivo  . To obtain deeper insights regarding the findings of the tweet data, personal in-depth interviews were subsequently conducted on the target audience.
3. Qualitative Analysis of Tweets and Yakult Stock Price Movements
A qualitative research design was chosen since the aim of the study was to build and elaborate a theory. In this study both “how” and “why” questions needed to be addressed and the “sense making” exercise was undertaken by the author. The author used query, visualisation techniques and memos in the first stage of the data analysis by using the software.
The author also undertook extensive coding exercise. The twitter data has many dimensions and thus NVivo  was used for coding purposes rather than doing the coding manually. NVivo  also assists in organizing, deep analysis for finding patterns in data and presentation of findings. Coding from “ten thousand feet view” with a broader picture of phenomenon led to some very interesting findings. A total of 1992 records (tweets) were studied by making 844 codes with references totaling to 8857. The data for 21 days, from 17th August 2018 to 6th September 2018 (both days included) were collected. The 17th August date was the movie release date in the USA. Although pre-release sentiments might have some impact, it was not considered for the purpose of this study. It is also to be noted that in any social media platform, 21 days is not a very short period of time as the topics of discussion changes very frequently on the social media. The data was carefully screened and all irrelevant tweets were eliminated to improve the quality of data used for the purposes of the study. An illustration of the code book is given in Figure 1.
The tweets were captured using “NCapture” Add-in without causing any obstruction and inconvenience to the tweeters. Search for hashtag, keyword, and username was undertaken. For instance, one of the keywords searches were made for “Yakult Sales Netflix” and the second search was made for “Yakult + To All the Boys I’ve Loved Before”.
Finally, two “NCapture” sets were made to collect the relevant tweets. Both the data sets were eventually merged to eliminate multiple and duplicate tweets. A purposive sampling process was undertaken which was followed by an assessment of the topics/themes discussed in the tweets. The data had to be carefully collected as the possibility of misspelling and slangs is high in tweets  . Similar items were clustered together using the “Cluster Analysis” method is shown in Figure 2.
The items clustered by coding similarity were calculated on the basis of Jaccard’s coefficient  as the similarity metric (Figure 3).
Figure 1. The code book.
Figure 2. Items clustered by attribute value similarity.
As per the terms and conditions of Twitter, only the derivations and conclusions from the data after processing the same were discussed in the paper. Therefore, no raw data or individual tweeter is discussed for the purposes of the study―no matter how prolific or significant they may be. This research was conducted purely for academic purposes to help theory building process. Again, as per Twitter terms of usage, only the public tweets are used for this study. This was manageable since most twitter accounts are public, and there was no need to purchase any data from twitter. The tweeter always has the right to withdraw the tweet at any point in time. However, a big point of contention amongst the scholars is whether the twitter data was primary data or secondary data. The author does not resolve this conflict as it’s beyond the scope and objectives of this study. The Bloomberg terminal was accessed to ascertain the trend in the stock price movement of Yakult (Figure 4). The stock price shows a rising trend during first 21 days of release of the movie.
Figure 3. Items clustered by coding similarity.
Figure 4. Yakult stock price movement. Source: Bloomberg accessed on 11 Sep 2018.
4. Data Analysis of the In-Depth Interviews and the Grounded Theory Development
Multiple data sources are a must for such a study to ensure rigor in qualitative data analysis, and therefore apart from the tweets, personal in-depth interviews were also undertaken to build the grounded theory. The target respondents were between 18 to 25 years of age. The respondents were chosen to have a nearly equal representation of males and females in the sample interviewed. The author was open to the idea that the research propositions could also be refined/ modified/extended above and beyond the expectancy valence theory (EV) as the study progressed. There was an intended suspension of judgment after the literature review so as not to prejudice the findings of the interviews with pre-decided notions of the researcher. Only the initial research questions were grounded in the existing literature. There were many follow-up questions that were asked to the respondents. The questions asked were predominantly open ended however a few close ended questions were also asked to ensure clear understanding of the responses. For instance, one question asked was “Please share an instance when you got influenced to try a product based on the media frenzy surrounding a movie?”. The aim of this question was to understand and observe the respondent’s emotional intensity and experiences behind the purchase decision. The respondents were interviewed in their natural setting i.e. their colleges and institution of higher learning. Each interview lasted from 20 minutes to 30 minutes.
The interview method  which advocated “going native” and explore deeply and at the same time keeping the authors role as less as possible was initiated for the informant who were considered as the “knowledge agent”. However, the informant reported all what was observed/discussed (Table 1). To the interviewee of the semi-structured interviews, complete anonymity was promised to get
Table 1. “Data Structure” of a select few demonstrative quotes from qualitative interviews leading to theory building.
free and honest responses. The notes taken and observations made were shown back to the respondents and then transcribed. The interview protocol revision process  was also undertaken.
As suggested in the Gioia Methodology  the author also made a revision of the interview questions as per the progress made in the previous interviews and also as per the primary “Gestalt Analysis” i.e. the whole is more comprehensive and meaningful and so should be interpreted collectively. A coding exercise was undertaken to maintain the integrity of the 1st Order Terms which was followed by a comprehensive “compendium of 1st Order Terms”. After that the author converted the 1st Order into the 2nd Order (Theory Centric) themes. First a “bottom-up” inductive process was used to develop patterns in the data which was followed by a “top-down” deductive logic to integrate themes with the data captured. The final result of the process was the creation of data structure by identification of principles that are portable. Thus, for the final reporting purposes, the data structure creation  was attempted as under:
• The 1st order concepts (Informants words)
• The 2nd order theories (Authors Key Phrases as researcher)
• Aggregate Dimensions
• Finally, the second round of literature review was undertaken for confirming the “overall concept” and the “smaller and more specific construct” to conclude the research process.
5. Findings, Limitations and Discussion
Social media analytics involve collection of data from the social media to gain insights of users tastes and preferences. The research illustrates the use of what comes close to being called as “tweet related big data” for developing a primary and quick understanding of a phenomenon. It then extends that understanding with in-depth interviews for a theory development exercise which validates and extends the expectancy valence theory (EV) with the help of qualitative data.
The study had some limitations as well:
• The first limitation of the study is that data was collected for 21 days only and observation is based on short term trend.
• The second limitation of the study is that the data was collected for English language only.
• Finally, a challenge of microblogging based studies is the range of topics  and this study was no exception to this.
A word cloud was created to visually analyse the most prominent words of the tweets. As can be seen from Figure 5, the words Yakult, boys, loved, Netflix, Yog, can be very prominently seen from the word cloud. Although there are other prominent words as well, only these words are relevant for the purpose of this study.
The words Yakult, Netflix, sales, boys, loved are mentioned among the ten most frequently used words in the 1992 tweets (Figure 6). These words contribute
Figure 5. The word cloud.
Figure 6. Ten most frequently used words in the tweets.
to the positive sentiment for Yakult and the increase in sales eventually giving support to both hypothesis 1 and hypothesis 2.
The word frequency (Table 2) clearly highlights that the 6-letter word “Yakult” was mentioned 6.3% weighted percentage with an actual count of 5239. This was the word with the highest word frequency among the 30 most frequently used words. The other prominently used words relevant for this study are yogurt (word count of 712; weighted percentage level of 0.86%), drink (word count of 598; weighted percentage 0.72%) shelves (word count of 365; weighted percentage level 0.44%), and fly (word count of 322; weighted percentage level 0.39%).
The coding by Twitter (Figure 7) highlights the presence of a couple of dominant tweeters. However, the individual tweeters identity, profile and objective of the tweets are beyond the scope of discussion in this study and thus were not probed further.
A detailed “geo assessment” of tweets was undertaken to ascertain the impact of the location-based tweets as it was relevant to know origin of the tweets. The map based on tweets (Figure 8) clearly shows that the tweets were a global phenomenon and were not restricted to a single city or country.
Figure 7. Coding by Twitter user on the basis of number of tweets.
Table 2. Word frequency of the top 30 words.
Figure 8. Map based on tweets.
A cluster analysis based on word similarity (Figure 9) shows that the words used in the tweets are quite intermixed in the different tweets. As per Figure 9, the words make a complex web, deeply interwoven, and thus the words are used repeatedly in the tweets in different combinations.
As advocated by Gioia et al.  there should be a deliberate delay in using codes that are based on existing theory (in this case the EV theory) to the last stage of data analysis. They suggest that the information obtained from the respondents should not be diluted with the burden of confirming with existing theories. This way the ideas are grounded in data rather than prior research studies. The author also used the existing theory in the last stage of data analysis in conformity with the “Gioia Methodology” which suggests value, in semi-ignorance of existing literature.
As per the EV theory, individuals make behavioral choices on the basis of anticipated outcomes (valences) and their futuristic estimations (expectancies) of attaining those outcomes  . This was evident from both the twitter data as well as the interview data. The respondents wanted to identify with the characters of the movie and the drink helped them achieve this goal. Thus, companies can position their products where the target audience wishes to identity itself with. As per the EV theory, higher motivation is associated with a higher perceived value for the rewards and the higher perceived probability that efforts will lead to rewards. As observed by the analysis of the tweets and the analysis of the interviews, the movie did motivate the target audience into using the product. Griffin and Harell  stated that additive form of expectancy model accurately predicts the level of motivational force acting on an individual. This was also the case here. From the data it can be concluded that the tweets impact the sales of products by initiating positive behavioral sentiments. As highlighted by Feather and O Brien  in the context of expectancy valance analysis, there was also a requirement to take in to consideration the dynamic processes that influence behaviour over time. However, this can only be explored by future researchers.
Figure 9. Cluster on the basis of word similarity.
Parker and Roffey  suggested that grounded theory aims to integrate the researchers understanding for an explanation of the structures and processes observed thereby interpreting the data instead of simply reporting it. In the context of the present study the “dynamic relationship” of cause and effect, between positive sentiment reinforced by tweets and sales increase can be expected in future as well. To this extent, this theory of tweets-based sales increase is generalizable and predictable. There is also a significant potential for discovery about this theory in other contexts which can be taken up by future researchers.
Twitter is a popular website for microblogging where users create status messages (called “tweets”)  . The tweets are however with a maximum limit of 140 characters of text. The Netflix-Twitter-Yakult triad is a good case illustration of the power of the social media in positively influencing the intended audience. This finding has significant ramification for strategic planning purposes as well. For the short run, the social media buy in, is a good strategy. It also appears that since firms innovate to create competitive advantage and managers take risks even in highly controlled environments to drive such innovation  , the innovative influence of social media extends to strategic planning domain for long run decisions as well. Social media is an impactful and innovative method to positively influence the target audience especially ever since mobile technologies have started being used for both social and professional networking  by individuals and professionals alike. The advancements in mobile technology have made the use of social media very convenient.
The consumption of social media generated information and the resultant actions, is not limited to the youth only. For instance, management accountants have started appreciating big data  for decision making and entrepreneurs have adopted new technologies such as mobile banking  for their financial needs. Thus there seems to be a deep integration of communication technology, mobile technology and information technology which is continuously re-shaping the world we live in. The study highlights the growing significance of social media in shaping the sentiments by influencing the perception and the opinion (valence), which eventually influences the sales of products (expectation). The study found out that the reasons for such immediate sales linked stock price increase ware positive behavioral sentiments created by the social media.