PsyFi Search

Thursday 5 January 2012

Noise, Sentiment and StockTwits

Don't be Sentimental

As we saw in Idiot Noise Traders it very much looks like there are people out there trading on the random oscillations in markets – which themselves make predicting the markets extremely difficult, particularly at times when irrational noise traders are dominating proceedings by synchronising their behavior. If this hypothesis is true then increasing ease of access to real-time internet trading data and opinions ought to be making markets less efficient, rather than more.

This implies that a contrarian investor should be looking to bet against the noise traders, rather than against the performance of stocks, so it’s of significant interest to figure out what the current sentiment of day traders is. Some recent research on the behavior of investors using the microblogging site StockTwits offers some interesting clues to whether this might work.

Predictive Sentiment

‘Sentiment’ is one of those concepts that’s quite difficult to pin down, but the general idea that it’s whether investors are bullish or bearish on a stock. In the context of noise traders the idea is that they hold beliefs that aren’t justified by the normal pricing metrics – stuff like cashflow or risks. There really isn’t much doubt that investor sentiment can significantly affect markets – it’s odd that there ever was – so the question becomes how to measure this, and how to exploit it.

Amongst the earliest research into the noise trader phenomena was Noise Trader Risk in Financial Markets, by De Long and colleagues, which posits that:
“Much of the behavior of professional arbitrageurs can be seen as a response to noise trading rather than as trading on fundamentals. Many professional arbitrageurs spend their resources examining and predicting the pseudo-signals noise traders follow in order to bet against them more successfully. These pseudo-signals include volume and price patterns, sentiment indices, and the forecasts of Wall Street gurus.”
Of course, the rise of the internet has led to a whole new stream of information for and from noise traders and this has received plentiful attention from the research community. In Is All That Talk Just Noise? Werner Antweiler and Murray Z. Frank looked at the chat on internet stock bulletin boards and determined that sentiment didn’t predict price movements. However, they did find that posting was correlated with both trading volume and volatility – which was a surprise, because it indicates that all of this apparently idle wittering actually contains some real information.

Price Predictions on StockTwits

The conclusion that the postings don’t predict price direction has been repeated a number of times, so it was a bit of a shock to find a recent paper looking at the StockTwits microblogging platform did find such a relationship. In Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement the authors, Chong Oh and Olivia R Lui Sheng, find that:
“We establish that stock micro blog with its succinctness, high volume and real-time features do have predictive power over future stock price movements.”
Which is interesting but, as we saw in Twits, Butter and the Super Bowl Effect, the automated analysis of data from Twitter is not entirely straightforward. The first part of this requires something called sentiment analysis, which is a set of computerised algorithms to extract subjective information from the postings. This isn’t uncontroversial as this article by Bing Lui records:
“All the sentiment analysis tasks are very challenging. Our understanding and knowledge of the problem and its solution are still limited. The main reason is that it is a natural language processing task, and natural language processing has no easy problems.”
Research Puzzler

Setting this concern aside the actual research paper itself is both interesting and puzzling. The researchers assert that the nature of the StockTwits platform lends itself to improved information flow:
“These three distinct characteristics, succinctness, high volume and real-time, greatly facilitate the diffusion of investing information in the community … Since users may only send short updates or postings up to 140 characters long, postings are brief and succinct by design. This may reduce noise and transmits a more relevant message to the recipients.”
The idea, therefore, seems to be that the very conciseness of tweets improves clarity and information, and reduces the noise. Implying, you’d think, that the reason that the postings are predictive of future prices is that the information content is high – aka this is not noise trading. However, the paper then states:
“Noise traders have the ability to affect stock prices whenever information can be shared among investors and spread quickly through web channels … Microblogging is one such channel where sentiments of irrational investors spread quickly with the continuous streaming of information. Specifically sentiment, represented by opinions, plays a vital role in this diffusion.”
Which seems to categorically state that the expectation is that the success of the predictions is caused by sentiment driven noise traders synchronising their trades. The only reasonable explanation I can come up with that seems to fit both points is that this is sentiment based, and therefore “irrational”, but the Twitter format makes the sentiment extremely obvious – there are few awkward questions of the ilk “I wonder what they meant by that”: and this may translate into more reliable sentiment analysis as well, with fewer of those tricky  sarcastic comments to trip up the algorithms. 

Social Media Predictions

The other confirmed predictions are that bearish postings are more predictive of future price movements than bullish ones and that investors tend to under-react to new information. These findings are in accordance with other behavioral research, providing some level of confirmation of the analysis. So what to make of this? Well, firstly note that this is all about relatively short term trading, over the ten days following postings. Secondly, it appears to cover some quite large cap stocks: if tweets were only influencing the movements of thinly traded, rarely analysed firms then this would be less surprising.  

It’ll be interesting to see the wider reaction to this research, but the analysis going on around social media platforms is rapidly becoming quite intriguing. See, for instance, Predicting the Future with Social Media, which has come out of HP Labs. Using sentiment analysis they showed that:
“The rate at which movie tweets are generated can be used to build a powerful model for predicting movie box-office revenue. Moreover our predictions are consistently better than those produced by an information market such as the Hollywood Stock Exchange, the gold standard in the industry.”
I don’t know if StockTwits sentiment analysis can really predict market movements over any time period, but I will make a prediction.  Someone, somewhere, is building a system to trade on the possibility. It's an arms race, and in arms races it's usually the side with the deepest pockets and the smartest people that wins.  Only, in a world of globally networked social media it's not so obvious who that's going to be. 


  1. All this noise is good for value investors. Anytime people are trading based on things that have nothing to do with fundamentals, there is a real chance for prices to become, shall we say, detached from reality.

    Another interesting thought is that when you trade counter to general sentiment (like buying when more people want to sell), then you are providing a service to the market by supplying liquidity. It is rational to expect to (in general) be compensated for this service.

  2. Actually I don't find it that surprising that if it works it works for large caps as well: whatever platform you're looking at, you are looking at a sample of that noisy activity, and if the noise on stocktwits is a representative sample (and on balance is it not likely to be? why would it differ?) then the sample on one channel should be representative of all channels, like polls of a good sample of 1000 people give you (broadly) a good idea of what millions think.