Filtering Social Media To Find Signal Out Of Noise

Yesterday's WSJ had an interesting piece about Alacra's new Pulse Pro offering. For those that don't know, I invested in Alacra in 1999 via Flatiron Partners and have been on its board ever since.

Alacra has been developing and selling information services to the banking, brokerage, accounting, and consulting businesses for almost 15 years. They use the web, sophisticated data aggregation, filtering, and packaging approaches to deliver powerful information products to the most demanding knowledge professionals in the world.

And so their take on social media is worth looking at. Their Pulse product starts with media available on the open web, from blogs to news articles, and then applies a set of filters to produce useful insights. As they explained to the WSJ:

Alacra's PulsePro tries to tackle the issue in several ways. First, it
only looks at blogs the company deems credible. The blogs are combined
with articles from traditional media companies for a total of about
3,000 sources. Rather than trying to codify all the text within each
source, it focuses on specific items such as quotes from well-reputed
Street analysts and C-level executives. Sentiment ratings are assigned
based on the language used.

What's interesting is this data set apparently is producing enough signal that wall street traders are using it to predict stock price movements. More from the WSJ:

Through backtesting, Alacra has found the ratings generated by its
product can lead movements in stock prices by about one to three weeks
for large-capitalization stocks. In turn, hedge funds and proprietary
traders are interested in the feed despite that it won't work anywhere
near the lightning-fast speeds they've been achieving for much of their
other computer-based trading.

Alacra Pulse is available as a feed for those who want to run it through proprietary algorithms. It's also available as web service for us mortals. And its available as a free 30 day trial for everyone. So check it out and see if you can use it to find signal from what we all know is a noisy world out there.

#stocks#Web/Tech

Comments (Archived):

  1. Dmitry Shapiro

    Very cool. How do you think this compares to Howard Lindzon’s http://StockTwits.com ?

    1. fredwilson

      i think they are very different servicesstocktwits is aggregating stock conversation and providing some governanceand filtering to avoid becoming yahoo stock boards 2.0alacra and others (because there are others working on this) are aggregatingmedia more than conversations

    2. GraemeHein

      Alacra pulls general content, people have to actively put their content into stocktwits.Goes to my comments on the problems of user intention – people using opt-in discussion mechanisms are different than the overall population and likely have ulterior motives.

      1. fredwilson

        well everyone has ulterior motives but alacra is trying to put a credibilityscreen in place to minimize that

        1. GraemeHein

          Exactly.I wasn’t very clear – ulterior motives was a downside of Stocktwits, since anyone can tag. Alacra’s credibility screen improves its utility.

          1. Dave Pinsen

            I’d be interested in learning more about Alacra’s credibility screen, because, as Fred notes, everyone has ulterior motives. It’s not as if, if you limit yourself to C-suite commentators and professional stock analysts, you won’t have ulterior motives.IMO, motives are less important than quality and detail of information. You can assume that anyone who’s long XYZ and is commenting on it on Yahoo! Finance or StockTwits wants the stock to go up. But occasionally, you see a comment that has an unusually high level of detailed analysis and insight.For example, the Yahoo! message board for a particular oil royalty trust included one commenter who offered detailed, accurate estimates of distributions, taking into account all of the variables — clearly someone with experience in the industry. The challenge is to highlight those sorts of comments, and that’s where some sort of curation comes in.

          2. fredwilson

            I’d be curious what you think of pulse dave

          3. Dave Pinsen

            OK, I’ll check it out and let you know.

  2. GraemeHein

    I really wanted to use their product a few lifetimes ago. The firm probably had purchased it but there were political problems in finding/using licences.Love the concept and think they’re is serious value here. It does give me a feeling of what it’s like to be an auto worker or a 1700s weaver – I’m being replaced by a machine!

    1. fredwilson

      alacra has made a number of big changes to their products and services inthe past yearif you would like an update on them, i can get someone to talk to you

    2. graubart

      Graeme – Barry from Alacra here.Drop me a note at barry-dot-graubart-at-Alacra-dot-com if there’s anything you need.And all we’re doing is making access easier – still need the human int to make good decisions.

      1. GraemeHein

        Thanks for the replies.I haven’t been in a role to use the product for a number of years. It was back in my global firm days.

  3. fnazeeri

    If it really works, then Alacra should immediately fire everyone not involved in maintaining the solution, stop selling the data feed and convert into a quant hedge fund to exploit this information arbitrage.

    1. andyswan

      The Goldman model.

    2. Alex

      @fnazeeri – you’re exactly correct, that’s what they would do. I’m so sick of these stories, rehashed every 6 months, about news mining adding an edge to black box trading. It’s total BS. The leading vs. lagging indicator problem has NOT been figured out, I repeat it has NOT been figured out.

  4. andyswan

    New product idea that seems MUCH more viable long-term:Identify the sentiment of the home-gamers on the message boards and twitter….and do the exact opposite. Position size in direct proportion to their volume.

    1. graubart

      Andy – we’ve had informal discussions around similar topics – we call it the Cramer model 😉

      1. andyswan

        Perhaps a partnership with covestor is in order. I doubt anyone has requested a data-feed of their worst performers yet….

      2. Seenator

        Barry,You should also trying connecting with Trefis.com. There service is very cool and perhaps your feed can be passed onto them so that they can see product/division level impacts on the stock price.

        1. graubart

          Interesting idea, Nik. I’ve played with the Trefis models; they’re pretty cool. There’ve been some attempts in the past to really nail down the whole supply chains of companies, so you can assess secondary and terciary impact of events, but typically the taxonomy isn’t deep enough to do it effectively. It’s certainly something we could look at though. Thanks for the idea.

    2. Dave Pinsen

      Like the old odd lot theory.

  5. ppearlman

    At StockTwits, not only are we ‘governing’ but, more and more, we are curating and it is good.Take for example http://www.abnormalreturns…. in which we human filter through links that are tweeted on StockTwits, links submitted directly to AbnormalReturns and links from across the web to bring best finance related stories in a very timely manner.as well, on the StockTwits Chartly stream, we are curating social charting as we pull out the very best charts submitted to Chart.ly and tweet them. you can peruse this curated stream here: http://stocktwits.com/chartly .This is only the very beginning for us in the area of social finance curation and there is much great work to be done!

  6. Tradestreaming

    Trading sentiment — curious how this would compare to things like Piqqem is doing with its sentiment indicator.crowdsourcing sentiment vs pure sentiment trading.Kinda like finding music artists via American Idol vs. iTunes

    1. graubart

      Nice analogy; and I find that sometimes I discover great music through Pandora, other times HypeMachine, sometimes Last.fm and occasionally from my guitar instructor, Jim.I think algo models work the same way. Sell-side analyst sentiment could be one input; crowd sentiment may be another, while a third could be something completely unrelated. It’s that combination of data points that will trigger a signal in the data.

      1. fredwilson

        triangulation ftw2010/3/16 Disqus <>

  7. falicon

    Filtering the noise is what the next huge breakthrough company (whatever it is) will be known for…it really is the next huge problem that needs solved (and that tons of us are all of course working on).

  8. Siminoff

    I feel as though I might have failed you:)You are 100% correct in that the cloud based VOIP PBX people have taken the traditional PBX and just thrown it into the cloud without really looking at how to leverage the cloud or the power of it.The closest thing to what you are talking about now is Skype, in that you can buy any Skype enabled device, put in your credentials and be up and running. Check out things like, http://bit.ly/2oha.There is a space that is forming around this and as you know my opinion is that it will not only be about VOIP but the integration of the “traditional phone” to be with both video and voip. While video is exploding on the computer, having a separate purpose built device like the Tandberg E20, http://bit.ly/dkqtuH, is something that I think will be valuable long into the future.There are definitely some people working towards this goal, including myself, and one of them you invested in, Twilio.

  9. Michael F. Martin

    Eric Schmidt recently commented on how Google toyed with the idea of trading on its information, but then didn’t because that would be illegal. I wish he would have elaborated.

  10. Michael F. Martin

    It would be more useful if they expanded to cover more small caps.

    1. graubart

      Thanks for the feedback, Michael.Our company universe is quite broad (300k public & privates globally) but the number of companies with at least one event is only around 35,000. For Street Pulse, where we pick up the analyst comments, we pull comments from industry analysts and key bloggers, as well as the sell-side firms and ratings agencies. That gives us wider universe than just what the sell-side analysts are covering, but I acknowledge that small cap coverage, particularly outside of tech, can be a bit thin.

  11. karen_e

    This is so interesting. I have a talk on social media tomorrow inside a large international architecture firm. When I get to the emerging trends section, it will be great to dip into this. I love exploring what is coming down the pike … guess I have some reading to do tonight! Thanks to all the commenters.

    1. graubart

      Karen – if I can share any info on Alacra for your talk – just reach out – barry-dot-graubart-at-alacra-dot-com

      1. karen_e

        Thanks, Barry, will let you know.KE

  12. howardlindzon

    # Hashtags are mostly noise, especially in finance. They are already yahoo message boards for main topics. It is hard to focus a business on signal because its not sexy and who really decides what a signal is?Aggregation, reputation, discovery and curation all mixed in together, shaken up, over different timeframes. That’s what has to meep happening to create signals.There are MANY niches that Twitter is leaving to the signal entreprebeurs. Fascinating opportunities, especially in finance.The Alacra peeps are good peeps. Go Barry.

    1. William Mougayar

      You nailed it Howard “Aggregation, reputation, discovery and curation all mixed in together, shaken up, over different timeframes. That’s what has to meep happening to create signals”. I would add mining+semantics.

  13. Guest

    Brilliant. But if this is true everyone will get similar information and then it will be pretty worthless to actually use it to predict market swings. That’s the vaporware nature of low-risk, high-return profit opportunities.

  14. wsmco

    The ONLY thing that matters is how much money have they made from trading the information? I agree with the earlier comment, if it actually has any value, they should immediately start a hedge fund and trade it.Stocks are about profits, how much profit do you make (versus risk you take), nothing else. Anything else but profits are noise, akin to analyst downgrades AFTER a stock plummets 30%, worthless.StockTwits has been the most valuable tool I’ve found to help boost my trading profits, which are utterly transparent and inching close to 300% since launched last April on Covestor Investment Management:http://cv.im/models/profile…Stocks are about making profits, everything else that doesn’t lead to transparent gains is empty noise.I have started a hedge fund with a person I met on StockTwits.

    1. ShanaC

      wow, now that’s different. A hedge fund?

  15. ShanaC

    I find this interesting in the ways we process information. It changes the way we look at analysis- clearly we are going to need to go deeper and look for better curation and pattern recognition tools for all sorts of businesses- otherwise why would Alacra work in the first place? We’re swamped with too much, and the boiling process probably is showing patterns that are there but are otherwise hard to discover without tossing the excess.I figure we are in a difficult period of this kind of production- how do you know what is baby and what is bathwater? Different fields must have different requirements with this sort of curation…Now I want to figure out how to test this for a different purpose…

  16. Eric Falcao

    The social media stream definitely needs a filter. I’m currently bootstrapping a company called TweetRiver and our goal is to moderate/curate raw twitter content down to the most useful tweets.Our customers set up any rules they like (there is lots of variation amongst what customers see as valuable- one customer explicitly filters out retweets while another might score retweets up). The result is a score for each tweet which could either be published or unpublished and sent to different streams.Here is a link to one of customer’s pages (retweets, profanity and links filtered out, with the option to manually unpublish individual tweets):http://abc.go.com/shows/danhttp://abc.go.com/shows/dan

    1. fredwilson

      This has to happen. Twitter is also working on this. As is google

  17. Mark Essel

    Heyo, that’s fantastic news! My dataminer sense is tingling ;).Sounds like a causal trend could be a huge gold mine for various traders/hedge funds.As you Fred, and many other AVC’ers are aware I’m working on a analyzing large social streams for signal relevant to each user. My co-founder Tyler and I just recently took the first step to becoming a web service, instead of a twitter app by moving to a feed based architecture. We’re still working out the structural details to focus on Relevancy (to the user, and to their subscriptions). Identity is a big deal so we’re starting by asking users if a particular feed they want is them or something they’re following. The framework is loose enough to allow mixed streams and identities.

  18. Ethan Bauley

    Signed up for that deal. Blog star demand gen FTWI think I need a download from Barry, too”Transparency is the new alpha”…love the Lindzon soundbitesThis shit is so interesting!

  19. leigh

    semantic tagging. anyone want to ‘splain this in more detail beyond contextual tagging = insightful context?

  20. markslater

    i’ll stick to virtual trading thanksi’ll grow some weed in FarmVille then sell it to MafiaWars who’ll sell it in YoVille then YoVille will get the munchies & go to CafeWorld located in FishTownjob done.

    1. markslater

      plagerized ofcourse.

  21. William Mougayar

    There is a lot to be said for having fewer, credible sources and taking a deep vertical approach than to boil the ocean with streams of commoditized info. Alacra’s approach is a good validation for Eqentia’s, as well. The business professionals have different use cases than regular consumers of info.

  22. otoemlak.com

    Thanks for sharing

  23. infoarbitrage

    tom, the difference is that it is from a much more constrained data set and, therefore, the engineering challenges are much different than those undertaken by monitor110. we tried to bite off way, way too much, focusing on having our algorithms do cleansing, curation and sentiment at internet scale. alacra has chosen to focus on curation first and to then apply sentiment on a far more manageable scale (a few thousand sources versus tens of millions). we boiled the ocean. they are taking a much more targeted approach. they won’t find “needle in the haystack,” but may well be more effective in areas such as trending topics, etc.

  24. fredwilson

    the expert on the sector joins the conversation!!!big win for the AVC communitythanks roger

  25. Michael F. Martin

    I like the hits I see. It’s like a list of the top topics of conversation relating to a given company.

  26. ShanaC

    Do you think there is a risk of missing something because of the curation- black swan of trends? It doesn’t even have to be for stocks, there just a question in my mind of if I curate too much, will I look too inward and miss something big because it came just past the corner of my eyes?How would you balance away from that tendency if it does exist?

  27. Carl Rahn Griffith

    Welcome, Roger. Looking forward to reading your thoughts – heard a lot about you! ;-)The targeted approach makes perfect sense, indeed. There’s an awful lot of noise (dare I say cr*p in fact!?) out there and the most important aspect before decanting anything into the algorithm is choosing which feeds/etc to suck in. Being pre-filtered makes life easier, for sure.Was very interested in/inspired by Monitor110 when we were first drafting the ensembli business plan. Cool.

  28. andyswan

    Predicting stock prices with a black-box system has been a “promising space” since the Dutch East India company put out its IPO in 1602.

  29. fredwilson

    That’s the idea

  30. fredwilson

    I gotta give it whirl charlie