Wall Street, Hedge Funds Add Social Media to Research Menu

Wall Street is taking another step toward making social media as core to investment research as quarterly sales reports.

M Science LLC, which sells alternative data research to big hedge funds, said last week that it is acquiring TickerTags Inc., a social-media tracking firm startup based in Dallas.

Investment managers are increasingly looking to use technology to generate new trading ideas. Hedge funds in the U.S. and Europe now spend more than $170 million annually on so-called alternative data, according to a survey by Greenwich Associates.

Though small, this deal announced Friday is the latest in a string of consolidation among new financial data providers. Advanced Publications Inc. acquired 1010data, an alternative data provider, for $500 million in 2015. Kensho Technologies, which applies artificial intelligence to stock research, announced this year that it would be bought by S&P Global Inc. for $550 million.

M Science, which didn’t disclose terms of its latest purchase, is among the largest and oldest providers of alternative data reports to Wall Street, publishing company-specific notes based on Information such as purchase data, mobile app usage and web surveys of prices.

The business began as Majestic Research in 2002, before being acquired by Investment Technology Group Inc., and then by Jefferies Financial Group Inc. in 2016. M Science now has 115 employees, more than double the number in 2016.

So far, alternative data has mostly been based on things directly relevant to forecasting sales, including credit-card receipts and tracking the number of customers visiting a business, such as with satellite images of parking lots.

Startups like TickerTags have emerged to mine Twitter and other social media for a variety of signals, including news about public companies.PHOTO: RICHARD DREW/ASSOCIATED PRESS

More recently, a number of startups, such as TickerTags, Dataminr and iSentiumhave emerged to mine Twitter Inc. and other social media for a variety of signals, including news about public companies. Some firms and investors also measure market sentiment through social media.

Now forms of social monitoring are being directly integrated into research products already consumed by Wall Street. Goldman Sachs Group Inc., for example, sometimes includes measures of Twitter mentions in stock research reports.

TickerTags has compiled more than a million key words and phrases and mapped them to public companies for which they are relevant. It then monitors online discussions, including Google search terms, for those key words.

TickerTags hasn’t yet taken off as a product sold directly to funds. But Michael Marrale, chief executive at M Science, said that clients had asked him about trying to find a way to integrate the TickerTags system into M Science’s own data and reports on public and private companies.

Data is just a small part of the equation, said Mr. Marrale. “It’s how you map that data to public and private companies in ways you didn’t anticipate.”

For example, Newell Brands Inc.’s shares soared in May 2017 after it reported unusually strong sales of Elmer’s glue. The product is a key ingredient in slime, a gooey homemade toy that surged in social-media popularity last year.

TickerTags co-founder Chris Camillo said that the company’s system noticed a surge in the mention of “slime” in April, because it was often paired with mention of “Elmer’s,” and also “sold out,” all of which had been tagged as relevant to Newell stock.

Originally published June 4, 2018 by THE WALL STREET JOURNAL.

An ex-Morgan Stanley trader resurfaced at an AI alternative data firm

iSentium, an AI fintech firm that uses machine-learning algorithms to turn social media sentiment into tradeable alternative data, has got a new president and COO. It recently hired Raymond “Ray” Tierney III, a former managing director and the global head of trading and execution in the investment management division of Morgan Stanley and the CEO of trading solutions at Bloomberg Tradebook.

Tierney is a big name. He ranked first in Institutional Investor’s 2016 Trading Technology 40 and spent 15 years as a trader at Morgan Stanley during a 36-year career in financial markets. He left Bloomberg Tradebook following an apparent restructuring ahead of MiFID II in March 2017 and has been advising on the board of various start-ups since. Subsequent to Tierney’s exit, Tradebook announced that it was outsourcing all equities execution services to Goldman Sachs.

At iSentium Tierney will report to CEO, Gautham Sastri. iSentium uses patented Natural Language Processing (NLP) to extract sentiment from unstructured social content like Twitter feeds and Stocktwits.

Originally published May 3, 2018 by efinancial careers.

Ray Tierney joins iSentium as President and COO

iSentium, the global leader in extracting value from social sentiment feeds like Twitter and Stocktwits, has hired Financial Services veteran Ray Tierney, as President and COO. Mr. Tierney will report to iSentium CEO, Gautham Sastri.

Tierney’s resume includes lengthy tenures at Morgan Stanley & Co., Morgan Stanley Investment Management and Bloomberg spanning his 36 years in Capital Markets. Ranked #1 on Institutional Investors 2016 Trading Technology 40, Tierney has long been at the forefront in the emerging world of FINTECH and the expansive new horizon it presents financial services.

Ray Tierney, iSentium President and COO.

In addition to his drive to effect change and build businesses, Tierney is also well known for his leadership in Diversity and Inclusion initiatives, as well as his presence in the community. In 2017, Tierney was awarded the inaugural Ken Heath Award by the STA for the leadership he has shown in empowering women and underrepresented groups in finance. He also sits on several Boards, including The Ronald McDonald House as well as National Organization of Investment Professionals (NOIP).

Tierney stated, “I set out to position myself in a private FINTECH organization that has a strategic focus in capital markets.” He continued, “iSentium, with their strong management team, raw processing power and data science IP separates them from the pack of providers in Alternative data, specifically social media and social driven news feeds. The continued acceptance and adoption of alternative data in capital markets is very much what the reimagination of financial services is all about and the perfect complement to where I could leverage my experience, passion, and drive to scale and accelerate our growth forward.”

Founded in 2008, iSentium, which has offices in the US and Canada, uses patented Natural Language Processing (NLP) to extract sentiment from unstructured social content then instantly transforms it into highly actionable indicators in Finance, Brand Management and Politics. iSentium’s management team led by Gautham Sastri and CTO, Dr. Anna Maria Di Sciullo, bring a wealth of experience from both industry and academia with skills ranging from Big Data, finance, linguistics and signal processing. Our world-class team, comprised of linguists, quants, and computer scientists has collectively published over 200 papers and 18 books. Sastri, a seasoned technology entrepreneur and investor, having previously founded Terrascale Technologies and Maximum Throughput. Over the past three decades, Gautham has worked on Big Data problems in diverse areas such as seismic processing, weather forecasting and large-scale simulations/data analysis for government agencies. He has co-authored two granted patents in both the field of cloud storage, and sentiment analysis and has three additional patents pending. On April 10th of this year, ISENTIUM was granted it’s 3rd patent for generating data from social media messages for the real time evaluation of publicly traded assets.

Sastri stated: “Ray’s decision to join iSentium is a powerful endorsement of our strategic vision, as well as our ability to operate a multi-disciplinary team staffed with renowned experts in disparate areas such as AI, NLP, Big Data and Signal Processing. When I got involved with FINTECH in 2010, Internet traffic was about 10 Gigabytes/sec – this has now grown to over 60,000 Gigabytes/sec. Simply put, Alternative data is now Too Big to Ignore. I sought out Ray because he sets a very high benchmark in everything that he does, be it professionally, charitably or athletically. I am truly excited about the prospect of working with Ray to execute on our shared vision.”

###

To schedule a capabilities demo call (212) 858-9694.  Or send us an email at: inquiries@isentium.com.

Startups are setting up funds based on what is trending on Twitter

Startups that have built businesses parsing Twitter data to find buy and sell signals on stocks are looking to move into the investment game.

Miami-based startup iSentium and New York-based Market Prophit are among the companies looking to bring market sentiment into real-time trading for the ordinary investor. 

They are also now looking to launch exchange-traded funds. 

Gautham Sastri, CEO of iSentium, told Business Insider he’s working with an investment bank to launch an ETF of his own using social data.

ETFs typically track capitalization-weighted benchmarks such as the SP500. These capitalization-based benchmarks have received some criticism in the investing world as they are moved by the biggest, and potentially most overvalued, stocks, and have less exposure to smaller companies which might be undervalued.

Sameer Gupta, a former JPMorgan trading executive who is now chief operating officer at iSentium, told Business Insider that an iSentium ETF would likely “be a smart beta product that delivers enhanced performance relative to a benchmark.”

The phrase ‘smart beta’ refers to an alternative form of index investing where the index in question is based on factors other than capitalization-weightings. 

Market Prophit meanwhile has already developed an index tracking the 25 most-mentioned companies on Twitter in a partnership with S&P Dow Jones Indices. 

Igor Gonta, chief executive at the company, told Business Insider that the company will one day launch an ETF that allows people to trade based largely on the sentiment expressed on Twitter. 

Twitter has created an entirely new Wall Street ecosystem — here are the companies leading the way

Traders are turning to Twitter to get in front of big market-moving trends.

That in turn is creating an eco-system of companies looking to make sense of Twitter data and pull the signal from the noise.

According to a TABB Group report issued last week, the industry is growing at a rapid pace.

“There’s not going to be one firm on top,” said TABB’s Valerie Bogard, a research analyst. “There’s going to be multiple firms.”

Some of the startups have already been snapped up by bigger corporates looking to get an early edge on analytics.

Twitter itself spent upwards of $US130 million just a year ago to buy Gnip, now used to help disseminate data to startups that in turn relay that information to hedge funds, among other clients.

Elaine Ellis, a marketing manager at Gnip, told Business Insider that Twitter also sells data to banks and hedge funds directly.

She said: “We know Tweets move markets. Our public, real-time nature positions us perfectly to be a source for the financial industry. We believe that in the future, this will evolve from a nice to have to a must have for all industry participants.”

Bogard told Business Insider that everyone from investors in munis to traditional funds with long-only strategies are trying to turn real-time commentary into critical analysis ahead of the tape.

Not all social signals are created equal, of course. In fact, some are created to throw Wall Street’s brightest minds off the scent of what’s actually happening.

Here are some of the startups looking to turn Tweets into trading signals:

Read more at https://www.businessinsider.com.au/twitter-has-created-an-entirely-new-wall-street-ecosystem-here-are-the-companies-leading-the-way-2015-9#sVb0y2uxVcEudKxk.99

Twitter has created an entirely new Wall Street ecosystem — here are the companies leading the way

Traders are turning to Twitter to get in front of big market-moving trends.

That in turn is creating an eco-system of companies looking to make sense of Twitter data and pull the signal from the noise.

According to a TABB Group report issued last week, the industry is growing at a rapid pace.

“There’s not going to be one firm on top,” said TABB’s Valerie Bogard, a research analyst. “There’s going to be multiple firms.”

Some of the startups have already been snapped up by bigger corporates looking to get an early edge on analytics.

Twitter itself spent upwards of $US130 million just a year ago to buy Gnip, now used to help disseminate data to startups that in turn relay that information to hedge funds, among other clients.

Elaine Ellis, a marketing manager at Gnip, told Business Insider that Twitter also sells data to banks and hedge funds directly.

She said: “We know Tweets move markets. Our public, real-time nature positions us perfectly to be a source for the financial industry. We believe that in the future, this will evolve from a nice to have to a must have for all industry participants.”

Bogard told Business Insider that everyone from investors in munis to traditional funds with long-only strategies are trying to turn real-time commentary into critical analysis ahead of the tape.

Not all social signals are created equal, of course. In fact, some are created to throw Wall Street’s brightest minds off the scent of what’s actually happening.

Here are some of the startups looking to turn Tweets into trading signals:

iSentium has backing from Goldman Sachs alums

iSentium

From iSentium’s site: ‘The sentiment bars (green and red bars) in the chart represent iSENTIUM’s Sentiment Strength Indicator or SSI which is a hybrid of the Relative Strength Index (RSI) modified using iSENTIUM’s patented sentiment scores in place of price.’iSentium is one of the older players in the social sifting game. That’s part of the reason it’s already profitable, having raised cash from Goldman alum David Heller and Marc Spilker, former president at Apollo Global.

The Miami-based startup is even working with a yet-unnamed investment bank to launch a Twitter sentiment ETF product, although CEO Gautham Sastri says it won’t launch until next year, thanks to regulatory approval issues. He says social analytics is here to stay on Wall Street. ‘If you don’t look at social media, you’re blind,’ he told Business Insider.

Read more at https://www.businessinsider.com.au/twitter-has-created-an-entirely-new-wall-street-ecosystem-here-are-the-companies-leading-the-way-2015-9#sVb0y2uxVcEudKxk.99

iSentium Uses AI for Sentiment Analysis of Social Media [Interview]

This is part of a series on machine intelligence companies. We interviewed BeagleMarianaBeyond VerbalPreteckt, Affectiva, Eigen Innovations, and ClearMetal. Now we’re featuring iSentium.

iSentium, which has offices in the US and Canada, harnesses applied artificial intelligence to extract sentiment from unstructured social media content and transform it into actionable insights in verticals such as finance, politics, and brand management.

Founded in 2008, iSentium’s expert team hails from both industry and academia and has collectively published more than 200 papers and 18 books.

Machine Intelligence iSentium: Breakthrough Sentiment Analysis

We recently chatted with iSentium’s chief executive officer Gautham Sastriabout the inspiration behind the company, its patented NLP technology, the power of social media content, and more.

[Editor’s note: The interview questions and responses below have been edited for clarity and length.]

What’s your role and what does that entail on a daily basis?

I am the CEO of iSentium. Given the small size of our team, I am deeply involved in the day-to-day running of the firm.

I focus heavily on:

A) product development, leveraging my data science skills developed over a 30-year career in signal processing, seismic acoustics, and weather forecasting; and

B) sales and market development, given that we are in early innings with respect to applied artificial intelligence.

What did you do prior to joining iSentium?

I attended the University of Houston where I studied electrical engineering and history.

I started my career working for the US Navy writing signal processing algorithms to look for Russian submarines. Subsequently, I spent a few years at NEC in their supercomputing division before embarking on the entrepreneurial route.

My first startup, Maximum Throughput, supplied racks of servers to clients, including the Los Alamos Laboratory. While at Maximum Throughput, I was invited to join Intel’s Server Advisory Council, which was tasked with providing key inputs to their server technology roadmap.

Upon acquisition of Maximum Throughput by Avid Technology (Nasdaq: AVID), I started Terrascale, which provided cloud storage solutions (in 2003) for large scale text analysis.

Terrascale was acquired by Rackable (now SGI Corp). I became COO of the new entity and managed a $500M business.

What was the deciding factor that led you to join iSentium?

With the advent of social media, I saw an opportunity to commercialize text analysis for actionable insights across verticals and domains.

iSentium had a talented team of linguists led by Anna Maria Di Sciullo, a professor at the University of Quebec who had studied under Noam Chomsky at MIT.

I got involved initially as an investor and then joined the company as CEO in 2010.

What’s the most challenging aspect of your role?

Educating the market and proving the value of social media data has been a key challenge that we have slowly been overcoming.

The nature of social data is very different from traditional data sets and hence requires new methods of analysis.

That being said, we have done a fairly good job proving the value proposition in capital markets and trading, where you have a binary outcome.

What was the inspiration behind iSentium?

The big opportunity to analyze and interpret a galactic amount of social content was a prime consideration.

Relatively speaking, platforms’ (Google, Facebook) lack of focus on the content made this opportunity even more interesting and attractive.

Early focus and research under the leadership of one of the top linguistics experts provided the right platform to embark on our vision to “Decode Social Sentiment.”

How do you utilize big data and NLP to analyze sentiment in social media content?

We are connected to Twitter and have access to both historical and real-time social media content.

Our NLP technology can assign a sentiment score to each message in about four milliseconds and can scale horizontally to process millions of messages per second.

Machine Intelligence iSentium: How iSentium Works

To what verticals are you currently applying your technology?

Our three verticals are Financial Indicators, Political Insights, and Brand Analytics. Finance came first.

How has your focus shifted over time, if at all?

We are being approached about leveraging our NLP capabilities to extract consumer sentiment by leading advertising agencies, management consulting firms, and brands themselves.

Where do you see the biggest opportunity in the long run?

While we will continue to deepen our penetration in financial services, we believe the bigger opportunity is in consumer sentiment that applies across industries.

We have already proven our efficacy and ability to mine sentiment cross-vertical, having run multiple proof of concepts on consumer sentiment on fast food and restaurant chains.

Machine Intelligence iSentium: Brand Analytics

How would you define your key value proposition for customers?

We produce the most accurate interpretation of natural language given a particular context.

For example, in finance, we have reached an 80% accuracy rate where a human agrees with the machine in its understanding of a given piece of text related to the stock market.

How would you describe your typical customer?

Within financial services, our clients range from leading quantitative hedge funds and systematic global macro funds to high frequency trading firms and investment banks.

Our new product suite for financial services is a real-time dashboard that will enable asset managers and long/short hedge funds to quickly sift through social media data and develop insights for fundamental analysis.

Machine Intelligence iSentium: Financial Indicators

Was the product fully designed and developed in-house?

Our NLP capabilities have been built from the ground up since 2008. At any point, we have had up to six PhDs in linguistics working on the product.

We filed a family of patents in the early part of the decade and have already been granted two patents. We did not use any open source tools.

How much training data do you typically require for a new customer?

We don’t train our models, and there is no machine learning involved.

For stock sentiment, we have a lexicon of about 15,000 words that is used for analyzing text and assigning sentiment scores.

How quickly can your system react to news on social media and adapt your signals?

Depending on signal logic, we can react instantaneously to a burst of social chatter or it can be much slower moving and relies more on continuous sentiment extracted from a much longer time period.

Machine Intelligence iSentium: Politiical Insights

What’s the most exciting trend in machine learning from iSentium’s perspective?

In our view, machine learning does not simulate natural language learning by humans.

The universal properties of natural languages are not learned by humans who may make mistakes with vocabulary items, but not with the structure-dependent properties of natural language. 

Whatever language they are exposed to, humans (particularly children) are capable of inducing a grammar for that language without formal or algorithmic instructions.

In essence, humans are able to learn language deterministically. On the other hand, machine learning algorithms tend to learn from scratch and after extensive training periods.

What advances in machine learning have benefitted iSentium the most?

Instead of machine learning, iSentium relies on an innovative artificial intelligence structure-dependent technology.

This patented technology makes correct predictions on the sentiment of short texts, such as tweets, where natural language constituents are missing, as well as longer texts, which may include non-relevant topical information.

Both covert and irrelevant topical information cannot be dealt with by natural language technologies based on machine learning. For example, in the case of short messages, the covert information is not learnable because it is not explicitly included in the message.

Machine Intelligence iSentium: Quote from iSentium CEO

Are there any limitations on machine learning iSentium would like to see removed?

Machine learning is not a simulation of learning expressions in any natural language by human’s brains. It is a matter of modifying the algorithm’s parameters at each step to reduce the error value.

Neural networks are assumed to be non-deterministic algorithms. Non-determinism is one of the limitations of machine learning preventing it from simulating human intelligence.

This is particularly the case for machine learning based NLP applications, ranging from information retrieval to information extraction, including sentiment mining and question answering systems. 

The compelling evidence comes from recent discoveries related to the granular modularity of the human brain, which is the biological basis of the human capacity to express simple and complex thoughts using natural language.

Does iSentium consider itself a machine intelligence company?

iSentium is an artificial intelligence company that simulates natural language processing by humans.

Its universal core can be parameterized to any natural language, as well as to any domain of interpretation, including finance, politics, and brands. Effectively, it mimics human natural language intelligence.

We have packaged our underlying sentiment data through a few applications already, including finance, where we have built a significant presence and brand on Wall Street, and are now providing highly actionable solutions to advertising and other verticals.

Big data expects talk about text, Twitter and turning quantamental

Tom Doris of OTAS; Peter Hafez, RavenPack; and Gautham Sastri, iSentium discuss contexts around NLP.

Using machines to read text as a way to enhance understanding of market movements is a topic of intense polarisation and debate.

Back in the 90s, work on natural language processing (NLP) involved teams of linguists and computer scientists attempting to code up rules of grammar. Recent work has focused on techniques like word embedding, the underlying idea that a word is characterised by the company it keeps; semantic similarities between words are based on their distribution in large samples of data.

The “bag of words” approach has been applied commercially in finance for more than 10 years. But it can depend on the source of information being analysed: a rule-based approach can work pretty well for news articles that follow certain editorial processes, while social media proves much more challenging.

Tom Doris, CEO at OTAS Technologies, takes a longer view of the technology’s potential; he thinks asking an AI how to get better prediction of which stock is going to outperform may be the wrong question. In terms of efficiency, it certainly can be used to quickly reveal things that would take many years’ experience and a carefully curated set of information sources to put together.

He said: “What’s exciting is we finally have the techniques that can look at entire economies and start to understand where the ebbs and flows are, and anticipate where the potential downturns or the potential resource constraints are going to be in future.

“That’s quite different to the hype around social media, which has really been focused on the low latency play and being the first to identify when Carl Icahn tweets something, or when the CEO of a company says something stupid – or the President for that matter.”

Doris, who holds a PhD in computer science, believes Twitter can be useful, but in a more limited domain. He said it doesn’t contain the information to answer a lot of the questions that are interesting to traders, and it has proven extremely difficult to extract information from Twitter that eliminates enough of the noise for traders to be interested in it.

“Nothing really works very well with Twitter because basically there just isn’t that much information in Twitter,” said Doris. “I think where a lot of this stuff falls down – whether it’s natural language processing or AI in general – people expect it to be able to extract information where there just fundamentally isn’t that much information.”

One approach to event trading with Twitter is using a so-called white list, where only verified company or influencer accounts have been used. Taking on the entire Twitter firehose may offer the wisdom of the crowd, but it’s extremely noisy and can easily become very expensive from a trading point of view.

However, some experts can demonstrate that Twitter-based indicators substantially outperform the market over extended periods of time. Sentiment analytics company iSentium extracts actionable indictors from large amounts of unstructured social content. It points to independent research commissioned by Nasdaq and carried out by Lucena Research.

Gautham Sastri, CEO of iSentium, emphasised that his company does not use a bag of words approach. “Our team of linguists, led by Dr Anna Maria di Sciullo [post-doc from MIT, Fellow of the Royal Society] all have PhDs and have been working since 2008 to build a system that seeks to understand social media messages in a human-like way.

“If short messages are less valuable because they are brief and to the point, then how does one explain the extensive references in Churchill’s History of the Second World War to telegrams that he sent and received? And why were so many resources expended on building the Enigma machine at Bletchley Park?

“I would argue the point that brevity is indeed the soul of wit; and low latency, combined with volume, can provide substantial edge when properly exploited.”

Sastri went on to say that his company processes more volume of social content in a second than the entire New York Stock Exchange produces in a day. “For each tweet that we process, we generate 24 different fields that can provide deep insights regarding demography, geolocation, contagion, etc.”

Even with replies and hashtags, Twitter is very disparate and lacks a strong sense of topic threading, says Doris. “You’d have much richer content if you looked at the archives of email within a company,” he said. “From that you build up a much richer picture about how the different parts of the organisation interact and where the connectivity is and what topics people are discussing.

“For instance, analysis of thread length would tell you whether things are languishing – you just don’t have that richness in Twitter. It’s been very successful in part because it’s so brief and to the point, but that makes it less valuable from an analysis point of view.”

Peter Hafez, chief data scientist of big data analytics firm RavenPack, said investors are finding value in Twitter, but not as much as some people might think, and not as much as one you you might find looking at more traditional sources like news. “That said, I believe Twitter may get a second life as the technology knowledge graphs that are applied are becoming more advanced.

“Twitter is more about tracking consumer sentiment than about following the views of prophets. People tweet about the products they like or dislike, what they wish to buy, observed side effects caused by a given drug, etc.

“They don’t necessarily tweet about the companies that own the products or a given subsidiary. For example, I might say that I love the new Q5, but I might leave out Audi, the owner of the product; and surely I would leave out Volkswagen, the owner of Audi, which would be the stock that I would have to buy if I wanted to trade the equities markets. You could of course have traded Audi’s corporate bonds, skipping the link back to VW; tracking products in a point-in-time fashion is the hard part.”

Today internal content is driving the discussion and many financial firms are looking towards internal content such as email and instant messages like Slack, instant Bloomberg, Symphony, Skype and more for a competitive advantage.

Hafez added: “More traditional firms have started turning their emails and internal investment notes into actionable data points that can be used more directly within their investment process, basically making internal content more easily accessible within the organisation. It allows deeper understanding of where an organisation has a true competitive advantage over publicly available information.”

Returning to a more macro picture of the world, Doris sees NLP technology dovetailing with human discretionary analysis, where the ability to quickly surface information algorithmically will be a fundamental sweet spot.

“It’s about what will be the ultimate truths of the macro environment; are interest rates going to change significantly; is the money supply going to change significantly; do we think that there is going to be a significant change in the socio-economic or the global geo-political environment?” he said.

“That’s the kind of stuff that you do need a human level of perception to understand whether it’s a fad or actually the establishment of a long term trend.”

And regarding firms publishing backtests of their signals, Doris pointed out that it’s really important to know whether trading costs have been taken into account. “If you don’t take into account trading costs, it is trivial to create a ‘signal’ that always makes money, e.g. by assuming you can instantly trade large volumes of stock whenever the main index future price moves and before the stocks catch up.”

Newsweek’s AI and Data Science in Capital Markets conference on December 6-7 in New York is the most important gathering of experts in Artificial Intelligence and Machine Learning in trading. Join us for two days of talks, workshops and networking sessions with key industry players.

Is Twitter Any Good at Predicting Stock Prices?

Using machines to read text as a way to enhance understanding of market movements is a topic of intense polarisation and debate.

Back in the ’90s, work on natural language processing (NLP) involved teams of linguists and computer scientists attempting to code up rules of grammar. Recent work has focused on techniques like word embedding, the underlying idea that a word is characterised by the company it keeps; semantic similarities between words are based on their distribution in large samples of data.

The “bag of words” approach has been applied commercially in finance for more than 10 years. But it can depend on the source of information being analysed: a rule-based approach can work pretty well for news articles that follow certain editorial processes, while social media proves much more challenging.

This technology can be used to reveal things that would take many years’ experience and a carefully curated set of information sources to put together. Tom Doris, CEO at OTAS Technologies, believes in a longer view; he thinks asking an AI how to get better prediction of which stock is going to outperform may be the wrong question.

He said: “What’s exciting is we finally have the techniques that can look at entire economies and start to understand where the ebbs and flows are, and anticipate where the potential downturns or the potential resource constraints are going to be in future.

“That’s quite different to the hype around social media, which has really been focused on the low latency play and being the first to identify when Carl Icahn tweets something, or when the CEO of a company says something stupid—or the President for that matter.”

Doris, who holds a PhD in computer science, believes Twitter can be useful, but in a more limited domain than the hype would suggest. He said it doesn’t contain the information to answer a lot of the questions that are interesting to traders, and it has proven extremely difficult to extract information from Twitter that eliminates enough of the noise for traders to be interested in it.

To a large extent Twitter has been used for low latency event trading, and one approach is using a so-called white list, where only verified company or influencer accounts have been used. Taking on the entire Twitter firehose may offer the wisdom of the crowd, but it’s extremely noisy and can easily become very expensive from a trading point of view.

“Nothing really works very well with Twitter because basically there just isn’t that much information in Twitter,” said Doris. “I think where a lot of this stuff falls down—whether it’s natural language processing or AI in general—people expect it to be able to extract information where there just fundamentally isn’t that much information.”

However, some experts can demonstrate that Twitter-based indicators substantially outperform the market over extended periods of time. Sentiment analytics company iSentium extracts actionable indictors from large amounts of unstructured social content. It points to independent research commissioned by Nasdaq and carried out by Lucena Research.

Gautham Sastri, CEO of iSentium, emphasised that his company does not use a bag of words approach. “Our team of linguists, led by Dr Anna Maria di Sciullo [post-doc from MIT, Fellow of the Royal Society] all have PhDs and have been working since 2008 to build a system that seeks to understand social media messages in a human-like way.

“If short messages are less valuable because they are brief and to the point, then how does one explain the extensive references in Churchill’s History of the Second World War to telegrams that he sent and received? And why were so many resources expended on building the Enigma machine at Bletchley Park?

“I would argue the point that brevity is indeed the soul of wit; and low latency, combined with volume, can provide substantial edge when properly exploited.”

Sastri went on to say that his company processes more volume of social content in a second than the entire New York Stock Exchange produces in a day. “For each tweet that we process, we generate 24 different fields that can provide deep insights regarding demography, geolocation, contagion, etc.”

Even with replies and hashtags, Twitter is very disparate and lacks a strong sense of topic threading, says Doris. “You’d have much richer content if you looked at the archives of email within a company,” he said. “From that you build up a much richer picture about how the different parts of the organisation interact and where the connectivity is and what topics people are discussing.

“For instance, analysis of thread length would tell you whether things are languishing—you just don’t have that richness in Twitter. It’s been very successful in part because it’s so brief and to the point, but that makes it less valuable from an analysis point of view.”

Peter Hafez, chief data scientist of big data analytics firm RavenPack, said investors are finding value in Twitter, but not as much as some people might think, and not as much as one you you might find looking at more traditional sources like news. “That said, I believe Twitter may get a second life as the technology knowledge graphs that are applied are becoming more advanced.

“Twitter is more about tracking consumer sentiment than about following the views of prophets. People tweet about the products they like or dislike, what they wish to buy, observed side effects caused by a given drug, etc.

“They don’t necessarily tweet about the companies that own the products or a given subsidiary. For example, I might say that I love the new Q5, but I might leave out Audi, the owner of the product; and surely I would leave out Volkswagen, the owner of Audi, which would be the stock that I would have to buy if I wanted to trade the equities markets. You could of course have traded Audi’s corporate bonds, skipping the link back to VW; tracking products in a point-in-time fashion is the hard part.”

Today internal content is driving the discussion and many financial firms are looking towards internal content such as email and instant messages like Slack, instant Bloomberg, Symphony, Skype and more for a competitive advantage.

Hafez added: “More traditional firms have started turning their emails and internal investment notes into actionable data points that can be used more directly within their investment process, basically making internal content more easily accessible within the organisation. It allows deeper understanding of where an organisation has a true competitive advantage over publicly available information.”

Returning to a more macro picture of the world, Doris sees NLP technology dovetailing with human discretionary analysis, where the ability to quickly surface information algorithmically will be a fundamental sweet spot.

“It’s about what will be the ultimate truths of the macro environment; are interest rates going to change significantly; is the money supply going to change significantly; do we think that there is going to be a significant change in the socio-economic or the global geo-political environment?” he said.

“That’s the kind of stuff that you do need a human level of perception to understand whether it’s a fad or actually the establishment of a long term trend.”

And regarding firms publishing backtests of their signals, Doris pointed out that it’s really important to know whether trading costs have been taken into account. “If you don’t take into account trading costs, it is trivial to create a ‘signal’ that always makes money, e.g. by assuming you can instantly trade large volumes of stock whenever the main index future price moves and before the stocks catch up.”

 

Win with the wisdom of the crowd

One need look no further than the latest uproar in politics, or the quick spread of a brand’s ill-advised move to know the flash-flood speed of social media’s swell. Twitter is the modern Town Crier, multiplied, amplified and accelerated by the power of the Internet. But how influential is it to a company’s reputation?

US-based iSentium has proven that the crowd makes a measurable impact on stock prices – the ultimate measure of a brand’s value. While social sentiment is among many factors that affect asset value, its role increases as the shouting on Twitter grows increasingly central to public discourse. Take for instance when e.coli was discovered at Chipotle. What would have once been isolated to a local news story became the outrage of a crowd, and the company took a serious hit.

In this climate, companies with access to accurate, real-time sentiment data can use this insight to grow profits, to protect brand and to stay aware.

The problem may be clear, but the solution is challenging

Both Brexit and the American presidential election prove the fallibility of traditional methods and the need to understand the voice of the crowd. But finding meaning in a tidal wave of human expression is hugely difficult. While many tech companies are throwing incredible resource at measuring volume and sentiment in social media, they rely on the binary predictability of computer language, versus being able to capture the actual voice of the people.

This poses two enormous challenges – first the operating expense of processing such a huge volume of data. Think of the $3b AWS cost filed by Snap and you get a sense. iSentium’s cracked that with extreme speed computing that processes the Twitter firehose on the equivalent of a lap top.

The other is that this ‘data’ is in fact language – human expression in its most casual.

iSentium’s engine is driven by a team of Noam Chomsky-trained linguists to handle the complexity and creativity of real human language, as we talk, Tweet, text. For example, the number of times a word is mentioned is irrelevant without context. Also, the way we refer to a brand varies dramatically from person-to-person, especially in short format. People don’t communicate with the goal of being measured by AI. Think of Coke, Coca-cola, or its stock ticker KO.

Platforms like Twitter, Facebook, and Slack force brevity, which obscures the grammar and other structural cues binary natural language processing (NLP) relies on to make sense of words – so each of the more than 3 billion Tweets each month is in a personal shorthand, and iSentium has done a decade of work to figure out how to find the meaning in each – at the incredible speed and volume they’re being pushed.

Too risky to ignore, and equally rewarding to get right

Perhaps this social outcry is a flash in the pan to be ignored? More accurately, ignore the sentiment of the crowds and see your brand’s value irrevocably damaged.

Think about it. This personal expression is where people’s intentions are most clear. The olden days of surveys and polling are no longer sufficient. While iSentium doesn’t replace traditional market research, it shows that real-time understanding of the wisdom of the crowd is essential for smart traders, brand managers and other businesses who rely on understanding market dynamics.

And it works. Well. The proof is in the numbers.

The unique, immediate data stream delivers edge directly to financial institutions that have struggled to find it in recent years — in a way that arrives well before other intelligence, is actionable and is immediately measured in money made. For example, a dedicated JPM/iSentium index (Bloomberg:JPUSISEN), applying iSentium’s daily intelligence on the underlying assets of SPY generated cumulative returns of 67.23% vs. 23.33% using the traditional buy and hold approach.

Contagion Future of AI and Social Media

Nasdaq’s Brad Smith interviews Gautham Sastri, CEO of iSentium — a Nasdaq Analytics Hub data provider — on the future and evolution of artificial intelligence and social media. Here are a few highlights from the exchange:

  • Sastri notes that currently AI has been used at a lower level than we can even imagine. For example mainly with self-driving cars and products like Amazon’s Alexa. However, Sastri points out that more complex tasks and functionality is not far off in the future.
  • When it comes to financial markets, AI will be used to provide traders with a “heads-up” on events and activities that might be considered contagion — notable events that catch fire and affect the value of an asset.
  • Continuing further, Sastri reviews the requirement for a combination of the right sources, data processing and AI to get a full picture of which contagion might affect valuations.
  • Combining data processing with artificial intelligence is the key to providing information to markets and those who trade with the right data to make better decisions for better trades.
  • He explains how companies are currently working to harness all of the data that is available, to create new products and solutions, the key is figuring out what is “stuff,” data, or actionable intelligence.
  • The iSentium CEO also elaborates that in the future it will be necessary to have a combination of really smart AI, intelligent math, very fast chips, and energy efficiency in order to utilize the amount of data that will continue to multiply — currently at 25 terabytes per second and growing 38% exponentially.

 

This broadcast is prepared by Nasdaq, Inc. At the time of broadcasting, the information herein was believed to be accurate, however, such information is subject to change without notice and Nasdaq makes no representation or warranty as to the correctness or completeness of the information set forth herein. Nothing herein shall constitute a recommendation, solicitation or offer by Nasdaq for the purchase or sale of any investment product, nor shall this material be construed in any way as investment, legal, or tax advice or as a recommendation, reference or endorsement by Nasdaq or its affiliates.