[HN Gopher] Selling Data to Hedge Funds
       ___________________________________________________________________
        
       Selling Data to Hedge Funds
        
       Author : r_singh
       Score  : 71 points
       Date   : 2020-11-01 09:11 UTC (1 days ago)
        
 (HTM) web link (alternativedata.org)
 (TXT) w3m dump (alternativedata.org)
        
       | threedots wrote:
       | I run a business in this space. Realistically your chances of
       | having data that is useful to a HF (of any kind) is pretty low so
       | I wouldn't bank on it as a revenue source unless you have a
       | strong reason to believe (1) your data is predictive of something
       | an investor cares about and (2) isn't already covered by other
       | data.
        
         | dhairya wrote:
         | Out of curiosity what resolution and time scale is useful. Is
         | it fair to assume the most hedge funds are relatively good at
         | tracking recent information and the value is in older archives
         | that's hard to collect?
         | 
         | Also are event streams and large connect event graphs like
         | Forge.AI what sells actually useful?
         | 
         | Curious as my phd research potentially has applications for
         | information extraction and event linking. But not entirely sure
         | if those applications are actually valuable
        
         | ed25519FUUU wrote:
         | I'm curious the types of data that _are_ useful, what goes into
         | them, and what the curators of that data might make for them.
        
         | fractionalhare wrote:
         | There are also ways to reliably find and curate this data for
         | trading firms and asset managers. But that usually requires a
         | somewhat uncommon blend of skills; like statistics, web
         | scraping, reverse engineering, or some domain expertise that
         | gives you an edge in (legally) finding and using nonpublic
         | data. Scrappiness also helps a lot.
        
         | crb002 wrote:
         | Trying to get my credit union to allow users to opt in to sell
         | bulk annon checking account transactions to a hedge fund.
         | Having that much point of sale data would be huge.
        
         | r_singh wrote:
         | Could I send you a sample?
         | 
         | I'm curious about certain data I'm scraping for a project.
        
           | fractionalhare wrote:
           | If you shoot me an email I can tell you very quickly if it's
           | viable, and direct you to specific people at firms who would
           | buy it. I probably don't need a sample if you give an honest
           | description of it.
        
             | neolog wrote:
             | Is it better to provide the raw data, or to instead provide
             | some interesting statistics from aggregating or running a
             | model on the data?
        
       | logicslave wrote:
       | Honestly if you have a predictive data set, most likely it is
       | more profitable to use it to invest yourself than it is to sell
       | it. Its like a successful startup raising VC funding, the good
       | ones dont need the investors, the bad ones do.
       | 
       | Source is I used to work in this space. Clearly there are
       | operational difficulties to investing and serious domain
       | knowledge but if you have data with alpha, its worth insane
       | amounts of money (seven figures a month).
        
         | elevenoh wrote:
         | True.
         | 
         | I'd ask: "If this data is truly useful in a financial market
         | context, why are you selling it instead investing with it
         | yourself? What data are you collecting that you're not*
         | selling?"
        
           | ed25519FUUU wrote:
           | If the data made a few dollars per thousand dollars traded,
           | it would take you a long time to make meaningful income from
           | it, as opposed to a market maker.
        
           | AznHisoka wrote:
           | Because sometimes the data is useful when combined with other
           | data you might not have yourself. Example: you have data on
           | number of Netflix subs. That's great. It would be even more
           | useful if you combine that with Netflix churn data.
           | 
           | You can think of company analysis as a jigsaw puzzle. You may
           | have a few pieces. That by itself isn't useful unless u have
           | the other pieces.
        
             | whatok wrote:
             | What you're saying is correct but also puts a non-
             | sophisticated seller in a poor negotiating position. They
             | do not know what existing data the buyer uses or how their
             | data can be complementary to anything.
        
               | ethbr0 wrote:
               | That's why start-high and walk away is a fair opening
               | gambit in negotiations of this sort.
               | 
               | You can always come back around, but if they pursue you,
               | that tells you something about their appetite.
        
               | whatok wrote:
               | I mean, I guess that's something that you can do but it's
               | likely to get you laughed out of the room and greatly
               | encourage them to reverse engineer whatever you were
               | trying to sell.
        
           | fractionalhare wrote:
           | This is a good heuristic for the sale of trading strategies.
           | If someone has a viable trading strategy, they should be
           | raising capital or joining a prop shop instead of selling it.
           | 
           | But it's not maximally rational to trade on data instead of
           | selling it if you don't have any experience trading.
        
             | whatok wrote:
             | And even if you're somehow able to figure out a trading
             | strategy on this with no expertise, there are extremely few
             | data sources that are actually proprietary so alpha decay
             | is a very real thing.
        
         | fractionalhare wrote:
         | I disagree. Sell-side research using so-called alternative data
         | is a different skillset from trading and buy-side research.
         | Some datasets comprise sufficient alpha in of themselves to
         | trade on, some datasets have some edge (i.e. are not yet priced
         | in) but require more sophisticated analysis. Good data is
         | necessary but insufficient for developing viable trading
         | strategies.
         | 
         | Source: Also used to work in this space. Still do, but not on
         | the sell-side anymore. I wouldn't give trading capital to
         | someone if their entire pitch was just possession of exclusive,
         | useful data.
        
         | Closi wrote:
         | Surely a hedge fund has three advantages compared to investing
         | yourself:
         | 
         | * Higher amounts of money to invest immediately, so they can
         | better capitalise on the data quickly.
         | 
         | * Balance investment risk over many different assets - they can
         | take a higher degree of risk/reward as any single failure in
         | investment is unlikely to topple them
         | 
         | * Combine multiple data sets with other trading strategies to
         | maximise returns.
        
       | dtwest wrote:
       | Does anyone else find it weird when a business like this uses the
       | .org domain? Not to distract from the main discussion, just
       | curious what people think on here.
        
         | altdatathrow wrote:
         | It's not a business. It's a stale content marketing effort ran
         | by one of the incumbents in the space who leveraged it as a
         | means to promote themselves in front of their competitors.
        
           | maest wrote:
           | Indeed, you are correct (at least about being ran by the
           | incumbents):
           | 
           | > AlternativeData.org is supported and maintained by
           | YipitData and sources its content from hundreds of
           | contributing investors, data providers, and industry
           | professionals.
           | 
           | It would make sense for this to be aimed at data consumers
           | rather than data producers, considering how the advice is,
           | honestly, unrealistic.
        
       | andrenotgiant wrote:
       | Probably should add a "[2017]" to the title of this post.
        
       | paulgb wrote:
       | The article mentions having a long history of data, but I want to
       | stress a corollary of this: if you think you might eventually
       | want to sell data, either put your data in an append-only data
       | format _now_ so that history is preserved, or at least take
       | regular snapshots.
       | 
       | Otherwise, any time you update the data in place, you make it
       | impossible to reconstruct what the data "looked like at the time"
       | for a backtest.
        
         | robswc wrote:
         | Definitely agree. I'm not on the level of these hedge funds but
         | I do love using data for trading.
        
         | fractionalhare wrote:
         | Yep. You typically need to incubate your data for several
         | quarters to demonstrate a real correlation. And you have to
         | balance how you prove this against the fact that some firms
         | will try to reproduce the data collection internally once they
         | learn how you get it.
        
         | conformist wrote:
         | Absolutely. Ideally, you'd want both, "append only" and a
         | vintage of the corrected historical data at every point in
         | time. And good metadata.
        
       | scottydelta wrote:
       | I recently published a data dashboard(https://public.quantale.io/
       | dashboards/50778f4a-02bb-4d31-b6e...) using Quantale[1] to show
       | the correlation between Tweet activity of a stock vs the price
       | during the recent stock split of Tesla and Apple.
       | 
       | While I believe there are softwares that can collect such data
       | which can definitely be useful to hedge funds but at the same
       | time we should note that the data available to general public is
       | most probably something these hedge funds already have since they
       | have spent years developing systems to gather and analyze data to
       | stay ahead of the curve.
       | 
       | [1] Quantale (https://quantale.io/) is a data collection and
       | analytics platform my company is developing.
        
       ___________________________________________________________________
       (page generated 2020-11-02 23:01 UTC)