Thanks for being here!
This is the Alternative Data Weekly for Friday, March 15, 2024.
Announcements:
Check out the Fordham Quant Conference April 4th in NYC.
Theme that emerged in this week’s email is … there are an ever-evolving number of use cases for the massive amounts of new data.
QUOTES
“We don’t think AI, at least in our field, is as revolutionary as others do. It’s still just statistics. It’s still a whole bunch of data going in and a forecast coming out.” – Cliff Asness
News Articles
Podcasts
Cool Charts
Final Thoughts (A Picture is Worth 1,000 Words)
#1 – Adam Nahari & Dimitris Bertsimas published in HBR External Data and AI Are Making Each Other More Valuable. February 2024.
My Take: The article goes into detail about various PE & VC use cases, beyond deal discovery & due diligence. Of most interest to me was the idea the AI will more quickly point to what is important in the mountains data. Filters are key when you have too much information. The real challenge will be training your AI to filter out the non-relevant information. No easy feat.
#2 – Neudata published Data scraping in 2024: What we are watching. March 2024.
My Take: This article does a nice job summarizing the current landscape. There is quite a bit of uncertainty as we learn how AI is using data. There have been some significant recent legal settlements (Meta vs Bright Data) and there will be more judgements in coming months, mostly around the rights data owners have over how their information is used by AI models. Let’s watch!
#3 – Qaisar Hasan & Stan Altshuller published Maiden Century’s Alternative Benefits of Alternative Data. March 2024.
My Take: The industry has come a long way from just trying to predict the quarterly sales comp with credit card data. The are limitless opportunities when using alternative sources of data. Platforms like Maiden Century’s help make that data much more accessible. The real defendable alpha will be on the combination of multiple datasets in proprietary ways … this is much tougher for competitors to replicate.
BONUS: Integrity’s Michael Mayhew published Alternative Data Vendors Offer New Perspectives to Private Equity Firms. March 2024. “As the private equity landscape continues to evolve, the use of alternative data is expected to play an increasingly pivotal role in driving successful outcomes and maintaining a competitive edge in the industry.”
What else I am reading:
Todd Harbour’s Data as an Unacknowledged Commodity: The Case for a Formal Data Exchange. February 2024.
Data Boutique’s How Data Boutique Works #1: Data Auctions. March 2024.
The Ocean Protocol Team published Ocean Protocol Update || 2024. February 2024.
Wired published How the Pentagon Learned to Use Targeted Ads to Find Its Targets—and Vladimir Putin. February 2024.
Thought this was interesting: “To assist researchers, the BIS offers a pre-compiled full text extract of the speeches delivered by central bankers since 1996”. Link here.
Matt Ober’s Wall Street Technology Predictions 2050 . March 2024.
Bob Knorpp published on the Datos blog Energizing Your AI-Driven Marketing With Clickstream Data. March 2024.
Source: Ben Lorica of The Data Exchange Podcast published 2024 Themes and Trends in AI. February 2024.
My Take: This is a bit technical for some of the more business-oriented readers of ADW, but I thought interesting for the few tidbits I was able to collect. Of most interest was the general optimism Ben carries throughout, specifically around the creation of new data tools for GenAI. The idea of new industries being created around AI quality & risk mitigation makes a ton of sense (data labeling, etc), and there is a lot of VC money chasing smart people solving that problem.
Highlights (27-minute run time):
Minute 01:00 - Topic #1- democratization of hardware for GenAI
Minute 03:30 - topic #2 - enterprises changing how building, Fine tuning, and deploying
Minute 08:00 - discussion of unstructured data for LLMs
Minute 09:00 - Topic #3 - GenAI on the edge; no reason not to expect LLMs to be on the edge
Minute 12:50 - Topic #4 - how we engage with LLMs will change. Today we prompt an LLM with an input … mixture of experts architecture; Model will decide what LLM to pass your prompt to.
Minute 21:00 - topic #5 - AI integration. Need more modularity
Minute 23:45 - real time speech synthesis
Minute 25:00 - new industry will emerge around AI quality control and risk mitigation
Source: Barr Moses of Monte Carlo. March 2024.
“Building a data platform from scratch? Here's a quick rubric data teams can use when assessing whether or not to build or buy.”
BONUS: Ocean Protocol Update || 2024. February 2024.
Source: Really liked this quote:
“The problem with drawing an owl like this is that the axiom that “a picture is worth a thousand words” has a corollary: It takes a thousand words to describe a picture. And even that’s probably an understatement.” - Benn Stancil
There is going to be a ton of value in “prompt engineering”. This is the ability to ask the right question at the right time. And then to train your model to ask the right question at the right time.
Matt Ober predicts that in 2050 investment PMs will have a team of AI agents that have replaced the analyst team. The real value will be training those AI analysts to filter out the garbage and filter in the relevant information … and in the investment world, what is relevant changes all the time. Perhaps communicating what is relevant to your AI analysts will be the PM’s most valuable skill.
Developing the prompt engineering of those AI analysts will take decades, so maybe 2050 is a good target. We can start with the small stuff, like how Twitter learns (in theory) to filter out the posts I don’t like & aren’t relevant…or how to prep a follow up email to a group with whom I’ve just had a call.
Need more than 1,000 words to describe Channing Tatum. We have a long way to go & there will be an incredible amount of opportunity along the way:
Hey, John! Thanks for the mention!