Thanks for being here!
This is the Alternative Data Weekly for Friday, February 16, 2024.
Announcement(s):
I’ll be at Neudata’s February 29 Boston Data Day for Investors. Let me know if you will be there!
Theme that emerged in this week’s email is … the importance of solving the right problem.
QUOTES
“But with alternative data, there’s a lot that one can do. We first analyzed numbers, and then words. Now we’re also looking at context. Previous approaches cannot account for that, but large language models can.”- Dacheng Xiu, professor of econometrics and statistics at Chicago Booth
News Articles
Podcasts
Cool Charts
Final Thoughts (Predictions)
#1 – Tristan Handy’s Is the "Modern Data Stack" Still a Useful Idea? February 2024.
My Take: Loyal ADW readers will know I enjoy articles like this one which provide historical context, shares recent experiences that then help form the always-difficult predictions about the future. Starting from 2016, Handy’s discussion of the evolution of MDA (Modern Data Stack) is a valuable history for anyone in the space. Today the industry finds itself coming off a universally tough 2023 and heading into a period of likely consolidation and increased impact (see Carlota Perez’s framework chart below).
My belief is that as data tools (MDS) improve, more people will come to rely on data to improve their day-to-day lives. Value will be demonstrable & switching costs will be high … early movers will benefit.
#2 – Duncan Gilchrist & Jeremy Hermann published The costliest mistake your business can make in machine learning. February 2024.
My Take: Are we solving the right problem? I see this with product development. Developers build really cool products with no commercial use case. Get the problem framing right and you’ll get it right. Focus on what you are trying to predict, how you’ll score hose predictions, and how those predictions will be used. Again a theme of ADW is having domain experience to go along with the data skills.
#3 – Sven Balnojan’s Three Data Pont Thursday published 3 Actionable Tactics To Create A Good Data Strategy. February 2024.
My Take: Fascinating read. There is a ton of great information. You should read the entire article and sign up for the weekly email. Using NFLX as the example, we go through the steps of creating a better data strategy. The key is to map out the pipeline and figure out which steps are most important. Start there. How can you make that step better. Find the points of leverage in your business & focus your attention.
BONUS: Snowflake’s Advertising, Media, and Entertainment Data + AI Predictions 2024. February 2024. “The generative AI era does not call for a fundamental shift in data strategy. It calls for an acceleration of the trend toward breaking down silos and opening access to data sources wherever they might be in the organization.”
What else I am reading:
Sanaea Daruwalla of The Web Scarping Club published What the court’s ruling in the Meta v Bright Data case really means for web scrapers. February 2024.
Jason Derise published Follow-up on Fashion Industry Pricing Trends: A Data Score Interview with Flywheel. February 2024.
Bonnie Waycott published in Global Seafood Alliance How computer science and artificial intelligence can enhance commercial fishing. January 2024.
Chicago Booth Review’s In Finance, Humans Were the First Machines. February 2024.
Lynne Schneider’s Generative AI and External Data Sources: A Recipe for Success?. April 2023.
World of DaaS published Weekly DaaS Roundup for Feb 8, 2024. February 2024.
CFA Society of NY’s The State of WealthTech: How Generative AI is Changing the Game
Source: Lindsay Murphy of Women Lead Data Podcast interviews SeekAI’s Sarah Nagy. January 2024.
My Take: Sara Nagy has a good story. From her days at Citadel where she was “like a Siri” & her role was pulling data at every ad hoc request. It was very inefficient & she started SeekAI to solve the problem.
There is alpha in big data, particularly “alternative data” (07:30).
There is a good conversation around the fund-raising process. Her goal was to meet & pitch 100 people. She talks about the importance of having a thick skin to endure rejection & her complex feelings about being rejected as a woman. Is there value in ruminating over “the why”? Ultimately, it is a numbers game. You’ll get rejected but you have to keep asking until you get a “yes”.
Other learnings from the fund raise: you are not going to be able to change someone’s mind, it is either a fit or not, this is why it is a numbers game. Talk to as many investors as you can, your first 20 pitches will be terrible, you’ll get better.
Now that funds have been raised and the business is growing, her attention on hiring is key. Hire slow.
Highlights (36-minute run time):
Minute 01:45 – Sarah’s background
Minute 05:15 – the move from astro-physics to data
Minute 07:45 – most recent experience and how that led to starting Seek
Minute 11:30 – first steps into co-founder / CEO role
Minute 16:00 – it’s a numbers game; you’ll get told “no” a lot
Minute 20:00 – discussion of the funding process; pre-seed round + seed round (2 parts)
Minute 25:00 – thoughts about the team Sarah is building at Seek; time investment
Minute 34:00 – parting words for entrepreneurs (just do it)
Source: Carlota Perez’s framework cited in Tristan Handy’s article: Is the "Modern Data Stack" Still a Useful Idea? February 2024.
“Prediction is very difficult, especially if it’s about the future!” - Neils Bohr
People use data to narrow the range of potential outcomes. Predicting is still a difficult exercise, but the more quality inputs we are able to include, the narrower the range of outcomes, & the greater conviction you have in your conclusion.
Balancing the inclusion of ever-more inputs with the risk of analysis-paralysis is a difficult act.
Those that do it well will be ahead of the game.