Thanks for being here!
Alt Data Weekly is Powered by Vertical Knowledge:
Announcement(s):
Let me know if you’ll be at the September 28th Neudata conference in San Francisco.
We also have a team in London for the BattleFin Conference Sept 26-27 & Eagle Alpha Conference October 12. Let me know if you will be in London for both/either.
Theme that emerged in this week’s email is … the value & importance of data (particularly public web-collection data) is becoming more apparent as AI is taking off … regulations will follow, industry leadership needed.
QUOTES
“Every AI app starts with data and having a comprehensive data and analytics platform is more important than ever.” – Satya Nadella, CEO Microsoft Q4 2023 Earnings Call
News Articles
Podcasts
Cool Charts
Final Thoughts (simplicity)
#1 – James Cicalo & Vera Shulgina of Arcesium published Is Your Data Initiative Underperforming?. September 2023.
My Take: This is the first of a three-part series in which the authors examine data strategy challenges and considerations to evaluate. All the highlighted reasons for underperformance are common (lack of domain knowledge, etc). I assume most firms would answer “yes” to the question posed in the article’s title, in part because expectations are so high, but in part because this stuff is hard to do well.
#2 – Sven Balnojan’s Three Data Point Thursday published How To Evaluate Hot Data Trends. September 2023.
My Take: Data is growing exponentially (see chart in “charts” section below), we are all just trying to keep up & figure out how to best position ourselves to add value to the world. Evaluate these hot new data trends carefully. Start by figuring out how many people can be impacted and how big a lift the project will be, then try it out in reality.
#3 – Andrea Squarito published 5 Reasons You Should Stop Web Scraping. September 2023.
My Take: The classic question of build vs buy. While some may say that collecting data from the web is “easy” … it is very difficult to do well. Technical challenges, compliance issues, quality issues, talent acquisition & retention … all can be solved (call me…this is what VK does better than anyone)
Andrea offers us five reasons to outsource web data collection.
Cost
Scalability
Talent allocation
Re-risking
Strategic positioning
BONUS: Interface Magazine published an interview with West Monroe’s Innovation Fellow Doug Laney How to Monetise, Manage, and Measure Data As An Asset. August 2023. “Data is a non-rivalrous asset, meaning multi-purpose, non-depleting, - you can use it over & over again, it doesn’t go away. And it’s a regenerative asset – we can use data to create more data.”
What else I am reading:
Sara Brown of MIT Sloan published 3 business problems data analytics can help solve. September 2023.
Jonathan Regenstein’s Snowflake for Macroeconomic Data Science: Cybersyn + Snowpark. September 2023.
Michael Mayhew of Integrity Research published Financial Industry Urges SEC to Withdraw AI Proposal. September 2023.
Progressive International published Cybersyn has much to teach us today. September 2023.
WSJ’s Private Equity Recruits Data-Science Talent as Industry Tackles Machine Learning. September 2023.
Source: Goldman Sachs published A conversation with Renaissance Technologies CEO Peter Brown. September 2023 (recorded in July 2023).
My Take: I enjoy conversations like this one that give historical perspective on investing, data, and business building. A few key takeaways. First, the importance of focus … they do one thing really well and do everything they can to keep their eye on the mothership.
Renaissance has core principals that Peter was able to share:
Science … they are scientists, not Wall Street people
Collaboration
Infrastructure
No interference with trading systems … we don't impose our own judgment on how the markets behave
Time & experience really matter
Highlights (40-minute run time):
Minute 00:30 – interview starts; Peter’s background (CMU, IBM, etc)
Minute 03:15 – early LLMs (“we had very little data”)
Minute 04:30 – sample of text generated from early LLM … spoiler alert … it’s not good
Minute 08:30 – moving from IBM to Renaissance
Minute 10:00 – what did Peter actually do at Renaissance?
Minute 12:00 – expanding beyond equities & making changes
Minute 14:00 – institutional funds discussion (mu vs sigma)
Minute 17:30 – relationship with Bob Mercer
Minute 19:30 – discussion of risk
Minute 25:45 – discussion of Renaissance’s response to Covid
Minute 27:00 – why has Renaissance been so successful?
Minute 32:00 – speed round questions
Minute 35:00 – how to assess & interview potential hires
Bonus: The Analytics Engineering Podcast published Ep 48: Bring your own data to LLMs (w/ Jerry Liu of LlamaIndex). August 2023.
Source: ChatGPT makes us better at our jobs. This is the finding of a paper published in conjunction with BCG.
Ethan Mollick’s One Useful Thing has a summary here.
“Our results demonstrate that AI capabilities cover an expanding, but uneven, set of knowledge work we call a "jagged technological frontier.” Within this growing frontier, AI can complement or even displace human work; outside of the frontier, AI output is inaccurate less useful, and degrades human performance,”.
Full paper with a catchy title is available: Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality.
“…for 18 different tasks selected to be realistic samples of the kinds of work done at an elite consulting company, consultants using ChatGPT-4 outperformed those who did not, by a lot. On every dimension. Every way we measured performance.”
One more. This one from the Three Data Point Thursday article highlighted above.
Simplicity.
I’ve been thinking about this as it relates to the data world.
To expand the benefit of the massive amount of data being created, we need to make it easy to engage, ask questions, trust the answer…all in a timely, cheap way.
“Simplicity is the ultimate sophistication.” - Leonardo de Vinci
One of the challenges is communicating to prospective users of data is the potential value. Engaging with data can be overwhelmingly complex, everyone has a different use case, sales cycles get extended … it is all too complex.
Thinking of a recent episode of The Analytics Engineering Podcast in which the hosts interviewed former Snowflake CEO Bob Muglia. They discuss how elegantly Apple’s iPhone does the most complex tasks.
This is true & we take it for granted.
Data has yet to have it’s “iPhone moment”.
Ha! Same headline picture! https://www.therandomwalk.co/p/daily-data-ai-is-good
Thanks for citing the article, John!