Thanks for being here!
Welcome Alt Data Weekly’s Sponsor:
Augment Your Design & Development Team
Click here to learn more. See special offer below EXCLUSIVELY AVAILABLE for readers of Alt Data Weekly.
Announcement(s):
I will be in NY July 25-26-27. Let me know if you have time to connect.
Theme that emerged in this week’s email is … proving you have the legal right to use the data you are using will be a key to growth.
QUOTES
“Data and compute have created a flywheel—driven by language models—that generates more digital information than ever before. This shifts where value sits in software ecosystems, and presents key opportunities for large incumbents and new startups.” – Abraham Thomas
News Articles
Podcasts
Cool Charts
A Word From Our Sponsor: Neutech
Final Thoughts (Be Discerning)
#1 – Abraham Thomas published Data and Compute Are The Ultimate Flywheel. July 2023.
My Take: I have learned a ton about the business of data from Thomas’s writings. This article is an add-on to his May 2023 article Data in the Age of AI. Bottom line, data is a critical competitive advantage. Data owners will benefit.
There is a TON of great information in this article, but the confidence chain is a concept that is key & of growing importance. Basically, can you prove you have the rights to this data from the ultimate source through the entire chain (governance, provenance, etc)? Clearly demonstrating this is key, especially when dealing with heavily regulated end markets like institutional investing.
The second is Thomas’s reframing of ‘human-in-the-loop’ to ‘software-in-the-loop’ to automate basic repetitive practices.
Lastly, the question we should all be asking is “where is the scarcity”
#2 – Alexandra Barr’s Synthetic vs Real Data: Why do models perform worse when trained on synthetic data?. July 2023.
My Take: Good overview of issues to consider when thinking about how to use synthetic data. Importantly, the author does a great job of explaining these issues in easy-to-understand language…a sure sign of strong understanding. Topics like over-representation, under-representation, and model collapse are reviewed.
“data produced by humans will become even more valuable in the future so we continue to have realistic distributions of tail data points.”
#3 – idaciti stories published Breaking Barriers: How Generative AI Transforms Data Collection Landscape. June 2023.
My Take: Perhaps related to both of the above articles is this article’s focus on data collection. We need data confidence and data “produced by humans”. Generative AI can, in theory, do a much better job of gathering & organizing data from millions of places. The key will be (somehow) creating an audit process to prove provenance through the data chain. Five years from now, current decision-making processes will look like guessing.
BONUS: Matt Ober published AI agents - Data Buyers. July 2023. “The future I believe will be AI agents buying datasets for users. Buying data on demand, paying for snippets, delivering charts backed by these datasets, and handling the entire process for us.”
BONUS 2: Eric Avidon published Generative AI hype evolving into reality in data, analytics. July 2023. “Another way enterprises will use generative AI as it moves from hype to reality will be developing language models designed for specific purposes, according to Dalloul.”
What else I am reading:
Justin Pot from The Atlantic published Google’s New Search Tool Could Eat the Internet Alive. July 2023.
Andrea Squatrito published AI Depends Massively on Web Scraping. July 2023.
Andrea Squatrito published Scrape Or Buy? The Zalando Case Study July 2023.
Boris Spiwak’s Unleash AI for Lead Generation. July 2023.
Jason Derise published What Alt Data Should Learn from Alt Rock. July 2023.
Will Shanklin from engadget published Massachusetts weighs outright ban on selling user location data. July 2023.
Coatue EMW deck: Click here
Source: Deborah Lorenzen of State Street interviews AI Product Manager Supreet Kaur of Morgan Stanley.
My Take: I enjoy conversations with the people running data initiatives at massive companies. They have resources to evaluate many solutions and develop the processes that will eventually make terms like synthetic data, Ai, data quality … common across organizations (see the reverse mentoring at 12:30 mark of the interview). Some of the initial use cases of synthetic data include protecting privacy, filling out imbalanced datasets, or the creation of edge cases that might happen. Early keys to making this effort valuable to the organization include data quality frameworks/processes & adding the right people to the team! Lastly, the conversation about bias was interesting (~minute 07:00).
Highlights (16-minute run time):
Minute 00:30 – interview starts & Supreet shares her background
Minute 01:30 – what is synthetic data & where is it found in your organization
Minute 03:00 – sample business use cases
Minute 05:00 – what does it take to unlock this power & be successful
Minute 07:00 – how to get the right questions … avoiding biases
Minute 12:30 – reverse mentoring (young mentor the more experienced
Minute 13:00 – where are we on the maturity curve?
Source: Ben Lorica’s The New Era of Efficient LLM Deployment.
The rise of LLMs demands that we rethink our MLOps tools and processes, and necessitates a more holistic approach towards building and deploying these models.
Framework for deployment (can be tough to read below … please click here to see image more clearly):
Our valued sponsor Neutech can help you manage your team’s growth.
Now through August 1, email sales@neutech.co with subject “Alt Data Weekly” to get $2,000 off your first month.
Be Discerning.
This is a real skill. It is easier to be discerning when you have some resources (time, but also money). It is also easier to be discerning when you have fewer options. Too many options can be overwhelming. Too many options for college, too many option for majors, too many options for dinner. There is a balance to be struck.
In my thinking about this topic, I have enjoyed the writings of Naval Ravikant. Two articles I’ll highlight. Here & here. Plus his book The Almanack of Naval Ravikant: A Guide to Wealth and Happiness.
Professionally. I have made some non-optimal decisions in the past. Not necessarily bad decisions, but not ideal, when choosing careers, or choosing areas of focus.
The best high-level example of the impact of discernment is over the last 25 years an average-performing tech worker in Silicon Valley likely had a more successful career than a high-performing worker in a traditional manufacturing industry.
The best career decision one could have made in 1995 would have been to get into tech relative to manufacturing (spoiler: I ‘chose’ manufacturing in 1995 but pivoted!).
In my case, there was ZERO discernment regarding my first job out of college … I just went to the on-campus recruiting day & was happy to get & accept the first offer!
"You can save yourself a lot of time if you pick the right area to work in." - Naval Ravikant
Later in my career (early ‘00’s), I decided (was decided for me?) in my equity research sales position to focus on large mutual funds as clients rather than hedge funds. The thinking was that the mutual fund customers would be more stable & therefore better over the long term. Over the following decade, the HF customers were indeed less stable, but the growth of that market (the amount HFs were broadly paying for research services) far exceeded anything I anticipated.
Frankly, at the time, I did not give it much thought (discernment) as the first & easiest path presented to me was this lower growth area. Had I been more discerning early in my career & pushed back, saying I would rather work with the potentially higher growth market, perhaps I would have had a bigger positive impact on the firm (and made more money).
That said, knowing my personality, I would never have pushed back like that … but successful people have that trait of understanding, almost having a sixth sense, about where to focus their time.
When making these types of decisions, it is worth taking a step back and having good mentors to run through your thought process. The vast majority of people, me included, are just excited to get a job. This is especially true when you feel desperate for a job.
Having a framework of discernment puts you far ahead of peers in the long term.
Be Discerning.