Thanks for being here!
Announcements:
Big One —> 90 West now offering our US consumer transaction card data at a more granular row-level. Clean, consistent, daily, low lag, significant history, with demographic insight.
Contact me to learn more.
Theme that emerged in this week’s email is … the structural advantage of having big, robust data to analyze is reduced with better AI tools & as big data becomes more widely available.
“Even if we have the tools that companies like Netflix, Google, AirBnB, and others have, and even if we copy their cultures and hire their employees, we’re still missing the third leg of their gold-plated analytical stool: Their data.” Benn Stancil
News Articles
Podcasts
Cool Charts
Final Thoughts (Data Commons)
Appendix
#1 – Harbr Data published Three ways to elevate your data business. September 2022.
My Take: The concepts: 1- Differentiation, 2- Speed of Innovation, 3- Unit Economics. Monetizing data is a great business once it reaches scale. This is why companies are taking a closer look at their “data assets”. The problem is it takes time and money to productize and get to market. Those willing to make the investment will see the payoff as I am a believer that within 3-5 years the market will rewards those companies that can most clearly articulate their data strategy. Every company is a data company.
#2 – Benn Stancil published Data’s day of reckoning. November 2022.
My Take: I really enjoy the work that Benn publishes. It’s all about the data. Can it be that the “big guys” of the data world work with data so effectively in large part because they have such great data to work with? All this talk of tools, culture, teams … none of it matters if you don’t have good (great?) data that can be valuable to the marketplace.
#3 – Elad Gil published AI: Startup Vs Incumbent Value. October 2022.
My Take: this caught my eye as it is related to the above article (& theme of the week). The big guys, incumbents, have an advantage because they have such robust data with which to work. This is a huge structural advantage for the Incumbent vs the Startup, thus the title of the article. The author argues this is changing … “perhaps incumbents won due to a data advantage that is now going away as companies use the broader internet as an initial training set + are switching to models that work more robustly against smaller data set?”
BONUS: Spherical Insights published a report indicating that The Global Algorithmic Trading Market Size was valued at $13.02 billion in 2021 and is expected to reach $31.30 billion by 2030, growing at a CAGR of 13.6% during 2021-2030. The algos need high quality reliable data (90 West!).
BONUS 2: Seattle Data Guy published Data is the How, Business is the Why. November 2022. Importance of domain knowledge.
#1 – Cindi Howson’s The Data Chief Podcast published Six Rules to Dominate the Decade of Data. November 2022.
My Take: Data & analytics is now a boardroom conversation. This podcast pulls highlights from a few of The Data Chief’s previous interviews. Of most interest to me was Rule #5 (Minute 35:00), at which time Vandana Khanna & Deeksha Singh from Unilever discuss the use of 3rd party data (use of geospatial satellite data).
Highlights (45-minute run time):
Minute 02:30 – Rule 1: Leverage the best-of-breed from the modern data stack
Minute 11:00 – Rule 2: Empower everyone with true self-service analytics
Minute 18:44 – Rule 3: Drive actions with operationalized insights
Minute 27:30 – Rule 4: Build a flexible data foundation
Minute 35:00 – Rule 5: Utilize third-party data to build a 360-degree view
Minute 38:30 – Rule 6: Deliver engaging data-driven UX
Source: Development Information published on twitter Data Engineer vs Data Scientist vs Data Analyst. November 2022.
Source: Data Commons
I went down this rabbit hole this week on Data Commons (Google supported project). This relates to the idea which is highlighted in a number of the above listed articles this week & became the theme of the week…data is now more widely available. Now what?
Now that massive amounts of data are available and can be cheaply stored, were are stuck trying to solve the, “oh sh*t, data is now everywhere” problem.
The good news is there seems to be an intuitive understanding that data has value. Better use of data can offer you a structural advantage over your competition.
But there is simply too much data & if not organized, it is largely useless beyond curiosities (Data is Plural1) or very narrow use cases (hedge fund trading). No one has any idea where to start when it comes to organizing the data. Even the most heavily-resourced firms struggle with the investment required just to get data in a position where it can be analyzed. Thus all the VC money flooding into data tooling companies over the past 2-3 years. Capitalism at work.
“Data Commons is an attempt to ameliorate some of this tedium by doing this (organizing) once, on a large scale and providing cloud accessible APIs to the cleaned, normalized and joined data.”
Every company is a data company and Google is the preeminent data company. This effort from Google-sponsored Data Commons is a net positive for the world, and in-line with what they do.
Google has proven they are the best at organizing the internet. Therefore, it is likely Google will do the best relative job making the world’s structured data more useable2. Perhaps control and centralization are seen as negative, but when it comes to making important data available, without a centralized structure the data is largely useless and transparency gets lost.
Is this purely altruistic? Being the go-to place for the world’s information will drive use of Google’s Cloud and/or their Big Query tools. But more importantly, Google will have the final say in how this publicly available data is structured, joined, & centralized. This will help them maintain their massive data-driven advantage, just hope they remember when they said don’t be evil.
Reminds me a bit of one of my favorite weekly’s, Data is Plural, that always sends me down a path of finding interesting data-related curiosities. Check it out!
This effort is really important as this is the data that will be used to train all the AI models that will train all the robots that will take over the world.