Thanks for being here!
Announcement(s):
Enjoy the 4th of July weekend!
Theme that emerged in this week’s email is … data tools are evolving, rapidly.
QUOTES
“…the biggest obstacle firms face is gaining rapid access to good quality data, whether that’s coming from an internal or external source.“ - Harbr Data
News Articles
Podcasts
Cool Charts
New Offerings … coming soon
Final Thoughts (Snowflake Summit Takeaways)
#1 – ZIRU published Exploring Key Data Architecture Trends Shaping the Future of Data Management. June 2023.
My Take: The author delves into key data architecture trends that will shape the priorities of data organization this coming year. There are five:
Lakehouse architecture – store &manage data
Data Mesh – decentralized approach that promotes scalability, flexibility, innovation
Data Governance – security & compliance
Real time processing/streaming – faster decision-making
Data Architecture/Data modeling – solid foundations
#2 – Portable’s Ethan Aaron published what he would have said to the data world if given the platform at Snowflake’s Summit. June 2023.
My Take: Understand the value you are driving for your business. How to create value (4): 1-automate, 2- provide leverage to execs, 3- build sale-able products, 4- mitigate quantifiable risks. In my opinion it comes down to domain knowledge. If you understand the business, you will know how to best use the data.
#3 – Harbr Data’s Step one for AI deployment: Acquire the right data. June 2023.
My Take: AI was everywhere at this week’s SnowFlake Summit. The AI/LLM model input data will be closely monitored. I’ve been a big proponent of value flowing toward the data owners. This should become more obvious as it becomes clear what ”good” data looks like. It is important to note that the training data need not be enormous in quantity. Quality (clean, consistent, complete) is far more important than quantity.
BONUS: Data Axle’s interview with Kieran Kennedy of Snowflake Leveraging new data technologies to accelerate business growth. June 2023. “…what they’re not doing is fully leveraging those insights to really make business decisions based on data. They still use a lot of intuition and instinct. They may know there’s data out there to bring in new insights, but it’s been difficult to wrangle data and put it into a usable format – that’s where Snowflake comes in.”
BONUS 2: Gennie Gebhart & Josh Richman published in Scientific American Science Shouldn’t Give Data Brokers Cover for Stealing Your Privacy. June 2023. ”Critical academic research must not become reliant on profit-driven data pipelines that endanger the safety, privacy and economic opportunities of millions of people without their meaningful consent.”
What else I am reading:
Matthew Bernath’s Data Unbound: The Art of Effective Data Sharing. June 2023.
IMF’s Deputy Managing Director Bo Li remarks including commentary about the IMF’s Data Gaps Initiative titled Digitally Driven Financial Innovation. June 2023.
Oxylab’s Aleksandras Šulženko on How governments use alternative data for policymaking. June 2023.
Brookings Institute published Fighting poverty with synthetic data. June 2023.
Dr. Sven Balnojan’s Three Data Point Thursday Use Alternative Data, seriously. June 2023.
#1 – Drive by Data, The Podcast, interviews Micheline Casey, CDAO at Siemens Energy.
My Take: I’ve read & written a lot about the idea of “data products”. I was interested in hearing more from Micheline as she has had success building data products & solutions. She discusses “Design centric thinking”, which is iterating on products with a focus on understanding & addressing customer pain points. This framework for problem solving focuses on creating tangible business value (“hard benefits” & “soft benefits”).
I’ve seen recurring theme of the importance of really understanding customer pain points when developing products (data products or otherwise)… Applying domain adjacent practices to the world of data products.
Lastly, continue to hear the recurring theme of the importance of quick wins (but going slowly enough that you are doing it right) & the importance of domain knowledge.
Highlights (44-minute run time):
Minute 02:00 – interview starts & Micheline’s background (Federal Reserve, Ford, Maersk, Siemens)
Minute 04:30 – high level overview of Siemen’s Energy & current role
Minute 08:00 – business goals at Siemens (more decentralized energy systems)
Minute 11:15 – “design-centric thinking”
Minute 17:15 – importance of talking the language of the business; laser focus on delivering quick wins
Minute 20:30 – aligning data strategy with larger business strategy
Minute 23:30 – examples of successful data products (understand customer needs, usability)
Minute 31:00 – balance “quick wins” vs “slowing down to speed up”
Minute 33:30 – thinking about team structure (clear roles, clear accountability chain)
Minute 36:15 – measuring value; economics evolve rapidly (“hard benefits”, “soft benefits”)
Source: The brain of Jason Derise of Datachorus.
…coming soon.
Reach out if interested in supporting this publication & reaching this audience.
Key takeaways from this week’s SnowFlake Summit.
The energy was palpable, lots of activity in the data space.
AI will solve problems that are too complex for humans given the number of inputs.
The key is to bring compute to data, not to bring data to compute (data is too big now).
Snowflake is putting a ton of focus on their building data marketplace
Probably most interesting to me was the idea of “apps” on top of structured data (sitting in SnowFlake of course) that will allow user, customers, business users, data engineers, etc, to easily query the data and solve specific problems. The goal is for the SnowFlake app environment to resemble the iOS App Store. There is a need for developers to start developing (only 30 apps listed at this point).
I am still processing this idea & am happy to brainstorm with people…
“Data products” will be how users will engage with your data. The apps will be a type of data product.
There are a lot of companies doing the same / similar things. Observability, ELT/ELT, data scraping, data cleaning, process automation, etc. It is unclear what the finer points of differentiation are, but the market will play out over the next couple of years. These new tools & growing companies are pushing great strides in the data space.
“ask yourself what is the most important database I have. If I then had a super smart person intimately familiar with that data, what would I ask that person about my data?” - Jensen Huang (CEO Nvidia) quote about AI, LLMs