The Analytics Edge

Understanding Your Customer with Marketing Analytics with Jason Davis, Founder and CEO at Simon Data

Episode Summary

This episode of The Analytics Edge, sponsored by NetSpring, features Jason Davis, the Founder and CEO at Simon Data. Simon Data is a next-generation customer data platform (CDP) built on Snowflake. It offers the fastest and simplest way to provide a true 360 degree customer view for your marketing team. Jason describes the modern martech stack and how CDPs are moving to the data warehouse, the differences between packaged and composable CDPs, and how marketing analytics helps marketers reach customers more quickly and more effectively via personalization.

Episode Notes

This episode of The Analytics Edge, sponsored by NetSpring, features Jason Davis, the Founder and CEO at Simon Data. Simon Data is a next-generation customer data platform (CDP) built on Snowflake. It offers the fastest and simplest way to provide a true 360 degree customer view for your marketing team. Jason describes the modern martech stack and how CDPs are moving to the data warehouse, the differences between packaged and composable CDPs, and how marketing analytics helps marketers reach customers more quickly and more effectively via personalization.

With a PhD in machine learning, data mining, and statistics, Jason has helped Simon Data provide businesses the spending trends, behavioral trends, and direct customer feedback required for a comprehensive understanding of their customers. Jason was also the founder of Adtuitive, an adtech platform that was acquired by Etsy to power Etsy’s own internal ad systems. 

In this episode, Jason discusses strategies for centralizing data, the role of the data warehouse, and how GenAI and LLMs are impacting marketing.

Bio:

Jason Davis is the Founder and CEO at Simon Data. With a PhD in machine learning, data mining, and statistics, Jason was also the founder of the retail adtech platform, Adtuitive, which was acquired by Etsy in 2009 to power Etsy’s own internal ad systems. 

Key Quote:

“True centralization aggregates data from all channels, not just what someone clicked on the website – offline, IoT, support... Let's have shipping and returns data and everything else that is required to properly instrument a real business put in a single place, which is governed, secure, complete, and accurate.”- Jason Davis

Episode Timestamps

(1:22) Founding stories: Adtuitive & Simon Data

(5:58) The modern marketing data stack

(10:20) Data centralization strategies

(11:57) Security, privacy, and compliance

(13:39) GA4 and alternative strategies

(15:33) Composable CDPs vs packaged CDPs

(18:10) Breaking down data silos

(20:14) Data infrastructure and cloud migration 

(21:09) Why Simon chose Snowflake?

(22:21) Real-time personalization

(25:17) The importance of Reverse ETL

(30:50) Customer 360: Where are we now?

(34:35) The impact of GenAI/LLMs on marketing

(35:20) Takeaways

Links

Jason Davis’ LinkedIn

Simon Data Website

Thomas Dong’s LinkedIn

Vijay Ganesan’s LinkedIn

NetSpring Website

Episode Transcription

Announcer: [00:00:00] Hello and welcome to the Analytics Edge, sponsored by NetSpring. 

Thomas Dong: The Analytics Edge is a podcast about real world stories of innovation. We're here to explore how data driven insights can help you make better business decisions. I'm your host, Thomas Dong, VP of Marketing at NetSpring. And for today's episode, my co host is Vijay Ganesan, co founder and CEO at NetSpring.

Thank you for joining me on the show today, Vijay. 

Vijay Ganesan: Great to be here, Thomas. Looking forward to it. A session with Jason. 

Thomas Dong: Well, today's topic is all about marketing analytics. And our guest is Jason Davis, founder and CEO of Simon Data, a next generation customer data platform built on Snowflake that is simple enough for marketers with zero ETL, yet powerful enough for data teams.

Jason, we're delighted to have you with us today. 

Jason Davis: Welcome. Thank you, Thomas. And thank you, Vijay. It's great to be on. 

Thomas Dong: Well, one of the primary drivers of big data is, of course, the explosion of customer and marketing data. Today's marketers are [00:01:00] capturing and gathering everything from transactional data to behavioral data and customer feedback.

So a CDP like SimonData creates a comprehensive customer database for all that data and makes it accessible by other systems to activate, analyze, track, and manage customer interactions. So Jason, you're a data scientist with a Ph. D. from UT Austin turned entrepreneur. We'd love to hear what led you to first found Adtuitive , an ad tech platform, and of course now Simon Data, a warehouse native CDP.

Jason Davis:

I always joke, first I've been working with my co founder and CTO, Matt Walker, for almost 20 years. It'll be 20 years come September this year. yes, that's a pretty big, big, big, big anniversary. but I always joke it took me You know, five years into my PhD in machine learning to realize the value in data isn't in the algorithms.

It's actually how the data is used. You know, and I really just saw a big opportunity 15 years ago in really transforming the way business stakeholders and marketers can leverage their, you know, their core business data in a way that they could. [00:02:00] My previous business was a business called Intuitive. it used high powered machine learning algorithms to, you really leverage retailers product catalogs to automatically create, syndicate, and optimize, ads across the web.

that business was ultimately acquired by Etsy, and now that technology now powers, Etsy's internal ad systems. you know, you know, fast forwarding to Simon. yeah, yeah, the opportunity behind what we're building here came from our experience building Etsy's data infrastructure. yeah, we spent about 18 months building out their core data infrastructure.

It was actually a requirement, to properly implement, yeah, some of the technologies we brought to market with them. yeah, and this was, you know, circa 2010. yeah, and this was in a world of, you know, cloud was still in its infancy, and cloud data wasn't a thing. And this was in a world where if you wanted to build a big data cluster, you had to build out a Hadoop cluster, and you had to buy, you know, a hundred servers, and you had to, you know, invest in two, three million bucks in hardware.

We had a network guy who'd show up at the office with a DeWalt jacket and [00:03:00] two drills. You know, and then five million bucks later with software licenses from Cloudera and Vertica, you know, truckload of servers. you know, all that would, yeah, yeah, yeah, they disappeared in Secaucus, New Jersey, with the servers racked, and then suddenly we could then go and build our big data cluster.

you know, and we did that, and it was incredibly powerful. It was, you know, perspective across the entire business from, you know, browsing to search to customer support, buy side and sell side of a two sided marketplace. it really, you know, provided a, an incredible data asset, you know, business had never seen before.

you know, but the fundamental challenge that brought us to, to, to Simon, you know, was that data was just inaccessible, you know, outside of the experts in my team. you know, we saw what, you know, you know, what transformation data could affect, you know, through, you know, what at the time was, you know, at the time was very, very simple marketing use cases.

and we had a thesis that, you know, you know, really started with this notion, you know, of, you know, hey, you know, you know, with, you know, you know, in an enterprise, it has... You know, sophisticated first party data. What would an [00:04:00] application look like that could enable, you know, marketing not just to drive analytics and insights, but actually, you know, to, you know, to affect, you know, operations, to create segments, to personalize messaging, to, you know, to use data to optimize across, you know, the dozens and dozens of channels and the, you know, hundreds of technology systems that are used today, you know, throughout your typical enterprise customer.

you know, and that was the thesis going into it. and when we went to market, and we first prototyped the business, you know, you know, the, you know, the second realization came to play, which was like, the cloud is going to change everything. you know, and, and, and Redshift, yeah, and Redshift appeared, and it suddenly was 8.

25 an hour. and suddenly what, you know, it took us six months and five million bucks and, you know, a few dozen drills, you can now do, in an afternoon and get started. you know, and really the bet that we saw at the time was, hey. you know, data is going to be democratized. you know, the access to high power data infrastructure is going to change materially.

you know, but you know, what's not going to change is how non technical stakeholders, you know, how business folks who, you know, aren't SQL experts, you know, [00:05:00] furthermore need to use the, you know, the data in ways that SQL isn't sufficient. you know, we just saw a huge, huge gap around the problem that we're solving today.

Right, so 

Thomas Dong: you've really targeted this business user and marketing in particular, and I imagine that's because budgets is typically... sat within the marketing organization, so they have money to spend. When it comes to marketing technology, the data shows that the average enterprise now has over 100 different, tools and solutions in place.

But with this move towards really cheap storage and compute, the landscape is changing. A lot of new emerging warehouse native apps, are coming about. Curious what your thoughts on what the modern marketing data stack looks like today and what it may look like, let's say, in four or five years 

Jason Davis: from now.

Yeah, I mean, look, historically, MarTech has, you know, consisted of data silos, you know, you know, led by, yeah, yeah, the, the beautiful pitch that, you know, so many marketers, you know, in a previous generation, you know, were wooed by and, you know, just put the pixel on your website and [00:06:00] everything else will just, you know, you'll, you'll fall through automatically.

and reality is, is that, you know, marketing technology is devolved to a state, you know, where you have, you know, each one of these 100 enterprise, you know, technologies, each having their 100 copies of your customer data. you know, and listeners might say, oh, well, that seems expensive. But the worst part about it is that, you know, there are actually 100 different copies of your data that don't reconcile with each other.

they both represent, a different, you know, and somewhat inaccurate. you know, perspective of the customer. you know, and to the first point of transformation, you know, around the modern marketing data stack, you know, which is incredibly self evident but is worth calling out in the context of where MarTech has come from, you know, is actually enabling true centralization.

You know, with true centralization, you know, it enables an enterprise focus, across aggregating data from all channels, not just what someone clicked on on the website, you know, offline, you know, IoT, support channels. you know, you know, just because you, you know, I bought something from a retailer doesn't mean it ever arrived in my house, you know, so let's think about how to, you know, let's have shipping data and returns data and everything else [00:07:00] that is actually required to properly instrument a real business, you know, put it in a single place, which is governed, which is secure, and which is complete and accurate.

you know, and this really is the foundation for, this, you know, next generation of. of SaaS applications generally, and certainly, you know, for MarTech and the customer 360 and customer data, you know, it's, it's, it's, you know, it's natural that this is sort of one of the primary areas of focus as well.

And so... 

Thomas Dong: That's, that has led to an architectural change. so if we take this number, a hundred Martech solutions, do you see potentially any consolidation of marketing capabilities because you have this, this new customer 360 perspective that is avoiding all this duplication of customer 

Jason Davis: data? Yeah, I mean, there are a couple things going on.

There's consolidation, there's also you know, I think one of the big challenges around CDPs historically, you know, is their activate, their activation model is this completely broken. you know, CDPs came along and said, Oh, you have a hundred pieces of marketing technology and they all have different data.

Well, we'll fix that. Yeah. So we'll [00:08:00] take all your centralized data and we'll copy it out everywhere. and that will fix all the problems around. you know, around the incompleteness and accuracies. But suddenly, you know, now you've, you know, increased your cost because instead of, you know, having each of these vendors, you know, see 10 percent of your data, they're now seeing a bigger percentage of it.

And you have a massive compliance and security issue as well, because now you have 100 points of failure with fly by night, you know, vendors, you know, those 100 were chosen from, you know, I think Chief Martek last yeah, you know, categorize 11, 000 MARTEC vendors. yeah, not all of which are SOC 2 compliant.

yeah, not, you know, many of which are, you know, who knows where the data is actually housed. but this is really, was the gold standard, you know, for CDPs for, you know, even up until now to an extent. you know, and it's really just, you know, you know, the, the word activate has really adopted a, a relatively broken,you know, it's fundamentally broken, you know, when, you know, even today, when people, you know, still talk about, you know, what it means to activate your customer data into your 100, enterprise, applications.

Vijay Ganesan: Jason, very interesting what you said about, you know, how traditional [00:09:00] CDPs told you, hey, you have 100 marketing systems, don't worry, we'll solve the problem, we'll just make 100 copies, right? but where you, you, you, with what you're doing with Simon Data, you're flipping it, you're saying? No, it's, there could be many systems, but everything is gonna get centralized in the data warehouse.

and it's sort of similar to the point of view that, that we have around product data, right, product instrumentation data and how you, how you combine it with other data, a business context and the data warehouse and so on. But in the marketing world, there's hundreds of thousands of different systems, right?

So centralization is, is, is tough, right? How, what, so what should data leaders be thinking about? if they want to embark on this journey of centralization of this data. 

Jason Davis: It's not a two quarter long project. yeah, I think for data leaders, I think, you know, you know, peeling the onion back, you know, everything starts with business value.

you know, and none of this matters if you don't have a complete and accurate view of the customer. you know, and I think with that, you know, you know, complete and [00:10:00] accurate is, is only relative, is only relevant, relevant relative to the end applications. yeah, so I think that, you know, the, the first point of focus is saying.

You know, let's actually lean into our business stakeholders. Let's talk to our marketing teams. Let's think about, you know, what their strategies are, for the back half of this year and into next year. you know, let's realize that, you know, we're, we're, we're, we're in a new world, but we're in a new world where, you know, you know, the previous world was marketing did its own thing in a vacuum and, you know, data did its own thing to service finance.

you know, we're in this new world where, you know, data is a truly centralized, you know, function, not just, for instance, reporting, you know, but to really drive a new set of applications across product, market, and beyond. You know, so step one is, is, is value creation needs to start with an alignment around use cases and then making sure that that results in a clear articulation of what data needs to be available, collected, and then execute well.

Let's make sure it's complete. Let's make sure it's accurate. I mean, this stuff is obvious, you know, but all too often, you know, data teams just find it easier not to have the conversations cross functionally, and stick to what they know and believe. you know, but [00:11:00] just like, you know, any other engineering project, in the absence of requirements, it goes down,you know, a road which just, you know, is not quite right.

Vijay Ganesan: You touched upon, security, privacy, and, you know, you've got to have You know, SaaS services that are probably don't even have SOC 2 compliance and so on. so how important is that? I mean, clearly it's important, but I know how, how much of it is driving this desire to, to centralize data? We 

Jason Davis: saw when, yeah, GDPR shut.

Entire companies down because they looked at their systems and they were like, you know what, like, you know, we're barely profitable. And then the, the, the, the incremental cost is going to take me to actually make this work, you know, I'm going to pack my bags and get a day job. you know, and, and look like, you know, systems complexity, you know, and data, you know, and, and lack of data.

organization makes GDPR intractable. you know, how do you even begin to think about, you know, deleting data, you know, fundamentally is hard. and it's something that so many systems were designed, you know, to do. and you take a setup where you have, you know, centralized data syndicated out to 100 [00:12:00] enterprise, you know, marketing applications, you know, you know, and look like there are plenty of, of, of sort of, you know, you know, you know, businesses in the CD category that attempt to go back to these channels and clean it up.

It's a nightmare, you know, it's a total nightmare. you know, centralized data changes all that, you know, because if you have one copy of the data, and it's governed, you know, and you know where it is, then, you know, suddenly you can think and reason about it in a much more, you know, you know, clarified and simplified way.

And then beyond that look, like, you know, all trends are pointing towards, you know, having, you know, around enterprises as being more mature, and more thoughtful and more organized around their customer data. 

Thomas Dong: So, top of mind when it comes to data privacy, of course, is GA4. So, marketer, marketer. would love to get your thoughts and, and on how data leaders should be thinking about first party data.

And all these replacement strategies vendors are proposing. There's just a litany of everybody, you know, trying to take this as a moment in time and a trigger for new technologies. 

Jason Davis: Yeah, I mean, I'll [00:13:00] stick at a high level. And look, I think the most important thing to remember with, Any analytics offering that Google builds is that it's fundamentally designed to sell AdWords.

you know, and, and I, I tell our customers this all the time. there's an incredible bias in tooling analytics. you know, and it's, you know, there's a great reason why Google gives away this stuff for free. yeah, sometimes, you know, so obviously some parts of it they're charging for, you know, around, you know, BigQuery access and whatnot.

but it still gets like pennies on the dollar compared to what you pay elsewhere. yeah, I think it's, it's really critical to understand that, yeah, yeah, that you need to, you know, again, back to my previous point, roll up your sleeves. Talk with your marketing team, you know, get your data science and analytics resources involved.

Understand the attribution and the causation that's driving demand generation, you know, and conversion and brand recognition for your business. you know, and if it happens to align with, you know, you know, Google's, you know, relatively digitally focused strategy, then GA4 is going to be a great path for you.

but if it [00:14:00] doesn't, you know, then you're going to want to think about doing something more bespoken. you know, and, you know, there are many other folks in the market, you know, our friends at Snowplow as an example, who are, you know, aggressively, you know, pursuing strategies that allow you to own this a little bit more easily.

you know, and, you know, I'm sure there are plenty of other, you know, options, you know, beyond that as 

Vijay Ganesan: well. Let's move on to the next segment. I want to kick it off with the discussion on Composable CDP versus Packaged CDP. This is a topic that's popular these days, heavily debated. we'd love to hear your opinion on that and, which, which category does Simon data fit in?

Jason Davis: So there are actually two axes in the conversation, you know, that I think, you know, get lost in category definition. when we hear the word composable, What this means is does it work with your cloud data warehouse? Does it, you know, natively integrate into the enterprise data investments you built?

you know, and, you know, is, is, does it orient to your snowflake, your BigQuery data as a first class citizen? in that sense, you know, we are [00:15:00] 100 percent composable. you know, this is, you know, you know, this is, you know, you know, on the, on the, on the, On the package side, you know, I would think, you know, we would generally call package, you know, something more to the tune of being vertically integrated.

you know, so something like Segment, as an example, you know, they collect your data, they aggregate your data, they'll fit it into Snowflake and they can pull it out as well, but that's sort of a pipe towards the end of their, their broader data pipeline and their core infrastructure. Yeah, so Segment is on the other side and they're...

You know, they're vertically integrated and fully integrated because you don't need, you know, you don't need a cloud data warehouse to make segment work. yeah, there's another dimension, which is, you know, bundled, versus unbundled. you know, which I think is important to distinguish, you know, and obviously Salesforce comes with a bundled solution.

You don't need to get anything else besides, you know, Salesforce, you know, and then reverse ETL is an unbundled solution because if you want CDP capabilities, You know, you can use reverse CTL to build your own identity models. You need to think about, you know, any sort of predictive capabilities you might need.

you need to think [00:16:00] about how to leverage, third party data and third party device graphs when needed. you know, and they, you know, a lot of these, you know, some of these guys have done a good job of, you know, doing reverse CTL very well, but you need to fill in the gaps. you know, on the other side, you know, a more bundled approach will look at...

You know, the CDP capabilities, which all gets loosely defined, is that, you know, the data requirements, you know, for marketing to interface with, you know, their, you know, cloud, you know, cloud data infrastructure. And our perspective is, look, you know, you know, if you're an enterprise app, you know, you're an enterprise with 100 Martech tools, you're trying to choose from 11, 034 chief Martech, you know, you know, sanctioned, you know, vendors in the category.

I'm not so sure, you know, that you need to choose from 5, 000 data tools, you know, to go with a fully unbundled approach to your, your CDP. and so that's our perspective in the market. 

Vijay Ganesan: We've talked about the data warehouse becoming the center of gravity of all data, marketing, customer data that historically sat in different silos.

And, you've built Simon Data on top of Snowflake.[00:17:00] And what have you seen that unlocked? You've started with a warehouse first approach. It was, you know, it was fundamentally a different architecture from ground up. And what has, what has helped, what has that helped unlock in terms of business value?

We can 

Jason Davis: actually deploy in, in, in, in two different ways. We can deploy... You know, directly on top of the warehouse, whereas a managed service, you know, where we can, you know, we have more flexibility around data, but you lose some of the governance, benefits on top of that. but look for, you know, for our customers who, you know, who have made bigger investments around their cloud data strategy, you know, who have.

and are starting to have, you know, relatively strong data capabilities that, you know, can, you know, assemble the, you know, pretty broad, view of the customer. a fully connected approach where, you know, we can deploy directly on top of the warehouse is, you know, has big advantages. you know, data doesn't move unless you actually have to ship it out to a channel for execution of a campaign.

yeah, you know, you don't have to pay for your data twice. you know, latency of data is reduced because there's [00:18:00] no ETL requirements. you know, and then, of course, you know, there, you know, there, there, there, there are other benefits around security and privacy that we've discussed. you know, it's also worth saying that, look, you know, you know, we also have many customers where, we, they, they build on top of our managed approach.

you know, and in this, you know, in this architecture. You know, we either ETL all data into our platform or we can interface the Snowflake share. you know, but, you know, you know, here we can, you know, you know, we can enrich from other sources and, we can share the data back that, you know, that we've also, enriched from other sources, you know, to actually make our customers data environments richer.

you know, and it's just a bit more flexible for customers who are maybe 25 percent of the way into the journey as opposed to customers who are maybe 50, 75 or 100%. Maybe the 

Vijay Ganesan: smaller customers that are, don't have, you know, mature data, infrastructure, they would choose the managed service approach and then the, so the larger enterprises probably go for the, for the warehouse native approach.

Jason Davis: That's the general trend, but let's also not forget, Yeah, there's a reason why Snowflake is, [00:19:00] you know, has 150 percent or 180 percent annualized net expansion. and yeah, and, and that's not just because businesses are growing, it's because they're aggressively migrating from on prem systems. And it's not coming from small customers, it's coming from big customers.

so there are also plenty of big customers, you know, who don't actually have, you know, sufficient data today. you know, in their cloud data environment, it comes from on prem systems. They can stage it into different areas where we can easily,you know, merge it in. but that's sort of another dimension to all this.

Thomas Dong: And one of the bets that Snowflake is making is around what they're calling these Snowflake connected apps, which of course, Simon is one, NetSpring is one as well. How did you decide to go all in on this approach with Snowflake? 

Jason Davis: We've been building on their technology for seven years now. you know, we've developed very close commercial partnership.

you know, whenever, you know, you know, because they're very tight alignment with, with, with their vision, both technically and commercially around where they're pushing the business. you know, ultimately. you know, you know, Snowflake calls themselves a data [00:20:00] platform for a very good reason. Even though everyone else, you know, you know, who is, is, is not forward thinking, you know, relative to their vision, is just going to call them a data warehouse.

you know, and a platform is something you build upon. You know, a platform is something, you know, that, you know, transcends reporting and insights. you know, so for us, it's, it's, it's as much about the technology and what they built today, as it is the vision of where they're pushing, their business forward, in the future.

Vijay Ganesan: Jason, let's talk about personalization. So, personalization is obviously critical for marketeers, and you know, there's numerous studies that have shown that brands that offer a more personalized touch with customers do much better. you know, some personalization has a real time element to it. You know, example, you're personalizing what a user sees based on the interaction they just did in the app.

Something that you know, you have to react to immediately and then then personalize the experience. And historically, data warehouses are not designed for real-time workloads. So how do you achieve real-time personalization [00:21:00] with our data data warehouse centric city? 

Jason Davis: For the past 20 minutes here, we've been, you know, focusing on, you know, the warehouse as you know, the singular source reality.

you know, it's actually not the singular source. you know, it's a primary source. and it's, it's, it's, it's critical and it represents the entirety of data at rest in our platform. yet, you know, for real time use cases, you know, while we do have customers who will route real time data through the warehouse, you know, a big part of our strategy and a big part of meeting the requirements of the use cases of marketing teams today, you know, is providing streaming support.

you know, you know, and we look at this as data, which is obviously not at rest, but data is in transit. It has a different set of governance requirements. you know, and, you know, and, and, and look, it's an open question around, you know, how, you know, the cloud data warehouse evolves, to support real time.

 you know, there are technology questions. you know, there's also just general abstraction questions. you know, you know, to what extent is SQL the right tool, you know, to do real time, you know, query processing? and obviously there's, you know, many people solving this problem, there are many different approaches, but yeah, yeah, there's no standard.

yeah, [00:22:00] SQL is, yeah, is beyond the standard now on how to, yeah, yeah, yeah, yeah, yeah, yeah, yeah, how to query data at risk. but there goes big, big open questions around where real time will be. you know, certainly we're very eager to see what, what real time looks like in one, two, five years from now.

you know, the other dimension as well, you know, when we sort of think about data sources is, you know, thinking about, you know, data that's, you know, that's real time but isn't streamed in, you know, which is a push model, but data might have been pulled in. You know, so you think about marketing applications where, you know, you know, DJ, if I'm going to send you an offer, you know, to go on a flight, well, I want to make sure that flight is still available.

you know, so, you know, you can think about, your ways of. Accessing APIs in real time to make sure that, you know, when an ad is displayed or a message is delivered, you know, real time pricing, real time inventory, you know, any other context that, you know, you know, may not, you know, you know, be naturally represented, you know, in a stream or a warehouse.

you know, still, you know, needs to be queried and surfaced. you know, so that's sort of the third, you know, sort of core data source, or data type, I should say, that, we integrate 

Thomas Dong: into. I [00:23:00] actually wanted to go back to an earlier comment you made about reverse ETL and fitting into this unbundled CDP, uh, concept, and we're seeing a lot of marketing applications emerging with their own native integrations with the data warehouse.

So curious what your thoughts are on, you know, where, where you see reverse ETL evolving to, will it still be as important as it seems to be right now? I think 

Jason Davis: reverse ETL is, you know, has been sort of a great, a great catalyst for this conversation we're having today. I think no single, I think when we look back five years from now, You know, we'll look at reverse ETL at the moment in time, you know, that really changed the orientation of the warehouse to be more than just a thing that data scientists would query, and CEOs would demand, you know, some, you know, charts and graphs for their, for their board meeting.

yeah, I think it really is, an application that. It makes it incredibly easy to copy from A to B, to write SQL, which everyone knows how to interface the warehouse, and to use that same [00:24:00] SQL to put data into your end channels. yeah, and so I think it's, you know, sort of forever, forever grateful for, you know, for the category, and I'm sure the three of us are, right?

you know, the challenge with reverse ETL is... There's a couple of things. One is, look, you know, if you're investing a million bucks a year in stuff, like five million bucks a year in stuff, like, you know, we have customers who are spending well over 50 million a year on stuff, like, you have a lot of data, you know, your data is miles wide, you know, and, you know, you have richness in your customer data, you have data across online and offline

context, you have a marketing application that's built on maybe MongoDB. That's it. or maybe, you know, some, you know, Oracle database that was architected in 2003. you know, and, and the fundamental challenge is, you know, you have, you know, three miles of data and you're trying to fit it into, you know, three feet, you know, three feet or three inches of column storage.

you need to make real design trade offs, and at the day, like, it's impossible to make the right trade offs, and the result is that You know, you're still in a situation where, at the end, marketer needs to actually, [00:25:00] you know, build a new segment, you know, and or execute against their Q3 campaign strategy, fields will not be there.

you know, and you still need to go back to the warehouse, you need to go and, you know, ingest new fields, and, you know, the whole idea of actually, you know, democratizing data access, and enabling the business stakeholder, you know, you know, you know, it's very hard to sort of, you know, do it in a way which is, you know, end to end.

Yeah, so that's, yeah, that's, that's sort of, you know, number one key challenge there. The other thing which is, you know, pragmatic to the category, is, is look, I mean, you know, Braze built their own integrated reverse ETL capabilities. you know, I think they have a team of, you know, two engineers working on this stuff and, you know, it's a SQL interface.

You plug it into Redshift or Cellflake and you write some queries and the data gets sucked in. yeah, so I do think from a, from a capabilities perspective as well. yeah, none of the M channels will want to get disintermediated in this way anyway. So it's 

Vijay Ganesan: interesting what you're talking about, where, you know, in this, in this new world where everything is plugging into the warehouse.

you have, you have apps pushing to the warehouse and other apps pulling from the warehouse. And that's going to be the way these integrations [00:26:00] happen, and less of these intermediaries are necessary to manage that data going from one place to another. So, 

Jason Davis: it's very interesting. Yeah, I mean, look, Snowflake's vision is for everything, just there to be a unified data layer for everything.

yeah, I think it's going to take a bit to get there, but... 

Thomas Dong: Alright, so in that answer, you actually mentioned something about democratizing data access, and just by definition, we're talking about kind of two stakeholders here you're talking about. The business user, you're talking about the data teams who need to support those, those business users.

So I thought you'd be a great resource here to leverage your thoughts on kind of this necessary collaboration between data leaders and business people as they're evaluating new technologies. Have you come up with any, you know, successful tactics to enable this type of collaboration so that they're reaching consensus faster on new technology 

Jason Davis: solutions.

The problem with, you know, so many technologies and implementations today around, you know, MarTech [00:27:00] architectures and implementations is teams come together for highly operational and tactical purposes. you know, and that's basically the absolute worst way of cross functional collaboration. And they're so fatigued by the time they get through a quarter that no one even takes a step back to talk about the strategy and where you want to hit.

yeah, so I think the first requirement... You know, is to, you know, ask the question, like, if you're trying to, you know, feed your kids, what food do you need to have in the fridge? you know, and is it possible that, you know, as they, you know, turn from four to five to six to seven to eight, you know, that you can get the right food in the fridge so that maybe they can start making their own lunch?

you know, you know, the analogy I like to make is, you know, I have a four and eight year old. The four year old, when he says, I want food, I ask him what he wants and I make it for him. The eight year old, when he says he wants food, I say, dude, go make yourself a sandwich. Like, you're eight years old. Like, we've had the conversation around what kind of food you want to eat.

Like, I no longer have to make you a sandwich. yeah. And I think so many, you know, marketing, you know, technology, you know, environments are every single time anything comes, you know, I want, you [00:28:00] know, I want a grape. Okay. Let me get you that. But it's in the, it's in the fridge, in the basement. you know, I'll be back tomorrow.

and then everyone, and then the consumers are going to start from the top. Yeah, 

Thomas Dong: that's great. Such a colorful example and analogy there. I think everybody can understand. So let's, let's close this off, with maybe some forward thinking thoughts of, of, you know, where you see things headed. You know, we talked a lot about the data warehouse revolutionizing the way we're storing and managing our data.

And so... This concept of single source of truth, Customer 360. It's been tossed around for many, many years now, but have we hit an inflection point, perhaps, where enterprises have finally achieved that? Curious on your thoughts if you agree that we're done, we found a solution, or what, if anything, still stands in the way of of this true dream of having a Customer 360 view.

Jason Davis: We're a lot closer to, you know, to, you know, to hitting that dream. But the problem is, it's, the Customer [00:29:00] 360 is a moving target. you know, right now, like, we're at a 360, but it's a pretty small, it's a pretty small circle. you know, and, and, and, and, and, and, and, and, and look, like, with the macro where it is, you know, and especially across, you know, a lot of commoditization around e commerce suppliers and, you know, commoditization of so many of these, you know, so many brands today.

You know, competition is just, you know, fiercer than ever. you know, so it's not just about, you know, developing the customer 360, but it's actually having a 360, which is, you know, which has that next level of granularity. you know, it's about really picking apart the nuance in. And identifying, you know, the relationships between online and offline behaviors in a way that, you know, just having basic visibility around, you know, has the person ever purchased in store, have they purchased online, when have they purchased online?

Those are the basics that, you know, five years ago we were, you know, saying is required for a customer 360. Now it's, it's, it's multiple levers. Levels deeper, you know, what was the, you know, what was the interplay between a customer, you know, in, in their, you know, online and offline shopping? you know, and if they did purchase last, you know, you know, offline, you know, was there a digital experience that was contained in that?[00:30:00]

Oh, you know, these are now the table stakes around this next, you know, degree of customer 360, you know, and. Yeah, I think for, you know, for customers who have been able to build a basic, you know, workable identity model and build basics around what forms a customer 360, you know, you're really, you know, to stay competitive, it's built hyper accurate identity model, which leverages third party data, you know, which really, you know, accounts for the, you know, the real nuances of Customers having multiple email addresses and, you know, and householding and beyond, and then take that and overlay that with a next level of data, which, you know, is, you know, can really only be stored in warehouse, you know, in these large scale, you know, cloud data systems.

you know, and I think that is, that's, you know, that's, you know, sort of your customer data, enterprise data strategy to a T, you know, is having a business that has a footprint, you know, you know, that, you know, you know, and being able to capture that day to day. you know, to deploy it into business processes.

Vijay Ganesan: So you're saying, Jason, that we, you know, enterprises may have reached that sort of the traditional way of thinking about Customer 360, but they really need to go [00:31:00] deeper. 

Jason Davis: That's right. I mean, it's, it's like, you know, evolving from the dash cam on your car to full LiDAR, you know, and bumper cameras everywhere.

That's a great 

Vijay Ganesan: analogy. let's talk, Jason, about, you know, the topic that's, that's catching everybody's attention these days, Generative AI and LLMs. you have data science background, you probably have a lot of, you know, you've put a lot of thought into this. what do you see, the impact of generative AI and LLMs, on the, marketing landscape?

You know, are there things that, we're going to be able to do easily, today with these technologies that we couldn't do, you know, 

Jason Davis: five years ago, ten years ago. Yeah, a hundred percent. I think, you know, look, the hardest challenge with data access for business stakeholders is data literacy. you know, and yeah, and data literacy, you know, is rooted in two problems.

One is, you know, a semantic understanding of the data. and the second is, you know, a technical knowledge on how to, you know, leverage that semantic understanding of the data to actually answer the questions you want to answer. [00:32:00] you know, and I think click there, you know, every, you know, there, there, there are many people solving the data dictionary problem.

I think there are big challenges, you know, in the category. you know, I think the, the, the LLMs and generative AI capabilities, you know, we will, we'll really open up a whole new set of opportunities for business stakeholders to, you know, access, you know, the call it data warehouse, and, you know, and, and, and what would otherwise be sort of full code data sources.

Thomas Dong: Oh, fantastic, Jason. this has been very insightful. really, deep knowledge that you've shared and vast experience across, now two companies that you've started. Again, we really appreciate your time. Thank you so much for joining us today, Jason. 

Jason Davis: Well, thanks for having me on. I had a fantastic, conversation with both of you.

Vijay Ganesan: I thought one thing you said was very interesting,he's obviously, you know, very smart guy, PhD from UT Austin, and he said it, you know, took me five years into the PhD program, and I guess data science, machine learning, to realize that it's not the algorithms, it's the [00:33:00] data and the use of the data.

And that's what matters more than the algorithm. So that's, that's interesting perspective coming from somebody like him. 

Thomas Dong: Right. And just from, I think that he provides tons of very applicable business knowledge and acumen in the conversation here. And so for me, it was, this, you know, reminding us that, you know, many of the, you know, the business initiatives that we talk about are moving target, the customer 360, you know, we're, we're, we're constantly chasing, and coming up with new ideas on what customer 360 is and what comes table stakes.

And obviously technology continues. To catch up. And so with Customer 360, for example, well, yes, we have that cross channel view. We can collect the data, but it's getting to that on beam channel understanding and, you know, the deeper behaviors behind it. So there's new technologies that emerge. We talked about generative AI.

There's one and other advanced [00:34:00] analytics techniques that are going to continue to emerge for us to have that deeper understanding of, of customer behavior and what really customer 360 is, much more than visibility, but actually true deeper understanding. 

Vijay Ganesan: And on the, on the generative AI LLM front, I think it's interesting what he said about data literacy.

At the end of the day, you want analytics to be, accessible for the business user. They are the ones who are the decision makers. They're running the business. And it's historically been very hard. to get them to do, analytics, impactful analytics because of, you know, lack of literacy, lack of data literacy, and, and, and that's the biggest hurdle in that, AI, could potentially 

Thomas Dong: solve.

That concludes today's show. Thank you for joining us, and feel free to reach out to Vijay or I on LinkedIn or Twitter with any questions or suggestions for future shows. So until next time, goodbye.[00:35:00]