The Analytics Edge

Warehouse-Native Data Architecture with Soumyadeb Mitra, Founder and CEO at RudderStack

Episode Summary

This episode of The Analytics Edge, sponsored by NetSpring, features an interview with Soumyadeb Mitra, the Founder and CEO of RudderStack, a warehouse-native Customer Data Platform (CDP) purpose-built for data teams. Soumyadeb shares the founding story behind RudderStack, and discusses the evolution of the CDP and the business benefits of a warehouse-centric CDP approach.

Episode Notes

This episode of The Analytics Edge, sponsored by NetSpring, features an interview with Soumyadeb Mitra, Founder and CEO at RudderStack, the leading warehouse Customer Data Platform that’s purpose-built for data teams. RudderStack is an open-source, enterprise-ready platform for collecting, storing, and routing customer event data to your data warehouse and dozens of other tools. 

After founding the company in 2019, Soumyadeb led RudderStack to 100+ employees and a $56 million Series B funding round in 2022. Prior to RudderStack, he co-founded Mariana, a VC-funded B2B martech startup, which was later acquired by 8x8 in 2018. Soumyadeb earned his PhD in Computer Science from the University of Illinois Urbana-Champaign.

In this episode, Soumyadeb talks about the founding stories behind RudderStack, the evolution of Customer Data Platforms, and the significant impact that a warehouse-centric CDP approach has on business.

Key Quotes

“We want to look at product funnels and customer journeys, but then combine that with Salesforce data, right? I mean, I want to look at funnels separately for enterprise customers and customers who closed versus customers where we are competing with a specific vendor and so on. And this is a very standard thing I would imagine. I mean, we see that across all our companies, but it was surprisingly hard to do with a lot of these cloud product analytics tools, right? They're amazing tools, but then they're only designed to ingest a specific kind of data. And if you want to combine other data sources, it becomes really fragile and complicated to set up those data pipelines, right? So yeah, I think Warehouse-native enables that and kind of unlocks that set of use cases. Plus there are all these challenges around data privacy, which again, it's not so much for a company like us, but at scale, it becomes a problem, right? I mean, you're centralizing your data in a data warehouse. Why do you need to ship everything to another vendor to do specific parts of your analytics? It just does not make good sense..” - Soumyadeb Mitra

Episode Timestamps

(01:11) Founding story behind RudderStack

(02:50) The evolution of CDP

(06:50) Business challenges CDPs are trying to solve

(08:06) Packaged vs. composable debate

(10:55) Benefits of warehouse-native CDP

(17:14) Analytics on customer data

(18:47) Data activation and reverse ETL

(21:10) Real-time personalization

(26:05) Achieving customer 360 view

(28:08)  Business impact with a warehouse-centric CDP approach

(30:17) The future of CDPs

(34:48) Takeaways

Links

Soumyadeb Mitra's LinkedIn

RudderStack Website

Thomas Dong’s LinkedIn

Vijay Ganesan’s LinkedIn

NetSpring Website

Episode Transcription

[00:00:00] Narrator: Hello and welcome to The Analytics Edge, sponsored by NetSpring.

Thomas Dong: The Analytics Edge is a podcast about real world stories of innovation. We are here to explore how data driven insights can help you make better business decisions. I'm your host, Thomas Dong, VP of Marketing at NetSpring. And for today's episode, my co-host is Vijay Ganesan, Co-founder and CEO at NetSpring. Thank you for joining me, Vijay. 

[00:00:24] Vijay Ganesan: Great to be here. Welcome, Soumyadeb. Great to have you on our show. I'm a great fan of what you're doing with RudderStack and really looking forward to this discussion today. 

[00:00:35] Soumyadeb Mitra: Thanks Vijay and Thomas. I'm really excited to be here. 

[00:00:39] Thomas Dong: All right, today's topic is Warehouse Native Data Architectures. And we're joined by Soumyadeb Mitra, Founder and CEO of RudderStack, a warehouse-first customer data platform built for developers, data analysts, and product teams. RudderStack is an open source, enterprise ready platform for collecting, storing, and routing customer event data to your data warehouse and dozens of other tools. Soumyadeb, we're delighted you're able to join us today. Welcome. 

[00:01:04] Soumyadeb Mitra: Thanks, Thomas. Super excited to be here.

[00:01:10] Thomas Dong: So the history of the CDP can be traced back over the last 30 years to the evolution of several key marketing technologies, including CRM systems, tag management systems, and data management platforms, or DMPs. Thanks, Thomas. However, 2013 is generally referred to as the year of the CDP when the term was originally coined. Soumyadeb, you founded RudderStack in 2019. What were you seeing back then and what became the motivations for you to start RudderStack? 

[00:01:36] Soumyadeb Mitra: The motivation for RudderStack came from the problems that I encountered in my previous roles and companies, right? So right before RudderStack, I was in a public telecom company and As a data leader in that company, I was told to build interesting use cases on customer data, uh, not just for marketing, but also for support.

Support wanted to build like a churn model so that they can like address churn problem, which was like, uh, like a big issue. Uh, like sales wanted to build like a lead scoring model. Score, uh, like to identify which are the leads they're likely to convert so that they can dedicate their resources on those.

So all these use cases required the same architecture, like get all the customer data, uh, build interesting ML applications on top, and then activate that data back into these tools, right? Turn score, you want to get it back into like a support tool, like Gainsight or Zendesk, and so on. And none of the traditional CDP architectures, which are All built for marketing could enable these use cases. So that was pretty much the genesis for starting, uh, RudderStack. 

[00:02:48] Thomas Dong: So it sounds like you were, you're faced with a lot of technical challenges to meet the needs of the business and the applications you were trying to build. There were obviously innovations happening with many other vendors in the space.

What are some of the key developments in the evolution of all these various CDPs that have gotten us to the current state of 

[00:03:08] Soumyadeb Mitra: CDPs? The space of CDPs is pretty interesting, right? I mean, like, they were almost like, to your point, first generation CDPs, right? The, the tag managers in some way, like, they eventually morphed into CDP.

But even before that, there was customer data, right? You had a CRM, uh, like with the cloud CRMs, and then there are like on prem CRMs. So they technically also had customer data. And so on. It is primarily used for analytics and driving specific business workflows, right? And then came the cloud's CRMs and cloud marketing tools.

They all had customer data, and then they're all designed for specific, uh, Like business workflows, right? I mean, like sales had a CRM and the marketing had a marketing automation system. I think the first two CDPs came to your point, came into the 2013 14 time frame, and it was really driven by two main things.

One is like the rise of data volumes, right? I mean, a lot of the The applications came to the web, to the mobile, and that just led to the explosion of customer data. And you needed almost like a new architecture that the traditional CRMs could not address. So that's, that's kind of was the first driver of like the CDP platforms.

The second was driven by The needs of the end consumer, like because you had so much data about them and the people who are interacting with the brands on so many different properties, right? It's not just like, so just selling, sending like a mail coupon was not enough, right? I mean, you have to like truly personalize the experience that the consumers were expecting that.

So that also like both this. You could do that because of this explosion of data. So these requirements, both the data volume but also the expectations of the consumer, led to these like CDPs and then a lot of innovation on that space, primarily driven by marketing teams. But then I think like, because there was no other way to do that, I mean there was no, like collecting this data was hard.

Now, I think in the 2017 to 2019 timeframe, like, you saw the explosion of cloud data warehouses, and it became really cheap and cost effective to collect data and process data, like earlier you had to set up a Hadoop cluster and so on, so that's kind of the Thank you. Other big change that is happening recently.

The second thing is like, uh, people are realizing that the traditional CDPs did fall short on their promise. Uh, again, because of like their architecture and the use cases that could be built on top, right? So I think we're almost seeing another evolution of CDPs where people are realizing that it's not, marketing is not.

Maybe the right, even buyer for like setting up this stack, like it should be the engineering teams and, and, and so on. The cloud data warehouses are kind of adding that. So that's, I would almost call like the second incarnation of CDPs. In, in, in, like, it's, it's really the third one. Like they're the traditional CRMs.

Then the came the, the CDPs and then this is like, Like a new CDP architecture being led by the engineering teams and so 

[00:06:18] Thomas Dong: on. That's really fascinating. Um, so we had this big data explosion that led to massive volumes of data. Um, and you know, a new architecture has emerged. RotorStack has built a warehouse native approach to it.

So around this big data explosion, you know, we can talk about the technical challenges, business challenges here. Um, I want to deep dive a little bit in terms of, okay, if somebody is trying to set up a CDP, kind of divorcing it from kind of the technical challenges, what are like, like common business challenges that any of the CDPs are trying to solve regardless of their architecture?

I think 

[00:06:58] Soumyadeb Mitra: it, it kind of all goes back to like the right personalization experience for your end consumer, right? Whether C d P or, or whatever, homegrown or whatever the, the end use case is delivering a personalized experience to your consumer, right? So that's, that's where it kind of all boils down to. And that experience could be on your website, could be on the emails that you send out, or it could be even the mails that you send like.

So, like, the postal emails that you send, right? So that's, that's what is the end outcome of, uh, of, of any CDP architecture. I think traditionally, a lot of those touches were driven by marketing. So that's why marketing has been. A buyer, but then as people are realizing is the consumers are not just touching a brand through their marketing, like when you call into a call center, that also requires delivering a personalized experience.

And then that's usually a separate team. So like all these teams need to get access to like true customer data. And then that's kind of what a CDP has to enable. 

[00:08:01] Vijay Ganesan: An ongoing debate in the community about packaged CDP versus composable CDP and, and there's, you know, different definitions even of those terms.

How do you describe packaged versus composable and how does Flutterstack fit 

[00:08:16] Soumyadeb Mitra: in that debate? Yeah, I think the, so for the audience like who are not familiar with the term, like the whole idea of a packaged CDP is like an end to end black box. SAS system, right? I mean, uh, and a lot of the second generation CDPs, if you will, are sold as a packaged CDP.

The idea is you send all your data into these, uh, into these SAS tools, and then they, they do some internal magic. And they expose an interface for marketing, mostly to come in and like create audiences and activate those audiences and so on. So they're like built for driving a very specific use case. It is a big use case, but it's primarily for marketing.

I think to what I was talking earlier, like people realize that that architecture can only go so much and you need like a new architecture of, uh, of like building this customer data stack, right? A warehouse native architecture where you centralize the data into some kind of a data warehouse or a data lake, build interesting applications on top and enable marketing, but also enable like the other business use cases.

And this new architecture, that's kind of one thing though, it's, it's a new architecture. And the second thing is like, this is primarily being driven by the, the, the engineering teams, the data teams, uh, product teams, as opposed to like just. Marketing buying like a solution, right? So that's kind of the big split around package and composable, right?

Here, you're saying you're composing the CDPs with like a bunch of tools, right? Data warehouse is a big piece of it. And then like you need other pieces of the puzzle, right? So I think now that you can like draw a line. On what, how many tools do you need? Like, clearly you need some kind of a data warehouse.

Then, like, you need, like, some kind of a data integration tool to bring all the data. Uh, you need some kind of a transformation tool to transform the data. And you need some kind of an, almost like, I would say, like a workflow tool for different business users to come in, act on that data, right? So, whether it's analytics, where, like, net companies, like, like NetSpring and so on.

Or whether it's like an activation where we play. So, I think this composable at the highest level where like your data warehouse and then some set of tools on top makes sense for this new generation of architecture. Like what set of tools is like, I think that will evolve over time. If you 

[00:10:43] Vijay Ganesan: are pitching Warehouse Native CDP to a business person who maybe doesn't care much about data architectures and stuff, like a marketing person, right?

How would you describe the benefits? Like, why should a marketer care? What benefit would they get out of Warehouse Native CDP? 

[00:11:02] Soumyadeb Mitra: That's a good question. I don't know if I'm like, we have like figured this out. I mean, that's why we primarily sell to the data teams, right? Our pitch is like, this is like... You should be owning this, right?

And they understand that the new generation of data leaders, they understand that like they have to build this stack and enable marketing. So how do you pitch this to, to marketing saying, I think the main pitch has to be that the traditional CDP investments are not providing the ROI and enabling the use cases they truly want to enable, right?

And, and I think it is only getting worse, right? I mean, with like generative AI and all the stuff, I mean, it's a buzzword, but I think like. Personalization will become very deep, uh, and but then you can only do that when you have like... The right data architecture, you cannot, the traditional things are not able to solve that.

[00:11:53] Vijay Ganesan: You know, one of the things we're seeing in the, in the product analytics space, and then we subscribe to the same philosophy of warehouse centricity and so on is, you know, to a business person, they can get much richer, context rich analytics. It's not just about few streams of data. The data warehouse has got data from so many other sources that your personalization can become Much richer, right?

Your analytics can get much richer. There's of course the governance and security and single source of truth and so on that that the data engineering teams care about. But you made a great point earlier about how this is not just for marketing anymore. This is for Every group in the company that has anything to do with customer, you know, they care, right?

So, so that's why it makes so much sense. Double clicking a little bit about some of the key capabilities of CDPs, right? So if you look at it, maybe let's pick a couple of things. You know, you've got connectors, right? You have to bring data from so many different systems. into a single place. So there's this whole connectivity ecosystem.

But after you bring the data, data often is in so many shapes and forms in all these systems. You got to rationalize it, conform it, and identity resolution. This is one of the biggest things for marketers, really, this identity resolution. How does being warehouse native help 

[00:13:08] Soumyadeb Mitra: with those things? I think in our conversations, right, ID resolution often comes up the first, uh, motivation for doing, like, a warehouse native implementation because, and again, it probably points back to the traditional limitations, uh, the limitations of traditional CDPs.

A good example is, let's say you have, like, your data coming from your website and you're coming from your apps and you have some data coming from your, CRM system and all have some identities and then you have to finally stitch all of them into one single record, right? And sometimes, like, you have, uh, like, fields like addresses which cannot be matched deterministically, right?

So, the, the traditional CDPs offered some black box ID resolution. And I mean, it could, they'll do some magic internally, but it could only go so far. I mean, each business has custom needs and complexities of their own data. This is that customizability was not available in a traditional black box a d p versus like when you have the raw data in your data warehouse, like it's kind of like your data team's responsibility to stitch them, but it also gives that power to put your own custom logic and, and tune that identity resolution.

So in fact, That is often, at least in our conversation, like almost an entry point to say that, like, okay, are you truly being able to match customers into one record in this cloud CDPs? Versus now you have the flexibility to the data. So 

[00:14:46] Vijay Ganesan: there's the flexibility aspect, but I'd imagine there's also the aspect of transparency.

I know exactly how this thing is happening, right? Instead of this black box is doing something, I have no idea how this identity resolution has happened. So there's the lack of transparency, I'd imagine would be an issue too. 

[00:15:03] Soumyadeb Mitra: 100 percent and transparency. And then the kind of a follow up to that is like being able to rectify things, right?

I mean, if this black box merge certain things, which. By looking at it, you don't think it's the right thing. There is no way you could, like, go and rectify that. I mean, it's for them to, like, ship an update to the software, and you're, till then, you're kind of stuck with this, right? Versus, like, if you're stitching, you're building your ideology into, in, like, you're writing code for doing that, right?

Yes, it does require some work. And then that's why, like, vendors like Radostack can also help, but that flexibility in being able to, like, detect mistakes and, like, override that is also extremely important. 

[00:15:37] Vijay Ganesan: DBT has become very popular, it's almost standard now in most companies and data architectures. How does that help with your architecture and how does it complement what Redis Stack does?

[00:15:50] Soumyadeb Mitra: Probably half of our customers are using DBT in some form or shape, right? Or like, DBT at one end is... A very simple thing. It's not like a new SQL or so on, but at the other end, it's like extremely powerful, right? For a very long time, analysts were writing pages and pages of SQL. There was no version control and no proper way to manage that.

So it's kind of like a big force in, in that ecosystem, right? Almost like took data analysts and make them software developers, right? Like all the software engineering best practices can now be applied to the data transformation, right? So it's like, I'm a big fan of dbt to that point. A lot of our customers are using DBT to write those transformations.

So we are almost like complementary in the sense that we land the data and then they can write their own transformations on DBT to achieve like an ID resolution and like compute features and so on. At the same time, it does require a lot of work. Writing an ID resolution in SQL in DBT is quite non trivial and particularly if you have to like handle scale and handle incrementally and so on.

We have also kind of taken that problem and productized that in Radastack. We have built a layer on top of dbt on most of it, which takes like a high level config and it will generate the SQL, it will generate the dbt model to do that identity stitching and like feature generation and so on. Yes, we are complementary in that sense.

Let's talk about 

[00:17:14] Vijay Ganesan: analytics, product analytics, marketing analytics, digital experience, all of these types of analytics, customer data is obviously front and center in that, and so how do you see analytics benefiting from this process? Thank you. Data warehouse centric CDP architectures. 

[00:17:32] Soumyadeb Mitra: We internally have use cases that can only be enabled by a data warehouse architecture.

Like a very common example is like, we want to look at product funnels and customer journeys. But then combine that with Salesforce data, right? I mean, I want to look at like enterprise funnels separately for enterprise customers and customers who closed versus like customers where we are competing with a specific vendor and so on, right?

So, and this is a very standard thing I would imagine. Like, I mean, we see that across all our companies, but it was surprisingly hard to do, uh, with, with a lot of this cloud product analytics tools and they're amazing tools, but then it's, they're only designed to ingest a specific kind of data. And if you want to like combine other data sources, it becomes really fragile and complicated to like set up those data pipelines, right?

So yeah, I think Warehouse Native enables that and unlocks that set of use cases. Plus there are all these challenges around data privacy, which again, it's not so much for a company like us, but at scale, it becomes a problem, right? I mean, you're centralizing your data in a data warehouse. Why do you need to like ship everything to another vendor to like do specific parts of your analytics?

Like, it just does not make good sense. Let's 

[00:18:46] Vijay Ganesan: talk about data activation and reverse ETL. It's part of this composable CDP idea where I can activate from the warehouse directly and push it to target systems and so on. But going forward, do you see the Martek systems to which you're pushing data, do you see them building native connectivity to the warehouse?

So you're seeing that, for example, with Salesforce, ServiceNow, so building sort of bi directional connectivity to Snowflake, right? Is that a trend that you are seeing building up and where we would be in a place a few years from now where you really don't need an intermediary to push the data from the warehouse to these tools and these tools have native connectivity to the warehouse?

[00:19:29] Soumyadeb Mitra: Yeah, I think like it's almost like a sequence of steps. I think reverse ETL was almost like a Temporary hack, right? I mean, and we support reverse ETL because there was a big demand for it. The other vendors were building big businesses on reverse ETL, but it's kind of like, yeah, just getting data into these cloud tools.

And then like, to your point, these cloud tools will eventually, the data warehouse is a big source of data and they, it makes complete sense for them to like, Provide that connectivity to pull data, right? So that's kind of the next stage. And I think like every cloud vendor would provide, uh, some, some kind of a connectivity to the data warehouse.

But I would say like the, the third evolution of that is like, why do you even need these cloud tools, right? Why should there be like a Salesforce CRM, which again, It's trying to copy your warehouse data, right? Or like, and so on, why can't you natively build those applications on top of your data warehouse, right?

And I'm sure like there are technical challenges and like, how do you support real time and how do you support like transactional? Uh, updates to your data warehouse, which I think like the warehouse vendors have to also innovate and so on. But I think like that is the future stated. I think like even the cloud SaaS vendors will be disrupted and a lot of these things will be built on top of the data warehouse.

Like Netspring, you folks are doing analytics on the warehouse, but like there is no reason it needs to stop there. And I mean, your marketing tool automation and your CRM all should be running on top of one single source of truth. Yeah, it's 

[00:20:53] Vijay Ganesan: interesting the Snowflake has this Unistore, which is basically, you can build transactional applications on top of Snowflake.

This notion of hybrid transactional analytical processing systems that have been around for a long time, but probably will see the light of the day now with the Snowflakes to the World rating. Let's talk about personalization. You mentioned that earlier, personalization is probably the most important thing for marketers, the way they interact with their customers, and increasingly it's becoming very, very important, right?

And every customers want and expect and demand personalized experiences in every channel, right? Whether it's like you said, even an email, everything has to be very personalized, right? And there is an aspect of personalization, which has to be in real time. So, I'm on the app on my phone and I do something and I get a offer or something, right?

And those things have to be in real time. Now, warehouses, though the conventional wisdom is warehouses are not really set up for real time, right? So, how do you deal with that? 

[00:21:56] Soumyadeb Mitra: I think this is a question that comes up again and again in our conversations and that's where I think like we as a product have some edge because like we provide all the connections like all the from ingest to activation and we can support data movements where warehouse is not involved, right?

I mean we can in real time stream an event from like a website back into like a marketing tool like Braze so that the push notification goes out so you don't have to like send that event through the warehouse. When we think about personalization, right, I mean, It is, there are two parts to it. One is like the understanding of the user, right?

How much do I understand an end user? And that is based on all the historic interactions that the user had with the brand, right? I mean, if you're on Netflix, then you are kind of built an understanding of the user based on all the videos they have watched and so on. That's like, call it the user model, right?

The second thing is like, the question is like, how real time does that user model have to be, right? I mean, yes, if you just signed up on a brand, You don't know anything about that, you probably need to be real time. But if you are like a 15 year Netflix customer, you probably have a very good understanding of the user anyway.

You don't have to like update that model in real time. That's one aspect of personalizing, like real time aspect, whether you need or not, depending on how long has the relationship been. The second thing is like taking action based on that user model, right? So if I did something, right, I, I, I... Drop off of a checkout page, then I want to take an action, uh, based on that user model.

Like I want to send you a promotion or not send you a promotion. These two aspects, and it is possible to support both on a warehouse first architecture. Like the warehouse model, the updating of the user model has to be, can be done in batch, right? In most cases, right? You just, what you just need is like the action that you are taking based on some user activity, that action has to query the user model and then send the...

Take that action, right? So that part doesn't require, should not go through the warehouse. And that is possible, right? I mean, even current architectures are possible. Like even Autostack can support that use case. So your point of 

[00:24:06] Vijay Ganesan: view is there's, most things can go through the warehouse, but there is a class of things where you, you need a direct pipe, right?

That, that you need to support both. 

[00:24:15] Soumyadeb Mitra: Yeah. I mean, this user action driving, like I did X and I get a response Y. And that why is dependent on my user model, right? So that part has to be supported. Like that is a requirement. You cannot like, like do that, like two days later, but the, the updating the user model, that need not be in a real time.

And warehouse is a perfect way to do that, right? So you have to support this architecture. And it's interesting, 

[00:24:38] Vijay Ganesan: you know, the warehouses are also constantly evolving, right? I mean, there are a lot more real time today than they were. Even like a year ago, right, in terms of streaming ingestion, you know, million events per second type ingestion into cloud data warehouses is not, is common these days, but, but still there is, there's some lag between the time the data arrives and the time that it's available for querying and so on.

But I'd imagine five years from now, the warehouses are probably a lot more, have a lot more support for real time capabilities. 

[00:25:05] Soumyadeb Mitra: 100%, right? And I think to your earlier point. This like unification of OLAP and OLTP was always the dream. And hopefully the cloud data warehouses pull that off. And then that will kind of like merge this.

And then like you also see beyond this traditional cloud data warehouses, you also have a new gen of companies who are trying to build like some version of real time data warehouses and so on. So there's a lot of innovation. 

[00:25:27] Vijay Ganesan: And you know, that's where I feel like this betting on a warehouse centric architectures makes so much sense because there is so much innovation that's happening.

Around data warehouses, right, that you can sort of start leveraging more and more of this capabilities. We'll talk about generative AI in a little bit, but there is Snowflake bought a company just focused on building generative AI capabilities in the warehouse. So all of that stuff becomes available for anybody that's building on top of the data warehouse.

So, So it's all the more reason to sort of bet on a warehouse centric architecture. 

[00:26:00] Soumyadeb Mitra: 100%. Yeah, I think that's the future.

[00:26:05] Thomas Dong: Yeah, it's really interesting as we're talking about these major innovations happening on the technology side. You talk about user models, and this has been a, you know, recurring challenge for many, many years. And it makes me think of all the, you know, marketing hype around Customer 360, Single View of the Customer.

Um, obviously these have been concepts that have been bandied about for many, many years now. Curious what your thoughts are, like, have enterprises finally and successfully achieved a Customer 360 view? Um, and if not, what still stands in the 

[00:26:35] Soumyadeb Mitra: way? We have a product around that, so we have a biased view of this, but I think, like, I'd be very surprised if, like, even one person of the enterprises have built, like, a Customer 360.

Like, maybe the Amazons of the world have kind of built. But like, if you go to any reasonable company, right? I mean, any standard company, you'll see that like, their customer data is all over the place, right? I mean, I have data here, which is, uh, with team X, which team Y wants to access, and they don't have access to that data because they're using some other cloud tool, even within departments, right?

I mean, you'll see that marketing has email marketing department and like a separate push web. Mobile marketing department and like somebody, somebody, some other team owning the, the, the, the web experience and even their data is all over the place and you see that outcome where you, you buy something and then you still keep getting email because your email system has not been updated with your transactions that you have been done, right?

So not even like 1 percent of the enterprises have truly built a custom 360. And again, it goes back to the earlier point that like, uh, that Vijay was mentioning is like. Like, you need a new architecture, right? This, this, like, traditional SaaS model is kind of almost anti building a customer 360. Like, you're trying to use 30 SaaS applications, then you're kind of trying to send the data to 30 places.

Like, so you, you, you need to Put a warehouse at the center to even build that customer. So hopefully in, in, in five years, that number will go from 1 percent to like 50%. 

[00:28:06] Vijay Ganesan: Soumyadeb, can you share any anecdotes, any examples of business impact that a warehouse centric CDP approach has brought about? 

[00:28:14] Soumyadeb Mitra: Yeah. So like, I mean, we have a bunch of customers who have, uh, like built this warehouse centric CDP.

I don't think I can name them, but like, uh, so there's a company or popular. In every brand, like probably the biggest in one in the U. S., they have the same challenge around like customer 360, right? Like earlier, their data is in like five different tools, right? One product analytics, one marketing analytics, some emailing tool, and different teams were running these promotions in isolation and so on.

Now they have centralized everything into, uh, like a data warehouse, uh, first architecture with data stack with, and they're still using those tools, but then they are being driven. Out of that single consistent customer view, so that the messaging and everything is very consistent. So there's some numbers to share around like ROI on those campaigns and like and so on, like substantially moved the needle compared to like the previous one.

Right. I'd 

[00:29:12] Vijay Ganesan: imagine the impact for a, for a large enterprises could be very significant, right? This is not just incremental thing. These are very, very fundamentally Business impactful. 

[00:29:24] Soumyadeb Mitra: Yeah. I mean, there's another company we work with. It's a, it's like a top casino brand and they have a mobile application and they kind of build like a churn model again by centralizing the data into a data warehouse.

They're kind of built like a churn model on top of that. And based on that, they are reactivating the users and that drove their revenue, uh, by like 30%, right? It's, it was like a huge lift from, uh, given the effort they had to put in. 

[00:29:55] Vijay Ganesan: Wow, 30 percent growth in revenue that any business 

[00:29:58] Soumyadeb Mitra: would want that. Yeah, and churn is a big problem in a lot of these, in certain segments.

I mean, even in my previous company, churn was a big issue. And then, and even building a simple churn model is hard. But then if you can centralize the data into a data warehouse, then like, yeah, it's more, more of a data problem rather. Not an ML problem, you just need the right architecture. So let's 

[00:30:16] Thomas Dong: bring it back to that ML problem.

Actually, we, we just, uh, previewed that a little bit. Um, in terms of generative AI and the, the possibilities there. It's obviously one of the hottest topics, uh, in tech discussions today. You know, as it relates to CDP and potentially the future evolution of CDPs. How are you potentially, um, viewing, uh, generative AI capabilities as a key component of, uh, a CDP's capabilities to help with things like churn and activation, uh, and whatnot?

[00:30:47] Soumyadeb Mitra: Yeah, I think, like, I mean, this is a very nascent space right now, so I'm sure there'll be a lot of innovation. Um, around like tooling on top of like generative AI, uh, chart GPT and so on. But at the fundamental level though, I believe that the reason people used to do like broad audience segmentation based personalization, right, you create like a huge segment, people who are from like New York and like what age 15 to 20, show them that same exact promotion.

That's how like marketing. Pretty much works now, right? And the reason you should do that is you could not create more campaigns, right? As a, as a human, you can only think of these broad categories and then like, okay, for these people, I'll create this campaign. And for this person, this another segment, I can create another campaign, right?

You're kind of limited by your creativity and how many campaigns you can think of. And then entire tooling was kind of built around that, right? And, and that's why you could, again, get around with like, Like, not so, you just need to know whether they live in New York and they, they are like, uh, what is the gender and what is their age, right?

And then you're, that's all you need to like run those marketing campaigns. That will go away, right? I mean, now with generative AI technologies, like you could like literally personalize the campaign to like an individual level. I mean, people have been talking about like one on one personalization for like 15 years, but I think like finally we are at a point where you could do that, where you could literally tell chat GBT, like, this is the person, this is...

The last four things you have bought, and these are our categories. What should be the next promotion? I mean, and Chattopadhyay will come up with a reasonable answer. So the technology is there, I think, but you still need the data. Like now, now it's not enough to say that you just need these four attributes about a person.

The more data you can feed and the more context you can give, the better would be the personalization. So I think like this, like broad based segmentation. Segment based marketing will not be the, will not be happening in five years and whether CDPs put generative AI in them or not, or whether it's a separate tool, the CDPs will have to play a very foundational layer.

I mean, you have to get all the data, like finally. Where we will be at a point where like you need to like, the more data you get, the better will be your personalization, which was not always true. So that's why I think it's a good time to be in a CDP space. That's a great, great 

[00:33:16] Vijay Ganesan: point you make about one on one personalization, which has been the dream for, for everybody.

And that's, that's, that's probably going to come true with, with generative AI. That's 

[00:33:26] Soumyadeb Mitra: very interesting. Yeah, they can create, I mean, now you can not just, creatives can be automatically generated. The messaging can be automatically generated. You are no longer limited by bandwidth constraints around what you could do.

And so I don't think we'll be doing segment, broad segmentation. There will be a new... Age of like marketing tools to leverage this. 

[00:33:44] Vijay Ganesan: Yeah, one on one, not just in, you know, the text or the email, but also even the creatives that go with it. 

[00:33:49] Soumyadeb Mitra: Yeah, exactly. You know that 

[00:33:51] Vijay Ganesan: I like, you know, cartoonish style, and so you send me the creators, you know, that align with my style. That's fascinating. 

[00:33:59] Thomas Dong: Yeah. No, the future definitely looks bright for CDPs and then for end consumers out there. Uh, this vision for the future is, is definitely motivating and hopefully the research will catch up and yeah, like Vijay said, hopefully someday I get a very personalized image, um, and message, um, based off of data that's been captured about me by the companies I do business with.

[00:34:23] Soumyadeb Mitra: Yeah, it's scary too. I mean, and that's almost like a opposite hat of my CDP founder, right? I mean, like at what point personalization becomes truly scary. I mean, but till, uh, we don't know, but, uh, but I think like, uh, till we figure that out, I think CDPs companies will. 

[00:34:42] Thomas Dong: Well, thank you so much for joining us today. This has been a very fascinating conversation. 

[00:34:50] Vijay Ganesan: From my perspective, I think there's two takeaways that I got. One was this idea that CDPs are no longer just a marketing concern, right? This is a concern for, Everybody in the organization and associated with that is the ownership of this has to be with the data team, you know, no longer with the marketing team.

So this is a centralized data team controlled, managed system that can be used by marketers, but can also be used by customer success and product and analytics tools and whole bunch of teams in the organization, you know, sort of similar to what we've been saying about. The siloed product analytics, not cutting it anymore, right?

So, the analytics around customer is something that is, should be used by product teams, success teams, sales teams, support teams, marketing teams. And so, so this idea of centralizing in on the data warehouse and making that available for all these The teams in the organization make so much sense. So the second takeaway is around one on one personalization with generative AI.

And you know, this dream of every marketer to make personalization custom for every single customer, which has never been possible before because it just doesn't scale and it's very... It just, it's not possible. But now with the generative AI, you can personalize individually to the, even to the point of creatives, right?

The color that you use to send me an email or a message, right? So things like that, which is, uh, which is really fascinating. I think that's going to change the way how marketers interact with their customers. 

[00:36:36] Thomas Dong: Yeah, definitely the art of the possible here is quite amazing, and that was certainly not a use case that I thought about, uh, when it comes to personalization as a marketer, obviously, you know, how far do you go before you start to get creepy, but, um, if generative AI is working off of very real data about my preferences, you know, I think I would appreciate that.

So that definitely was a, a Very fascinating insight from him. Um, many of my, my takeaways were very similar to yours Vijay. Um, but one thing that really kind of stuck in my mind is what he said in terms of SaaS being anti customer 360, right? As a marketer, we've been thinking about customer 360, uh, for decades now, customer centrist being, um, so important.

Um, but kind of 10 years ago when Martech and SaaS really took off, everybody just scrambled, you know, shadow IT took. You know, you know, took, took it, took its course, and we bought, you know, hundreds of different Martech tools to, um, serve different needs, uh, in, in the organization. But, um, all they did was create data silos, right?

And so Customer 360 hasn't been achieved as, as Salmia had shared. He estimates 1%. Um, that's a lot lower than I even thought, right? So, um, that's been a lot of hype. Um, but. Uh, with the warehouse native approach that, uh, that they're taking at RudderStack and similar to us, what we're seeing in this new wave of of SaaS, right?

The next evolution of SaaS will be warehouse centric. It's about warehouse native apps on one, uh, single source of truth. And it's, uh, um, certainly encouraging to see, uh, the velocity and volume of vendors taking this approach and, uh, providing, you know, effectively enough critical mass that the data teams can, uh, begin to recommend a portfolio of Warehouse native apps that they can recommend to their line of business.

That concludes today's show. Thank you for joining us and feel free to reach out to Vijay or I on LinkedIn or Twitter with any questions or suggested topics for future episodes. Until next time, goodbye.