Leaders Shaping the Digital Landscape
Nov. 1, 2023

Real-time Data for Generative AI

Retrieval Augmentation Generation (RAG) is an architecture that augments the capabilities of a Large Language Model (LLM) like ChatGPT by adding an information retrieval system that provides the data. But how is it changing the way that data is used...

Retrieval Augmentation Generation (RAG) is an architecture that augments the capabilities of a Large Language Model (LLM) like ChatGPT by adding an information retrieval system that provides the data. But how is it changing the way that data is used across the board?

Well, let's find out during the conversation that host Wade Erickson conducted with William McLane, CTO Cloud of DataStax.

#cloudcomputing #data #technology #chatgpt #liveinterview #podcast

Transcript

Carlos Ponce (00:12):

Good morning everyone. Welcome to another episode of Tech Leaders Unplugged. And let's get unplugged today again on this Tuesday, October the 31st. So we're looking forward to Halloween, and we, today we're going to have our guest today is William McLane, the CTO of Cloud of DataStax. So William, thanks for joining us today and Tech Leaders Unplugged.

William McLande (00:41):

Thanks, Carlos. Happy to be here. I think it's going to be an exciting conversation especially, you know, given we're on Halloween.

Carlos Ponce (00:49):

Absolutely, yes, I'm sure it will be. And of course, as ever we have our co-host and Wade Erickson the business. President of Business Development. A lot to hear. Thank you, Wade, for joining us as ever. Thanks for being here again. Alright, so let's get it started. Let's get unplugged. So William, let's start with you. Tell us a little bit, tell us a little bit about you, you know, your background, where you're coming from, anything you want to share with the audience today. Thank you.

William McLande (01:22):

Sure, yeah, happy to. So I've been in the technology industry for well over 20 years. My, the majority of my career actually has been in what I would classify as the traditional enterprise kind of integration communication space. I spent 20 plus years working at TIBCO Software building high performance, low latency messaging, communications, infrastructure. And where that's actually led me is, you know, the, the reality of all of this has kind of been building on itself over the last 20 years where we had, you know, situations where we started to look at data differently specifically with the advent of things like big data and analytics. And I think we're in this next layer of sea change where we are now at a kind of a crossroads where data is becoming even more important for the aspects of taking that big data kind of concept and moving it into AI and, and the, the rapid growth that we're seeing in the AI industry, both around predictive and generative ai. And how do we handle all of this data and how do we leverage that data appropriately. There's, you've got a lot of challenges, I think, in front of you with regards to how you do that, and that's really what led me to come to data Stack. So I was asked by a number of colleagues over a data stack to come help with kind of the building out of the communications infrastructure, and then ultimately oversee the aspects of kind of the AI cloud infrastructure that we've built to provide that real-time data platform for bringing production quality, high quality, high performance applications into production that can use generative AI principles. And I think that that's one of the challenges that we face with application development today is, is that you know, AI is a really, really great buzzword and there's a lot of things happening in the space, but I think a lot of people are struggling with how do we actually bring this to a production ready application that can be used and provide value.

Carlos Ponce (03:29):

Thank you so much, William appreciate it, your, appreciate your elaboration. So specifically on the, on the AI part, and which that's probably something that we're going to keep elaborating further in a, in a bit. So, because we're going to be talking about realtime data for generated ai, that's the topic that's chosen by you. But before we get there, tell us a little bit about, specifically about DataStax. So what is your value proposition? What do, what do you guys do essentially?

William McLande (03:59):

Sure. so DataStax is, is one of a, a very interesting company. We've had a, a number of different incarnations with where our priorities and our core focus are a lot of our customers and a lot of the people that know DataStax know us as kind of the commercial backing behind Apache Cassandra. And that's, that's where we've spent a large portion of our 15 plus years in the business focused on, which was really the ability to take and expand on the traditional workloads that were being hindered by things like SQL databases and relational types of operations, and expanding those into high performance, no SQL types of operations with unstructured data, being able to take that next step with large volumes of data and being able to access that data across an application infrastructure for kind of real time processing. Within the last kind of five years though, what we've really been focused on is how do we build out that data platform to be more than just a, a NoSQL database or more than just a database that's providing storage and retrieval, but how do we build out the platform that allows you to do realtime data distribution for applications? How do you leverage and bring in integrations with event processing and event driven architectures and microservices? And how do you treat your data across your enterprise holistically from storage to generation to distribution? And then most recently, with regards to the advent of AI based applications and the need for data to augment things like trainings and LLM and then ultimately provide the foundation for retrieval, augmented generation, we've come in and said, how do we build underneath that platform the ability to take and leverage functionality like vector storage within our large scale, high volume, low latency database to service those applications with vector embeddings and the ability to take and use those within an AI based process. So that's really what we do. You know, I like to think of us as more than just a a database company because unfortunately, I think a lot of people think of us as a database company because of our history, but the reality is, is we provide a significantly larger platform for being able to take data from the database from multiple different integration points and leverage that within the application space and use that building generative AI functionality.

Carlos Ponce (06:39):

There you go. Thank you so much again. William, you were telling us a little bit about DataStax, and I'm sure there's going to be a bunch of questions coming from, from either Wade or from the audience or myself. But in the meantime, let's get into elaborating on the topic. As I mentioned, it's so as chosen by you, we're going to be talking about realtime data for generated ai, and then it's we're going to be discussing how retrieval augmented generation or rag, I think it's pronounced, you kindly told me at the beginning, is changing how data is used. So let's move on to that part. Why did you choose this particular topic and why do you feel it was relevant for this day and age?

William McLande (07:23):

Yeah, I think, you know, from my perspective, one of the biggest challenges that we've faced over the last 20 years is, you know, we've talked about data and, and the importance that data has around building out an application infrastructure. You know, if we, if we scroll back, say 10, 15 years ago, the question was really, do we have the right data at the right time and then the right place to make decisions? You know, but, and, and we've kind of solved that with the aspect of building out architectures that use things like data lakes and being able to coalesce that data and then ultimately being able to take like a data scientist who can go in and actually review that data and find the pieces of information that are of relevance. But when it comes to AI, we've now actually taken all of that same challenges that we've had with the big data space with regards to the wealth of information that we have, and now we've now put it into the processing space within the application and said, now we want our applications to be reactive and be able to understand what's happening, not just what's happened in the past, but also what's happening in real time so that they can make decisions and generate responses based on those decision making processes. And, and that's where we know what we really look at with regards to what we bring to the table. You know, there's a lot of ways that you can build out generative AI applications, right? You know, you look at like a ChatGPT or you know, some of the things like Dolly two, some of the things that like OpenAI is building, they're basically taking information, they're vectorizing that data, they're using that information. But like, one of the biggest challenges that people faced with ChatGPT 3.0 Was, is that the corpus of data that it was using was limited to a timeframe. It was up until 2021. And anything new that had been generated had to then be refiled back into the LLM. And the LM had to be retrained. And ultimately that's why they, they put out ChatGPT 4.0 with more functionality and more features. And that's where the real challenge comes is, is that, you know, as we start looking at generative AI processes, there's lots of tools that are out there that allow you to do things in a seamless fashion, you know, the chat GPTs, for example. But what do you do when you wanna bring in functionality or data that's core to your own business? You know, a great example of that is, is working very closely with another company that we deal with. And one of the things they provide is the ability to provide real time generation of advertisement campaigns for either retail stores, car dealerships and things like that. And what they they do is, is they actually can generate scripts. They can actually generate the actual data that's being used to create the advertisement, and then they use text to speech based AI to actually create the commercial for the end user. So that something that typically would take weeks, months at a to, to generate can now actually be driven based on real-time data that's being brought in. And then more importantly, the actual physical media that's being generated for things like publication onto radio broadcasts within specific regions can be tailored to their specific needs based on what they're seeing. So an example of that would be like, what if a car dealership is saying, Hey, you know, we want to increase the sales of SUVs in a specific region. Well, we can provide a promotional commercial for that specific brand or that specific thing, and do so in a real time to those specific regions. And that's all being driven by this concept of real retrieval, augmented generation, being able to pull the right information and use that information to generate new content on the fly.

Carlos Ponce (11:25):

Thanks again, William. Wade, I'm going to pass it on to you and we have some questions, so please, by all means.

Wade Erickson (11:33):

Yeah. So you know, a lot of the, you know, viewers come from a strong tech space of relational databases. [inaudible] Know, a lot of our minds are very much structured around the relational databases, you know, columns and rows and those kind of things. And I know these vector databases that you talk about, and, and of course big data and all those have very different structure. And we talked a little bit about how artificial intelligence is learning in these new structures probably differently than relational databases and how this technology change has maybe landed into AI in a more friendly way than if we didn't really have this new structure of data lakes and stuff. So can you kind of give a brief summary and a description of that difference between relational databases, vector databases, and then how AI benefits from those differences?

William McLande (12:45):

Sure. you know, it's, it's interesting because one of the, the, the pieces that I think, you know, what AI's bringing to the table is that we are now able to access information in ways that we have a very difficult time kind of comprehending or, or at the very least, comprehending in a a, a relatively easy approach. You know, a relational database model is basically tables and columns. You know, that whole kind of concept that we've built within that relational model is really focused around our ability to think about how to structure and organize data, right? That's how our brains work. And it's how, you know, even though it's computational, we've still structured that data in a way that allows for you to do some level of human kind of cognition around, Hey, I've got a key value, or I need to be able to look something up. So I'm going to use this keyword and that's going to take me to a table or that table structured in a way that I can basically organize it. What AI brings to it is really just explodes the ability to classify data in a completely different way. And that's really what has happened with this movement that we're seeing in the generative AI space in in particular. So the whole concept of being able to take an index data and vectorize that data is really at the core of what is really different between a relational database and a vector database. The difference really is, is that un instead of categorizing data across a number of small categories with vectorization, we now have the ability to categorize data across hundreds of thousands of different points, and then we can use those points to hone into how data is related across multiple different planes. So the concept is really that within a vector database, now you can do functions like vector search, which is provides you the ability to look at data as a semantic values between different pieces of data. And there's a number of different ways and complexities that you can think about that. But like, one of the easiest ways that you can think about how data is structured within a vector database is that there's multiple planes of information. And as you drill through those planes, you become closer or further away from pieces of data depending on how they're related. And so you know, a a good example of that is, is, you know, we may have a series of words that are being categorized. Take for example the word king and queen, right? So if you look at the word king and queen, you know, if you say, Hey, are these two words related with regards to the aspects of royalty? Well, of course they're very closely related to those, but if you asked, are these two words related based on gender, well, now they're much further apart. And so you have the ability with this whole concept of vector databases and vectorization of information to take and categorize words, pieces of data, content, objects across hundreds of those different patterns, and use those patterns then to find out which aspects are closely related. And that's really where the big difference lies is, is that instead of looking at a relational database where you're doing things like key value pairing or relating one piece of information to some tiny other piece of information, we now have the ability to store mathematically these relationships and index them, and then use that math to find how they're related across multiple different planes of existence. And so you can drill in and dive deeper or farther apart, depending on what you're looking for.

Wade Erickson (16:48):

That's great. cause That helped me bring it back. cause I think the first time I started dealing with these semantic spaces and vectorized data was back in the late nineties with resume databases and job description databases. So we brought them into a, a basically like a monster.com. And this was even before Google was even working with this, I was working with a company actually here in Texas back in about 98 and seven, I guess it was, and that the term, like the word Java back then, it was, is that a barista or is that a Java, a programmer, right? And so we looked at HTML and other things in the proximity of other terms. So when you threw the description at it, it would look at those words and the vectorize the job description, and then apply that to the resume bank so that you'd at least bring back programmers and not people that are chefs and baristas. So if you threw the word Java in there, and so that, that always was a an analogy that helped me to look at relationships of words. And it's, I guess it's just another, you know, 30 years later almost of, of just refinement of that and then doing it in a much larger level. Tell me a little bit about the the performance of these systems. I know that, you know, they're really optimized for retrieval. Tell me about the ingest process. How, how does data get in? Is it using like JSON structures or straight text, or how, how do, how do you ingest data in most of these kind of structures?

William McLande (18:27):

Yeah, so that's actually a really good question. You know, ingestion has been one of the big pieces now that is coming into play and what something like RAG really is focused on you know, initially, like, you know, a lot of the things, like things like ChatGPT and things like that though, they're doing like webs crawlers and pulling data from the internet. So it's basically kind of unstructured text-based operations. JSON is, is definitely an area that you can use to bring data in, but just like the same problem that we had with, you know, integration across a large enterprise, right? We have tons of different pieces of data that are in multiple different formats. And so what RAG capabilities really provide is this concept of saying, listen, instead of having to take and normalize all your data into a single common format, what we can do is say, listen, we're going to take all of these different data structures and we're going to bring them into the process. And what we'll do is, is we'll vectorize and index those in real time. And so that as they're being written into the backend vector database you know, like, you know, for us it's Astro db, we have a cloud-based solution that can do this. We can actually do that indexing in real time. So whether you're using open source tools like Lang Chain or LAMA Index, or if you're trying to bring data in from like an event driven architecture where you're using something like Kafka or Pulsar, you can use those applications to feed into your basically data platform that can ultimately then take that data and integrate it into your AI processing so that LLMs can leverage it. You know, a good example of that, and kind of to your point about the semantic search layer is, you know, take for example, I wanna look at something like I'm building out an e-commerce platform, and I wanna look at the click streams that are happening with regards to my different customers. And so a customer comes in and says, Hey, I'm interested in tennis balls. And, and, and the reality right, is, is the very first thing you look at there is you go, okay, well, what are the things around tennis balls that I would like to help promote? And it's you know, tennis rackets, you know, sneakers, there's all this stuff around it. Maybe I'm, maybe I'm interested in saying, Hey, have you checked out pickle ball? You know, but what if that person has actually got a history that is more relevant to give you more information that you can pull from and provide a closer match to what they're looking for? Specifically something like, well, what if they're actually an avid dog lover? And what they're actually looking for is, is they have never shown any interest in tennis. They're actually looking for dog toys. And so now with that semantic search, and if I bring in this real time aspect of knowing who my customer is and actually using that as a vector within my process, I can actually start to fine tune what's of interest to me. You know, or even another side of that coin, right? What if it's not a dog lover? What if it's somebody that's uses a walker and they need to have tennis balls that they want to put on the bottom of their walker? You know, you, you now have the ability with this generative AI processing to really hone in on what a customer's using. But more importantly, even if you take that to the next step, if you take that to something like let's say I, I'm a, a manufacturer that's builds I, I build coffee tables, and now with generative AI, I can actually have an interaction with my customer and have a natural language processing type of approach to saying, what are you looking for? And, and they, you can actually create images of, say, a coffee table that as they're telling you the different aspects of it, and fine tune that to their specific desires. So, you know, I might show 'em a coffee table that's got you know, squared edges and they say, well, actually I'd like to have a, a table that has round edges, and I can redefine and populate images for them and show them exactly what we can provide and actually have that created for them once they've decided on a, an architecture or a perspective that they think is what they're looking for.

Wade Erickson (22:43):

That sounds good. Great. one last, I'd like to pivot a little bit and just talk about you and some of your backgrounds. A lot of folks are, you know, either middle managers or senior managers that watch this show and, and often there, you know, next career path move is into the, the C-suite. And tell me a little bit about your journey as ACTO. I, you know, a lot of folks, you know, will tell you everybody's journey's different to get into that, there's no one size fits all in the C-suite journey, but tell me a little bit about yours and what kind of looking back were some of the pivotal points in your career or relationships that helped you to get there.

William McLande (23:26):

Yeah, I think the, the statement you made about everybody's journey is different, I think is, is 100% accurate. You know, for me it was, you know, when I started my career, I was a, you know, software developer and engineer. I worked on a, a product called Tipco Smart Sockets. I worked on a, a multicast protocol called PGM and, and, and I truly loved all the engineering work that I had done over the years. For me, it really was, as my career path kind of went through I just got asked to do more and more of the actual business management piece of things. So I got asked to step into product management. I spent a lot of time working with engineering and, and learning the different aspects of the business, so engineering, marketing, sales you know, the leadership side of it which ultimately just kind of led me down this path of kind of learning different aspects of the business and then applying those to provide cross communication and functionality across that. And that's really what led me, I think, into this, the c-suite. You know, the, the piece, I guess the guidance I would give to people that are earlier in their career is, is one of the things I learned very early on. And, and it's an old adage, but it's, I think there's a lot of truth to it, is, is there's nothing that I would ever ask anybody to do that I wouldn't do myself. And, and there's, to be a true leader, you have to be willing to sometimes jump down into the trenches. You know, that being said, you know, as a leader, you also have to make sure that you keep that focus and drive and make sure that the trench is going in the path that it needs to go. So, you know, there's times where it's, you know, in my career as ACTO, it's sometimes the engineering and product side needs to take priority. Sometimes it's marketing, sometimes it's sales, and, and whatever needs the attention at the time, you know, you got to be willing to jump in and solve those problems.

Wade Erickson (25:24):

Great. Yeah. That's, that's what I always tell everybody too, do not, do not turn down the, the dirty work because that's where you often find learning opportunities to figure out the problems that others might not be able to solve. And it's okay to be a mule. You don't have to be a thoroughbred all the time.

William McLande (25:43):

Well, yeah. The, the only other, the thing I would actually throw out there too is, is I, I've had a lot of really, really good mentors over my career, but there was one thing that the, that I've, I've, I think is very true for all of the different leaders that I've worked with, and that is, is that they try to surround themselves with people that are smarter than them. You know, I, I have always in my entire life tried to find, surround myself with people that are smarter than me. You know, my kids tell me every day that they're smarter than me. So that's, that's okay that, that's easy to do at home, but it's, you know, in the, in the workplace, it's a humbling experience, but you will have so much more success if you're willing to accept that there's always going to be more people that are smarter than you, and that their perspectives are, and, and how they can challenge you is always going to be a, a better end result.

Wade Erickson (26:40):

Carlos?

Carlos Ponce (26:41):

Yes, absolutely. Thank You, Wade. William, we're coming up on time, but there's a couple of, there's, well, there's one thing actually that I wanted to mention or ask is I understand that the how can I say this? The training part and in the rack model is could, it's not necessarily you know, cheap, right? So I understand there are some, it's, there's a, there's a, a price tag in there. Why is this, I mean, is there like scarcity of trainers or what's going on? And I'm asking this, as I mentioned at the beginning from the master perspective, I'm interested in this, or if any a viewer is interested in this, why is why does the training cost consideration come in and why is it like that?

William McLande (27:28):

Yeah. So you broke up a little bit there, but I think what you were asking was is what, what is the kind of the, why is there such a high cost to training within the AI process?

Carlos Ponce (27:37):

Yes.

William McLande (27:39):

Yeah. So basically, if you really start looking at the whole kind of AI piece, and more importantly the generative piece, the, the work that goes into validating and, and creating objects, I is really a, a pretty heavyweight piece. You know, one of the examples that I give right, is this. And, and we, we've seemed to have started to solve this problem, but when you're creating something like, let's say I say, you know, to a generative AI process that I want to create a picture of a dog. But, you know, if that dog looks like a blueberry muffin, so the old adage, if you go and search online, there's this whole thing about can, can you distinguish between a, a blueberry muffin and a chihuahua's face and to make that generative process such that it actually will create quality information, takes a lot of repetition, it takes a lot of time, it takes a lot of processing. And so what rag is really bringing to the table is, is if we can think about how we train our models and use the training to create more of a generic approach to solving some of these problems. And then we use data as the driver to how those processes are being updated so that you don't have to ultimately retrain all the time. That was where the value proposition comes in. You know, the example I gave in the beginning, for example, with ChatGPT 3.0 And ChatGPT 4.0, Right? One of the biggest reasons why it takes so much time to release something like a chat GT four is because the corpus of data that we have to analyze is finite. And so we there chat GT three, three went to 2001, you know, with ChatGPT 4.0 We're bringing in more of the relevant data from now. And that's what really rag is trying to do is, is it's bringing your private data, it's bringing all of the relevant data in. So that training and those, those models can be used to leverage that data without having to go off and retrain them over and over again.

Carlos Ponce (29:40):

I see. Thank you, you so much. William. So we're, unfortunately we're out time right now. And by the way just in case anyone wants to get a hold of you and we both, you watching you and get in contact, so you have this email address and also of course which is [inaudible], and then of course you can reach out to Data Stack, you should go to the website and it's all going to be in there. No, no further directions I are needed, I suppose, and Wade and William, and it's been a great pleasure to be here again, you guys. Before we go have an announcement to make and that is about tomorrow's guest right here on Tech Leaders Unplugged. So tomorrow, right here, we're going to be speaking with [inaudible] senior software engineer at Qualtrics. The topic is going to be scalable enterprise features and powering big brands with innovative solutions. That's what we have right here on Tech Leaders Unplugged. So join us tomorrow as ever, 9:30 AM Pacific, and just be here or be square. Thank you so much. Thank you, William. See you next time.

 

William McLaneProfile Photo

William McLane

CTO Cloud

William McLane has a diverse work experience spanning over two decades. William is currently the CTO Streaming at DataStax, beginning their role in September 2022. Prior to that, they worked at TIBCO Software Inc. as a Product Evangelist from May 2018 to September 2022. Before joining TIBCO Software Inc., they served as a Software Engineering Manager at IDEAL INDUSTRIES, INC from September 2016 to May 2018.

From 2002 to 2016, William McLane worked at TIBCO Software in various roles. William started as a Senior Software Engineer in January 2002 and progressed to become a Senior Product Architect, Messaging from May 2008 to June 2012. Later, they held the position of Director of Product Architecture, Messaging from June 2012 to August 2016.

Prior to their tenure at TIBCO Software, William McLane was a Member of Technical Staff at TALARIAN CORPORATION from May 2000 to January 2002. Throughout their career, they have developed expertise in software engineering, product architecture, and technical leadership roles.

William McLane attended Concordia University Chicago from 1996 to 2000, where they pursued a degree in Computer Science. William'sfield of study included Computer Science and Communications/Theatre.