Practicing Multimodal AI

Transcript

Practicing Multimodal AI_mixdown

Trond Arne Undheim, Host: [00:00:00] Futurizing goes beneath the trends to track the underlying forces of disruption in tech policy, business models, social dynamics, and the environment. I'm your host, futurist and author. In episode 96 of the podcast, the topic is practicing multimodal AI. Our guest is Slater viktorov CTO and founder of Indigo.

[00:00:26] The enterprise AI startup and this conversation, we talk about how Slater was picking trash off of the Wellesley dump for school engineering projects, how he loves Chinese fantasy fiction. John Joe, with its immortal heroes he's experienced as a Techstars, founding a startup, and how startups beat juggernaut juggernauts like IBM spending billions of dollars.

[00:00:49] We discuss how his company, IndyCar practices, multimodal AI. I set up blended techniques that make data sync. We muse about the future. Where citizen data scientists contribute to better problem framing driven by subject matter experts, or how are you today? I'm doing

[00:01:08] Slater Victoroff, Indico: [00:01:08] really well. Thanks so much for having me here.

[00:01:11] Oh,

[00:01:11] Trond Arne Undheim, Host: [00:01:11] I'm so excited to have you not every day, you've got a real prodigy on I have been looking into your background and also we spoke earlier. It must've been, you must've been an interesting student to to have in school. I'm sure I was

[00:01:23] Slater Victoroff, Indico: [00:01:23] to say the least I think different teachers responded to it differently and some I'm still wonderful friends with all these years later.

[00:01:31] And so I'm not,

[00:01:36] Trond Arne Undheim, Host: [00:01:36] yeah, that, that makes a lot of sense. In all serious seriousness it's not, it's a blessing and a curse, because no one really. Begs for being different even in a positive way, arguably right. I

[00:01:48] Slater Victoroff, Indico: [00:01:48] mean, it was definitely we can argue back and forth on, on whether or not it's a positive way, certainly given, my, my state, traditional education was always a bit tough, but I think the positive that it really brought across was that it gave me a really clear sense of how differently different people can learn.

[00:02:05]Granted, it meant that a lot of high school was maybe not perfectly fit to my style, but it meant that when I found a school that really spoke to me for college it was that more, it was that much clearer to me how different it was from a traditional education.

[00:02:18] Trond Arne Undheim, Host: [00:02:18] So we didn't talk about this. So I may be in deep water here, but you went to North Hollywood, highly gifted high school. I'm guessing it was different from 900 to one. Oh high school experience from watching old TV shows.

[00:02:31] Slater Victoroff, Indico: [00:02:31] Yes. Very different. Probably the first thing to clarify for people that don't know about LA is North Hollywood probably when people think Hollywood people sometimes think, a very glitzy kind of fancy place, people from LA I never have to explain this to cause North Hollywood is very much not that North Hollywood is a lot more famous for its gang violence than for anything upscale.

[00:02:53]No. It was a strange high school to be sure it was a very interesting, I'm extremely thankful for it. I think the highly gifted magnet, it's a really great program. It's been, under a bit of siege within LA USD in recent years. My graduating class, actually, if you can believe or not at a public school was 42 students.

[00:03:09]It was I dunno, I always just considered it special ed, frankly. But we did we did have more than our fair share of academic awards.

[00:03:19] Trond Arne Undheim, Host: [00:03:19] It's later, then you went on to Olin college, which is a, in my neighborhood here in Wellesley. And that's another very special school in the sense that, they follow a quite different curriculum.

[00:03:29] How was that experience for you?

[00:03:32] Slater Victoroff, Indico: [00:03:32] It was wonderful. I think the best way that I could sum it up is when I was out looking at colleges, I had the implicit assumption that no matter where I went I would not go to class very often. I would show up for tests and hopefully I would be able to get good grades.

[00:03:47] And so that was actually something that I was actively screening for in the schools. Do I have to show up to every class? When I showed up at Olin though they were the first school that ever convinced me that I had it wrong. And for those of you that aren't aware, which is probably pretty much everyone, cause Olin's, very small and very new Olin is a totally engineering education and it's entirely project-based.

[00:04:07] And what that meant is that going to class? It wasn't about. Hearing a kind of half-baked lecture that I was going to go and have to reread the notes after class anyway. It was much more about having active help on projects that we're working on. These were actual applications of the theory that you had learned earlier in the week.

[00:04:23]And the thing that I just really loved is that there is no hiding, I had gotten pretty good at taking tests over the years and I always felt bad that especially if it's a multiple choice test, Pure strategy can, it can just do so much for a test like that. But when, it's I always say about engineers, right?

[00:04:39] There's no partial credit at the bridge falls down. Even though like at Olin, they probably would give you partial credit for trying something so ambitious. The idea is when you actually have to make something work really have to push yourself to a much deeper level than you would otherwise.

[00:04:52] And I think a lot of schools are afraid to put that much pressure on students. Not for terrible reasons either. But for me it was really effective. I actually withdrew my application to every other school after I visited Olin for the first time. So it was the oldest school.

[00:05:06]I applied to for college.

[00:05:08]Trond Arne Undheim, Host: [00:05:08] I wanted to spend a little time on your background because, w will very shortly move on to various parts of this AI picture. But I do think that your background, it does spell. It's interesting too, to just consider, I know you also very much into Chinese fantasy fiction where your mortal heroes are battling and I'm wondering that kind of fictional universe, was that sort of an escape that just provides a very rich environment, quite different from the everyday that you were faced with that I'm not trying to psychologize you, but there's something cool about just jumping into fiction really deeply.

[00:05:43]Tell us about that.

[00:05:44] Slater Victoroff, Indico: [00:05:44] I will say. Have you ever heard a hero with a thousand faces? So this is a book by Joseph Campbell and it's, one of the first books where he outlines his hero journey. And there's a quote that he has pretty early on in the book that, the first time I read it, I thought it was the most grandiose absurd thing that I ever heard.

[00:06:03]And as time went on, I've learned to appreciate more and more the gravity of these words. It's the story is the portal through which the boundless energy of the cosmos pours into human society fairly roughly. And it's again, like just on the face and I'm probably butchering the quote a bit.

[00:06:22]You can look it up. It's not to that. It's not too far away from that. It just sounds so grandiose. How could this possibly be true, but what he goes on to describe and what I. I think people kind of rail against this unconsciously, but actually embracing it has been really effective for me, which is that fiction allows you to go outside of yourself to solve problems that you're too close to as an individual.

[00:06:44]And it was something that I had never really considered, or, if I ever thought about fiction like that was, me not being strong enough to get over that hump. But when I had never thought about it as is, fiction is almost this force multiplier for your efforts, whereby getting outside of your mind a little bit, you can get a new perspective on a problem.

[00:07:01] And, I just thought that was really powerful. I happened to find Sean Shaw, which is, these Chinese immortal fantasy books really compelling. For a couple of reasons, but I think one of the ones that is it's most strained from a Western perspective, because these books are really heavily intertwined with religion.

[00:07:21] And one of the things that is near universal actually is that the line between like humanity and divinity is extremely blurring and porous. And it's this assumption is that, if you're good enough that, being a person at some point, you gain all these kinds of special abilities, which, I think it is a fun idea, but I think just that blurry line is such a, it's such an evil, the concept from, I think like a traditional, apprehend Vic background.

[00:07:45] So I really appreciate the tape.

[00:07:50] Well,

[00:07:50] Trond Arne Undheim, Host: [00:07:50] look you provided me with a nice segue because whether there is a massive Shiism between humans and the divine, right? It's arguably a little up for grabs in, in at least some people's fear about AI. So let's fast forward to present day. You're with the company you've founded in Decaux you've just raised $22 million on you're involved in.

[00:08:12] What you guys call intelligent process automation, which I'm taking to be a version of what people have called RPA or robotic process automation, which is an interesting field in and of itself. But you have a sort of a little take on this and you're definitely involved in multimodal systems. And I wanted to get us into this discussion about what's really going on in technology.

[00:08:34] These days wanted you to maybe start us off a little bit with kind of getting into this dichotomy between supervised and unsupervised learning and the data spectrum that we are faced with these days. Because it's not just. Algorithms sitting on an academic shelf anymore. This is real world data with real world consequences and it just bringing, it brings us into this universe.

[00:08:59] Slater Victoroff, Indico: [00:08:59] Yeah. And, I think I really loved the the lead in with RPA because I think that's also an interesting lens that is brought to this whole pump problem. And again for folks that, I think RPA the balance around it fuzzy, like in a lot of cases and people might have different definitions, but, traditionally I think of this is robotic process automation, and that's to say you're automating robotic processes again, which is Pretty reasonable, but people often don't break it up that way.

[00:09:21] And that's the same processes that can be really easily broken down into, pick this up, put this down, copy a value from here, paste it there. And what RPA has really exposed is that, there are a huge number of use cases that fit really well into that mentality. And they, it's perfect fit.

[00:09:36]And what they tried to do at first actually was make AI fit into that kind of a bubble, that kind of a construction. And that's where you get a lot of these black box conceptions of AI, which is it's a closed off API and it's got an input and it's got an output and I can't change it.

[00:09:51] And I have no idea what's going on inside of it. And I think that it's it's made to fit into that RPA mentality, right? It's an activity like any hour, you can copy something, you can paste something, you can extract the values of an interval. But I think what's really interesting is, and this is both true about intelligent process automation.

[00:10:08] And I think AI more broadly is that it turns out that these very what I'm calling myopic problem framings. I think we're really starting to run into the inheritance. Limits with those ratings. And I think on the AI side, you see this, everything from a basic, binary classification of sentiment, but you see it reflected in RPA as well.

[00:10:27]This idea that you've got these very neat bundle outputs of every single model that you create. And certainly, those are powerful obstructions, right? Class V everything is classification and regression, like that's a powerful abstraction. That's gotten us a very.

[00:10:42] Long way in the ML space, but I actually think that it's getting to the end of its life. And I think that these simple problem framings, are turning really into shackles more than anything that's going to push us forward. And you dig further into kind of multimodal and that is a spectrum because I think those are some really interesting talk points here, but maybe before we double click on those I think one of the really broad, interesting kind of shifts in this space is that, did I mention it all the clever Han's effect to you?

[00:11:12]When we talked earlier?

[00:11:13]Trond Arne Undheim, Host: [00:11:13] No. So

[00:11:14] Slater Victoroff, Indico: [00:11:14] it's it's just it's a kind of nascent term for one I'll dub. Incredible intelligence and stunning ignorance in these ML models. In AI we've got this paradox where, every problem that we can crisply construct and measure the output on it looks like these models are performing as well as humans.

[00:11:30] At the same time we shift anything about it just slightly. It's almost a child's play right to get these same models to exhibit really. Stunning levels of ignorance. And I didn't come up with this term. Someone else has dubbed this, the clever Hans effects. And have you ever heard of clutch?

[00:11:48] Trond Arne Undheim, Host: [00:11:48] I have in myself. So I think it's an

[00:11:52] Slater Victoroff, Indico: [00:11:52] American story actually. And it's from an old like I say, I'm not a hundred percent sure where it has come from, but it was a real horse in the 18 hundreds. And the idea is that this horse could do math. And it was in one of these old kind of traveling medicine shows where people would come, come to see the force that could do math.

[00:12:10] And the way it would work is they would have this chalkboard setup and they would write down, a very basic kind of addition, subtraction, multiplication kind of problem. And the horse would then stamp its hook until it got to the right answer. And somehow it was always right. And, this was as if

[00:12:25] Trond Arne Undheim, Host: [00:12:25] this is ringing a bell

[00:12:26] Slater Victoroff, Indico: [00:12:26] now.

[00:12:26] Yep. Yep. And it turns out, it's

[00:12:28] Trond Arne Undheim, Host: [00:12:28] funny, it's a funny situation.

[00:12:32] Slater Victoroff, Indico: [00:12:32] And so they did a lot of testing, just to figure out, okay, how on earth is this happening? And no one was really aware. Everyone thought this was almost legitimate, but what it turns out is that the only thing that the horse had to do actually was read its handbook.

[00:12:44] And so because of the way they were measuring it we're always stamping its foot until it had to stop. It was just reading the handler to see when the handler was tense and then, stamp stamp, ah, it finally hit the right number. It looked for that relief in the handler's body, again, completely unconscious and gets the right answer every single time.

[00:13:01]And the argument goes, there's a close analogy for AI where. Often we can get the right answer through the wrong path that then begs this question of, okay, how do we actually force this AI? How do we actively encourage it to follow the right path?

[00:13:22] Trond Arne Undheim, Host: [00:13:22] Look, I'm interested. This is it's fascinating. The clever hands. Idea is fascinating. And certainly, in today's AI, I find there, there is a lot of, both, there is a lot of the. Clear hype. And then there's a lot of the enormous skepticism that, Oh, because there is hype, there is nothing here.

[00:13:42] And, there are people saying, this is just math. Like you were onto this this is just regression. And, you could be impressed or not by regression, at the end of the day, is there really nothing behind here? Is this just some pure statistics with some extra bells and whistles on it?

[00:13:57]And where's the real mass, where's the real magic in here. And if there is magic some people would say, arguably, that magic is a black box and we can't have that. And there's a lot of debate around that because whether it is magic or not, if it is a black box and if it does turn, bad or biased, that is arguably more and more of a problem.

[00:14:20]Absolutely. What do you have to say for that? And also as we get into. More kind of multimodal data sets as well, where you're taking in an enormous array of data. You're not just comparing apples and apples that have been classified as, various colors, just a simplified here, but we're taking in pictures, you're taking in things that are analyzed in various ways, and then you want it then combined to do pretty amazing things.

[00:14:46]It wouldn't take much of a critical mind to think this is going to have to fall into some sort of system that we can understand otherwise, where this taken us.

[00:14:57] Slater Victoroff, Indico: [00:14:57] Absolutely. And I agree a hundred percent with everything you've said and So my first thing is I don't believe in magic.

[00:15:05]Very much, AI, and, I deal with some pretty sophisticated, modern, deep learning techniques in the grand scheme of things. And let me just say, right out there is absolutely no magic here. It, it is just statistics at the end of the day.

[00:15:18] No, granted. I don't want that to detract from the real impact of these new techniques. My view very much is that all academic inquiry is almost out of necessity incremental, right? But that making real incremental improvement is incredibly difficult to do. And when someone has made it really incremental improvement and that, that's what I would say, modern, deep learning techniques are right.

[00:15:44]It was this real, very significant. Increment made in the machine learning space. He didn't recreate it. These techniques existed, but it was Critical mass of almost tips and tricks that we accumulated to actually get the damn things to work. And so I find it really difficult.

[00:15:59] And, again, as a deep-learning guy, myself, I really rankled almost at this pitch that so many people have, which is that deep learning is this panacea it's going to walk your dog and clean your car. I very much don't believe that. So what I would say is a I also think that explainability.

[00:16:16] No. I think explainability is a very, it's a complicated term, right? I think different people mean very different things by explainability. I think that the most important aspect though, to recognize about AI is that it is only as useful as our control interface to it is. To that point, if it's magic, if we can't control it, if we can't push it left or right. Or correct it is not a useful thing. And I think that this is where. There's actually a, I think an analogy that people have for AI implicitly that's. It's simplistic. It's maybe even harmful, which is this notion of AI is a, an autonomous conscious thing with its own drives and desires.

[00:16:56]They'll frame things. Does the AI understand? And that's just, it's not how it works. At the end of the day, it is a set of tools that we have to deploy in very specific ways. And where I think this touches really interestingly on multimodal learning and where I also think that we're pushing on some of the boundaries of current.

[00:17:13] Prom framings is that we are officially giving our modeling too simplistic tasks. For a long time, we weren't able to solve these simplistic casks. We weren't able to solve these really niche point tasks, but now we can, unfortunately, we're now looking at these very niche point tasks as symbols of broader intelligence.

[00:17:35] And, there's no accuracy level and you can hit on sentiment analysis that. Is a significant advancement in this space. Like I don't care if it is a hundred percent, like there, there is no a hundred percent we have to evolve beyond the simple construction of sentiment analysis, or these simple classification constructions.

[00:17:54]Trond Arne Undheim, Host: [00:17:54] I want to dig a little deeper on some of these things in a second but first, another thing that sort of is, puzzling in this debate is, and I think Nicole is, I guess an example of it too. How is it? If it isn't magic in a sense of magic and like colloquial sense that a, small ish startup and you, as an example, there are other startups, obviously in AI can manage to beat these massive juggernauts.

[00:18:22]Whether it's a, a famous IBM, famous because it's easy to beat on IBM. Because they're so big and have been around. So it's like a fun thing to do but in all seriousness, how is this even possible that you can, is it the more clever use of existing tools? Is it some clever access to data that corporations or otherwise have given to you?

[00:18:44] Or is it that there are still some tricks in the trade that do require a little bit of feel for the game or something that. You just have to stumble upon that. Isn't just building block upon building block as you would. I would assume and again, I'm embellishing it a little bit, here's IBM they've been doing all of this.

[00:19:03] They have thousands of people engage in this activity. Why is it that they can't build rock upon rock and then just really just blow you guys out of the ballpark any day they want. As

[00:19:14]Slater Victoroff, Indico: [00:19:14] It's an absolutely great question. And, I think I'm going to start by saying that yes. Indigo has done this, but to your point we're far from the only ones, right?

[00:19:22]You look at clarify, right? You look at alien, and everywhere you look the world is dotted with these really impressive companies doing real significant work in the ML space. And to your point, as an outsider, looking at this, you're like, okay, Google has invested billions of dollars into an industry research lab.

[00:19:40] And how is it possible that, Joe Schmoe in his bedroom and some of these people are literally Joe Schmo in their bedroom, right? That's how we started this space. We were, Joe Schmo in a dorm room can actually contribute to this space. And I think that the keys that people I think people, because they don't spend a lot of time in academia, they misunderstand the nature of academic exploration.

[00:20:04]I think people often have this notion sometimes I joke it's, it's the old man jumping out of a bathtub shouting Eureka, that there's some fixed list of problems. And then you're going to check them off and then somehow, everything. But especially in AI, especially in a space that is very early on, actually research does not work. Like basically every paper that you put out is going to ask two more questions. And you've got this interesting thing. We're the frontier of research is expanding outward broadly in all directions.

[00:20:31] And that means the surface area that you can devote yourself to make real significant improvements is also increasing. And I think that's something that a lot of people don't realize. I actually often like to tell companies there is no better time for a startup that is trying to make a real technical advantage to start their ML journey than today.

[00:20:48]And I will also say that the other thing that allows these small companies to have so much success relative to investment is that the problem is fundamentally a system level problem. If you ask what are, these massive companies really good at is where it is one block ahead of the other, and you don't very linear path to improvement.

[00:21:07] They can execute like crazy on that. And I don't care whether it's Google or IBM or Microsoft or Amazon, they are going to win. If I had to compete on the economics of marginalized three capacity, not chance. But. Because so much of AI today is not just about model infrastructure, right?

[00:21:24] It's not, how are you going to supervise it, right? How are you going to actually get the infrastructure working behind the scenes and the true, the ugly truth is frankly, that academics hate to work on these problems and academics are honored today that you can't get an academic to work on something that they don't want to work on them.

[00:21:39]And so it means that there's this very big blind spot where they're always going to have the best architectures, but even within their own organizations, believe it or not, it's very easy for a small company like ours to look at research that their labs are publishing and because of their internal processes, we may have it in production before they do maybe even months or years before they do.

[00:21:58]Which again, I think it just, it really, it goes to say a lot more about the space than anything.

[00:22:06] Trond Arne Undheim, Host: [00:22:06] It was so interesting to me unless I misunderstand you, but you are not. You're characterizing big tech companies in the same vein as academic labs in the sense that you're saying they have this they're basically burdened with the same kinds of problems because of their size and their interest in the generic nature of the problem that.

[00:22:28] Anybody really from the outside who can pick up the pieces and execute faster and more in, creatively, has the shot right now at this very moment, perhaps though it is a temporary

[00:22:40] Slater Victoroff, Indico: [00:22:40] advantage, right? A hundred percent. In fact, I'd go a step further and just say that, and I also will say, I think this is unique to AI as a space, most academic spaces.

[00:22:50] I don't think work like this, but there is so much open publishing and the space moves so quickly that, the era of trying to have some proprietary advantage that you guard, it's gone, right? Because the duration of a state of the art algorithm today is months. And again, companies, lifetimes are measured in years and decades.

[00:23:09]Either if you think you've got a proprietary advantage, either you're mistaken and you're missing some research that someone else out there is doing, or you have six months to live

[00:23:19] Trond Arne Undheim, Host: [00:23:19] and then Slater, what might be the reason that so many of these startups are allowed to exist for quite long, quite a long time.

[00:23:25] Is that because this only serves the big tech ecosystem. To have all these startups out there and just not really a threat, because you could pick them off the ground anytime you want. And it's just you need all that variety to make this sort of even incremental progress that I guess many of us are interested in.

[00:23:43] And then some of us fear, right? Does this sort of progress that would take us somewhere? Really which we'll get to in a second, somewhere really far beyond where we are right now, which I guess is a little bit our discussion as well. Where are we at this moment in AI? And are we heading into a winter or summer or spring or whatever season you might, you might fancy.

[00:24:03] Slater Victoroff, Indico: [00:24:03] No, it's a great question. And I think honestly your initial take is. Pretty spot on. I think that for all of the progress that has been made, and this is something that I don't know either I've gotten more cynical or realistic over the years, but I used to be, I used to think in individual years now, I think in terms of decades, right?

[00:24:20] So the adoption is certainly going very well, but we're still in the first decade of adoption of most of these techniques. And I think it is exactly that is I think that. Google and anyone that really understands this space realizes that there's too much ground to cover for individuals.

[00:24:34]And I think certainly their plan is they're just going to look at all of these seeds and they're going to see what pops up. I think one thing that's also really interesting is when you look at a lot of the ecosystem players like Nvidia and Nvidia, obviously is, that Nvidia has a 50% higher market captain Intel today.

[00:24:50] I was shocked. I did not believe that.

[00:24:52] Trond Arne Undheim, Host: [00:24:52] Yeah, it's an extremely interesting company that would take an entire podcast to explore. No. So they have

[00:24:59] Slater Victoroff, Indico: [00:24:59] one technique in particular just where they invest heavily into the ecosystem and they don't want anyone to be acquired and they just want to sell GPU.

[00:25:08]

[00:25:08]Trond Arne Undheim, Host: [00:25:08] In fact, I've been looking at their their startup lab, basically. I can't remember what it's called right now, but it's gotten quite some attention because they literally have this ecosystem approach to startups. As long as you. Are interested in, in, in broadly working in the direction that or not even, using their platform, but everybody at some point will be using their platform.

[00:25:29]So they just want a thousand flowers to bloom essentially.

[00:25:33] Slater Victoroff, Indico: [00:25:33] And, Nvidia has really done a lot for the community. I think there's a couple of huge players that I think of. Without them, this AI Renaissance would not have happened. And Nvidia is absolutely one of them. In fact, Invidia is probably the only one that managed to successfully capitalize on their position.

[00:25:52]It was like, I think Kaggle for instance is another, but, in video actually made an incredibly successful business and continues to do whereas Kaggle and, they did a comparable amount of good in just educating the world data scientists out there, obviously a very different commercial outcome.

[00:26:07]Trond Arne Undheim, Host: [00:26:07] They're an interesting counter two two in the video though, one like more hardware based than one P people-based, believing in networks of clever people. I wanted to take us a little bit towards the future, because a lot of these AI debates, they either days they are, they're stop at like status quo where you can discuss what's possible now and what's going on right now.

[00:26:27] I wanted to bring us a little bit out of that picture, but not. Only from this whole like, visionary, like crazy, the w the world's so different, but if you just look at. You said you think in decades let's roll up some decades here. If you look at the next decade we're I don't know.

[00:26:44] My first take would be, it would be further in this multimodal direction. So there's going to be a lot of blended techniques. We're going to have to spend more time than we really wanted to testing them and verifying them. And there'll be an enormous. Governmental regulatory backlash, obviously, because these things are getting serious.

[00:27:02] That much is very certain, what else are we to expect this decade? So I

[00:27:08] Slater Victoroff, Indico: [00:27:08] think that, there's really deep potential in the multimodal space, and I think maybe to, to paint the picture, because again, I think you and I have talked about this, but it may be, maybe not every listener has.

[00:27:20]This idea that, traditional machine learning works within a particular modality, text or image or audio and that, and we're starting to see this today with, some of the papers out of open AI, but I think again, very much in their infancy where you actually blend together these different modalities and.

[00:27:36]I think it happens in two directions, which is interesting. So you have one direction, which is a, clip out of open AI was the recent hot thing here where you have one model that can understand both text and image, that can, capture images that, you can type in text and it'll generate some images and results and, it's very interesting to have that kind of central reasoning.

[00:27:57] Chamber, if you will. But I also think the inverse of that problem is really interesting because what they're not doing, for instance, they don't have blended image and text that they're trying to do inference on. And I think that there's another kind of very interesting spur of this research that's coming out now.

[00:28:14] And I think people, it's pretty hotly debated how. Interesting. This field of study will eventually be, but neuro symbolic methods which is a little bit this idea of how do we take this really fuzzy, rich on structured understanding that we've got with today's deep learning techniques and maybe fuse them with some of those older symbolic techniques and find if there's a happy middle path where we can have our cake and eat it too.

[00:28:37] No. And I think neuro symbolic techniques are probably one of the areas in AI where I'm most ambivalent. And not in the I don't care one way or the other is that I've gotten really conflicting ideas about whether it's going to be really successful or not. But I think it. Again, it's one more area of these multimodal techniques where we're starting to see some really interesting research.

[00:28:58] And I think when we pass this forward and start thinking about a lot of these kind of a more detailed infringe problems. So when we think about doing video inference, when we think about combining text systems with systems, that reason over time, these are all things that today you have to handle it and a very choppy, inconsistent way.

[00:29:17] And I think, the techniques to handle these in more comprehensive and effective ways they're burgeoning now. And I think it's a really good time to be advancing them because I think again, just the surface area of the space is so large.

[00:29:31]Trond Arne Undheim, Host: [00:29:31] Does this mean that sort of pure PR play in neural nets have run out of steam and is that why they're seeking more inspiration back into symbolic domain? And by the way, symbolic is hard to understand for an outsider actually, probably even for an insider. What exactly are the symbols that we're talking about?

[00:29:50] Why don't you line that up a little bit and then I'll ask my question. Yeah.

[00:29:53]Slater Victoroff, Indico: [00:29:53] And it is a reasonably tough because the notion of symbol is just. So vague. But it really, it dates back to a very old way of thinking about human cognition almost right. Sorry. I don't mean old in a derogatory way.

[00:30:05]I just mean, like they came up with this in the forties or the fifties, which is this notion that we are, symbol machines. Sense. So it's almost language, for instance, like each word. I use that as a simple, the idea being that I have symbols inside my mind that I explicitly am manipulating to come up with them.

[00:30:23]And for a long time, I think part of this is just because it was academics, creating theory. They're just like, this is how the brain works. Everything is super explicit, we just have to figure out what each one of these symbols is. And then it'll be perfect. And they tried for a long time to do this, like decades and decades.

[00:30:40]And I think thankfully we've recognized that humans aren't quite so cut and dry. Like you cannot mimic human reasoning, but out some implicit sort of surrender to the fuzziness of the problem. But again, it's one of the really interesting places of tension. Where we can recognize and show explicitly, okay.

[00:30:58] There is fuzziness in this problem, but at the same time, it's not enough to just hand wave and say, Oh, it's fuzzy. So we can't understand it. And I think that's one of the really, fierce growing tensions as we try more and more to mimic humans, we realize that humans are a lot messier than we then than we thought, and we were not very consistent.

[00:31:20] Trond Arne Undheim, Host: [00:31:20] I love that phrase that you had, that it requires surrendering to the implicit fuzziness of the problem. And I guess the fuzziness of the human being which is interesting to me, and it's always been a thing that I've been struggling with when it comes to understanding any kind of cognition research, which I've also been involved with for some years.

[00:31:39] And it is this assumption that. At any given moment, when you are looking at this you think we have a model of the brain and this is more or less how it works. Usually you start with a model of the brain, or but if it's not the brain, you start with a model of behavior or something of there that's sort.

[00:31:54]Everybody who's involved in this business has the problem that they are stuck with what, wherever academia is at any given moment of time. Yeah. Yeah. And you know who, who is to say that we're any closer now or like how much closer are we than we were 10 years ago, 30 years ago when it comes to either understanding of the brain or behavior or any of those things, arguably science moves somewhat forward, if you're making two explicit reference to some academic paper or paradigm.

[00:32:21] And then you're saying this is how reality looks and that's why we're going to design a system to mirror that reality. You're arguably in trouble wherever in history you find it.

[00:32:31] Slater Victoroff, Indico: [00:32:31] Absolutely. And I think it's one of the things, I think the most explicit place I can draw to right, is this notion of biological inspiration and neural nets and our, neural nets and software are the same as neural nets in what, where if you will.

[00:32:44] And I, You really did hit the nail on the head with the most important point, right? Which is we. Yeah, arguably, I think we understand more about how the brain works now than we did 50 years ago like that. Yes. We still don't understand how the brain works though. It's we're chipping away at this absolutely massive kind of area of research.

[00:33:03] And we don't even have a clear sense of how big it is. And so it's yes, we've chipped away, but, can I say whether we're 1% closer or 80% closer or a thousandth of a percent closer? No, I have no idea. And I think that. I don't know, I'm a little bit of two minds here because I think that there are two modalities that people, I think there are people that.

[00:33:21] Really love the idea that neural nets and software are based off the brain. And we're going to try to cram the two together at every possible juncture, even argue that the two, like mirror off each other. And there's just no support for that. It's we came up with the word neural network when we built neural networks at different way.

[00:33:37] And we thought the brain worked a different way and the two have just like strictly diverged for the past 40 years. And there's not. You really have to strain to see any shadows of one of the other, on the flip side there are some really interesting. Areas of exploration there.

[00:33:53]I think Yoshua Bengio is notable for this in that he really wants, the neuroscientists and the ML people to play nicer together. There's a lot of cranks that make it a little bit difficult, but he's been very good at actually finding good ideas in biology and, translating them over to the ML world in a way that, again, he has a real understanding of both sides.

[00:34:13] So I really do hope to see more of it if for no other reason than. Yeah. I think we are a bit at the end of our current paradigm of thinking, and it's clear that we've got places to go next, but we're not sure which is the right one. And so I am like, we need as much inspiration as we can get.

[00:34:34] But

[00:34:34] Trond Arne Undheim, Host: [00:34:34] it is an interesting moment though for the reason that if it is going to be a winter or a fall or whatever season, right? Two famously AI has all these winters it's not going to be a winter in the sense that we have accomplished nothing. Because, and in your proof of that within Indigo, because you guys are ingesting valuable data and you're making inferences and you are, relevant to companies today.

[00:34:55] So that's not going to go away unless we start to say we've been so wrong that these. Assumptions are creating the, cars are crashing, towers are falling down. I'll get that. That would be a problem. And we would literally go into a winter. So I hold that open as like a faint possibility, but

[00:35:13] Slater Victoroff, Indico: [00:35:13] for a long time, it's there's been really detailed testing. And again, that's not to say that there's never going to be an issue, but no, I think that, yeah, the chance of that it's vanishingly small outside chance.

[00:35:24]Trond Arne Undheim, Host: [00:35:24] It may be. A winter in the sense that it's going to take another few decades before we make the next big leap.

[00:35:33] How probable is that. Fast forward, 20 years, 30 years then, once you start thinking in decades, it's hard to pin it down, if you flash forward a couple of decades or three or however many you want, what is it that we could be looking at? Are we still looking at these sort of hybrid transfer, learning type approaches, or are we looking at some sort of radically new paradigm that would marry these things together?

[00:35:57]Hybrid AI fashion with some completely systematic new paradigm,

[00:36:02] Slater Victoroff, Indico: [00:36:02] I think it's a good question. It is something that I think about a lot and, be disclaimer obviously, is that the further away we get from today, the less certain I can have any kind of confidence, I do think that a couple of things that are certainly going to be true is I think that the way that we think about training models today, that absolutely is not sustainable and that has to change.

[00:36:21]Today, we think very much we're training things up from scratch, right? Transfer learning to your point. It's easy. It's getting much more common, right? But it's still not the way that you train most things. And while we're going to continue making incremental computational improvements, that's really where, there's no shocking breakthrough we're going to hit there.

[00:36:39] And so what that means, I think is that as we build larger and more sophisticated models, we're going to do a wave with this notion of training models from scratch, because it's just so impractical. And there was

[00:36:48] Trond Arne Undheim, Host: [00:36:48] a notion called zero shot learning.

[00:36:50]Slater Victoroff, Indico: [00:36:50] It's, it's a part of that, right?

[00:36:52]The way that you tackle the O-Shot learning, I think that's a great place to look for kind of early research. That's pushing along in this path. And I think another really great thing to look up is this concept of model surgery by open AI. And again, it is this notion that you're going to be training models together.

[00:37:05] You're going to have, a base model that has been trained on, the data sets and then. You're going to be making tweaks and tweaks, and there will be model lineages and, questions much more along those lines. So I think that is that's one aspect of what is going to change.

[00:37:20] And I think, and there is a discontinuity coming up, and I don't know that I would call this an AI winter, but certainly there are people who have pitched far beyond AI's ability to deliver. And I think we have a good sense of what is possible in the current deep learning JAG.

[00:37:35]I personally very much don't believe that, this current approach is going to solve all of our problems. I think that we've got an infinite number of problems. And I do think that the, this is going to get us very far, call it one to two decades and then what is going to have to advance for us to get beyond our current paradigm though?

[00:37:56] I think it is compute fundament. And right now quantum computing is the most likely contender for something to have a real impact there, but, to throw out, maybe an underdog there's new memory technologies as well, that could have, a similarly massive impact on what the future looks like.

[00:38:12]So maybe something else there.

[00:38:16] Trond Arne Undheim, Host: [00:38:16] What sort of magnitude improvement would we need, do you think, in the compute because quantum, famously as a quantum leap, certainly, we're talking thousands and 10 thousands and may maybe much more than that in terms of effectiveness improvements, what sort of leap do you think would make a fundamental.

[00:38:37] Change that would change the ball game, whether you started from zero or from previous

[00:38:42] Slater Victoroff, Indico: [00:38:42] models. I think very much in orders of magnitude. I think that's how, I'm not a business person. That's like a two X, nothing to me. I'm like, what is that? That's a rounding error. No, I think that GPS are a great historic benchmark.

[00:38:54] Right. GPS, I think, commercially accessible and available, individual GPU's led to this last Renaissance. And that was, somewhere around a 40 X practical improvement, different, maybe a little North of that, depending on how you did it. But I actually think it's going to happen very much the same way, because the thing is that.

[00:39:10] We changed our technique, GPS empowered the first leap. And then we started to change our techniques, to match GPS, such that we could push those multiples higher and higher. I think exactly the same thing is going to happen because most of the things that we do today in AI, they don't have obvious quantum improvements.

[00:39:27] It's not to say we won't find them. Obviously it's a very far future path. I can't make any assertions about them, but I think it's drastically more likely. That we're going to find interesting approximations of the ways that we're doing this that are better fit to whatever quantum architecture we end up developing and in doing so there are some classes of problems in which again, if we can hit that a hundred X, like two two orders of magnitude, I think that's where you really have potential for a breakthrough.

[00:39:54]And obviously more and more is better, but I think it starts with that.

[00:39:56]Trond Arne Undheim, Host: [00:39:56] Towards the end here, a lot of what you do is within the broad category of NLP, which is another sort of famously kind of difficult concept when you really start rounding it up. And it has to do with language and understanding language and human language specifically, arguably.

[00:40:14] If a machine could render human language and understand human language much better by I guess, to use your language by orders of magnitude, then that would in and of itself be the kind of progress you're looking for. How probable is that and how much does it really have to do that compute versus to do with really a new mental model for how we translate our own language to a computer?

[00:40:38] So I

[00:40:38] Slater Victoroff, Indico: [00:40:38] actually, I think. I'm going to argue that has just happened and has shown a light on the answers to those questions, which, I think in all cases there may be not quite as optimistic as we hope the first time around. So I'm just going to point to GPT three, right? Because from a modeling perspective, the Delta between GPT two and GPD three is minimal, right?

[00:40:59] Really GPT three is an exercise in scale. And GP T3 really dated from a compute and data, push the boundary of what we consider possible from a scale perspective and what GPT showed, which again, I think it underscores this so much, is that. Just scaling the method up, resulted in drastic improvements. In fact, such that in a one and there's a big, but coming, I will say.

[00:41:26]So even in, one and two and four shot learning scenarios know just much better than however, they also cited the fact that our current prom framings are woefully insufficient to test this language, understand it. And I think that's really the key to it is that I think that before I saw really sophisticated language understanding in a sort of brain in a jar type way, I was a lot more excited about it.

[00:41:50] The more that I realize it, the more that I realized that there's actually no such thing as understanding human language in a vacuum. And the second thing you give that to someone, they are going to push on edges around Oh, like it should have opinions about things, it should be able to understand some physical reasoning about the outside world.

[00:42:08] And I think those are all completely valid, but it's also why I say that. I think we have pushed language really far. I think we've accomplished incredible things there. And I think that our ability to model language is. Far outstrips our ability to ask demanding language problems. And I think that's where this problem is most acute.

[00:42:29] Is that we need better language problems, right? This is where the clever Han's effect came from, is we need people creating data sets, really using this detailed supervision that is linguistic and also things that really importantly bring language into the broader context, right? Whether that's chat conversations, just understanding the thread, right?

[00:42:48] Whether that's, Reddit comment threads is, that's something. Anything in that direction, obviously for us, that's bringing positional information in studying the language of the document and the position of the document, but they're all, again, those are all fringes in that surface area.

[00:43:01]These are all really, I think, powerful areas of exploration.

[00:43:05]Trond Arne Undheim, Host: [00:43:05] Slater just for the benefit of my listeners and myself GPT, what does it stand for again and GPT one, two and three. Can you just line up roughly when these things happened and why they're relevant to what we were just talking about within NLP, because people have different levels of awareness of all these things and not everyone sits there and waits for GPT three to be

[00:43:25] Slater Victoroff, Indico: [00:43:25] issued.

[00:43:26] It's very fair. GPT stands for generative pre-training. And it's, GPT one, two and three, that's a series of models produced by open AI that fundamentally all centered around this idea of if we feed just a huge amount of text to our model and ask it to predict the text that comes forward and, understand it from a kind of semantic and grammar perspective, that is enough to give us a true intelligence, if you will.

[00:43:50]And so that's the notion of this whole GPT string of paper. Now the lead author on GPT one and two, and the, like father, of GPT, Alec Radford, he's, a close friend and a founder of mine actually. But then maybe another note for folks that might have, attract this part of the news a little bit more.

[00:44:04] One thing that's really interesting is that, and this is the tragedy of the name GPT honestly, is it's a terrible name, right? It's super not catchy, right? GPT three. Maybe it rhymes a little bit. Maybe that's why that one got picked up. But GPT one and Bert came out at almost exactly the same time for came out just a little bit after GPT one.

[00:44:23]And the two papers were so similar that Burt actually had to put a pause on their publishing to add, five paragraphs throughout their paper being like, okay, but here's why it's different from GPT. But again, it was just a parallel discovery, right? Is they both hit on this really great idea or had, an improvement in some ways But, I think the braining success of GPT one versus Burt and goes to show just how, really it goes to show how much the ag community loves our Muppets.

[00:44:49]I think maybe that's the lesson for the whole thing.

[00:44:51]Trond Arne Undheim, Host: [00:44:51] It reminds me, I will be mixing metaphors slightly here, when we had mapped the human genome quite a long time ago, right now, some people who weren't in the space and some people who were very much into space said, it's all over.

[00:45:04] Now we have mapped it. Everything's going to change. And guess what? Decades later, Yes, everything has changed because we have, and now we have applied it to vaccines. Everybody has felt it, what this change means and that synthetic biology field, and many improvements have happened, but no, they didn't happen overnight.

[00:45:25] And they required much additional modeling, thinking creative thought. And you just had to. Experienced time.

[00:45:34] Slater Victoroff, Indico: [00:45:34] I think that's exactly right. Just like mapping the genome is. It's very important for us to, take a beat. We should recognize yes, we have made progress. This is awesome.

[00:45:43]Like Pat ourselves on the back, but really the only thing we've done is, we've turned over the next card and we see the next, six problems that we've got to solve. And now suddenly we've got an explainability crisis on our hands, right? We've got multimodal methods that we need to improve.

[00:45:56] We need to understand how, bronze and gold and silver data come together in the future. We have to understand how we're going to regulate these and make sure we're not selling snake oil. And I'm excited for all of these problems. I think that they show more clearly than anything.

[00:46:08] The fact that we are concerned with these issues, that AI is hitting the mainstream and these problems are not trivial, but they are solvable.

[00:46:16]Trond Arne Undheim, Host: [00:46:16] Slater we're coming to an end here, but one of the things that I really have come to appreciate about you in the very short time that I have known you is that, you may have started out a prodigy and prodigies can go several ways, right?

[00:46:27] It's a, you are in a certain sense better than people because you have some thought patterns that are divergent and arguably, you accelerate faster. But one thing that you haven't done is fallen into the trap of. Skipping the explanation step. You have been very pedagogical with me and with my listeners, and I find that a very attractive feature in, in, in smart people, because it's so important.

[00:46:50] And I find it perhaps one of the most important things these days is that because knowledge arguably is advancing and, but you have been so humble about the degrees to which it is also not advancing. And I think to keeping. Oh, keeping all of those things in mind and bringing us all along for the ride.

[00:47:10] Yes, it is slower. But explainability is important, and it may even be important to your colleagues know, and I think it's a great compliment that I wanted to give you. And it's so important. And I think I would even argue that I don't really want the kind of genius that doesn't explain.

[00:47:27] Themselves, because, life is too precious for that. And, we may be making progress, but you got to string people along. So thank you for stringing me along. And I wish you all the best in, in, in taking this further, in your way. And I hope to have you back on the show some little while ahead, when we have new problems to talk about and

[00:47:46] Slater Victoroff, Indico: [00:47:46] deconstruct, thanks so much for having me, it was a total pleasure, very happy to, do it anytime.

[00:47:51] And no I totally agree. The only I'll have one gripe, which is that I don't think that I'm smart. My view is that. The whole of knowledge is so large that none of us know anything. We're all running around here. So the least we can do is help each other where we can.

[00:48:06]Trond Arne Undheim, Host: [00:48:06] Thanks.

[00:48:06] Anyway,

[00:48:08] Slater Victoroff, Indico: [00:48:08] love total pleasure. Thanks so much for having me again.

[00:48:12] Trond Arne Undheim, Host: [00:48:12] You have just listened to episode 96 of the futurize podcast with host thrown on at home futurist and author. The topic was practicing multimodal AI. In this conversation, we talked about how Slater was picking trash off of the Wellesley dump for school engineering projects loves Chinese fantasy fiction, had an experience with Techstars, founding a startup, and how startups can indeed beat juggernauts like IBM that spend billions of dollars.

[00:48:43] My takeaway is that the secret to making money with today's AI technique seems to lie in blending various approaches. Being able to handle a myriad of data sources and meshing it together without losing the context and stumbling along, making predictions. That makes sense, even though the underlying dimensions are seldom understood using transfer learning approaches, I would personally hope we would get a few steps further sooner.

[00:49:11] So the explainability also increased. We will get there soon enough, I guess let's just see if the technology is weatherproof and whether we can get there without another AI winter, I find it refreshing to talk with smart people who are also humble. That's why my bet will be on folks like Slater to build these systems.

[00:49:32] For the future. Thanks for listening. If you liked the show, subscribe@futurize.org or in your preferred podcast player and rate us with five stars. If you like this topic, you may enjoy other episodes of futurizing such as episode 74, AI talent, diversity episode 79, futuristic AI, or episode 48. The future of AI in government futurist preparing you to deal with disruption.

Practicing Multimodal AI

Slater Victoroff

Listen On

Featured Episodes

USA Episodes

Founders Episodes

Emerging tech Episodes

AI Episodes

Entrepreneurship Episodes

Recent Episodes

USA Episodes

Technology Episodes

Trends Episodes

Founders Episodes

Emerging tech Episodes

Future of work Episodes

AI Episodes

Social dynamics Episodes

Entrepreneurship Episodes

Business forces Episodes

Globalization Episodes

Books Episodes

Learning Episodes

Society Episodes

Science Episodes

Workforce Episodes

Consultants Episodes

Regulation Episodes

Consumers Episodes

Foresight Episodes

Browse episodes by category

New to Futurized podcast?