Transcript
Vlad Voroninski [00:00:00]:Using generative AI and combining it with deep teaching, we're able to essentially take any of those real data sets that we've collected or sourced and then train world models that can dramatically expand the scope of those data sets. Because we're able to get very high fidelity simulation, essentially closing the gap between real data and simulated data, we can get orders of magnitude more variety in those data sets. So it's as if you drove a thousand times more than you did.
Daniel Darling [00:00:39]:Welcome to the Five Year Frontier Podcast, a preview of the future through the eyes of the innovators shaping our world. With short, insight packed discussions, I seek to bring you a glimpse of what a key industry could look like five years out. I'm your host Daniel Darling, a venture capitalist at Focal, where I spend my days with founders at the very start of their journey to transform an industry. The best have a distinct vision of what's to come, a guiding North Star they're building towards, and that's what I'm here to share with you. Today's episode is about the future of autonomous driving and robotics. We cover simulating physical environments, helping car makers leapfrog, Tesla, real world AI models, the economics of self driving cars, autonomous minds and the future of autonomy. Our guide will be Vlad Voroninsky, CEO and co founder of Helm AI. An early pioneer in training AI for autonomous cars, robotics and aviation, Vlad has been instrumental in shifting the industry towards advanced simulation and synthetic data, which is a radical alternative to the trend transitional reliance on large scale manually labeled training sets.
Daniel Darling [00:01:40]:Helm has raised over 100 million from top investors including Freeman Group and Goodyear Ventures. And its AI software powers major automakers like Honda and Volkswagen, helping them to build next generation self driving systems. Vlad was a UC Berkeley PhD and a former MIT instructor and author of over 20 research papers on machine learning. He's been at the forefront of unsupervised learning, an approach that he believes will define the future of AI. Before founding Helm, Vlad was chief scientist at Sift Security, a machine learning cybersecurity startup that was later acquired by Netscope in 2018. After that he set out to build Helm AI, rethinking about how AI could train autonomous systems. Hi Vlad, nice to see you.
Daniel Darling [00:02:22]:Thanks for coming on and talking with me today.
Vlad Voroninski [00:02:24]:Thanks for having me on Daniel.
Daniel Darling [00:02:25]:For years we've been talking about self driving cars being just around the corner, yet they still feel like they belong to this kind of future scenario. But something about this year does feel different and has sparked expectations that the future is actually arriving now. What has changed and why are we finally at this inflection point?
Vlad Voroninski [00:02:45]:There's been a number of innovations, I would say firstly, really tackling problems in unsupervised machine learning. That was a pretty big on lock. That's something that we focused on from the very beginning. That was actually the very first thing we started to work on. And we trained in 2018, the world's first foundation models for semantic segmentation, which is a difficult computer vision problem. So that allows you to essentially unblock yourself on data, right? So you can actually train on very large data sets and that allows you to get to kind of optimal accuracy for any particular neural network. That was a pretty big unlock and that was actually pretty early on from our perspective when working on certain perception problems. And then another big unlock came from the world of generative AI.
Vlad Voroninski [00:03:29]:Really there it's a question of certain types of architectures that started to mature. As it turned out, if we combine generative AI architectures and kind of innovate upon them to make them kind of more capital efficient and then combine that with certain advances in supervised learning, like our deep teaching approach at Helm, we're able to essentially perform AI based simulation in a very accurate way where you can actually close the gap between real data and simulated data. And that really allows you to tackle the tail end of corner cases in a much more effective way than before.
Daniel Darling [00:04:02]:This deep teaching technology innovation that Helm does is really unique to the industry and does revolve around these kind of simulated training environments for automotive. Can you run us through a little bit how you do these simulations?
Vlad Voroninski [00:04:17]:The basic idea is that you can take any kind of sensor data, let's say video data or LIDAR or radar, and treat it as basically just a sequence of images or other types of sensor data in any given timestamp. And so it kind of evolves through time in a way that's sort of similar to how let's say you're forming a sentence and you have a sequence of words, right? Basically what you're able to do is take that kind of data and train on it to essentially make predictions about, you know, let's say given some number of observations or some number of images that are in a temporal sequence, what's going to be the next image in a sequence, right? So you're basically using kind of the temporal evolution of the world as a modeling mechanism in terms of prediction. It's actually a pretty difficult problem just in terms of the dimensionality of it, right? Because images are so high dimensional, you know, they're kind of Much more high dimensional than words. Right. It becomes a pretty computer intensive task. But it turns out that you can use generative AI to essentially train models that are able to make those kinds of predictions in a certain very high level sense. It's, you know, similar to LLMs, but it's dealing with kind of the physical world. And so these models are able to not only model the way things look and really kind of get that accuracy in terms of realism, but also start to understand sort of basic properties about the physical world, like object permanence.
Vlad Voroninski [00:05:41]:Right. Something if there's an object in a scene, it shouldn't just disappear at random or shouldn't appear at random. Right. It really has physical consistency to it. Other aspects of behavior of agents like vehicles or pedestrians or what, or whatever you're trying to model. It's very interesting to see kind of how the models evolve. And so again, similar to what we've seen with the large language models, where as you train them on more data with larger architectures, they're kind of tapping into different levels of understanding. And so we're seeing something very similar happen with what we call world models, kind of world simulation models.
Vlad Voroninski [00:06:16]:We've done a lot of optimization on the architectures, changing them and innovating upon them. And combining that with our unsupervised learning techniques, which essentially accelerate the training process, we're able to tap into better scaling laws and actually train these world models with significantly less compute than what's typical. And what you end up with is basically these kind of world simulators that can model the entire autonomous vehicle stack, from the vision data to the LIDAR data, to, you know, the perception stack, even the physical position of the vehicle. You know, you can use these models to, for example, simulate rear corner cases. You can even reduce your data collection costs because the models can tell you, you know, given real vision data, what does the LIDAR data look like? So you can actually simulate LIDAR for you based on vision data. And in effect, because these models are predicting out into the future and giving you essentially the potential scenarios for what might happen next, including the motion of all the other agents as well as the eco vehicle motion, they technically act as self driving systems, so you can actually technically use them for the purposes of self driving, provided that you actually get them to run in real time on a car.
Daniel Darling [00:07:27]:Essentially you have Tesla in the lead for the industry, but it has orders of magnitude more data that it's been collected through its visual systems for all the years it's been on the road. And for all the amount of vehicles it's had on the road. And then you have the traditional automakers like Honda, Volkswagen, et cetera, that you guys work with, close that data gap. And so does this kind of simulation technology enable them to catch up in terms of being able to offer autonomous systems to their cars without having all of that data collection advantage that Tesla has?
Vlad Voroninski [00:08:02]:Yeah, absolutely. That's really one of the key points of this technology. And Tesla's been investing into their software defined fleet for over 10 years now. And they have millions of cars collecting potentially like trillions of Images of like 400 case data per year. And yeah, if you're a traditional automaker, you just have not been investing in that nearly as much, although they all are doing that now. But I mean, the point is, if you're solely relying on real data and you wanted to catch up to Tesla by taking the same approach as Tesla, then yeah, you would be inherently many years behind. But via the advent of these generative AI technologies just in the last couple of years, and really in the last year I think it's been shown, and we played a part in showing what's possible with actually pushing the quality of simulation in terms of fidelity and comparing it to real data. You can basically, instead of having a real fleet, rely on essentially like an AI enabled virtual fleet.
Vlad Voroninski [00:08:55]:And that has a lot of advantages compared to real driving data. Because if you're just relying on the real data you're collecting, even if you have a Tesla sized fleet, the issue is that as the system improves, your definition of corner case kind of changes, right? So as your disengagement rate gets really good, that means that actually the rate of data collection slows down and that's not a really a good property. That means that the better you get, the slower you are developing. If instead you are able to leverage simulation and it's truly as high quality as real data, then you can focus all of your efforts and kind of like your computation on simulating those difficult corner cases. And that becomes, you know, much more capital efficient. It's faster. It doesn't have that kind of exponential issue of kind of becoming more and more expensive. It's actually just pretty much becomes linear.
Vlad Voroninski [00:09:45]:It's like for any given corner case, you can just simulate it for the same cost. You also get to avoid liability, right, because they're not actually putting humans in cars and it's just a lot faster and cheaper. So it is kind of very interesting development for the automotive industry and I think that it really will enable automakers to Potentially leapfrog what Tesla's doing so.
Daniel Darling [00:10:07]:Bullish on the ability for the rest of the industry to catch up using some of these advances. That's fascinating. And what about learning from the actual drivers themselves and how they're driving isn't a big part of how Tesla's AI is operating around learning the habits of drivers and how they react to different situations. Are you able to also mimic that component in your virtual fleet?
Vlad Voroninski [00:10:29]:Yeah, so, I mean, you certainly get to learn from driving behaviors from a certain amount. I'm not saying that you cut out real data entirely. Right. You still want to have definitely, you know, some fraction of your training happening on real data. But the thing is there are going to be these really rare situations which again might not be accessible even on a large fleet, where the way that you react just might not be available, even coming from like a very large data set.
Daniel Darling [00:10:55]:When you think about this catch up or this sort of rollout of this kind of autonomy, you know, there's a lot of speculation around when we'll start to see the first kind of autonomous vehicles on the road. It's always been said that the technology is advancing before the regulatory and the rollout. And from this discussion here, it's clear that we've seen a really big leap in the last couple of years for all of the automotive industry capabilities to start to adopt this. So how do you see this starting to be put into practice and when do you actually see the rollout happening for fully autonomous vehicles?
Vlad Voroninski [00:11:29]:In a lot of ways now, I think it's more of a cost question in a sense that if you're given sort of a set of physical constraints, like, okay, you have a vehicle with a certain number of sensors, you have a compute platform that gives you some kind of, you know, physical capability or physical limitation. Right on. What's the most software can do? You don't know until you actually create the software what exactly that limit is. But the ability to create an optimal AI based software system for that set of physical constraints is now becoming very possible. Right. So for example, at helm like we have for us kind of developing toward L2 versus L4 looks pretty much the same. The pipeline is the same. And it's really just a question of, okay, look, how much compute do we have access to, what sensors do we have access to that kind of defines what kind of neural networks you can actually use in terms of how the size of their neural networks, their capacity, and then we have the ability to address kind of like the data layer of the machine learning problem and then train the optimal neural networks.
Vlad Voroninski [00:12:29]:And then it becomes sort of a question of, okay, well let's say you reached kind of like some hypothetical limit that nature has set and what's possible for software to do in that context, is that going to achieve L4, you know, what's the level of safety that you get from that? And that comes back to a cost question, right? You feel, oh, well, if I would have put more sensors on this or if I would have put more computer on this, it could push the safety further, but that's going to increase my cost. So ultimately there has to be a business model that works for things to actually scale. I mean, L4 deployments have already occurred just at a small scale. It's really just a question of what will it take for that to actually become truly commercializable. And it really comes down to a cost question. My main point is that technology is no longer really the bottleneck. I mean, I think that it's just at least on the software side, I think you can basically build the optimal software stack given any kind of physical consideration. But then it becomes a cost question.
Vlad Voroninski [00:13:23]:The second piece is regulatory, where if you're asking yourself the question, okay, well, I've built some kind of self driving system, it has this level of safety. What qualifies it as an L4 system that I can actually put out there as a scalable product? Well, a large part of that is just what are the regulations, you know, how will the system be interpreted by government agencies, by the public, et cetera. Right. So I think that getting clarity on that is critical. Right. And so like for example, there's discussion about some potential for kind of a federally mandated framework that might clarify what a rollout could, could look like in the US Right, where it's not kind of state by state like it is now. I think historically, if you're going to look at other industries, right, like, like aviation, you had sort of the wild, wild west of aviation initially and there were a lot of kind of false starts there. And then the Geneva Convention put in place liability caps and made it very clear what are the requirements for an airline fleet to actually be commercially and legally viable.
Vlad Voroninski [00:14:25]:I think something similar has to happen for self driving cars for a fully scalable rollout to occur. Because companies can try to scale, but it really might not end well. As we've seen with Cruise, right. There's no guarantee that just scaling something and calling it all four will lead to success.
Daniel Darling [00:14:43]:One of the things about this future is that the infrastructure in and around vehicles, roads. Sounds like it will need to change. And you know, Uber's founder Travis has talked about repurposing parking garage for fully autonomous charging and repair stations. How will infrastructure need to evolve or how do you see it evolving in the next five years to start to support this far more autonomous landscape?
Vlad Voroninski [00:15:12]:Yeah, that's an interesting question. I mean I think like if you, if you look back to the earlier days of self driving, there was definitely a lot more thinking around infrastructure and you know, there were these questions about oh, should we have like cities with like a special self driving lane or other things embedded into the infrastructure to make self driving easier. In my opinion that's not really. Yeah, I mean unless it's absolutely necessary, I think that it's better to be avoided because I think that first of all it can be a pretty big investment and essentially I think that those approaches that lean heavily on that kind of infrastructure will just be leapfrogged by AI systems that you don't need that. And so I just think it could just be a waste. My main point is I think that really the key thing to address is what's actually happening on the vehicle. Is the vehicle able to make the correct real time decisions? And I think infrastructure is like a crutch if you're trying to somehow help the vehicle. I just think it's not going to be realistic for any kind of world scale rollout.
Vlad Voroninski [00:16:14]:Like you're not going to be able to get buy in from every single government to change the infrastructure. And again there will be a solution that doesn't need it. So you might as well kind of skip that step.
Daniel Darling [00:16:25]:We've talked a lot about self driving cars and that gets a lot of attention. But what about autonomous trucking?
Vlad Voroninski [00:16:31]:Yeah, autonomous trucking definitely like a very interesting area and I think it's a mixed bag basically because you get kind of this in some sense an easier environment, simpler environment let's say. But that's a bit of a red herring because sure, the road itself is simpler, but the set of things that can still happen on a highway, there's still a very, very long tail end, right? You can still have a human run out on the highway and that's incredibly dangerous because you're, you know, going super fast. There could be all sorts of obstacles. And because you're going so fast, right. And if you're on a truck, that just creates a lot of potential risk, right. I mean you're basically just have like so much momentum traveling at high speeds on a truck, which means like Your stopping distances are just going to be much, much longer, which means your visibility has to be much better. Like your ability to see far out has to be very good. And that really puts a very big strain on sensors.
Vlad Voroninski [00:17:25]:Like lidars, for example. Like currently sort of the standard way or basic way, I guess like to say, like even get to like L3 capability is to combine vision and LiDAR for redundancy purposes, right? But usually those systems operate only up to a certain speed. And in part that's because LiDAR systems don't really have the range to go beyond that. And so if you're doing trucking that problem gets much worse because you're again, your stopping distances are much longer and so you're very heavily constrained by things like lidar. Then you have to I guess like really lean very heavily on vision, which means you have to invest a lot more into compute. Now on the upside, if you have a truck, it's expending a lot more power. So you could technically afford a lot more compute on a truck, right? I mean if it's like 10 to 20 times more than a car or whatever. So there's kind of like things that help you and then things that really make it more difficult.
Vlad Voroninski [00:18:19]:So I definitely don't think that like trucking is somehow easier than robo taxi. Like I just don't think that's true. I think the, you know, jury's still out on that. I could definitely see like slow speed urban autonomy being happening before kind of full fledged like high speed highway trucking.
Daniel Darling [00:18:36]:Are there other pockets of industry that may be adopting autonomy? You hear for example, areas of supply chain or places like mining development, developing a lot of autonomy. What are you seeing at the frontier of other industries?
Vlad Voroninski [00:18:48]:There's a pretty wide spectrum of adjacent industries, including mining, construction, agriculture, et cetera. Mining is actually an industry that's already had automation, including technically L4 automation for some number of years. In part because you get these much more kind of constrained environments that are not really on public land. And there's kind of an inherent. If you're at a mining site, it's like everybody knows a mining site could be dangerous, right? So it's like you just know that it's much more controlled access environment. Autonomous mining has already been achieved using simpler methods like HD maps, lidars, GPS based localization. It's already provided some value, but there's a lot more that you need to do in order to get it all to its full potential. Similarly to self driving, like ideally the mine itself doesn't have to change, right? The only thing, like ideally it's not an infrastructure update, it's purely whatever is happening on the actual vehicle, right? So you outfit the vehicle with auxiliary sensors, like vision sensors, lidar sensors or whatever.
Vlad Voroninski [00:19:48]:Basically there's an AI system that is able to interpret what's happening at the mine, can navigate, can infer, like, hey, this is a dangerous area, don't go there.
Daniel Darling [00:19:58]:We're also at this breakout moment for robotics, and your AI simulation technology could also play a role in accelerating this. Are we at this tipping point in the capabilities of robotics and what are you seeing in terms of simulation for that industry?
Vlad Voroninski [00:20:13]:Yeah, absolutely. I would say like pretty similarly to what I was saying before about kind of like this chatgpt moment for self driving cars. Something very similar is happening in robotics. There's not that much difference between automating a vehicle for driving versus automating a robot for whatever task. I mean, there's just actually the AI technology carries over pretty directly, right? I mean, you have kind of different form, physical form factor for sure, but kind of mathematically it's the same thing. I mean, you just have a state space with somewhat different variables operating in a different environment with a different goal, but all the techniques essentially carry over. And from a simulation perspective, right, like with autonomous driving, we're taking real driving videos or other videos and basically training a system on that to understand how to actually have simulated data that evolves according to kind of physical laws, right? Something very similar to that. You can do robotics, right? In the sense that instead of taking, let's say driving dash cam videos or something like that, you take like put a GoPro and somebody performing some manual task, right? And then you can train world models on that, then basically try to leverage that to build kind of automation policies for robotics.
Vlad Voroninski [00:21:26]:So in a lot of ways there's a lot of similarity. I mean, I would say like, there's kind of two differences, right? So one is that, of course for automotive, like the physical form factor is incredibly mature, right? Like cars have been around for like over a hundred years. There's not really like that much variance on it. It's like, okay, well it's got wheels, got a steering wheel, like gas pedal, whatever, brake. For robots, it's more complex, the physical form factor. I mean, human body is pretty incredible in a lot of ways. And trying to mimic that or trying to approximate that in some way using different materials is like definitely challenging. But there is a lot of progress being made there.
Vlad Voroninski [00:22:01]:And you know, it seems like that's not really going to be necessarily the bottleneck at this point. And then the other difference, which is actually I would say a positive thing for robotics is that you're dealing with tasks that are not as safety critical, right. Like when it comes to driving, you have, you have to be able to react like in the nick of time, right. Like if you have your action, time has to be like less than 200 milliseconds and you have to like make the correct decision. And if you don't, it has dire consequences. I think for like humanoid robots, there are going to be many situations where that's not the case, where you can actually have the robot think more before it makes a decision. Right. If you have like a humanoid robot that's you're out at work and it's like basically its job is to clean up your house or whatever, like it's not a safety critical thing.
Vlad Voroninski [00:22:44]:I mean like, sure there are certain safety hazards that can occur, but for any given action it can like think more or connect up to like some external. It can actually connect up to the cloud and communicate with the cloud to figure out what to do. Whereas like a self driving car, like the latency requirements just don't really allow you to do that. It has to be done basically in real time on the vehicle. So there's kind of like the physical form factor for robotics is more challenging, but there's the compute constraints are not as severe than self driving cars. So I think it's kind of like upsides and downsides, but I think like directionally they're all heading toward like L4 automation. We're going to see some, yeah, definitely pretty dramatic improvements I think in both in the coming years.
Daniel Darling [00:23:24]:Well, look Vlad, we've come up on time, so thanks so much for chatting to me today. Sounds like 2025 is going to be an absolute fascinating year to watch for your industry as it really accelerates on this innovation curve. So thanks for shedding the light on what's ahead.
Vlad Voroninski [00:23:38]:Absolutely, thanks for having me.
Daniel Darling [00:23:40]:What a fascinating conversation with Vlad. The race towards autonomy is no longer just about collecting more data. It's about building an AI that truly understands the Helm. AI's breakthroughs in unsupervised learning and AI driven real world models are proving that self driving systems don't need Tesla scale fleets to compete. Instead, AI can simulate, predict and generalize from synthetic environments and unlock a future where cars, robots and entire industries learn like humans, through intuition, not brute force data collection. But as Vlad highlighted, the future of autonomy will be shaped as much by economics and regulation as by technology itself itself. The next frontier isn't just about making AI smarter. It's about making it scalable, cost effective, and ready for the real world.
And with these breakthroughs, 2025 really could be that year where autonomy stops being the future and starts to become the present. To follow Vlad and the work he's doing at Helm, head over to their account on X @Helm_AI. I hope you enjoyed today's episode, and please subscribe to the podcast to listen to more coming down the pipe. Until next time, thanks for listening and have a great rest of your day.
