ASU+GSV 2026 Summit | Monday, April 13, 2026, 2:10 pm-2:50 pm | Sponsored Partner Programming
Speakers
- Charles Westrin, BCG U
Key Takeaways
- Victor Riparbelli, CEO and co-founder of Synthesia, presented the evolution of AI video from simple avatar-based content creation to interactive "Video Agents" that transform learning from passive consumption to two-way conversation.
- Founded in 2017, Synthesia now works with 90% of the Fortune 100 and has 700 employees globally.
- Riparbelli announced two major product launches: an AI Assistant that can generate near-complete videos from a prompt, PowerPoint, or PDF, and Video Agents that enable real-time conversational interactions within video content.
- The first Video Agent product, called "Skills," enables practice-based learning through role-play scenarios -- for example, 3,500 sellers at a major company practiced sales conversations with AI agents and showed measurable improvement after just 2-3 attempts.
- Riparbelli drew an analogy between the transformation from newspapers to websites to argue that video is about to undergo a similar reinvention from one-way broadcast to personalized, interactive experience.
Notable Quotes
"The problem with training is that 50% of it doesn't work, but you don't know which 50% doesn't work."
— Victor Riparbelli
"What would video look like if you had to invent it in 2026? The answer is clearly not one-way broadcast video where you record it once and everybody watches exactly the same video."
— Victor Riparbelli
"Up until now, I can tell you that someone has watched the video, but I can't guarantee they've actually understood the content. For all it matters, it could be that they're scrolling TikTok while they're watching the video."
— Victor Riparbelli
"We set out on a very audacious mission to change how we create video, not by inventing better cameras or better tools for collaboration, but by actually building AI technology that could replace the need for cameras, microphones, studios, actors."
— Victor Riparbelli
Full Transcript
Good afternoon, everyone, and thank you for that introduction. I don't think I've ever been introduced with my background as a juicer before. I'm not here to talk about juice today. Today we're going to talk about AI video.
My name is Victor, I'm the CEO and co-founder of Synthesia, and today I'm excited to share a sneak peek into some of the products we're going to be launching this year, which is all about transforming the digital learning experience from being around consuming content to much more of being in conversation, of course, using AI. Before we get into the details of what we're building, I would love to just give you a little bit of a background on how we got where we are today. So back in 2017, I founded the company with two professors and my friend Stefan. Mark Cuban was our first investor, as you can see here.
And we set out on a very audacious mission to change how we create video, not by inventing better cameras or better tools for collaboration, but by actually building AI technology that could replace the need for cameras, microphones, studios, actors, and all the things we need in the physical world to record a video. First, we thought we were going to make Hollywood films, and that got us very excited. After three years of trying to do that, we realized that the technology just wasn't really there. It was very immature.
It didn't really work. And back in 2017, AI video was a pretty fringe and pretty crazy idea to most people. We had our big breakthrough in 2020 when we launched the world's first avatars. We realized that rather than trying to build for Hollywood, where the quality requirements were extremely high and people didn't really want to change how they created video, there was a big market of people who worked corporate jobs in big and small companies who really wanted to make video, and especially they wanted to make video for education and training and learning, and they wanted to make videos of people talking to the camera.
And the avatar technology was the first time it was possible to simply just write a script and then generate a video based off that. Today, we've evolved into a full-blown AI video platform that owns the lifecycle of how you create a video, all the way from idea to deliver that video to a final person. Avatars is essentially these kind of digital people. They can be based on a real person.
You can create yourself in something like five minutes with just a quick photo, or you can use a library that we have inside the platform of actors that you can use. You can customize them, put logos on them, and make them look like they're from your brand. And essentially what they did was just provide a way for people to create video without having to use a camera. And getting from that idea to a video turned out to be extremely powerful back when we launched it in 2020.
We quickly realized that the people who were using Synthesia weren't really video experts. They didn't know how to use video editing tools like Adobe and all these things you have to train quite a lot for. Most of them were more PowerPoint users than they were video editors. And so we built a platform for those people.
We made it really easy to generate with the AI models the actual videos that you're seeing up here, edit it by adding in animations, screen recordings, and all the things that make a video a video, collaborate with your colleagues, translate the videos into 120 different languages with a click of a button, and finally deliver that video in our video player, which is proprietary and supports things like interactivity. You can ask multiple choice questions in there and buttons and make the experience come to life more than just a normal video. Today, we work with 90% of the Fortune 100, with 700 people globally, and we've built a really strong business based on helping people make better video. This year, there's two things that's very exciting.
And they, of course, all revolve around using even more AI to communicate with employees, students, or whoever you're making content for. The first thing we're launching is our new Assistant. And you can kind of think of the Assistant as being a video editor that works kind of like a real human editor. You give them a task through a prompt, and then you can iterate with them until you have a video that you're happy with.
You can either just put in a text prompt like you know it from most of the AI systems you use today, you can give it a PowerPoint, a PDF, a web link, or something else. We'll then take the content for you, we'll write a script, and we'll also design the video in your brand colors and your typography. It's actually getting so good that I think in the next couple of months, we're going to be able to create videos that are 90% done. We even generate motion graphics that are specifically made for that script, and it works phenomenally well.
Once you've made the video, you can then chat with it, like you would chat with a chat GPT. You can ask it to change something in Scene 3, either the script or the visuals. And this essentially blocks the gap between who can make videos, right? What we really wanted to do is that everyone can make a video as easy as you can write an email or make a document or a slide deck.
That gap is rapidly narrowing. The other big idea, the other thing we're launching this year, is more fundamental around what a video actually is. When we founded the company back in 2017, we founded it based on two big ideas. The first big idea was that AI is eventually going to be able to generate content, not just analyze patterns in data, which is what most AI was used for back in 2017.
Once we can generate content with AI, the marginal cost of creating content is going to drop to almost zero. Not just in how much dollars it costs to create a video, but also in the time it requires to create a video and the skills required to create a video. And those of you who have used Synthesia or some of the other AI video platforms out there, you'll know that we're right in the middle of this. You can literally go in, and in five minutes you can create something that looked like it was shot by a Hollywood studio.
That's very fundamental. That's a huge transformation in how we communicate. The other big idea was that when we as humans invent new media technologies, we always invent new media formats that go with those technologies, right? The media formats don't stay the same.
And so, for us the question is, what would video look like if you had to invent it in 2026? With all the technologies we have around us now. We have large language models which can provide us intelligence on tap. We have offline video models that can produce very high-quality clips.
We now have real-time video models at Synthesia where you can actually talk to an avatar like you're on a video call with an avatar. You can have a conversation. And so many other technologies around us. And the answer here is clearly not one-way broadcast video where you record it once and everybody watches exactly the same video.
That's not how it would look if we were to reinvent it today. I always love to find analogies in history of where technology has changed, how we consume information. And what I always go back to is, if you think about physical newspapers, that's kind of what video is today, right? With a physical newspaper, you would make a newspaper.
Then you would print a lot of copies. You would send it out to people. They would all read the same newspaper. And the next day, they'd get the next newspaper.
We reinvented the Internet and websites. The first websites, they kind of just looked like a newspaper on a screen because that's kind of what we could imagine, right? As technology has evolved, as we learned how the Internet and computers work and what we could do with it, it turned out that websites and newspapers are very different, right? On a newspaper, as on a website, you can personalize the content to every individual viewer.
With one click, you can access a trillion different pages. You can have a comment section. You can show video. You can show audio.
It's a very, very different thing, a website, than a newspaper today. And the same thing is going to happen with video. This year, we're launching what we call Video Agents. Video Agents fundamentally change how we interact with videos, and it takes it from being a one-way broadcast to actually being a two-way experience.
Video Agents can be inserted into the videos our customers create. You can combine scripted content with Agents, and they can take on a whole bunch of different roles, depending on what kind of content that you're creating. One of the first use cases we're launching is all around training. When we sat down and thought about how training evolved and how does that intersect with this new Video Agents product, we found this study by Rice University, who said that there's kind of three key ways that we learn.
It's about informing, demonstrating, and practicing. In the corporate world, we're pretty good at informing. We create lots of text documents you have to read, videos, now it's easier, slide decks. There's lots of content about what you should learn, right?
We're also pretty good at demonstrations. We can go and watch live calls of other salespeople who've done it before, and we can kind of consume a lot of content that prepares us for the real world. But it's still very difficult to practice. If we take the example of running a sales team in a company, for example, when someone new joins the company, they consume a lot of content, and then they do go through some practice with their manager, where the manager will role-play to be a customer, and you can try and have a kind of conversation.
But obviously, that's pretty limited by the manager's time. So when we thought about where should we apply video agents first, we really like this problem of helping people practice real-world skills with an agent. And so Skills is our first product that uses video agents. Skills is essentially a way for our customers to build interactive trainings, combining all the video technology we have in these here today.
We can create great scripted content. You can have quizzes, all the things you do in most learning platforms today. But you can also add in these agents that can have a conversation with someone. So if we stick with the sales example, what you can do now is when you roll out a new product, you can have people consume kind of the scripted content.
Maybe that's 10, 15 minutes of video explaining the new product, the positioning, and so on and so forth. And then we can send them into a conversation with an agent. That agent can pretend to be a customer. And so you have to answer product questions.
You have to overcome objections. You have to build empathy. You have to prove that you can set up next steps and all the sales skills you need to be successful in real life. And because the intelligence layer is now so good, it's absolutely incredible how realistic this actually is and how well it works to practice your skills before you go out into the real world.
We rolled this out for 3,500 sellers in one of the world's largest and most valuable companies recently. And seeing how people improve even after just doing a role play two or three times is phenomenal. So for the learner, it makes a lot of sense, right? You can practice.
We all know that when you practice something in real life, you learn it better than if you're just reading about it. But for the managerial and executive layer, this is also very interesting because up until now, the way we work with our customers who create lots of videos for training is I can tell you that someone has watched the video, but I can't guarantee they've actually understood the content, right? For all it matters, it could be that they're scrolling TikTok while they're watching the video. Maybe the content just went over their head and they didn't actually get any of it.
I can only tell you if someone watched the video. That's a problem with a lot of training. I met with an executive recently who told me that the problem with training is that 50% of it doesn't work, but you don't know which 50% doesn't work, right? With this new product, because the way you set up a skill is you go and you define different skills, you teach the agent how to evaluate it, you put in a prompt, we can actually begin to build a data set of your team and we can show you how well-enabled people are on an individual level, on a team level, or on a geographical level.
That means we can surface insights like what you're seeing up here where our systems can look at all the data, all the learners, and they can tell you what are some improvements that you could make, what are you really good at. We can begin to build a graph over your teams and tell you what they're good at, what they're less good at, and this, of course, makes it much, much easier for our customers to deliver training exactly where they need rather than just stroking with a broad brush. Skills is the first product that we're launching using this new Video Agents technology, but there's a lot of other use cases that we're very interested in that we're going to be releasing later this year. Things like doing interviewing of candidates, a first screening where they talk to an agent which can give them a case study and have a talk to them.
We're excited about doing things like pulse service. You could go out with your team and you can actually interview them instead of just sending them a long survey they have to complete. They can talk to an agent for 15 minutes. I'm actually doing this with my own team right now, but we've built one of these agents, and everyone in my company is going to have a 10-15-minute conversation with the agent.
They're going to talk to the agent around how they use AI, how they would like to use AI, what problems do they have with using AI in the company. And then we can get all that information from all 700 employees in my company, and we can do that in the span of a couple of days because we don't need humans to interview them. And then we can, of course, also use large language models to summarize all those learnings and help me with a plan for how I can help accelerate AI adoption in my own company. There's going to be so many things these video agents can do in the future.
We're also going to be launching it as an API. If you want to build your own app around a real-time avatar you can talk to, that's going to be a possibility. And I'm very, very excited for playing a part in inventing what I think is going to be one of the most important interfaces of computing in the future. Thank you so much for having me.
Hope to meet a lot of you on the floor, and I hope to see you on Synthesia at some point. Thank you very much.
This transcript was put together by our friend Philippos Savvides from Arizona State University. The original transcript and additional summit resources are available on GitHub. Licensed under CC BY 4.0.