
Voices of Video
Explore the inner workings of video technology with Voices of Video: Inside the Tech. This podcast gathers industry experts and innovators to examine every facet of video technology, from decoding and encoding processes to the latest advancements in hardware versus software processing and codecs. Alongside these technical insights, we dive into practical techniques, emerging trends, and industry-shaping facts that define the future of video.
Ideal for engineers, developers, and tech enthusiasts, each episode offers hands-on advice and the in-depth knowledge you need to excel in today’s fast-evolving video landscape. Join us to master the tools, technologies, and trends driving the future of digital video.
Voices of Video
Density, Efficiency, Power: Why VPUs Are Redefining What's Possible in Video Encoding
Nacho Mileo of Cires21 takes us on a journey through the evolution of video encoding infrastructure, revealing how Video Processing Units (VPUs) have transformed from just one option among many to an essential technology enabling the next generation of streaming solutions.
Founded in 2008 during the early days of streaming, Cires21 has always worked at the cutting edge of video technology. Starting with physical hardware encoders but maintaining a cloud-first mindset, they strategically positioned themselves to understand both worlds. This foresight has paid dividends as the industry evolved, allowing them to develop solutions that seamlessly bridge on-premises equipment and cloud implementations.
What's particularly fascinating is the paradox of video economics that Mileo describes. As encoding technologies become more efficient and cost-effective, the market expands rather than contracts. New use cases emerge, consumption increases, and what was once prohibitively expensive becomes accessible to more creators and distributors. Remember when "a terabyte of CDN was $500"? Those days are thankfully behind us.
The conversation takes a deep dive into Cires21's approach to AI integration, revealing how seemingly simple features like automatic video reframing or live captioning actually require sophisticated engineering with 12+ distinct processing steps. Their comprehensive AI toolkit includes captioning, dubbing with synthetic voices, content classification, and automatic highlights generation - all built with practical broadcasting needs in mind.
The crown jewel of their NAB showcase is their VPU-powered encoder that achieves remarkable density - 16 SDI inputs in a single rack unit. Through partnerships with NETINT and Akamai, they've created one of the first truly practical implementations of hybrid cloud for video encoding workloads. The same encoder technology runs both on-premises and in Akamai's Linode Cloud, all managed through a unified control interface.
Don't miss seeing this technology in action at NAB, where Cires21 will be demonstrating at NETINT's booth.
As Nacho emphasizes, this isn't just a PowerPoint presentation or future roadmap - these solutions are ready for immediate delivery.
Stay tuned for more in-depth insights on video technology, trends, and practical applications. Subscribe to Voices of Video: Inside the Tech for exclusive, hands-on knowledge from the experts. For more resources, visit Voices of Video.
Voices of Video. Voices of Video. Voices of Video.
Speaker 2:Voices of Video.
Speaker 1:Well, hello, I am Mark Donegan and we have a very special Voices of Video episode that we're bringing to you today. So NAB is just right around the corner Now. If you're watching this after NAB, then well, you know, this is what you missed. We are talking to very significant companies who are doing interesting things with BPUs, and it's interesting that BPUs have now switched from one of maybe a smorgasbord of encoding options and architectures to essential, and what we're going to hear today in this interview, in this special interview, is how VPUs are enabling a whole new generation, really, of video encoding infrastructure architectures that don't compromise quality but yet provide tremendous density, therefore energy efficiency and, obviously, cost advantages. So, with that introduction, I want you to welcome today Nacho Mileo from Cirrus21. Nacho, thank you for joining us.
Speaker 2:Thank you for having me. It's a pleasure to be here.
Speaker 1:Absolutely, absolutely. So you know, as I said in my preamble, there's a lot of interesting things happening right now in video and Cirrus 21,. You're going to tell us exactly what you guys do and the kinds of projects that you're in, and I know you're right in the center of it. So why don't you give an introduction for those who aren't familiar with Cirrus 21?
Speaker 2:Yeah, absolutely Well. Cirrus 21 is a company that pioneered, let's say, live streaming in Europe. We started 15 years ago, or even more than 15 years ago, doing encoders, on-prem physical hardware on-prem physical hardware, but with the mindset of this is going to be clouded at some point. This is going to the cloud at some point. So we started developing our encoders with CPU and then GPU, but when GPU jumped in, we stick to CPU because we knew at that point that when the cloud starts it was not going to be GPU based. So we stick to having CPU and GPU for many, many years. We built on top of that. We were born in a very, let's say, complex, or in complex scenarios, because our first clients were big broadcasters, big brands in sports. So you know technically difficult clients or situation, or schemes.
Speaker 2:And we were born into that fire. And after we stick, let's say, to CPU and GPU, we were able to do hybrid approaches, with on-prem equipment and cloud equipment working together. We built our orchestrator, which is live control, which is a product that talks to our encoders and does some magic in terms of what's capable on-prem and on cloud simultaneously. And in the last, I would say, year, or less than a year, we started introducing VPU accompanied by NetEnt, and we have seen very, very good results and some amazing leap forwards very very good results and some amazing, you know leap forwards.
Speaker 1:Yeah, it's great. We're excited about the work that you know we're doing with you, so for building these workflows. You mentioned that you're primarily working with broadcasters. Give us a sense. Are these broadcasters that you know have their traditional broadcast infrastructure you know, maybe it's satellite distribution over the air and they're, you know, also adding on OTT? Or are these broadcasters who are, you know, maybe discontinuing some of those more legacy approaches and going all OTT? What, you know, what's driving the work that you, the work that you're doing with broadcasters and streaming?
Speaker 2:We are working mostly with, let's say, big fishes, big broadcasters. I don't see them shutting down the classical TV for now, but they have very serious approaches when it comes to OTT. So they they have all at the same time and we are working on the OTT part with Uh, but then what we are seeing is that, due to the flexibility that that internet brings, uh and OTT brings, they may be doing, for instance, some special things that are only on the, on the OTT services, or they bring for certain sports, for instance, there's some content that's exclusively over internet. So I still see that there's a mixed approach where we are still sticking to classic TV, let's say, but then there's a ton of new things coming up with this. We have seen fast channels appearing in certain clients. We have seen that they are taking more advantage of how ads works in the streaming world. It's a living thing, but I don't see that we are shutting down classical TV yet.
Speaker 1:At least not yet. Yeah, yeah, Interesting. So let's talk about your solution. Why don't you tell us you know the company is roughly 15 years old, Is that?
Speaker 2:Yeah, we were founded in 2008. So, founded in 2008.
Speaker 1:So, yeah, okay, over than 15 years over over 15 years, early days of of streaming. Really, you know, it's hard to believe. I, I was uh, I I was building an instrumental or I was a part of a team building an instrumental video platform in the US that eventually got sold to Walmart, and you know, 2008, 2007,. In fact, the product you know debuted and it was like October, november, 2007. And boy, it's amazing how far we've come.
Speaker 2:Absolutely. Yeah, I still remember the days previous to, you know, even when platforms like YouTube didn't have live. Those things are not that old, yeah it really is.
Speaker 1:It's still a boiling, I would say industry and there's a ton of things going on in terms of there's always new challenges, right, you know there's a very interesting paradox which is often talked about with AI, around AI. But as the price of a new technology or a capability goes down, it has the opposite effect. You know, oftentimes maybe you know someone who's not into economics, you know would say, well, does that hurt the business? You know, doesn't that mean, now that you know they used to? You know, the industry used to be able to make this much money, but the price is compressing, you know, meaning it's not as expensive to deliver. Technologies are better. Does that mean the industry gets smaller? No, it's actually completely the opposite and that's why it's. A paradox is that as it gets more cost-effective to stream and as our technologies get better and more efficient, it actually drives more consumption and it drives more usage of video and traffic on the Internet expands even faster because there's new use cases. Right, there's new applications that are created, new entertainment experiences.
Speaker 1:I still remember back in the day when a terabyte of CDN was $500.
Speaker 2:It was prohibitive for many brands to do streaming or to do some stuff and, as you said, when it becomes more available, it doesn't hurt the industry, it just makes it bigger and more accessible for more people to use it, and to test and to do some really cool stuff that we have seen in the last few years. We're only possible due to these things happening.
Speaker 1:Yeah, that's right. So, cirrus21, are you primarily an engineering company, or are you an engineering company with also products that you bring to your clients? Or are you a product company that does a little bit of engineering? Where are you? How do you describe yourself?
Speaker 2:I would say that 80% of the company are engineers, so I would stick to the engineering company definition and we have been delivering this type of streaming products for the last, as we said, 15 years, and in the last two years we are also broadening our offer to AI services, but not just Gen AI, generic services, but connected to video and to how we compress video, how we read video, how we read video, how we, you know, extract data from video. So we are trying to leverage all the know-how that we bring from all these years working with live broadcasters and and all these types of complex clients and add AI on top of that. But we are still a video streaming company and we like that. I guess we're happy with where we are and, as you said, we are an engineering team, most of the team is engineering, and we have right now a huge, huge R&D team too. Like a third of the company is working in R&D.
Speaker 1:And so what are they focused on? Are they focused on Kodak encoding? Are they focused on streaming protocols? Software application layer.
Speaker 2:And, yeah, we have our very own, let's say, R&D deployment platform. And yeah, well, we have our very own, let's say, r&d deployment platform. And, yeah, they work on mostly, of course, on video. Everything is video and they work mostly on either improvements in coding and processes and also, of course, on AI and data extraction from videos.
Speaker 2:So everything we can bring on top of our current offer. For instance, right now you can take an encoder, clip a certain part of a video inside the encoder, send it over to Media Copilot, which is the AI part, and get a reframe version for TikTok. So all of that is done on the same website.
Speaker 1:Yeah, yeah, interesting, and that's just a part of your core platform, or is that something someone has to buy separately?
Speaker 2:For now we are still splitting streaming and AI. It may converge at some point. It may converge at some point, but we are still working with streaming on one side and AI on the other. Everything is pretty interconnected but we still work on those two worlds. You can use our AI without our streaming and you can use just our streaming without any AI features.
Speaker 1:I see Interesting, okay. Okay, is that in production? Is there anybody that's using the AI clipping function? Yes, absolutely yeah, we are working with a couple of clients already.
Speaker 1:We introduced MediaCode Pilot on IBC last September, so it's still a pretty pretty you know new thing, but we still have a ton of interest and a couple of clients use it already. You know it's funny how sometimes, sort of on the surface, the simplest operations actually become quite intensive when you really have to execute them, especially at scale. And publishing on social networks they all have different formats that they like. You know there's the vertical video there's. You know, there's square video One by one.
Speaker 2:Yeah, yeah, exactly.
Speaker 1:And yeah, it's square video One by one.
Speaker 2:Yeah, exactly, it's a challenge News and stuff. We have clients that work with news. The thing is they really need this quick approach.
Speaker 1:It's not just solid streaming. They can't afford to have an editor spend half a day to edit a video.
Speaker 2:Yeah, yeah, editor, spend half a day, you know, to edit a video. Yeah, yeah, and right now we are seeing really, really new approaches in terms of of how they. You know about the transformation you mentioned in the beginning. One of the things that we are seeing, uh, more and more, is this reduction of the equipment that goes out. Right, you know, in the past you need to go with a van to a certain public event and the van has a satellite dish on top and two cameras, a ton of cables, and right now we are going out with a mobile phone that uses, maybe, srt and go straight to the encoder, and that's crazy.
Speaker 1:And there's other companies, but I'm just thinking of like Live View with their backpack. And you know, from what I understand anyway, you know this is a little bit outside the space that we work in, but you know, I hear and I see them out there and know what they're doing. It's like the standard, like every news organization in the world has live view backpacks, you know, and the and the reporters or the you know people who are out there at at an event or at some public, you know some public, um, uh, um you know, thing that they want to cover.
Speaker 1:They're wearing a live view backpack and they're transmitting via cellular and in some cases very high quality yeah it's amazing.
Speaker 2:It's amazing and this is also connected to the improvements in codecs that we are seeing, the improvements in performance that we're seeing, like 10 years ago it was impossible to think about this. Yeah, exactly.
Speaker 1:Or it was really clunky and didn't always work well and you know there was yeah I want to go back to. So we're going to get to what you're showing at NAB, but you know, this is kind of a buildup for the listener. So, AI hot, Everybody is doing something in AI, it doesn't matter what function you are and it doesn't matter what industry. So, as an engineering company and focused on video, maybe you can bring us in a little bit to what projects are you building on or what solutions are you leveraging. Are there models that you're using today, Maybe you're looking at? Bring us in as much as you can to how this solution is built and what you're excited about.
Speaker 2:The thing is, we work with a bunch of different models and what we see is that everybody's doing AI. Everybody's talking about AI especially. I would say more talking than doing yeah that's true.
Speaker 1:That's why I asked if you were in production, and I'm so happy to hear that you have at least a couple clients, Not chit-chatting right. Yeah, yeah, yeah.
Speaker 2:But yeah, yeah, and what we also have seen is that you know, things look simple but they're not simple. Yeah, and we have a really, really cool example around this when we started doing captioning automatic live captioning for events.
Speaker 1:We have the same experience, so I'll let you finish, yeah.
Speaker 2:But the thing is, we started talking, you know if you go, and you know, talk about this lightly.
Speaker 2:It's like yeah you put a Whisper instance and there you go, so you may get HLS, wishper and subtitles. That would be the full circle Right now. I don't have the exact number, but at least we have at least 12 steps in order to see what we are doing. So if you don't have a huge R&D or engineering team in-house, it's really hard to leverage right now not leverage in ChatGPT, which is pretty obvious and simple, but leverage these type of applications in which, for instance, you build subtitles for live content. So the thing is we are taking the signal. Right now we're working with, for instance, with HLS. So we read the HLS, we transcript that, we check that we understand how lengthy the subtitle line will be. We check with dictionaries if there's some words that shouldn't be translated. We check with blacklisting that some words should be, you know, striked out or replaced with asterisks. Then we rewrite or, let say, rebuild the playlist and we deliver it with certain delay to make it match what we are listening to, what the subtitles say.
Speaker 2:So all of that is just one example and we have this for a ton of things in AI, like we do dubbing right now. We do dubbing or voiceover. We recognize speakers, we recognize objects. So for everything you put in, it's like oh, we have object detection. Yes, this company does that, but then when you need to put it all together and think about this as a product, it's way way more complicated.
Speaker 1:Yeah, that's super interesting. So I was curious what specific functions you were focused on using AI. So you know, we heard kind of resizing reframing, which is clearly needed for the social networks, as we discussed. So, and you're doing subtitling Is that also in production or is that coming?
Speaker 2:Yes, it's also in production. Or is that coming? It's also in production, the only thing we have right now in staging and other production is dubbing. That's coming too.
Speaker 1:Yeah, that sounds super interesting.
Speaker 2:Maybe it should be out Really and dubbing.
Speaker 1:So that would be we're speaking English, right, and then it could be translated in real time.
Speaker 2:I guess I mean real time meaning obviously there's a delay to the HLS stream, not in real time for now, but we will use a synthetic voice over my voice, for instance, to say what I'm saying in English, in Spanish, portuguese, dutch, and those things are coming.
Speaker 2:We also do video highlights, we do summarization, we classify the content using ebu or a AP, depending on the region of the world or using custom for certain, let's say, broadcasters that have their very own tree of how they treat data. So that's also something we're putting in, and you can download all of this or integrate this over API. So if you have a MAM and you want to enrich the data that's already on your MAM, that should be pretty easy to integrate.
Speaker 1:Yeah, amazing, well, very cool. Well, I know you're gonna be showing at least I assume. Anyway you'll be showing all of that and more at NAB. Tell me this, when someone engages with you so you're dealing with these very large operators and a lot of them they're using commercial services, so they have a mix of commercial products. The engineering is sort of more integration engineering than it is like building, say, from the ground up on open source projects.
Speaker 1:So what does that look like for you? Are you doing that integration work as well? Are you bringing you know, a bespoke solution and then your customer is integrating that? Are you there advising and even building the entire system? Like I'm just trying to get a handle on what the scope is, because there's so many you know people in the market that say, oh, we have a video platform, right, and you know they give all the usual feature lists. That is like everybody else's, you know. But then you have to kind of dig in and see, oh, okay, well, you don't do this, you don't do that, you can only work in this environment. You know there's all the asterisks.
Speaker 2:You know A lot of small letter, right? Yeah, yeah.
Speaker 1:So I'm asking the question, nacho, because I'm guessing that there's probably at least one person listening who's like hey, sounds interesting, but how are you guys different and how do you work with your clients compared? To fill in the blank, somebody else.
Speaker 2:Yeah, yeah, we will not name anyone here, but yeah, it depends on the project and the scope. We work a lot with customization with our clients. Of course, we have our products and we try not to, you know, add features that are just a feature request for one client, that will not be leveraged by the rest of our clients, or at least a group of them, but we get involved in general with how they integrate, for instance, with different things. I'm thinking about a project that's happening right now in which we are doing real-time streaming and there's a feature request around sending certain type of signaling inside the streams.
Speaker 2:And that came as a requirement and we may do like build it yourself, or we can point to someone that builds it or we can do it. It depends a lot on the project. In this case, we understood that these would be useful in the future for more clients, so we added as a feature and it's already working. So I would say that the answer is it depends, but we work a lot hand in hand with our clients yeah, yeah, understand well, very good, um, okay, so let's end here.
Speaker 1:What are you showing at nab and why should someone come visit you?
Speaker 2:We are showing a really, really cool encoder with VPU right now.
Speaker 1:I've heard about that. Yeah, yeah, you've probably heard about it. I know something about that one.
Speaker 2:We have done a lot of tests and you're familiar with it, but the thing is we are seeing a huge, huge leap forward when it comes to power efficiency, when it comes to consumption, but especially when it comes to density.
Speaker 2:We have been able to put 16 SDIs over one rack unit encoder using NetEase VPU. So this is already in place. Our encoder does a lot of stuff, and also something that we are already working on and it's already being tested is that, being partners of NetIn, but also partners of Akamai, we are going to be able to deploy we are going to be able to deploy our encoder with VPU on Linode's cloud, on Akamai's cloud. So that's something that it's pretty powerful when it comes to having an encoder cloud. And also, as I said before, we have this hybrid approach, so you may be able to have a VPU on ground, but also a VPU on Akamai's cloud working all together and connected using different encoders, but controlled on a very, very live control instance. So I think it's a pretty, pretty interesting approach that we're having right now, with the VPU fully integrated in our encoder.
Speaker 1:Yeah, we're really excited about this. You know, everybody's been talking about hybrid, hybrid cloud. I mean, for years it's been talked about theorized people. Oh yes, we're doing it.
Speaker 1:But you know, oftentimes it was really difficult to truly have a real hybrid solution that was duplicated in a data center or on-premise somewhere or even in another cloud, you know, and then you seamlessly just sort of moved. You know, it's like the, the, the architecture didn't matter where the service was running, it just wasn't possible. It was theory, it was theoretically possible, but there was always a difference in what cloud a provided and cloud B and then what I had available in my data center and, and you know this right. So the, what you're building, and with Akamai and the Akamai connected cloud now I'm speaking from video encoding, you know video encoder perspective. But now for, really, I would argue, the first time, it is truly, it's certainly the first time it's truly possible to flex hardware from a cloud environment, as in Akamai's connected cloud, to on-prem and then back and forth, either based on capacity, based on maybe there's a certain use case or a function that requires, you know, on on-prem.
Speaker 1:I know of one, you know large project that's actually going to be featured at NAB as a case study in Europe, where you know it had, for this particular project, the primary infrastructure had to be. It was a requirement it's a government project requirement that it be on prem, but there were other parts of it that could flex and, you know, and they're like super excited now they can do this, you know, without having to, you know, spend a whole lot of money, so, um, so that's really awesome. So we're, you know, anybody who's um, also interested in flexing true hybrid. Make sure you come see Nacho and Cirrus 21, because they're going to be showing this and it's real, it's not just a demo and it's not just a PowerPoint. It exists, it works, it's real.
Speaker 2:We can deliver the day after NAB without any problem. Amazing, no excuses.
Speaker 1:I love that we're like you guys are. We avoid at all costs going out and talking about things that are wishes and dreams. We only talk about what's real.
Speaker 2:Let's keep it that way let's keep's real, so let's keep it that way, absolutely, it's great.
Speaker 1:Well, nacho, thank you so much for joining us. We will, you know, maybe we'll do a wrap up episode after NAB, sort of a hey, what did you see? What did you learn? What was the response? But I do encourage everyone who's listening. We're now about five weeks before five, six weeks before NAB, so make sure you put Cirrus 21 on your must visit list, and it's really easy, because you are going to be where At anything Right in the corner at NetInt's booth.
Speaker 1:If you come to NetInt, come see us. That's right, it's super simple.
Speaker 2:This episode of Voices of Video is brought to you by NetInt Technologies. If you are looking for cutting-edge