Breaking Barriers in Low Latency Communication Artwork

Voices of Video

Explore the inner workings of video technology with Voices of Video: Inside the Tech. This podcast gathers industry experts and innovators to examine every facet of video technology, from decoding and encoding processes to the latest advancements in hardware versus software processing and codecs. Alongside these technical insights, we dive into practical techniques, emerging trends, and industry-shaping facts that define the future of video.

Ideal for engineers, developers, and tech enthusiasts, each episode offers hands-on advice and the in-depth knowledge you need to excel in today’s fast-evolving video landscape. Join us to master the tools, technologies, and trends driving the future of digital video.

All Episodes

Voices of Video

Breaking Barriers in Low Latency Communication

January 09, 2025 • NETINT Technologies • Season 2 • Episode 4

Unlock the secrets of the future of streaming with Oliver Lietz, the visionary founder and CEO of nanocosmos, as he takes us through the exhilarating evolution of low latency streaming technology. With over 25 years at the forefront of this industry, Oliver unveils how his company transformed from a codec provider to a groundbreaking live streaming platform. Gain an insider's perspective on the pivotal role of low latency in interactive streaming, illustrated through a compelling case study of a major conference in Saudi Arabia, where real-time communication with a global audience was made possible.

Embark on a journey through the complex world of adaptive bitrate delivery in ultra-low latency environments. Discover how nanocosmos leverages the unique capabilities of WebRTC compared to traditional streaming methods like HLS and DASH. Oliver shares strategies for overcoming integration challenges while ensuring seamless user experiences across various devices. We'll unravel the intricacies of audience scalability, the strategic use of third-party networks for reliability, and the future of codec compatibility, focusing on H.264 and upcoming advancements like HEVC and AV1.

Explore the intricacies of modern live streaming technology and transcoding insights, as we dissect the strengths and limitations of protocols like RTMP, SRT, and WebRTC. Oliver highlights the importance of robust delivery systems capable of handling thousands of users and shares his take on the efficiency of HTTP-based protocols. We conclude with a discussion on video encoding efficiency, touching on recent updates in FFmpeg and the balance between CPU utilization and server-side scalability. If you're curious about the evolving landscape of streaming technologies, this episode promises to deliver profound insights and innovative solutions.

Stay tuned for more in-depth insights on video technology, trends, and practical applications. Subscribe to Voices of Video: Inside the Tech for exclusive, hands-on knowledge from the experts. For more resources, visit Voices of Video.

Speaker 1: 0:07

Voices of Video. Voices of Video. The Voices of Video. Voices of Video. Good morning, good afternoon, good wherever you are. I'm Jan Ozer. Thanks for coming to NetIn's Voices in Video, where we explore critical streaming-related topics with the experts who are creating and implementing new streaming-related technologies. If you're watching and have questions, please post them as a comment to whatever platform you're watching. We'll answer them live if time permits. Today's episode is all about low latency streaming and we speak with Oliver Leeds, engineer, founder and CEO of NanoCosmos, a Berlin-based company with more than two decades of experience in streaming. The company's flagship product is NanoStream Cloud, an industry reference for reliable B2B interactive live streaming on any device. Oliver, thanks for joining us. Tell us a little bit about your company, how long you've been in business, your products and services, that type of stuff.

Speaker 2: 1:07

Very good introduction already, so it's going back 25 years actually now. So we are celebrating this year, which is amazing. Not even digital cameras existed in that time and we were starting with digital videos, 1998. And we have grown from an engineering company formerly providing codecs and software tools for the broadcast industry, now to provide a full live streaming platform for interactive use cases, which means that the latency between the camera and the viewer needs to be very low, ultra low as we say around one second end to end, to enable this real-time interaction between the camera and the viewers.

Speaker 1: 1:46

Okay, I was going to ask you about a typical customer, but we got a free NAB case study today and it talked about a conference that you helped produce called the Future Investment Initiative. So why don't you describe what that is and what was involved and what your role was in getting all that set up and operating?

Speaker 2: 2:05

Well, it was a big conference show in Saudi Arabia with several thousand attendees 10,000 attendees on the location and around 15,000 in the virtual space and there was high value people speaking on the stage, celebrities but also important politicians, investment people talking about renewable energy, investment policies so a really high profile event which was held there. And they needed a reliable solution for interactive live streaming, which means they wanted to have a global audience anywhere in the world, directly available in the browser and have that reliable working for interaction with the audience and the panels, so to be able to communicate with chat feedback from the audience to ask questions for this event. So several channels were set up for that and we were getting the stream from the production setup when you picking that up to our CDN, do the whole delivery worldwide and also running the player on the browser which was running with adaptive picture to accommodate on all different quality levels, so you have a running live stream for interactive purposes on every device. So and that needed to stay very stable and 100% reliable for these shows.

Speaker 1: 3:35

Were you the whole streaming component or just a piece of it? We?

Speaker 2: 3:39

were the core of the streaming components, but there were also several channels sent out to social media platforms like YouTube and Facebook, but the primary stream was run through our system.

Speaker 1: 3:49

Give us a sense of why low latency was important. Were videos coming back from the audience or was the chat function where it needed to be interactive? Why was low latency important in that application?

Speaker 2: 4:02

With interactive streaming we understand that there is a kind of feedback from the audience back to the presenters which in almost all cases is not video-based but kind of text-based. So it can be a voting, a chat, a question, q&a, any kind of action which is triggered by someone in the audience and then on the other hand gets some interaction back someone in the audience and then on the other hand get some interaction back. So that can be like in these cases, can be like shared Q&A questions and this enables the interaction. Only if you have real-time streaming, low latency streaming, you have the delay very low between both parties.

Speaker 1: 4:41

Looking outside of that particular application, are there any applications that you serve where interactive video is a major component of that?

Speaker 2: 4:49

So we see two major scenarios for that. One is similar to what you just mentioned, kind of enterprise space, large event space where we also have corporate customers who are doing town hall meetings with Q&A, so which all needs to be accessible directly in the browser so you don't need to set up any kind of Zoom client or so and can directly watch the stream on every device. Majority of our clients are using mobile phones, so it needs directly be accessible on every handset you are using. And then you have the interaction elements, separate to the video or on top of the video, to ask questions, give any kind of feedback that can be for this corporate space. But on the other end there's also the large space of monetized video content which is like betting, gaming, um auctions, live auctions is quite large, so where you have a revenue channel directly based on this video stream and can't use the application without real-time video.

Speaker 1: 5:50

So I mean lower latency is always better, but there are trade-offs associated with latency. Can you talk about those?

Speaker 2: 5:55

There are certain applications which really require low latency. So, like in the auction space, you can't go higher than two seconds end-to-end it must be very low. Space you can't go higher than two seconds end to end, it must be very low, otherwise you can't also legally keep up with the live auction with the real people sitting on the venue, for example, and this requires complete control of the whole workflow and adaption to the live stream which you get to the devices. So very important is that you have a good adaptive bitrate control system which requires transcoding of the right bitrate ladder to send out the right bitstream to the receiver, and that means that it's not always the best to have the highest quality, like 4k or full hd, sent out to the clients and it can be as small as whatever 320 times 240. At least you have the live content in real time and you can enable this interaction on your monetized revenue channel.

Speaker 1: 6:53

So you're saying that quality or resolution may suffer if you try and make the latency as low as possible?

Speaker 2: 7:04

Not necessarily, but you are running everybody's running in all kinds of different networks so you may have hostile environments where you're on commuting or any remote locations. Network availability varies so, especially on mobile, you go in whatever location where network quality drops suddenly and then you need to adjust. So to keep up the quality as high as possible means not only highest video quality but also the whole quality of experience. So that means the interaction needs to stay active. It doesn't need can't buffer things like needs to adjust to the network conditions you are using. And if you have a small bandwidth network like 3G with 500 kilobits per second max, it just needs to be a low quality video, but at least you have a signal and you can keep up the application running.

Speaker 1: 7:58

So what technologies are available for low latency streaming that can operate in the sub two second realm?

Speaker 2: 8:08

When we started that there was not much available. Actually, everything was shifting towards HLS and Dash and people were surprised to suddenly see these large latency values of 30 seconds, one minute, several minutes. It was quite surprising to many people who wanted to have interactive video somehow. So these use cases were somehow not covered by the traditional CDNs and still are not. So traditional CDNs based on HLS and Dash are still buffering things and you need to keep these things under control to keep the latency very low, which can go maybe to three, four seconds based on these approaches. But then you are really fighting with the closest challenges you can get there. So that's why, in the end, we decided we need to create something on our own and technology to create something on our own and technology which really keeps the latency in the second range, which has the server side under control but also the player side under control. So it's an active connection between the player and the server and not like an HLS or DASH, just passively watching a stream and buffering whatever is coming from that. So the whole technical challenge is a bit different there.

Speaker 2: 9:26

There are also other cases, like WebRTC-based, which are created for real-time interaction but which originally were created more for smaller groups, like in a video meeting, like we are now, and guests might come to this video meeting as well. But if you want to enlarge this group to have everybody available on all mobile devices and all mobile networks, webrtc also gets very challenging. So that's why we decided to create our own technology around that, and what we noticed also is that there are many technologies available for certain kinds of use cases, but technology is not leading anymore in these decisions. It's more like the whole complete platform, the complete application. There are things like adaptive pit rate, transcoding, analytics, the player, the whole CDN, the network ingest. Keeping everything under control is not only the streaming technology but the whole platform, and that's quite challenging for business customers who want to focus on getting their business accomplished.

Speaker 1: 10:32

So you're saying that low latency HLS and low latency Dash. I mean, what's the lowest latency you can achieve reliably with those technologies?

Speaker 2: 10:41

With HLS and Dash you can go down maybe to three seconds or something, which is higher than the ultra-low latency real-time activity I mentioned. So if it goes back and forth it's already six seconds then. So there's no interaction possible anymore, and then you also only have the best case. So it needs complete control on both ends on the CDN, on the player side to keep that under control, and that's why the technology approach we do is a bit different here.

Speaker 1: 11:11

And you're saying that you know, once you look at the ultra-low latency technologies, the people you're dealing with, they don't care whether it's WebRTC or any other technology, they just want, I guess, a solution. You know a turnkey solution that works overall. I guess a solution, a turnkey solution that works overall. So why don't you talk about the solution that you guys offer in that respect?

Speaker 2: 11:33

We offer the complete CDN, the network the ingest points globally so we can ingest a live stream from anywhere in the world you want.

Speaker 2: 11:47

We do the delivery or we do the transcoding in the first place, first to several bit rates.

Speaker 2: 11:49

We provide quality levels for every network of the clients and we do the distribution around the world with several edge locations also globally available. And we have the player software and which our customers install on their web pages and which picks up the live stream from the closest edge location. So it's a complete low latency CDN plus the live transcoding, plus the player as an integrated product. And on top of that we also have an analytics platform which is very valuable and which is more and more required by our customers, which gives more insight into the quality of experience of the whole workflow, which can identify potential issues on any part of the video streaming workflow If latency goes up or if it's buffering or if you have network connection problems. You need to have a kind of identification of every part of the workflow and need to track this down to these things. Plus you have, of course, the option to get kind of business intelligence to see in which part of the world which things are playing, how much the volume, how large the volume is, etc.

Speaker 1: 12:57

You know you talk about adaptive bitrate delivery. Is that unusual in the ultra-low latency technology realm? I mean, does WebRTC do that or is that possible? Webrtc has some kind of scalable mode included.

Speaker 2: 13:11

I'm not sure about the details here of how has that gone yet it's different to traditional adaptive bitrate. We do a kind of hybrid approach like in traditional HLS dash scenarios. So we send something to us, we create a bitrate ladder and create several bit streams out of that and then we do that all based on our low latency protocol, send it out to the edge locations and to the players to keep the right quality up and running. So it's a ladder-based approach and I think in WebRTC it's a bit different managed. So that's also challenging to get the player under control, to have a compatibility stream running the same time on all devices. So WebRTC is a kind of also I would say a beast from the standard here. It has created or has grown from the initial beta version in the Chrome browser, which is now available on every device. But still every browser does it a bit differently and the whole handling is quite complex. So getting that under control is quite challenging.

Speaker 1: 14:16

What does that mean from a? If I implement your technology, how do I integrate my player? I guess I send you a stream into the cloud and you kind of handle the distribution. What's the playback side look like? How do I integrate that into my whatever application? Is that I'm building for that event or that you know that program?

Speaker 2: 14:34

It's like other player vendors are doing, that. We have a JavaScript code snippet which you can put into your player page. You also have an iframe which you can easily use or a dashboard directly website where you can watch the stream. So it's different levels for integration which you can use. Copy paste the code snippet to your web page and then you directly have the live stream integrated and the adaptive bitrate handling and the connection handling to the closest location and the right network, etc. That's all done by the player. So that's the intelligence which is in the player library and which makes it more valuable than just have a player and then you need to connect it to a third party CDN and to a third party encoder, et cetera.

Speaker 1: 15:16

So it's more the complete integrated approach which also creates a value here the program we talked about a few minutes ago. I think they had 15,000 remote viewers. What's the largest audience size you've supported with your technology?

Speaker 2: 15:31

So it's not like in the CDN environments where you can directly go to millions at least what some vendors claim but it goes usually about 100,000, 200,000 concurrent streams. So it's sufficient for all the applications we are using until now. If there's need for larger growth, you can scale up to other services as well. Here that's an indication of the scale you can reach with this technology.

Speaker 1: 15:59

And do you have your? You know you talk about your own CDN. Do you have your own build out of a CDN or are you working with you know, third-party CDNs and just integrating your technology stack as necessary into their technology? We're working with third-party CDNs and just integrating your technology stack as necessary into their technology.

Speaker 2: 16:12

We're working with third-party partners, but not CDNs, because when you have a CDN you already have this caching approach with these HLS, chunk file transfer things. So that's why we need software control on our own. We make that based on our own virtual or bare metal machines which we run on our partners' networks. So it's a kind of multi-provider approach. So it's Amazon, but also other providers, and also has a multi-failover system built in. So our goal is really to reach 100% stability and no downtime, which we enable by automatic failover to a data center from another vendor if something's going down or something's going wrong, and we put high effort into that to keep that system up and running. So there are a lot of challenges and we are quite happy that we have complete control over the system and insight and can tune the knobs if required on these systems.

Speaker 1: 17:12

How do you typically charge for an event? Is it by audience? Is it by, you know, delivery streams? How does that work?

Speaker 2: 17:19

It's a rough indication of the volume you want to have. But in the end it's volume-based and like other providers, so the bytes going through the system, like other providers, so the bytes going through the system, traffic-based. The larger the audience is, the larger the volume will be and there can be some packages prepared. Usually we have kind of self-service packages which start like 500 bucks per month, but usually it's customized quotes and offerings which we can or which we we discuss with the customers, with our partners. They have the right package for them available and then this can scale up easily. So, based on these smaller packages, they can grow as they need. You can instantly live stream, add more streams, add moreodes, etc. So it's a kind of self-service system when it's once started and then you can grow by yourself based on your demands.

Speaker 1: 18:16

What's the codec side look like? Are you stuck on H.264? Are you starting to extend out into other codecs? Where are you with that?

Speaker 2: 18:24

We are still with H.264 because that's the most compatible format which is running on all devices for the distribution side. We are working on also other codecs, of course, like HEVC, but also AV1, which then have also challenges again for the real-time encoding mode. And then there come interesting solutions in place, like the NetEnt solution which makes real-time encoding more efficient on the server side. So picking up the stream in the right format for as high quality as possible, still based on H.264, it can be 2K, 4K, whatever, but could also be H.265, AV1, etc. Also be H.265, AV1, etc. And then we transcode it to the formats which are available on devices which is still H.264 on the delivery side.

Speaker 1: 19:18

Why the delay? In HEVC, I mean transcoding, has been available, for you know our product came out in gosh 18 or 19. So why it's so slow? Is it the decoder side, the whole Chrome thing, or what delayed that?

Speaker 2: 19:32

Yeah, HEVC is maybe comparable to H.264 in these things if you set up the right profiles. So it's a bit more complex. We have more profiles available for HEVC encoding than for H.264. You even have more profiles for AV1. And then it's getting also difficult. It's buffers a bit more on the encoding side. So that's trade-offs. We consider which is always the question based on the customer requirements, Is it whatever? 100, 500 milliseconds more? Would that be sufficient or not, compared to the quality you get then? So that's things which depend very much on the use case.

Speaker 1: 20:13

Have you done experiments that kind of showed how much bandwidth reduction you can achieve with HEBC or AB1?

Speaker 2: 20:19

That's very dependent on the encoder.

Speaker 2: 20:21

So as every video expert knows, but maybe not everybody in the industry knows is that the encoding results and the quality results are very much dependent on which encoder brand you are using, which configuration you set to the encoder. So there are things like baseline profile, main profile, which decide on the quality, but there are also implementation details on the encoders which create higher or lower quality, create higher or lower quality, and there are standard encoders available, but there are also software encoders, hardware encoders, all kinds of encoders available where you need to. If quality is key, you need to check if the quality results are really comparable. And also when you compare them X264 to H264 to H265 or HEVC or whatever you need to be sure that you compare apples to apples and not a different profile to another profile which doesn't match. So in the end there is of course a benefit, but it's not easy to get that under control and to find the right profile for the right distribution. So for HEVC there are numbers you know that better than me like 20%, 50%, whatever benefit.

Speaker 1: 21:34

but that's not deciding in all cases, because the whole handling also needs to be under control how much of the charge to the customer relates to the bandwidth that you distribute. Is that a major component of it, or is that just an afterthought?

Speaker 2: 21:49

We charge basically based on the bandwidth. This is then decided on the player side right. So if you are on a low bandwidth network, then you can't go up with the bandwidth anymore. So you need to go to the limits. What the audience has.

Speaker 1: 22:02

We have a question from Mark Kogan, Like all his questions. I think it's going to lead to another question, but let me throw this out at you. Is it a must-have for the segment generation to be in direct sync on the incoming timestamps from the encoder? With frame accuracy and sync? Can you answer that, or is that too vague?

Speaker 2: 22:24

That's very technical and that's the details we really don't care too much about.

Speaker 2: 22:29

So in the end, it doesn't matter what we get so you can configure your encoder. Even with two seconds, four seconds crop size, we still make a low latency stream. All of that and that's the challenges I meant. When you want to handle these things yourself, you need to worry about all these details so long as the gobsize, so in sync other segments to your distribution, et cetera, and that's what we try to hide below our APIs to make it as easy as possible to use.

Speaker 1: 23:04

So let's cover the implementation side. If I'm running a low latency event with your technology and I guess I wasn't aware that it was available on a Turnkey is a bad word but I can go to your website, I can sign up for it. I don't need to talk to a human. I can just send you a stream and integrate the player and then I'm done. I mean, is that? What percent of your customers are actually doing that?

Speaker 2: 23:30

Many. I mean is that? What percent of your customers are actually doing that? Many customers are starting with that. So our approach has always been to provide an honest and transparent way to use our technology, make it easy to use and kind of follow the approach.

Speaker 2: 23:40

Seeing is believing, which means you can try it out, you can directly sign up, you use it directly for your system. There's even a seven-day free trial. So we believe in what we do and what we can provide. So it's very easy to get started and that's also the nice thing about that that you directly get a live stream as a result and you get a visible picture. You can play with it and even run the open source software, obs, to start from your webcam and get low latency results with this. And you don't need to have any specific hardware setup on your end and camera setup. It directly works out of the box with webcams as well. And, starting with that, you can then grow and add more professional equipment etc and integrate the player on your web page, et cetera, and that makes it very easy to start from very basic things and grow from there.

Speaker 1: 24:37

You know, let's walk through the implementation. I want to do a live event. You know, call an auction a celebrity or charity auction. How does it work from an integration standpoint? You said I can use a webcam, but if I've got a camera setup, you just provide a RTMP type link that I need to send to, or how does that work?

Speaker 2: 24:56

So you need to have an encoder on your site. So, based on your camera, you need to have a live encoder. It can be a software, as I said, obs. Of course it can also be a hardware encoder. We partnered also with companies like osprey video, who integrate or encoding urls directly into their hardware boxes. But it's basically an rtmp url which you put into the system, into your encoder.

Speaker 2: 25:23

You configure the stream for the right bitrate, which you in your end need to need to decide on your own how high the quality shall be.

Speaker 2: 25:32

It's a full HD stream with whatever three or four megabits per second, and then you send us the stream, we give you an ingest URL, we take care about the rest. So that's a setup. You need to do on your end is keep things under control on your network, that the network bandwidth is available for this upstream, and then we do the delivery and you get will have a player on the web page which you can put on your own web page and have the live stream then end to end on your application. Yeah, it's not only rtmp. Of course we also can work with srt, which is a more and more prominent protocol used everywhere in the broadcast industry and has advantages in unstable network situations. For the upstream, we are also providing WIP integration. Now WebRTC Ingest protocol, which is on a similar level, like the SRT, make it easier in hostile environments. So whatever is sent to us, then we take care about the delivery worldwide and the delivery then to the browsers.

Speaker 1: 26:38

What are the analytics I care about from a stability standpoint, that I'm going to get stream count and number of viewers and all that stuff, but from a reliability or a stream viability perspective, what information do you give me that lets me know things are either working well or about to go wrong?

Speaker 2: 26:56

Yeah, of course, things like you said, based on the volume, the stream volume, the number of viewers. You have a geo-based world map where you can see where's playing what and in terms of the streaming integration, what kind of errors can happen. Latency can go up, the error rate can go up like buffering ratio. Network networks can can go off if you get lost in whatever network situations. There might be kind of larger customers running in corporate spaces where you directly have connections through small bandwidth connections only to the servers. So that's things you notice only when you have metrics collections from the server side but also from the player side, and we can aggregate all that to make that visible.

Speaker 1: 27:42

Hang on. We have a question from Paul Connell. If I'm comparing your system to a WebRTC-based system, what are the five points of comparison that I care most about? What do I look at when I'm comparing you with a WebRTC-based system?

Speaker 2: 28:01

I don't know if you can count to five now, but I can try to list some things, some differences. So WebRTC is usually browser-based. When you create a WebRTC stream, you would usually do that from the browser. It's originally created for real-time interaction and that means that only smaller bitrate profiles were allowed in WebRTC. I think it's still only baseline in the Chrome browser which is allowed and you can go much higher in the ingest quality when you do a studio-based protocol like SRT or RTMP, up to whatever 10-bit, 422, 4k stream you can send, which is not possible in WebRTC. And on the distribution side it's also there are platform providers who use that and cover that under the hood to stay ahead of the complexity.

Speaker 2: 28:56

But if you want to do that yourself it's really challenging. So it's a big difference between creating a small WebRTC kind of prototype, end-to-end, which is working well, and create a complete platform which is working then for thousands, end-to-end, which is working well, and create a complete platform which is working then for thousands. Because even a WebRTC it can go, it can buffer, the buffer can go up. You need to get that under control, you need to measure that somehow, make that visible, et cetera. You need to have the adaption of the bitrate.

Speaker 2: 29:25

If suddenly network goes down, what do you do then? Will you skip frames, drop frames? Will video go off, audio go on, etc. So it's quite challenging to make that on your own and the platforms who are using that of course have similar challenges as we do. We consider our solution quite stable and usable for for any device in any browser, because it's based on HTTPS and WebSocket delivery, which is much more lightweight and easier getting under control and doesn't need to have any kind of third-party protocols, firewall openings, et cetera available. It can go through the standard HTTPS ports, et cetera.

Speaker 1: 30:11

So it's going to be the quality of the incoming stream and then the robustness of the delivery mechanism. Are those, that's two, were those kind of, the two key ones, or?

Speaker 2: 30:20

Yeah, also the scale, somehow, so to get to the scale of thousands of years. There are challenges in WebRTC if you want to make it yourself, so if you have a platform. I don't tell anything about other vendors who are doing that, but as we learn from our customers, they are very satisfied with how we do it. It's very lightweight and easy to use and which makes it more, more seamless and integration. So there are newer things like WebRTC is kind of still under development. Things like Google, apple, other browser vendors are still working on improvement and improving the technology. We're also looking at that, of course, and see what's the best technology going forward in the future and that's the things which we then cover, also by integrating in our player or APIs to decide on what is best for the use case, what is best for the user experience and for integrating into the whole product, and try to cover the technical complexity around that.

Speaker 1: 31:29

We've got a question from a Mark Ritzman. What about closed captioning for your application?

Speaker 2: 31:37

That's a good question. The demand actually for doing closed captioning for our applications is rather low, so we don't do that Usually. There are ways to use that as a separate channel. It's in the real time interaction space, usually not so much required to use that as a separate channel. It's in the real-time interaction space, usually not so much required to do that, because you directly interact with the presenter and the audience and don't have that requirement for these kinds of applications, at least how we see it. So thought answer is no, but there's a reason for doing that.

Speaker 1: 32:11

Getting into the inner workings of the transcoding capability in your system. Where did you start in terms of getting a stream in? You need to transcode it to encoding ladders. Did you start with software? Did you start with GPUs? I know you're looking at our technology. I'm not going to ask you a lot about that, but you know, talk to us about the transcoding side. Where you started, where you think it's headed.

Speaker 2: 32:35

Transcoding is challenging. Generally, as you know, you need to have the right balance between performance, quality and overall throughput. So with the software of course, as everybody else also probably starts with software encoding first see how that works and getting many channels, many parallel channels, on a server running, trying to increase the number of calls on the server, using larger machines and tuning the bits and knobs of the encoder to use kind of efficient profiles and find the balance between quality and the results. So it's always a trade-off. Somehow, as everybody knows, when you go to a live stream, in the live video stream you just can't get 4k quality on a low bandwidth mobile network. So in the end you need to compress something and you will see the compression results. And to make that as efficient as possible is part of the encoder and the transcoder.

Speaker 2: 33:38

And there are efficient software solutions which we use, based on our history. We created software encoders ourselves so we know what it's about. And there are things like X264, which is open source based, built into things like FFmpeg. There are things like GStreamer. There are things like hardware acceleration on QuickSync, intel machines, amd machines. There are ASIC solutions, like NetEnt provides, which makes encoding very efficient, like NetEnt provides, which makes encoding very efficient. But there is a wide range of solutions which you have here and which you can control and change and finding the sweet spot is quite challenging and the more volume you get, the more challenging it gets to find the right applications and in the end it's also a kind of question how large the business value is to have the highest quality. With software encoding you can go very high with the quality. You can create overloading CPU load with that for one channel of 4K encoding if you use the wrong profile. But you can also make it efficient. So that's also part of the trade-off you need to control on the software side.

Speaker 1: 34:55

What's the typical encoding ladder? I mean, if you get a 1080p stream in, what are you sending out from a ladder perspective?

Speaker 2: 35:02

Typically for our applications the bit rates are rather medium or low. So when we get 1080p in, it's something like three or four megabits only and then it goes down to either a second 1080p profile like two megabits, or already 720p profile which is in standard HD and delivery, and we have lower profiles like 480 and even 320. So it can go very low to keep up the quality on low bandwidth networks and devices and you, but also usually it's not more like three or four steps of the ladder and ultra latency, because you need to decide on the player side very quickly if you need to change. So making that too complex and making the data too large is also not optimal. So that's the rough operating points we are working in. There are more and more questions about getting the quality higher, at least for kind of premium streams, going to 4K etc. We also enable that and then increase the data a bit to the full HD profiles as well.

Speaker 1: 36:10

This is a question you know. As we test internally, one of the questions we have is how far you want to push the server. So we think we can push our hardware to like 95% and not really worry about anything crashing. If you're running transcoding on a regular computer using software, what CPU utilization level starts to get you scared that you may have a crash?

Speaker 2: 36:34

Yeah, it's gradually somehow. So it starts already getting kind of impact with 70, 80%. So usually you should be very careful about that. It's interesting to learn also about the kind of things which create that kind of load. So the whole processing pipeline loads the whole system. It's not only CPU, it's also the memory load, it's also things like scaling, you know what I'm talking about.

Speaker 1: 37:04

Another question from Mark. Let me try and repeat this word for word. Let me try and repeat this word for word Is it HTTP2-based or how about compared to QIC or quick QUIC-oriented? I guess he wants to discuss those input protocols.

Speaker 2: 37:27

No, that's more on the delivery side. So HTTP is kind of available in several versions, so the current standard almost everywhere is used 1.1. There is HTTP 2 and HTTP 3 and part of that is the QUIC you mentioned. So it's UDP based.

Speaker 2: 37:45

There are challenges around that as well. We are working on that to make that available also on the large scale. But there are even networks which don't allow that. So it's kind of challenging also to go through all networks for that, because it has compression and encryption built in the protocol which not every provider and not every legislation likes. So there are things around that. If you can use it or not, it's not an our decision, but we enable these things to make it, yeah, to keep the quality and the latency as low as possible, and it's a good point because generally theoretically that's a good point because generally theoretically UDP-based protocols create less traction and less latency in the system, even on the network level, because on TCP you need to acknowledge every packet and on UDP you can just pipe it through. But there are a lot of challenges around that.

Speaker 1: 38:48

Ricardo Ferreira is asking if the stream will be shared afterwards. I feel safe in saying that there's nowhere you can go where you won't see copies of this stream. You know LinkedIn, facebook, our website. Wherever it will be available afterward, go to the Voices of Video page on NetEnt and I'm sure it'll be up there, hopefully with the transcript, within the next few days. Oliver, did you have anything for me? I know you talked about that in our initial meeting, or you decided to take pity on me and let me go on with my day?

Speaker 2: 39:17

If you want to mention some of your challenges, you see, so we can see if we can match that somehow.

Speaker 1: 39:22

Some of the challenges. Actually, one of the big ones is one you're kind of in the middle of Talk to us about the efficiency of FFmpeg, not to go totally wonky on the audience, but multi-threaded efficiency of FFmpeg and I don't know how much you've experienced that or how much you've compared it with GStreamer, but any that's been a big challenge for us. You know, we've just you know we've we've had great performance with. You know, high CPU utilization, and then we throw a complex encoding run, maybe 4k or maybe different things, kind of trigger it, but then we see throughput stop, but CPU utilization is still pretty modest. It's in the 40 to 50% range and utilization of CPU utilization is still pretty modest. It's in the 40% to 50% range and utilization of our cards is still pretty modest. So we see this issue with FFmpeg. How much have you encountered that and how much have you played with GStreamer as a way to avoid that?

Speaker 2: 40:23

Yeah, that's a great question that goes really in the deep details of coding and video coding. Details of coding and video coding and that's typically when you use tools like MPEG or GStreamer. They're working quite well, so they're quite efficient. They have been very stable in the last years and doing a good job. But when it comes to really high performance and large throughput, you need to get into the details and need to maybe do your own software development or your own configuration and find the right spot to make that really scalable on the server side, and that's also a great effort and I agree that that's challenging. Switching the software from FFmpeg to Gstreamer creates completely different results. Tuning things in FFmpeg and changing buffers and profiles also changes results. So that's interesting to learn and that's a process which is ongoing, of course and to make it more and more efficient.

Speaker 1: 41:28

Have you played with FFmpeg 6 in that regard?

Speaker 2: 41:31

Not yet. We just moved to the latest 5 version, but I'm looking forward to see how 6 is performing. The announcement was saying that there seems to be a kind of improved threading behavior there, but we didn't verify that yet.

Speaker 1: 41:48

I did some very simplistic tests. You know, I had one of the command strings that I talked about that really crashed our system and all systems. You know, ffmpeg and software, or with our hardware, and I ran it with five and I ran it with six and I saw absolutely no difference. And I do that with two of those. So there could be a switch that I'm missing. It's in front of our engineering team at this point. We're trying to figure out, you know what's available with six that wasn't available.

Speaker 1: 42:15

The thing about FFmpeg is it's you know it's really easy to use. You know, and there are plenty of examples out there, gstreamer is a little bit harder and then if you go to the SDK, you've got complete flexibility with how you approach our encoder. But it's a lot, you know it's a lot more work. I mean it's not challenging work. Most of our customers at scale are using the SDK. But you know, all of our demos and all of our quick starts are FFM, peg, and it just really hurts when you know you can get up to a certain performance level and then you just hit this wall. So yeah, I had high O's for six, but don't see any quick resolution to those issues. Looks like I'll be learning GStreamer.

Speaker 2: 42:56

Yeah, and we keep working on that, so that's what we do.

Speaker 1: 43:07

We're done with questions. I'll let you go on with your day. I appreciate you spending time with us. It's interesting to hear about new technology. I learned a lot of stuff that I didn't know about it and I appreciate you taking the time. Thank you very much. Okay, we're going to sign off. Everybody thanks for coming and we'll see you next time. This episode of Voices of Video is brought to you by NetInt Technologies.

Speaker 2: 43:28

If you are looking for cutting-edge video encoding solutions, check out NetInt's products.

People on this episode

Voices of Video

Voices of Video

Breaking Barriers in Low Latency Communication

People on this episode

Mark Donnigan

Jan Ozer

Anita Fejter

Oliver Lietz