The benefits and Limitations of FFmpeg, Gstreamer, & GPAC Artwork

Voices of Video

Explore the inner workings of video technology with Voices of Video: Inside the Tech. This podcast gathers industry experts and innovators to examine every facet of video technology, from decoding and encoding processes to the latest advancements in hardware versus software processing and codecs. Alongside these technical insights, we dive into practical techniques, emerging trends, and industry-shaping facts that define the future of video.

Ideal for engineers, developers, and tech enthusiasts, each episode offers hands-on advice and the in-depth knowledge you need to excel in today’s fast-evolving video landscape. Join us to master the tools, technologies, and trends driving the future of digital video.

All Episodes

Voices of Video

The benefits and Limitations of FFmpeg, Gstreamer, & GPAC

January 23, 2025 • NETINT Technologies • Season 2 • Episode 5

Unlock the secrets of video streaming technology with our special guest, Romain Bouqueau, a key contributor to the GPAC framework and CEO of MotionSpell. We promise that by tuning in, you'll gain a deep understanding of multi-threaded performance challenges in FFmpeg, especially when pushing the limits with complex tasks like 4K HEVC encoding. Romain brings us behind the scenes, explaining how threading and scheduling come into play, and why issues like input-output management and memory can make or break your streaming success. Together, we also explore how giants like YouTube and Vimeo optimize these technologies to deliver seamless streaming experiences.

Our journey doesn't stop there. We dissect the strengths of FFmpeg, GStreamer, and GPAC, focusing on codec performance and their unique roles in the streaming ecosystem. From the efficiency of x264 and x265 codecs to the power-saving potential of Gravitons, learn why companies like MulticoreWare are at the forefront of performance enhancements. We also share insights into the evolving world of FFmpeg's infrastructure and GStreamer's superior handling of bottlenecks, shedding light on the importance of choosing the right tools for your streaming needs without the hassle of constant compilation for different CPUs.

Finally, we delve into media packaging standards such as HLS, DASH, and the emerging CMAF format. The discussion underscores the critical role of ISO BMFF and highlights tools like GPAC's Compliance Warden in maintaining interoperability. We tackle the risks of using under-maintained open-source projects like Bento4 and explore the supportive communities that provide commercial support, ensuring you stay ahead in the fast-paced world of multimedia technology. With an eye on the future, we touch on advancements like VVC support and consider the potential for user-friendly interfaces to simplify complex processes. Join us for an episode packed with insights and expert guidance on navigating the ever-evolving landscape of video streaming.

Stay tuned for more in-depth insights on video technology, trends, and practical applications. Subscribe to Voices of Video: Inside the Tech for exclusive, hands-on knowledge from the experts. For more resources, visit Voices of Video.

Speaker 1: 0:07

Voices of Video. Voices of Video. Voices of Video. Voices of Video.

Speaker 2: 0:17

Welcome to NetEnt's Voices in Video, where we explore critical streaming-related topics with the experts who create and implement new technology-related solutions. If you're watching and have questions, please post them as a comment. On whichever platform you're watching on, we'll answer live if time permits. Today's episode is all about encoding and packaging, and we'll be discussing the GPAC open-source technology framework recently licensed by Netflix for their live streams. Specifically, I'll speak with Romain Bocou on his birthday, and he's a main contributor to GPAC and the founder and CEO of MotionSpell, the licensing arm of GPAC. And the reason we're talking with Romain today relates back to an experience we had with FFVMPEG.

Speaker 2: 1:02

Here at NetEnt, we started doing some testing with some of our cards and the cards performed well up to a point. When the FFmpeg operations became more and more complex, we saw that the throughput dropped significantly, even though capacity in the cards and capacity in the CPU was fine. So we started researching alternative solutions. We found GStreamer and we also found GPAC. Romain was kind enough to join us today and talk about these and many other applications of both programs and just describing the technologies, where they fit, what they perform and how to consider which ones to use for your application. So let's start there, romain, thanks for joining us.

Speaker 1: 1:45

Thank you, John.

Speaker 2: 1:49

So why don't we just jump into the FFmpeg issue? What's going on with multi-threaded performance with FFmpeg? What are the symptoms, what are the causes and how do we work around it? And take it a bite at a time, don't try and jump into all that.

Speaker 1: 2:00

So we're going to talk a lot about multi-threads, multi-threading. So first, what is a thread? So a thread is a sequence of instructions, things that you want to do sequentially, so you kind of do them in parallel. If you want to do things in parallel because you have many cores on your computers, for example, so your computer is able to compute in parallel, then you need to have several threads, and these threads most of the time are not all the time, but most of the time are said kernel threads. It's an object that is understood by the kernel of your operating system, that's handled by a component that is called the scheduler. So the scheduler analyzes which threads are running, which threads are eligible for computing, and it handles all of that. More specifically, on FFmpeg, for your question, ffmpeg is a multi-threaded application and so there are several levels where it's multi-threaded. There can be like a codec that can be multi-threaded when you execute, for example, x364 or x365. As a library inside FFmpeg, they create their own thread, but also FFmpeg creates its own thread, the application FFmpeg.

Speaker 2: 3:20

So what was I experiencing when we saw a drop in throughput? So what was I experiencing when we saw a drop in throughput with and let me just you know one of the typically the operation was large file in, say 4K, and then a pretty complex encoding letter out, say five or six rungs, and testing HEVC, because it's 4K output and you know H.264 doesn't make sense. What's happening in a? You know it's a pretty simple command line, right, we do it all the time, but you know what's happening that causes throughput to drop.

Speaker 1: 3:49

It's not an easy question. So first, I think there's been an effort in FFmpeg to try to bring more threads at the application level. So that's what I was saying. There were not so many threads, I think, in FFmpeg-5,. One of the big changes is that now, when you have multiple outputs, each of these outputs run in a separate thread, which means that, for example, if you have two outputs connected to two different servers, there is a problem on one of the outputs. The second one is not hung, so they can run in parallel.

Speaker 1: 4:29

When it comes to 4K, it's a really huge question. I think there are many factors that affect performances when you're trying to reach the limits of computers. There are threadings, of course. If you're not threaded enough, or if you have too many threads or so maybe that's something that we may want to talk about, then you're not going to run at your maximum speed. But there is also IOs, right? If you don't deal with IOs correctly, then you can be limited by your IOs. So the input-output, and there is also some memory issues, for example. So we're going to compare maybe with the other frameworks, but in FFmpeg the queues are really large. If you queue a high number of decoded frames that are pretty large by themselves, then it can take several gigabytes of memory. So RAM is cheap. But if you don't have enough RAM, then it's take several gigabytes of memory. So RAM is cheap, but if you don't have enough RAM, then it's going to write to the disk and then it's going to be super slow.

Speaker 2: 5:32

Interesting. How do you mitigate that? And if you're working with a command string as opposed to, I guess more programmatically, is there any way you can mitigate against that?

Speaker 1: 5:42

So FFmpeg is two parts, I think, for most of the people. There are some libraries like libavcodex for the codex, libavformat for the demuxers, muxers, network protocols and a bunch of others, and then there is the FFmpeg application. There are other applications in FFmpeg. There is FFplay, which is a simple player. There is FFprobe to inspect your content, and the efforts that I was talking about are on the FFmpeg part, which means that other people were able to leverage some of the benefits that the FFmpeg developers are trying to implement, because they directly integrated the FFmpeg libraries. So, when you consider big companies like Vimeo, youtube or other projects like GPAC, we integrate the libraries and we could investigate, with some improvements on the performances. Ffmpeg is really great for this because you can plug at many points, like for the IOs or for the memory allocators or for the threading, and you can do your own stuff. Not everything is solved on the FFmpeg site but, as I said, they've been working on it, so exciting.

Speaker 2: 6:55

Version 6 came out right in the middle of these problems and I basically just used the same command lines and tried it on version six and I found no difference. So you know, I thought that one of the one of the goals of version six was better multi-threaded performance. What happened? Am I just you know what changed with between five and six and why didn't I see any performance difference between those two versions? With these type of of of encoding projects, Okay, yeah, that's a bit technical.

Speaker 1: 7:28

In version 5, as I said, they have separated the different outputs, so now you can run several outputs in parallel and if there was a problem on one, then it's not going to create problems on the other output. In version 6, it's a continuation of this and they've been working on this for at least a year. I could find comments related to better threading in FFmpeg, and at the beginning of the year, the FFmpeg developers at the FOSDEM Open Source Conference said that the effort would last for the whole year, so probably you will need to test with FFmpeg-7 next year.

Speaker 2: 8:08

You talked a moment ago about too many cores. Go into that and then maybe describe how machines, how you should configure an encoding machine better for 4K as compared to, say, 1080p-type projects.

Speaker 1: 8:23

There are maybe many factors on this. If you put too many threads, the problem is that in your operating system, in the kernel, in the scheduler that we talked about, then the scheduler says it's called preemptive. It allocates you with slices of time, typically tens of milliseconds, and each tens of milliseconds is going to look if there are some new eligibles tasks so that everybody can have some time to execute. If you put too many threads, there are two kind of problems. The first one is that you switch all the time. It's called a context switch. The price is pretty low, but each time you have to save your registers, you have to save a bunch of stuff and switch to the other threads, so that you know, instead of spending time doing some computation, you're spending time switching tasks. And the second problem is that we're most of the time we are CPU bound and so we rely heavily on caches. And when you change cores sometimes you break the cache and so again you have a performance penalty. But this is something that is really really difficult to measure.

Speaker 2: 9:39

Can you give any advice at all? I mean, if I'm doing, you know, say with, let's stick with software because you know that's the most generic case. But if I'm doing, you know, say with, let's stick with software because you know that's the most generic case. But if I'm encoding 4k videos, is there a benefit to going to 128k or 128 cores, or should I? You know, should I stick with 32 or does it? Does it matter? Or you know, more jobs, less jobs. What's your sense of the best machine for high volume 4K transcoding Again, vod, it's a lot.

Speaker 1: 10:09

I think the number of threads on your computer should be a couple of times the number of physical cores that are available on your CPU, and if you make some graphics related to this, you're going to see that it scales almost linearly up to a certain point. That being said, it also depends on the application. As I mentioned, there are, you know, codecs can create in FFmpeg. Ffmpeg the application creates its own codecs, but then the application may create their own codecs. You have codec specific options, for example for x264 or x265 or AVC, hevc encoding or any encoder where you can try to tune. So most of the time the encoder themselves provide an auto, automatic parameter so that it tries to scale by itself, because only the application knows if it scales better with twice the number of cores for the number of threads, or just the number of cores plus a certain number because they use this for scheduling or whatever.

Speaker 1: 11:19

And again, as I said, it's not only CPU cores. If you saturate your bandwidth for the memory, if you saturate your IOs, for example because you're reading from the network, or you're reading from a disk that is not fast enough because you launched several tasks at the same time, then it can be a problem. That's really. Really measuring performances is something that is really not easy and sometimes with small changes, like going from you, going from FFmpeg, the application, to another framework that uses FFmpeg as libraries, you can see big differences.

Speaker 2: 11:53

Just kind of an aside. I recently did some work analyzing the CPUs that were available on AWS, so I looked at Intel versus Graviton versus AMD. What's your sense of you know? What I found was that Graviton was best for X.264 and AMD was best for X.265. Are those general conclusions you can reach, or do you have different thoughts as to which CPUs do the best job with those codecs?

Speaker 1: 12:19

I can say that we use a lot and we advocate for, you know, gravitons and equivalents because they are more power efficient and also that's because the FFmpeg developers developing the codecs they've made for years a tremendous job in implementing all the assembly acceleration that were needed.

Speaker 2: 12:43

I found that true with X.264, but X.265 was very bad with Graviton and that was including a very new version from multicore wear that actually improved performance by 300 or 400%. So any comment there?

Speaker 1: 13:01

Very different projects, right? X.264 is really community-based passionate people and X.264 is really community-based passionate people and x265 is maintained by multi-coreware. And even though there is passion in companies, especially on video codecs, you can have really interesting discussions with them, it's not the same right You're in front. X264 was made by people that were genius, I would say even crazy at some point. So for me, totally different perspective.

Speaker 2: 13:32

What about compiling different versions of ffmpeg for the different CPUs? Do you need to take it to that level, or is that not necessary?

Speaker 1: 13:41

I don't think that's necessary, but yeah, you could do this. We see not necessary. I don't think that's necessary, but yeah, you could do this. We see. You know you always have problems with operations, like operational problems. You can come into a situation where you'd need to have several builds or using several versions even of a specific tool. That's a possibility, yes.

Speaker 2: 14:01

So let's transition over to JPEG and let me kind of set it up. I mean, I know FFmpeg. You know it's really good on the encoding, transcoding side, a little bit less good on the packaging side. We use GStreamer as kind of a workaround for some of the bottlenecks that we you know that I described a few minutes ago. That work pretty well. And then you know everybody. You know, I've known you for years and known about gpac for years, but you really came to prominence when netflix licensed the technology for their live streams. Give me, give me an overview of you know, just really high level, about where you see those three projects, including, you know, ffmpeg, gstreamer and gpacpack, what they do, how they fit in together, what they're good at, what they're not so good at.

Speaker 1: 14:48

FFmpeg. As I said, they are the libraries and these people are implementing codecs protocols. It's a really nice toolbox that is really, really complete. There has been a huge effort on it. Really successful project, I would say, critical infrastructure for the industry today. And then there is the FFmpeg application that people use. People like it because the interface is quite stable. You can reuse a lot of your common lines from years ago and they still work, your command lines from years ago and they still work, which is not the case at the library level where you have to implement the API, which is the interface. At the coding level, it changes from time to time, so that's more work to go. On the libraries. You can get some of the benefits. For most people you can use the FFmpeg command line libraries, gpack, so we are also implementing some standards and some streaming protocols. So we're really experts on this Demuxing, muxing, packaging, encrypting, streaming. We are active at standardization, so we have a lot of R&D. That's the difference, for example, with FFmpeg and inside GPAC. We have our own code and we integrate it with other libraries, and FFmpeg is a big GPAC. We have our own code and we integrate it with other libraries and FFmpeg is a big one. For us. It's a big dependency, and we leverage a lot of FFmpeg. For our customers, our users in general, gstreamer is more like an aggregator. There is no real code being implemented in FFmpeg, even though it has changed over time, but its ambition is to aggregate everything into these kind of what they call elements, which are kind of modules that you connect together in a big graph. So GStreamer is really good at this.

Speaker 1: 16:49

You mentioned that GStreamer could handle some of the bottlenecks that you had with FFmpeg. The framework is really more complex and, for example, we were talking about saturating the memory. Gstreamer has nice mechanisms for this. It pre-allocates the buffers. It never allocates on the fly, which is also something that FFmpeg or GPAC do. But with GStreamer, when you have a queue, a buffer, when you are high in a buffer or low in a buffer, it's going to emit a message, and so I was talking about the scheduler. This is something that is important At the GStreamer level. They can decide to execute or to not execute some of their elements just because the buffer level is high or not high, and so you avoid saturating, for example, your memory usage or whatever, because when a buffer is full, there is nobody that's able to push data to you, for example. So that's a really precise example of where a GStreamer may have better performances. A GPAC is also quite flexible and we have all kinds of I think intelligent mechanisms to try to handle performances in the best way.

Speaker 1: 18:05

But again, you know it's magic by default. There is a lot of automation and if you don't like it you can disable it, and there is a lot of flags to try to control your performances, depending on. You know the whole environment. Jpeg may not be aware of. It's not like when you run these tools. There is no like uh, something that is benchmarking your system and then says, okay, the best flags for you are this or this. It's just saying how many cpu cores do I have or how much memory do I have, and it doesn't consider the fact that you're running a browser at the same time. You know using already a lot of memory or you know other considerations.

Speaker 2: 18:43

Let's look at the individual, I guess, components of what I want from the encoder packager. I mean it feels like again, ffmpeg is very strong in the encode side and it's very usable, very accessible. Pretty weak on the packaging side. You know, and I guess you know FFmpeg is over here. I don't know where GStreamer is on packaging. I know that you're pretty sophisticated there. I know there are projects like Bento 4 that are primarily packaging. So you know, just trying to understand that, and I think you've given us a good start, what can you tell us about GStreamer's packaging capabilities?

Speaker 1: 19:19

As I said, they write little code, so basically they're more of an aggregator. So you know they don't have a GPAC integration, for example, and most of their packaging relies on FFmpeg. So mostly they have the packaging capability of FFmpeg, with a few other components on WebRTC et cetera, where they wrote their own code. But otherwise I think G-Streamer has some particular markets. It's more on the hardware side, which is why I'm not surprised that a company like NetInt would try to use it. But when it comes to pure software workflows I think FFmpeg is way, way more popular and it's been years that we've tried so historically, GPAC is more about file-based packaging with MP4Box. So MP4Box is the main tool of JPEG. But then people had to encode their content with FFmpeg, dump it and package it.

Speaker 1: 20:22

I think it was okay a couple of years ago, where people wouldn't trust open-source software to do some live. But now things have changed. So it's really difficult to be integrated as a third party to FFmpeg. I hope that we can change that in the near future as near as possible for me, because FFmpeg is super popular. But we decided to do the reverse. We said, OK, we can build something that we think is leveraging FFmpeg and which is better than FFmpeg, and so we have a new application in GPAC, which is simply called GPAC, like the FFmpeg application in FFmpeg. That allows you to build complex workflows and it's used for live by many people. We have a lot of users on this and, as I said, because we're doing some standardization and a lot of R&D in JPEG, it's a mix of a lot of things. It's a player, it's an analyzer, it's a content generator so you can encode, you can package, you can stream, you can dump, we embed an HTTP server, so it does a lot of things.

Speaker 2: 21:34

Taking a step in a different direction for a moment. What are you seeing on the packaging side regarding utilization? I mean, we've heard about CMAP for years, we know Dash, we know HLS. What are you seeing as trends in packaging utilization among the people that you're working with?

Speaker 1: 21:52

I said that we were doing standardization. We are advocating for ISO BMFF. So that's the MPEG-4 file format. And so, because we are in standardization, because it's pretty active, because they have this open source software like GPAC, when it comes to open standardization, mpeg systems and the MPEG file format are really really, really powerful. Iso BMFF is everywhere. The QuickTime file format on the prediction side, cmap, fragmented MP4, mp4, everything that's related also to encryption with common encryption dash, hls that's fragmented MP4, cmap For storage, there is MP4.

Speaker 1: 22:37

I think there's a huge consensus, contrary to what happens in codecs right now, where the situation is pretty fragmented. That's the same for input contribution protocols, where people still use RTMP. They use also SRT, but they don't know exactly where to move. When it comes to file formats, it's pretty stable. Iso BMFF is everywhere and I believe that JPEG is the best tool. But again, you can try it, it's open source. There is a lot of nice features I think that people are not so aware of. If you take a raw H.264 file, for example, or H.265 file that you dump from your encoder, if you package it in MP4, the MP4 is going to be smaller than your input because we eliminate all the redundancy and, in addition, you can get some indexation of the content and you can have a lot of nice features that come with. You know packaging your content.

Speaker 2: 23:39

Well, how does that translate to HLS, dash or CMAP? I mean, does that mean dash? Or you know it feels like HLS is still the leader, you know, years after standards-based approaches became available. Are you saying that's not true? Or I don't remember what Bitmoven said in their developer survey? I haven't looked at that in a while. But people, they have to output one of those packaging standards. What are you saying in terms of support?

Speaker 1: 24:06

HLS has dominated for years. It's going to continue to dominate, but that's the truth. But it's not important, right? The way you handle your manifest is just a really tiny part compared to what you're doing with the rest of the data, which is the majority of everything. And then, there again, there is a convergence on CMAP, which is ISO BMFF based. So I think we're pretty close now that there is a convergence also on the encryption, because you know that you know everything that was, you know industry-based was aes, ctr and apple said okay, but cb, cbc and cbcs in particular are. You know where we want to head to. So there is a consensus right now to go to cmap and for the encryption, cb. So I think the convergence is here. We are really near the point where you would have only one media format to distribute.

Speaker 2: 25:00

And that is ISO BMFF with different packaging. And that's the whole promise of CMAP is. You know, the manifest is such a small portion of it, it really doesn't matter.

Speaker 1: 25:10

You know it's messy, because when you, you know, when you consider so I say ISO BMFF, which is maybe a technical term for most of the audience that's the base standard that is edited at MPEG and you can imagine that CMAP is a profile of it.

Speaker 1: 25:25

So it's going to restrict the toolbox. I would have liked it to be restricted to a point where you say, okay, I need to package this and you know exactly what's going to be out, like we have for, you know, bit accuracy, like we have for codecs when you take an encoding stream, you know exactly where you're going to have as an output. It's not that true anymore. There are, you know, a few discussions on that, but basically that's what you have, but still with CMath, not restricted anymore. So there is a lot of freedom. And when there is freedom, of course interoperability is much, much more difficult. But again, in GPAC, separate from the main GPAC repository, we have a repository called Compliance Warden and the Compliance Warden is all about compliance on ISO BMFF and maybe one day we'll get some sponsors to have the full ISO BMFF and CMAP stack so that people can actually have some better compliance on this.

Speaker 2: 26:29

So let's take a deeper look at JPEG you said it's open source, you said it's command line. A deeper look at gpac you said it's open source, you said it's command line. Um codex that it handles. You know pretty much the standard. You know h.264, hbc, av1. Any limitations there?

Speaker 1: 26:42

a lot more. You can mix. You can mix everything, uh, all codecs. You know you could dash any codec. You can dash also masrovska, webm, you know whatever. You can put special flags to be interoperable with basically everything. That's the chance we have. We have pretty big users, not only Netflix, and it gives us a wide, wide range of application and interoperability.

Speaker 1: 27:08

I think that a lot of people, because we're doing R&D, for example, vvc we've had some integration with VVC for three years. So basically, when you want to package VVC, there is a high chance that you come to GPAC and you try GPAC. So it's not a real strategy from us to try to attract users, but this is something that is pretty effective compared to other packages. You were mentioning Bento4. So Bento4 has only one maintainer who doesn't work, who hasn't worked in packaging for like six or seven years. So for me, there is a lot of, you know, packaging projects that are really good, used by the industry, but there is a real, real risk when you have only one maintainer and the project is not actively maintained anymore. So that's why you know we were talking about that there are different strategies, I think in open source and projects like FFmpeg or GPAC you know they come from a real pain point, from users, passionate people who you know that's my case.

Speaker 1: 28:13

That's my case I was. I needed this solution. I started to contribute to GPAC and then I made a company because I wanted, because I'm passionate I think people can see that I'm passionate about it I tried to find a way to make it a living and also with GPAC licensing you know, the commercial arm of GPAC inside MotionSpell to allow other people also to be able to live off their contributions. Okay, ffmpeg, they have the same right.

Speaker 1: 28:41

They have FF Labs now they had a booth at NAB and for G-Swimmer they were these two companies, fluendo and then Collabora. I think that's really important to give the industry confidence that they are real people supporting the project and understanding their needs. It's really important to give the industry confidence that you know they are real people, you know, supporting the project and understanding their needs. It's not a bunch of teenager kids, you know, in their garage, because sometimes that's the feeling that you can have If you go, for example, on the FFFBug tracker. You know you can be, you know, have a wrong image, but still they are trying to build something for the industry.

Speaker 2: 29:15

I mean, they're tremendously talented programmers. What about? So, looking back at JPEG, what about HDR? I mean, you know Dolby supports huge. Hdr 10 plus is huge. What do you? What are your capabilities there?

Speaker 1: 29:27

We support everything. We have a very generic support and the latest release, that was last December, had a special focus on HDR. So we support everything static, dynamic metadata, all kinds of metadata. You can re-inject your own metadata. You can fix the metadata along the stream because sometimes there is a lot of streams that are just broken. Let's frame it this way. So we support everything. And, again, we have a really deep integration with FFmpegs internal. That allows us to take, like the row, you know, the matrix coefficients and the color space etc. And to map it into something that would be recognized as Dolby Vision or HDR10 or HDR10 plus, etc. Where does the Dolby licensing come in? Or HDR10+, et cetera.

Speaker 2: 30:15

Where does the Dolby licensing come in? And I guess, taking a step back, can I do Dolby Vision with FFmpeg. I know I can do HDR10+, but can I do Dolby with FFmpeg? And what about the licensing? Is that just device decode side or is that on the encode side as well?

Speaker 1: 30:34

I don't know. I know they support it on the encode side. They support the metadata. Like you know, they have some special side data that comes with the packets of the data and they support it. That's all I can say. I don't know more about this. But that's sufficient for GPAC to package it.

Speaker 2: 30:52

You know, one of the things that's kind of been interesting to me is the compare and contrast for commercial software like you can get from a number of vendors. You know Harmonic comes to mind. You know what are the pros and cons of open source versus, you know, buying a commercial version. That's going to have a pretty extensive UI, it's going to have a lot of testing, it's going to have people you can call and get support from pretty easily. What's your sense of that decision?

Speaker 1: 31:17

Well, I think there are several reasons why you would want to do that. I had the chance to talk at an AB streaming summit with Netflix and they explained that they do it first because it's open standards. They don't see how they could compete on this. It's not their goal now. They are pushing contributions there. They want to compete on other things, not these open standards.

Speaker 1: 31:43

And I think Netflix made a demonstration when it migrated to JPEG. They moved super fast, like you see Bandersnatch. You see evolving a newer, new generation audio codecs. They announced an ad-based tier. They moved to live. I don't know any company able to move that fast.

Speaker 1: 32:09

So I'm not saying it's 100% because they are using open source or whatever. But clearly that's points where packaging is crucial, right? How do you make sure, if you're doing interactive, how do you make sure that you can stitch your content? Do you know any commercial tool that says, okay, I can stitch and create scenarios on this. If you don't control exactly what a packager does, you cannot do this. So, and again, when there is new codecs, when there are other things using an open source tool, you know you can leverage the work that has already been done by the community. So I think that's important. But, that being said, netflix is a pretty innovative companies, so we have the chance to have them like, as you know, sponsors, and also they pay us when they need, you know, integration of new tools, new codecs or whatever you know. For us it's really easy to work with you know such a company and we understand that it's it's really easy to work with such a company and we understand that it's kind of easy customers.

Speaker 2: 33:08

I mean, they've got a phenomenal engineering team and I guess certainly a lot of publishers don't have the engineering resources in-house that Netflix does. But let's move to Netflix. Tell us what it is you license to them and what it's being used for at Netflix.

Speaker 1: 33:28

They gave a number. It's like what was it? It was a million. I don't remember the scale, but that's millions of content every month that are being packaged. Basically, they package everything. They mentioned that, as, as I said, on the production side, uh, they handle the contents, uh, or at least some of the content, with gpac, because they don't control 100 percent of the. You know the formats that that comes in. Also previews for the journalists. Uh, you know packaging the content for their 200 million viewers. So my understanding is that they package everything.

Speaker 1: 34:04

But again, when it comes to open stores, the magic is that you don't know what your users do. We don't put telemetry in our tools. That's pure freedom. So I don't know what Netflix is actually working on If they don't send me an email. Or when we work with customers, we have acceptance tests. So that's tests, basically, that we both execute. So that motion spell for GPAC would execute and Netflix would execute, and we agree that this shouldn't be broken, otherwise there would be a problem. We send them an email, or when they want to upgrade their version of GPAC, they would have to execute this list of tests to make sure, you know, to give them confidence. So that's the only way for us to know how a customer actually uses GPAC.

Speaker 2: 34:52

So is it for live? Is it for live NBOD, or you just don't know?

Speaker 1: 34:56

I don't know. I know for VOD, for sure we were not able to talk about live for the NAB presentation, but I hope that you know, if there is some GPAC involved, we can also share things. They have a really great technical blog, like many of the companies, and so it would be great if they could communicate on it. But you know, I don't know, maybe this needs to mature also. I don't know, maybe this needs to mature also. I'm not into, as I said, netflix or any other customers internally. We're a pure technical vendor no-transcript.

Speaker 1: 35:46

That's something that you may want to do to get some support. You may want to support us. Also, if you use us and you're successful, you may want to support us so that the project continues. It's sustainable, basically, but there are some sometimes. The open source, the GNU LGPL license, is a bit tricky for companies to deal with. Like you know, you cannot forbid reverse engineering and this type of things that you know you look twice if you want to accept such clauses. So we're in a position to do a license, gpac, and so we can emit commercial licenses. But these licenses we're not looking specifically into details.

Speaker 1: 36:32

What is important for us is that people have a solution. We're building really a platform and I'm pretty sure I have all the statistics, of course I'm pretty sure GPAC is going to be the it's. A majority of the packaging now is done with GPAC and so I'm pretty sure, as I said, because we are on open standards, I'm pretty sure that you know there is not going to be a lot of competition on this. I think it's going to stabilize. We have ISO BMFF, which is a really good technical basis, and once the innovation slows down and I think it's going to happen on where we are? I hope so.

Speaker 1: 37:12

So there's still the question, as I said, to the contribution protocols. We're not on it and there is a real problem. There is also the problem of what's going to replace MPEG-TS exactly. So now we went into HTTP delivery with OTT and ISO BMFF is, you know, pretty good, but that's not a streaming protocol, right? So there are still things I think to you know, figure out on the transport level. But you know that's the future. I think ISO BMFF is going to stay. It's going to stay for 20 or 30 years and after that, who knows?

Speaker 2: 37:46

Give me the compare and contrast with Bento 4, because it seems like you extended back into FFmpeg for more of the transcoding operation but that primarily you're being used as a packager. Is Bento 4 your most, I guess, the most pressing competitor at this point, or the one that's most obvious?

Speaker 1: 38:08

I think a lot of people in the industry use Bento 4. Some of them realize that it's not maintained anymore and that's a technological risk, but that's not the only risk that they face. As I said, it's a problem. You go into open source and then the project is not maintained anymore. You know you're happy with the project, but then you know there is a new video codex that appears. Who's going to implement this? If you have nobody to coordinate the developments, what happens? So I again, that's what I think. A lot of people like GPAC and they go to see us because there is an active community. It's an active project. We're always ahead of time. Everything is easy, everything is flexible, there is a commercial arm. So I think that maybe the next move for us is also to help customers to migrate from Bento4 to us, just because at some point there is no solution.

Speaker 1: 39:04

There is another competitor, pretty big, in the same area. That's Shaka Packager. The two main maintainers are not working on it. There is a lot of people who came to sales and said what do we need to do? It's a Google project, we need to de-Google-ize it and then we need to maintain it. And when there are new codecs, who's going to do this? Even for big companies, there was no inside resources. Netflix is the proof of that. They had their own packager and they decided to move to open source, and then some of the companies who went to the wrong they couldn't know to an open source project that isn't maintained anymore. Now what do they do? Either they invest in something else, like JPEG, or they need to maintain the tool that is not maintained anymore, or create a community around it, or whatever.

Speaker 1: 39:56

So I think it's also our responsibility to help people migrate easily to us.

Speaker 2: 40:02

Okay, I should note. I mean I went and I'm doing a talk at Mile High next in a couple of weeks and I looked at the. I mean there are some responses to questions posed on the Bento, for, you know, whatever the resource is for support, I don't know if there's any new development. There have been responses within the last two or three weeks, so I don't know what that means. You know. I just want to mention that because I don't want to. I don't want to. I'm not saying it's not better to switch over to your tool. I'm just saying, you know, there are signs of life there that may be different than what it is you're expressing.

Speaker 1: 40:38

I think people can. There are things that you can have a look at, like how many open issues in a project, how many pending pull requests. It gives you, you know, a sense of, you know is the project active? Are the people you know, are there enough people to you know, deal with the activity of the project or whatever. I think that for people you know comparing open source projects, most of the project or whatever. I think that for people you know comparing open source projects, most of the time they're comparing the stars, which is on GitHub, when the projects are on GitHub, which is a good indicator of the popularity of a project. But then look at the open issues.

Speaker 2: 41:14

What about? You know I'm actually looking at a GPAC now to integrate into the same talk. What are the educational resources that you have? You know that's going to be important both for new implementers and for people switching over from Shaka and Bento 4. You know how developed are those capabilities on your website.

Speaker 1: 41:31

The website is pretty old, it's just. It dispatches users. That's how that's all it does. We have a pretty good wiki that's hosted on GitHub. So if you make searches on GitHub directly, you can find a lot of things. And if you read the readme in the front page, we give a lot of pointers to our test, our build boat. There is a doxygen for developers. But yeah, again, I think the wiki is the best entry point for people.

Speaker 1: 42:01

And then on the common line, we have some really good magic. If you make gpack minus h and you put whatever word, it's going to output everything that is related to this word. And by reading the doc you will learn that we have several levels of verbosity on the help. So, as I said, we want to make it magic. I think that for a lot of people, you know common lines can feel painful and the learning curve is really, really important.

Speaker 1: 42:30

What I would like is that in the future we have a graphical interface. I think that would be great for people to see. You know what's happening, et cetera, because multimedia is a lot of magic. You know, you have H.264, it's Annex B and then someone has to take care of this. That's the same AAC, implicit signaling, you know ADTS, latm. That's a nightmare when you don't look into the details. It can be a nightmare and I think people would, you know, educate themselves by having to you know some, you know graphical interface, showing to them what is the actual work that's being done, and I wouldn't do that only for users, I would do that so that you know we can also, you know, keep talents and, you know, make it less, you know, less complex. I think talent is, is, is, you know, how do you say? Talent retention is something that the whole industry should look at, because there is a lot of, you know, gifted people leaving our industry after a couple of years and we need more of this.

Speaker 2: 43:34

We have a question Can you describe the extent of your VVC support and I guess, vvc? I know Fraunhofer had an issue with getting their VVC implementation into FFmpeg, so which version are you supporting and how does that work?

Speaker 1: 43:49

Yeah, we're pretty up to date because we have a close integration with the people of OpenVVC, which is an open source VVC decoder. They have some integration in FFmpeg and so we're following the latest developments of the standard. If that's not the case, just open an issue on our bug tracker. But we're pretty up to date. There is MPEG this week and so we're going to make an update if there are fixes to the spec or additions.

Speaker 2: 44:19

Another question on your inspector. What can your inspector do and how is it different from, say, MediaInfo?

Speaker 1: 44:25

Yeah, okay. So I think MediaInfo is a great inspector. You take MediaInfo, the name of the file, you can go common line, you can go with the graphical interface. It gives you immediately an overview. But one thing that it doesn't do is inspect the packets like deep inspection of each of your packets, and GPAC does that.

Speaker 1: 44:45

We have several levels of verbosity, and so it can display general information, and I would like, as I told you that I want, to improve that, maybe graphical interface so that people can navigate into the content and then you can go deep, deep, navigate into the content and then you can go deep, deep, deep, deep, up to the point of every bit that's been parsed. Netflix had an anecdote about this that JPEG was really, really picky and going really deep in the bit streams and actually it was reporting, or they were able to report, errors on some standards where files were already widespread. And then you have what do you need to do in this situation? Do you need to modify the standard or do you need to ping the other implementation and say, hey, you should modify what you're doing?

Speaker 2: 45:42

One last question what is MotionSpell? I think you mentioned that at the start and we just got a question in that relates to that is what are the licensing terms for GPAC and FFmpeg? So what is MotionSpell and what are the licensing terms for GPAC and FFmpeg?

Speaker 1: 45:59

So MotionSpell is the company that holds the commercial arm of gpac and we're also making products, most most coming from the gpac technology or related, so we have things that are related. Now we have an ott live subtitle inserter for people you know having a live ott stream but you know they need to add titles on the top of it and they don't know how to do it and, as I said, also related to compliance. We're doing a lot of work with the biggest companies in the industry. I think that's pretty important. And regarding the license, the license of FFmpeg and GPAC are the same, that's LGPL V2.1 plus, and my chance at MotionSpec is that I went to see every of the copyright holder in GPAC and they signed an agreement with me and I can put a commercial license which is basically your terms. We propose basic terms and you can modify them to suit your needs.

Speaker 2: 46:58

Another question came in what's the status of mp4boxjs? Is that being maintained?

Speaker 1: 47:03

It's not actively maintained and I think that's so. No, it's not true. We're in the same situation as Bento 4, right, we're merging the pull request, but we're not actively developing it, and there's a reason for that. There is WebAssembly now and you can build JPEG and run it in your browser, so I think it's going to be duplicated within a couple of years. That being said, in the middle, I think MP4Boxjs is a really, really nice tool that's really popular and people use it, coordinated with WebCodecs, and it can transform their content in the browser. That's really amazing.

Speaker 2: 47:41

Pleasure seeing you at NAB, pleasure working with you on this and I'm sure we'll be in touch over the next year. I'm going to need some help with the presentation I talked about. I'll be in touch and thank you for participating today.

Speaker 1: 47:52

Yeah, my pleasure. Thank you, Jan. Thank you everybody for attending, and if you have questions, I'm always here.

People on this episode

Voices of Video

Voices of Video

The benefits and Limitations of FFmpeg, Gstreamer, & GPAC

People on this episode

Mark Donnigan

Jan Ozer

Anita Fejter

Romain Bouqueau