Designing Video Systems Around Latency Constraints Artwork

Voices of Video

Explore the inner workings of video technology with Voices of Video: Inside the Tech. This podcast gathers industry experts and innovators to examine every facet of video technology, from decoding and encoding processes to the latest advancements in hardware versus software processing and codecs. Alongside these technical insights, we dive into practical techniques, emerging trends, and industry-shaping facts that define the future of video.

Ideal for engineers, developers, and tech enthusiasts, each episode offers hands-on advice and the in-depth knowledge you need to excel in today’s fast-evolving video landscape. Join us to master the tools, technologies, and trends driving the future of digital video.

All Episodes

Voices of Video

Designing Video Systems Around Latency Constraints

March 10, 2026 • NETINT Technologies • Season 4 • Episode 1

0:00 | 48:05

Your neighbor cheered before your stream - now what?

In this episode of Voices of Video, we move past the generic advice to “make it faster” and dig into why latency has become a structural constraint in modern video systems. It’s no longer just a performance metric. It dictates where compute lives, how encoding is deployed, how traffic is routed, and what it really takes to deliver reliable, real-time video at scale.

With i3D.net co-founder Stefan Edeler, we unpack the architectural decisions that separate stable platforms from fragile ones.

We start with the workload lens. Live sports, betting, auctions, and interactive formats cannot hide behind buffers. When user actions, commentary, or creator feedback loops back into the stream, encoding can’t sit in a distant region. It must move closer to viewers, often across multiple sites. That shift forces teams to balance geographic distribution against blast radius, cost, and the very real operational load of running many locations.

One key insight: pretty ping times don’t equal quality. What matters is sustained throughput into last-mile ISPs during peak hours, packet loss behavior, jitter, and whether your providers truly have capacity headroom when it counts.

From there, we zoom out to platform strategy. Cloud accelerates early builds, but egress-heavy video workloads can quietly crush budgets. Hybrid models and bare metal often win on cost and control, yet introduce vendor sprawl and operational complexity.

Stefan outlines a pragmatic path forward:
Prototype quickly in the cloud across a few regions. Validate failover and backhaul. Then expand step by step into the right geographies with strong peering and measurable demand.

Instrument everything - edge QoE, per-ISP performance, jitter, exit rates - and let real telemetry guide routing decisions and site expansion. The objective is not just scale, but resilience: a self-healing system that detects carrier trouble and automatically shifts users onto healthier paths.

We close with practical guardrails to avoid over-engineering:
Optimize until users can no longer perceive improvement. Choose five excellent sites over thirty fragile ones. Challenge every “we must” with observability data. And recognize that legal, licensing, and sovereignty requirements now shape placement decisions as much as physics does.

The takeaway is clear: architectural intent beats component speed.

If you’re wrestling with latency budgets, interactive encoding, or the cost of scaling globally, this conversation offers a blueprint for making durable, data-driven choices.

In this episode, we cover:

Defining latency budgets and aligning them with architectural intent
Mapping workloads to geography and last-mile ISP realities
Measuring sustained throughput, packet loss, and capacity, not just RTT
Deciding when to distribute compute and how far to push it
Placing encoding for interactive and feedback-driven streaming
Balancing resilience against operational overhead
Expanding stepwise with strong observability and real telemetry
Navigating cost trade-offs: cloud egress vs. hybrid and bare metal
Designing for automated failover and self-healing routing
Accounting for legal, licensing, and data sovereignty constraints

If you’re coming to NAB Show, come talk with our team at NETINT and explore

Stay tuned for more in-depth insights on video technology, trends, and practical applications. Subscribe to Voices of Video: Inside the Tech for exclusive, hands-on knowledge from the experts. For more resources, visit Voices of Video.

Setting The Stakes For NAB

Voices Of Video 0:07

Voices of video. Voices of video. The voices of video.

Anita Flejter 0:12

Voices of video. Welcome to Voices of Video. As part of our preparation for NAB, we are speaking with partners across the VPU ecosystem about the architectural realities behind modern video system, not product features, but the design decisions that shape how platforms actually operate. For today's conversation, designing video systems around latency budgets, we are joined by Stefan Edeler, co-founder of i3D.net. Stefan has more than 20 years of experience in infrastructure, networking, and game services. From the early days of consumer-hosted game servers to today's globally distributed orchestrations platform powering always online environments, he's been deeply involved in how real-time systems scale across regions. Stefan, welcome. Great to have you with us.

Stefan Ideler, i3D.net 1:09

Thank you, Anita. Very glad to be here again at the Voices of Video.

Anita Flejter 1:14

Joining us as a moderator is Leonardo Nieto, Director of Market Development for EMEA at NETINT, who moderates these conversations from a market perspective. Bringing forward recruiting patterns, concerns, and architectural tensions, he hears from video teams navigating real deployment decisions. Leo, I'll hand it over to you.

Leonardo Nieto, NETINT 1:36

Thanks, Anita. And uh yeah, it's great to start doing this uh for the European market and therefore the European uh context of what we do uh as um as video technologists. Um and uh Stefan, uh thank you for for joining us. Uh how have you been?

Stefan Ideler, i3D.net 1:55

Very good actually. Um of course now with the new year, there's also a little bit of a pause in the in how we say the travel season within our main industry initiatives, which is starting to pick up again this and next month, uh, where the famous industry conferences and the games industry start played, which is of course one of our big operating theaters, and of course, also the broadcasting conferences such as uh NAB in Vegas.

Latency As A Structural Constraint

Leonardo Nieto, NETINT 2:22

Yeah. Which brings us to you know the themes that uh that we want to discuss, and I think uh our audience is curious about, and you know, just to you know tackle it directly. Uh, you know, uh we're hearing repeatedly uh that uh you know latency is not just a performance metric, but rather becoming a structural constraint. You know, uh teams, uh even uh people on the commercial side are not just asking uh how fast is the encoder or where do we find bottlenecks? I think the task itself has gone quite complex as the demand in video or demand of video increases constantly, and of course monetizing becomes uh much more challenging. Uh I think they're asking, you know, uh, where should compute live? Um you know, what happens to density and thereby quality and other parameters, you know, when you start affecting those variables? Uh what does this do to fail your domains? And you know, fundamentally uh that shifts the architecture discussions away from centralized efficiency towards more placement and locality. And obviously, i3D is a great example of the you know of the outfit that you guys manage worldwide. Um so, Stefan, uh at what point do you think that latency becomes the dominant constraint in systems design?

Workloads, Geography, And Risk

Stefan Ideler, i3D.net 3:49

Actually, it is a good question. Um, there's a few different ways where where it comes into play, right? Um, one example is of course it depends on what kind of workloads are you trying to serve. Um going back to our game group, okay, computer games is a workload. Um streaming, it could be like a a sport match going on. Um it could be just uh uh uh a simple movie that you're streaming. Uh, for example, in the case of gaming or the support match, latency is definitely a constraint. Uh that the higher the latency, uh the more delays, the more buffering you have to do to account for variances in in your streams. And you really don't want to hear your neighbors already cheering when on your screen the goal has not been even made yet. So uh for certain workloads, latency matters a lot more compared to some others. Um another big question is of course, where are your users? And from which point in the world are you serving your content, your streams, your infrastructure, and so on. Um also there, of course, what is acceptable in terms of your latency budgets, but what also people sometimes tend to forget if even if your latency budget might be relatively relaxed, there are questions like okay, is it a good idea to serve all of your world value users from, for example, two locations in the world, like one in the US, one in the EU. What if something goes wrong there? And we thought then we are thinking about blessed radius. Um wouldn't it be easier to run multiple sites closer to the user? So if there are disruptions, the Interbred breaks all the time, continuously, you can uh work around it by by by going around it. Um so summary, it depends on the workloads, it depends on where are your users and what are their specific demands, and sometimes you also have to look at uh the specific region where uh the connections by default might not go as the bird would fly. They might take large detours, so you need to be a little bit more um you need to be research a little bit more on where the users and what's the geographic makeup in terms of the fiber connectivity, in terms of the latency, because there might be things that you um might not expect. And the third part is of course, it's a very age-old discussion, of course, the centralized approach versus more distributed. Yeah, correct. And and if the stream especially is important, like has real time needs to current events or anything, uh uh you definitely want to avoid or minimize the risk of a total outage. Uh I also believe because there are disruptions in the internet, you might have, for example, quality constraints in one region with a certain ISP. But due to your measurements and operational excellence, you see that in your five other locations, for example, in Europe, that uh throughput constraints are not there. Then you have the freedom if you control from where the user gets their stream to optimize the quality by shifting some of that load towards your other locations. Instead, if you had only one location, if something is not performing on the other ISP's end, then you're kind of stuck and you cannot act right away to alleviate any operational problems.

Distribution Versus Centralization

Leonardo Nieto, NETINT 7:28

And I think by the nature of what you have shared uh just now, one size doesn't fit all, and I think that you're right when you're trying to tackle these things both from a technical approach, obviously, but then mix it with uh you know uh commercial realities, it's uh rather it's it's rather challenging to find let's say uh uh a relatively I don't want to say the word holistic, but a rather let's say encompassing way that you know can make sense for both technical uh challenges and also uh commercial objectives. Um you know, within that respect, what do you think you know latency stops being let's say a tuning exercise or maybe one uh desire of uh or or customer objective and rather uh a driving force in uh in changing infrastructure or you know topology changes themselves? I know it's not an easy question, and of course it's always you know there's no right answer ever, but obviously we're mentioning things doing it closer to the customer. Sometimes we're doing it closer, you know. But at some point, sometimes it's difficult to pinpoint where where do we start asking ourselves, wow, we have to stop thinking maybe on uh on simple systems that find themselves either or, and then uh where would you see what would you say or have give an opinion or where that point is where we have to think maybe wow, the infrastructure or something much larger has to change.

Stefan Ideler, i3D.net 9:07

Yeah, I I think one thing that resonates with me that if you want, in my opinion, if you want to do it well and do it well over a longer time, so not just just a lucky shot and one-off and and yay, it went well. Um actually there's a lot of skill sets you need to combine to properly understand and design around the low latency requirements and what that means for your infrastructure deployment. I mean, if you would look at that very simple thing, low late low back in the days, low latency, you build up a big location, you do some connectivity with the ISPs, you do some handshake agreements, and then you would have it in place. But in today's world, with today's massive traffic volume shifting around, that's no longer good enough. You need to monitor this. There's a lot of operational things that are come into play. There is, as we spoke about before, ideally you start distributing the environments, but there's open there's of course overhead when dealing with distributed environments. And depending on the requirements of your product and the latest requirements and the guarantees that you need to meet, that of that overhead might be acceptable. But there's always a tipping point. At one point, does it become a burden versus does it become a game? Um there's um it's a tricky question. It's like it's not a trick to anymore, it's just okay, I have a sports stream, it's X latency. Uh maybe if you do it once or twice, but if you want to build a long-term product product with customer satisfaction, return on clients, then you really need to start looking at the whole yeah, almost actually a holistic view of all the different um uh little things that you can turn on to optimize the experience. Um I think that's what teams underestimate. Like you can fire up streams in a cloud to get a simple single server and with with a 2 or 25 gig uplink and do your first things, but scaling that to a more professional level, a lot of complexities get into play where where you at first glance you might not realize that they are there. And that that's kind of the difficulty.

Leonardo Nieto, NETINT 11:30

What do you think teams you know often under underestimate you know these requirements? Um you know, it seems obviously something that I mean uh from you know from an initial approach that you would take it rather as a as a big priority, but why do you think that is?

The Tipping Point For Change

Stefan Ideler, i3D.net 11:48

Well, one one thing that is that's definitely a common common uh that could be a common pitfall. Uh yes, you have latency, right? And latency from a certain test location might be low, but the second question you should always then ask is what's the throughput that I can sustain? Um so if you know where your audience is, you will know what their main ISPs are. So you should be asking questions to your vendor, what's the capacity from you towards these ISPs? The latency might be good, might be good right now, but if you're already running at 90% capacity, part of my online content in the evening, it's not going to be good. The latency will probably degrade, there will be packet loss, the throughput will not be there. Normally, when you do latency tests, you run a script, you look at the numbers, but you need to move beyond this and also test the throughput or find providers to give you real-time insight into, for example, the throughput towards these end user ISPs. The other big thing that people forget is um I mean, yes, operational overt I mentioned before. I've met with companies where they they realized they had to distribute towards multiple locations because they had very strict literacy requirements, also problematic legislations, and so on. But then before they knew it, they were working with like 40 different vendors at 40 different locations with 40 different support teams, 40 different entity assistants, 40 different billing systems. The the the that required full-time effort just to manage that world. Uh and yes, of course, um you can then reduce the overhead by trying to consolidate vendors and and everything, but um uh it's something that when you start you might not realize. Uh of course you can also depending on on how cost-driven you are with your platform, you can also just pick a single cloud, of course, and and expand everywhere there. Um but we see with many companies nowadays, especially with egress-heavy products like content distribution and streaming, that um uh cloud might not always be the best fit cost-wise because of the the egress uh uh costs associated with Venice with your product. So you in it by default, you're almost looking at again physical veterans around the world who can give you that best deal on, of course, all that equest bandwidth that you need to consume. Uh but then we come back to the previous point. Yeah, literature matters, but then how much throughput can you actually do? And does it actually meet the requirements of what you need to reach? Uh don't just believe the providers that you go with at the first glance. You need to do more research, and often that kind of gets gets forgotten. And to be honest, that research should also be done on your cloud providers because they are dealing with the same constraints as well. It could be that some of the availability zones or even specific zones within data centers have some congestions to certain IATPs at certain times of the day. So it really you really need to do your homework uh well uh if this is a crucial operation for for your business. And then final point, then you have all these this multiple sites, right? That there's usually a case where you have at the edge closer to users serving the content, but you might have backends somewhere, processing, user authorization, maybe user payments, tokens, whatever. And then you need to figure out how do all my systems at these edges communicate at my backends. Are my backends redundant? And um uh can they communicate well? Are those connections? Are those physical connections or are those tunnels? Do they go back into VPCs into cloud providers?

Throughput, Capacity, And ISP Reality

Leonardo Nieto, NETINT 15:56

Um uh and because yeah, no, no, no, but because no no no, this is it's a great discussion because you we sort of open a can of worms and you know Oh, yeah, definitely. There's there's just so many things that um it's important to contemplate. And obviously, none of them uh are I don't want to say that one is particularly more important than the other, but the more that we start touching each of these facets, it starts affecting the latter and the former. And I think it's important to keep that into consideration, especially as you said mentioned earlier, like I there's this human operational load, not just the the bare metal infrastructure, which I mean as technology evolves, you know, these things I think become a little bit lighter. Yeah. Um, in itself, they they get commoditized. Exactly.

Stefan Ideler, i3D.net 16:49

On the being able to consume compute on demand is pretty much like standard, standardized nowadays. But but then managing the compute and up installing the observability, installing that's where um uh a lot of the workload and choices come in that you have to get message.

Leonardo Nieto, NETINT 17:12

Yeah. Maybe a side question about this, because you know we talk latency and there's you know these human overloads, and then technology keeps getting better, and of course, you know, infrastructure uh you know keeps uh being a little bit more solid and I guess more permissive in the current uh growing market. The question um not necessarily uh the question, but um you know whether compute moves closer to the users, or whether, let's say, like you mentioned earlier, maybe the cloud is too expensive, or we do it we do a change to do something on-prem. You know, what truly changes? Like oftentimes you know, we we we start tweaking this rather large decisions. And obviously, yeah, when we say it semantically, we say it commercially. Oh, we're gonna do this, and yeah, it always sounds nice when we when you pitch it to the world, and you know, it's but there's obviously a very tenacious work uh from uh engineering departments, from uh you know, uh even marketing, but they're after by marketing departments try to understand how to monetize this. Um what really changes you know in in this in this whole, I don't want to say ecosystem, but you know, in the supply chains uh because that's what it is. And we start restructuring this this this this large volumes of of hardware, uh software, and of course, you know, uh intellectual power. So maybe you maybe maybe you can share with us your vision and your your what are called your experience uh about what you're seeing.

Stefan Ideler, i3D.net 18:46

I think I think I can definitely have some some view I have some viewpoint on that because we we had a especially during COVID, we had to undertake this exact scenario a few times, right? We we were dealing with also systems which were at that point very centralized. So we were talking about those blast zones earlier, and there was then a need to distribute it closer to the users. So if there was a problem with the internet somewhere, at least the impact was greatly reduced, and then the majority of the platform platforms would still remain stable. Um that's kind of the discussion that we had regarding where does my compute go and how does it talk back to my more centralized backends which do more of the workflow logics compared to one which serves the users. Um but another thing is that going to the users generally is a performance improvement. Like going from a user from, of course, 200 milliseconds away to 10 is a massive noticeable improvement also for the user point of view.

Leonardo Nieto, NETINT 19:53

Yeah.

Vendor Sprawl And Cost Pressures

Stefan Ideler, i3D.net 19:54

Going from 10 milliseconds of latency to two milliseconds of latency for a user, that's no longer noticeable. For him, it still is almost it loads instant, it works. So from a user, the the the the at one point the perception of the user that in regards to the latency is diminishes, but uh if if you install all your observability for observability and metrics, of course, are your automated systems to measure the difference that the human no longer feels or can perceive. But tangible it becomes a little bit less once you reach that ideal threshold of latency, you optimizing even further might not be needed unless that single country has such a big population of users you want to go to a two-city solution just again for that lesson reasons. Um, another thing is the more insights you have on your future expected workloads, the better you can plan your capacity and intensity. Like there's definitely countries where you want to be close to the users, certain countries in Latin America instantly come to mind. But uh importing new equipment to these countries can be a very lengthy and difficult process. So for those countries where you know it's going to be difficult to scale up, the more predictions and analysis and and things and think work you can do in what do I need three months from now and six months from now in these races, it's absolutely crucial to be able to make sure that your hardware will actually be there when you need it. Um and this is just regarding the compute. There might be other not compute might not always be the bottleneck, it might also be the ISPs that you're dealing with in a certain region. Um when looking at POS projects, that's definitely the Middle East comes to mind where capacity between the different ISPs is not always um uh guaranteed. And if you want to increase it with those ISPs, we've seen in the past that that can be a very, very lengthy. Process because of additional overland overland fee taxes, bureaucracy. So it what but in some parts in Europe might take a month to create a private network connect with a certain ISP from 100 to 200 gig. In other parts of the world, it might take a year. So there's different variances and variables regarding being able to expand your capacity depending on where you are. And in some of these cases, you have to deal with it because there's no other options to be close to the users. In some other cases, depending on your workloads, you might be able to be accept accept a little bit higher latency and then put your stuff in the neighboring country to still be acceptable enough and have more flexibility in terms of going with your capacity up and down. But the more real time and the more interactivity there is in your product, like also users can do actions back or for example talk back or stream back, some interactive platforms, the the the more stringent the latency requirements are, right? Of course. And then you have to deal with it.

Edge To Core: Backends And Links

Leonardo Nieto, NETINT 23:15

Of course, and uh we're it's I'm curious specifically about that because I mean this is all video workflows and uh the something relatively constant is the is the encoding power. You know, obviously uh you know through the chain things get compressed and non-compressed and transport, like this constant pipeline, let's call it. Yeah. And that you know, that sort of pipe not the pipeline, but that the that sort of uh transport. You know, it fundamentally does it change drastically, let's say, in the style in which it's done at certain places, or does it become more a strategic place or uh you can say that uh a strategic density placement? Um uh because uh things before, when the world was a little bit more linear, let's call it, uh it was certainly more simpler because things were in a in a bit of a panoptical way. You just sort of sent it forward and then people just consumed it and then we forgot it. Uh you said the magic word before that when you start doing this in an interactive way, which is how modern communications are happening, everything is sort of point one-to-many, many to uh uh many to one, uh many to many, one-to-one. And everything gets in everything gets you know uh juxtaposed through the network. And it's uh it's a it's a it's a circus that how can one from the infrastructure perspective make sense of that workflow and identify those pipelines.

Moving Compute Closer To Users

Stefan Ideler, i3D.net 24:46

One one great thing you mentioned there, it's like the more interactivity there is, and if that interactivity also needs to get pushed back into that video stream, like for example, the player said something back, it gets immediately featured back in the stream, stuff like that, then suddenly you can no longer suffice, even if you had a before a product which was not very constrained by latency, and you could simply have maybe one or two encoding locations in the world, which then distributed the finished streams towards the edge nodes, which might uh um have the the users actually connected to. And when you have a lot of interactivity going on, then you also need your encoding your encoding clusters to be more distributed as well. And that's that's also quite an operational change to what it used to be, for example, five or ten years ago. Uh very different. So and that kind of ties back again, of course, to the original first point that workloads do impact a lot also on your architecture. So it's depending on the workloads that you're going to serve. If you want to be a one-size-fit-all shop, then in my opinion, there's no way around that you need to have both the compute and your encoding capabilities, so also with the CPUs, as close to the users as possible. Um, because then you can of course have any workloads. Of course. What it is, right? But um let's but if you're more constrained in a team, you have a specific mission, like a specific company, and specific workloads, then really look what are these, and then you can extrapolate it to okay, they need to be in these many locations, correct. This many distribution clusters for the streams, this many encoding sites. It could also uh be like if there, and then if there's interactivity, there could be a question, okay. The majority of my users are interactive and sending me stuff are located in this country, so we need to be there, but they might actually content might actually be consumed by another country across the world, but it might be popular to watch that kind of stuff at the same time. It could be even a sports match with interactive commentary that is being consumed by viewers uh somewhere else in the world. Uh something that goes worldwide with lots of commentary and and interactive possibilities nowadays uh in in the in the streams. Um that happens to be a little bit like because we're talking that in the specific context, that's like sort of a workflow process. Yeah. But I think on the on this new world, on this uh approach, it's more about uh what do you call that uh infrastructure or maybe topology, logic, and understanding you know what's really out there and perhaps how it can be better leveraged. And obviously, you know, incumbent technologies uh are technically able to do these things. But there's always this big uh uh what do you call that commercial questions that have to be answered as well. In essence, anything is possible. Well, virtually anything is possible. Uh but of course if it costs an arm and a leg and it doesn't really move things forward. What's the balance? Yes, we can do it, but it's not gonna benefit

Leonardo Nieto, NETINT 28:04

anybody. No, no. Continue, sorry.

Stefan Ideler, i3D.net 28:07

Yeah, no, that's a really good point. Like we kind of tipped on it a little bit earlier, but how do you balance these these latency and workload requirements versus the added costs and complexity that it might bring, right? Correct. Um we we we of course, like if it's purely your goal is user satisfaction, then optimizing that towards a latency point of view that the user can then no longer really notice it if it becomes better or not, is already probably good enough. So that there's there's there like you can continuously distribute distribute all your workloads even further. But at one point you're at one point it simply might not be needed anymore, or you're doing overkill or something that your workloads actually don't require. Um yes, I mentioned, of course, you don't want a single cluster for a continent because of blast zones, there might be a bug in one of the routers causing problems, one of the ISPs might have an outtage that you're connecting, that zone. But there's of course also a trade-off, like if you instead of one big point of presence in Europe, you use 30. Well, then you do have to observe and monitor 30 different domains for other LITC and those ISPs. So that does increase the stress and workload on your monitoring systems, on your Lite Hops team, on it needs proper playbooks for every single location, how to do escalation and stuff that goes wrong. So um that comes back again to the question okay, does my workload warrant that additional well complexity stress? So and sometimes it definitely does, but sometimes it does not, or only until a certain extent. Then you might go with five locations instead of one instead of thirty. Yeah, correct. Uh I mean I'm technical. I all the all the technical things in the company. Yeah, we have to do that. It needs to cover that. We always like to make it perfect. We like to build that Ferrari and and and and make sure it works. Yeah, but sometimes that also leads into getting into the trap of of over-engineering.

Leonardo Nieto, NETINT 30:11

Of course, correct. You have to wait, you know, how complex it's good enough before it it starts um you know not being beneficial for anyone uh in the in the in the party, right? Yeah, when we take you know the complexity of doing uh you know this this this very hard uh challenges, uh at what point you know does it become unbenicial for the mission? At this point, you know, uh latency budgets shrink, operational complexities increase. You know, how how how do you balance the how well not you, but how does a one balance the benefit of uh you know lower latency against increased distribution and failure domains?

Interactive Streams Reshape Encoding

Stefan Ideler, i3D.net 30:57

Yeah. Yeah, the the so the the talks the the points that we covered just now are more like the risks we described, right? So if I would summarize that down to what would be my advice is um of course avoid building a Ferrari if you don't actually need that Ferrari. So avoid trying to over-engineer, uh even though it might be technically satisfying to do so. But all right, look at what does my client actually or my workload actually need, and then we build for that. Um the second thing is really sort out operationally with your monitoring teams and like of supporting people. How am I gonna manage all my different domains and locations? And that by having that discussion, it will also become clear what's the limit or what's the overhead cost with adding another domain, another playbook. Um, and the last part is um figuring out also when distributing something across many, many different locations is actually a net loss versus a net gain. It could be that five locations might be the optimal spread instead of thirty. Or it could be that it can it can be just can be just two, or it could be that your workload is not that latency sensitive at all, and you just want to distribute it to mitigate a single blast risk. Then maybe you can mitigate just maybe every one or two locations in a continent, and that would already be sufficient. So uh it's really going looking at it strategically, what is my workloads, what are my capabilities, and and yes, I also believe you should listen to the engineering feedback, but challenge them on uh does this meet the actual requirements or are we over-engineering this at the same time?

Leonardo Nieto, NETINT 32:45

Correct. And it's it's a good segue because now you know there's a question obviously, just to be not necessarily devil's advocate, but to you know maybe understand when when when or does is there a point or where is the point where distribution itself becomes uh counterproductive?

Stefan Ideler, i3D.net 33:05

I think from what we also learned in in the past, when I look at our own past experiences, right? When we deploy and build up new locations in the past, for us what always was most important, do we gain those do our users will feel in being better by us being here? Will will our clients feel a tangible benefit? And the benefit did not always have to be around latency, it was often latency-driven, yes, but could also be that by going towards that certain location, um, the internet ecosystem in that location was much more beneficial to get direct connections compared to the country we were serving than in from previously. It could be that from that other country we had to use middleman parties to reach those end users, whereas by going in that country, even though the latency might not differ much, we might be able to enable the direct connections with those ISPs. So you cut out the middleman, which is another um party that you cut out that you don't have visibility over. That might add unreliability in your connectivity. Uh, so we get more control. Sometimes is that also a reason. Um there could be well, no, there definitely is. In some of our cases, there could be uh legal requirements why you want to cover and be inside of certain regions. Having that clear is also very important, especially to legally firewall stuff regarding some support licensing streams and so on, which can only be served towards certain countries or from certain countries towards certain users. Getting a clear picture of that could also really define again what your workload is going to be, but this is another thing that people might be overlooking on on the workload aspect. Another thing that I would say, okay, let's say you've identified that you do need to distribute it pretty heavily. Don't try to do everything all at once.

Avoiding Over‑Engineering

Leonardo Nieto, NETINT 35:05

That's a wise that's a that's a wise, that's a that's a wise approach, and and I think a very honorable one. A lot of people would try to maybe tell you, yes, do this and uh do it all the way. And uh but uh no, this is a sensible hear thing to hear, Stefan. Good. Just do it step by step. Start with adding one more location, yeah, verifying it L works, verifying that your teams know how to work with an additional location, verifying that the whole workflow from the and uh the encoding towards the end user consuming the the streams and video actually works. Um and ideally you you by by doing it incrementally, you learn from every time you expand a little bit. So you can apply the lessons that you learn towards the next location and so on and so on. And it's nothing clean that includes optimizing your design every time, being able to um uh make use of improvements in hardware or encoding technologies with every new iteration, uh, and also then being able to optimize your design for if something fails in one of those regions. And then have a self-healing distributed platform. That's that's kind of like the the utopia that we all try to go for. That if I have a problem in a certain ISP in a certain region that I'm currently serving a content from, that I automatically detect this and redirect those specific users towards a location where they can optimally consume the content. That's of course the holy grail of having it distributed. Um but you could you only get there by doing step by step. Like trying to do it. So yeah, if so, no, but this is great because then you know it it let's say for for for those of our audience that is listening, you know, for those teams where you know designing latency sensitive systems for the first time, it's a it's a challenge, or my obviously when you try to do something for the first time, at first it always looks a lot bigger than maybe it is once you have done it the first time. You know, for those listening out there for uh you know for a first for a first uh runaround, what what would be your suggestion or like a relatively simple approach for uh for a rollout uh would look like? What would you say in a few points like okay, consider this, this, that, so that um people that may come back to us after the session, uh, you know, we can uh direct them towards you and uh just say, look, this is how we can start, and just give them a little bit of a little of a blueprint, uh maybe what I call that, uh a podcast blueprint on how to approach it that is simple, at least to ask questions and maybe uh take away a little bit of that uh uh monster feel of approaching the initial task.

Stefan Ideler, i3D.net 37:56

Now I think one thing that that coming for me people might find like, huh? Um if if you're definitely starting Greenfield, right? I would definitely, when creating your application and creating the links between your where your stream comes in and you're processing the stream, and then having clients watch it. When you're building this, I would definitely utilize uh for, for example, cloud providers where it's extremely easy to get uh do your development there. Um greenfield, start with a few VMs, get it all hooked up, even though it's not to scale, but at least you can start generating proof that it can work. You can also start experimenting, having that in multiple availability zones and multiple regions. And when you got a good feeling of a technology, uh and you have a good expectation as of what your initial workloads are going to be, then of course you should make the step towards uh before any big production traffic hits it, then of course you need to make the step to selecting uh uh the providers in the regions where your research shows that you need to be. And of course, then it becomes a little bit more serious, yet still you're not at 30 providers at the same at right away yet, but you're then you're starting. Then you start optimizing at those few locations first with your first production traffic. Then when you have your first production traffic, it's also gonna be the decision moment for your observer, observability and monitoring. Do you actually measure your play uh your your user satisfaction correctly? Are the technical details? Does it report backly? Do you see problems with specific ISPs? You see reasons why you should expand to yet another country to maybe resolve those problems, and then you start getting into this in incredible mode to go where your workloads uh and your latency constraints need you to go. Of course, it's a different scenario if you already have a very large production traffic volume, right? Then you're probably already with a big deployment somewhere. It might be very monolithic, uh one gigantic monolithic system. Yeah, then I would then there are also still always two options. Yeah, again, you start doing a little green feed project to build something ready for the future, or you start trying to disconnect tiny pieces from from the monolithic system to see if you can duplicate this in a thin format uh into a new location to see if you can get it work and get an additional distribution approach going, right? Because you already have existing tech. Yes, it might be old, but maybe you can still make the distribution part of it of it work. Uh and if not, then yeah, then you have to start fresh.

Leonardo Nieto, NETINT 40:43

And yeah, correct.

When Distribution Backfires

Stefan Ideler, i3D.net 40:45

Even when we look at our own uh internal systems, when I take a little self-blame there, like when we look at our own game orchestration platform, um at one point we also decided that okay, what we have now it works for all the use cases in the past, but for the U cases in the future, which in this case was having the ability to go multi-cloud and uh seamless distribution, uh you need to start fresh. Uh, because the old tech was from a time where uh a lot of even programming concepts weren't as developed as they are now, and even then, when I skip ahead of years to today's world with all the advances in AI and how quickly you can put revenue applications and get to a POC. Yeah, I if I would have to choose right now, and let's say I would be giving a big enterprise workload, maybe 10-15 years old, and a mission task, okay, you go fix this, distribute it, and optimize your latency for users. Yeah, I would I would definitely try to build applications uh from scratch. Uh today's advances, it it's it is a lot a lot easier to do so.

Leonardo Nieto, NETINT 41:53

No, and obviously, especially it's it's uh it's always challenging. I think it's exciting, but I think it's also something that requires a lot of collaboration. And for for for our audience who's listening, you know, if those that are coming to NAB, uh we can have this discussion with our teams and you know just just have a chat, really, because it's not really about just uh mentioning uh VPUs or mentioning you know infrastructure, I3D, etc. It's uh it's something that requires a lot of um uh of input from a lot of from a lot of people at different stages of the video transport.

Stefan Ideler, i3D.net 42:30

So I I do think that one thing compared to yes, users have always been critical. Uh we always uh you know in the gaming as well, like pitchforks out whenever a game crashes or when the video scenes is not working uh during an important event, everyone is also instantly complaining. But the reason why it's even more pressing, I believe, in today's world is because there are so many other alternatives competing for your users' time. So if you have a hiccup, even if it might have been acceptable 10 years ago because they would just okay, be annoyed, but continue watching. Now there might be so many other things second screens, social media competing for the user's time, that a small hiccup might mean that users are already gone. So in a way, I I personally feel over the last few years that the demands towards the infrastructure, the video transporting, the whole chain um have definitely become higher, more demanding, even though um the the the the the the the latency, the the geography geo geography didn't really change at times, but we cannot afford the in yeah, the level of interruptions anymore which might have been accepted acceptable in the past and they are no longer acceptable now.

Leonardo Nieto, NETINT 43:54

Great. I guess what we can take from you know from from from this session is uh Uh you know, that certainly number one latency is not solved by a faster component. It's a much more you know complex um uh t uh challenge to to to fix. And it's certainly like you just mentioned, it's solved by architectural intent. As and as you said, uh there is more competition for people's attention, and you cannot look we cannot lose them to a brief moment of distraction or or signal failure. It seems that um placement decisions affect density, it becomes this uh worldwide puzzle on how to you know maximize the the the video transport.

Stefan Ideler, i3D.net 44:37

Yes, and also there's the legality aspect which correct some workloads uh are really becoming uh a thing. There's also licensing requirements in the past with geofencing. Uh but now in in some of the countries that we are active in, the the sovereignty requirements also start becoming a play. They want to be able to point at where the data gets gets processed or and go that's a couple of years ago. This was almost never mentioned in any of the organizations.

Stepwise Expansion And Observability

Leonardo Nieto, NETINT 45:06

Because people associate latency like this archaic concept that you know every nanosecond it's what uh what's important. I think that was well and well and dandy when it was just one signal coming from one side, glass to glass, as they call it, and then it gets to here, and then that's it. But there's the element that there's more interact interaction, there's more competition, not in just one screen, but maybe multiple screens, picture-in-picture methodologies of sharing information. And I think that what you're what we're taking from from this is that it's not just about the milliseconds and the the fastness of of of of delivering the video, etc., but more about the the design of the entire system. And I think that requires a lot of very smart uh disciplines to work together at different stages and deliver this because yeah, it's nice the FPGA technologies, but eventually it's it's okay. It's just the the world changes. And um no, it's great to have these insights uh from you, uh Stefan. We really appreciate it. Uh, this has been a very great grounded discussion. I think it's valuable for you know our audience for people listening. Uh, obviously, you know, we try to just open the conversation. We have covered a lot of subjects here that maybe bring up more questions, and you know, we simply invite people to come talk to us, you know. Uh we love you know having these crazy ideas, and then from there just kind of bringing them down and and really delivering true innovative solutions into what's really happening in the in the market today. Thank you uh for joining, Stefan. Uh uh, I'm not sure if we'll see you at uh at uh at NAB, but I'm sure we'll see your team.

Stefan Ideler, i3D.net 46:57

So uh uh team is flying in, and and uh if I if we don't see each other personally at NAB, then there's always, of course, IPC and uh September in the Netherlands. Perfect.

Anita Flejter 47:09

Thank you both, Leo, for channeling the real questions we hear across the market, and Stefan for sharing a perspective shaped by years of building and scaling distributed infrastructure. What's interesting about conversation like this is how consistent the concepts are across teams, not just performance, but predictability, placement, and operational control. For those heading into NAB, these are exactly the architectural discussion that matter, not headline matrix, but how video systems are actually structured and operated. Thanks for everyone listening, and we see you at the next Voices of Video.com.

Voices of Video

Voices of Video

Designing Video Systems Around Latency Constraints

Setting The Stakes For NAB

Latency As A Structural Constraint

Workloads, Geography, And Risk

Distribution Versus Centralization

The Tipping Point For Change

Throughput, Capacity, And ISP Reality

Vendor Sprawl And Cost Pressures

Edge To Core: Backends And Links

Moving Compute Closer To Users

Interactive Streams Reshape Encoding

Avoiding Over‑Engineering

When Distribution Backfires

Stepwise Expansion And Observability

Leonardo Nieto

Mark Donnigan

Anita Fejter

Stefan Ideler