Voices of Video

Explore the inner workings of video technology with Voices of Video: Inside the Tech. This podcast gathers industry experts and innovators to examine every facet of video technology, from decoding and encoding processes to the latest advancements in hardware versus software processing and codecs. Alongside these technical insights, we dive into practical techniques, emerging trends, and industry-shaping facts that define the future of video.

Ideal for engineers, developers, and tech enthusiasts, each episode offers hands-on advice and the in-depth knowledge you need to excel in today’s fast-evolving video landscape. Join us to master the tools, technologies, and trends driving the future of digital video.

All Episodes

Voices of Video

Who Owns Your Digital Brain?

June 19, 2025 • NETINT Technologies • Season 3 • Episode 14

Reza Rasool, CEO of nonprofit AI lab KWAI, shares his vision for democratizing artificial intelligence through personal AI systems that prioritize user privacy and local computing.

• KWAI aims to create personal AI assistants that run locally on users' devices rather than in the cloud

• Current cloud-based AI services create privacy concerns as they require uploading personal data to remote servers

• Retrieval Augmented Generation (RAG) technology separates language models from knowledge bases for more efficient local processing

• The streaming media industry's evolution provides valuable lessons for developing standardized, component-based AI systems

• Moving AI processing from cloud to edge devices addresses both privacy concerns and power consumption limitations

• Comparing KWAI's mission to Linux - an open-source reaction to monopolized operating systems that made the industry healthier

• Personal AI Operating System will provide a framework for running AI assistants with extensible "abilities" similar to apps

• Technical innovations like Mamba architecture reduce computational complexity from quadratic to linear growth

• Future vision includes distributed, peer-to-peer AI networks functioning as digital public infrastructure

• The opportunity window for creating democratic, user-controlled AI is limited and requires volunteer participation

Visit kwai.ai to explore KWAI's proof-of-concept avatar and learn more about joining their community of over 500 volunteers working to democratize AI.

Stay tuned for more in-depth insights on video technology, trends, and practical applications. Subscribe to Voices of Video: Inside the Tech for exclusive, hands-on knowledge from the experts. For more resources, visit Voices of Video.

Mark Donnigan: 0:07

Voices of Video. This is another edition of Voices of Video and, as usual, we bring you the very best guest movers and shakers, and we have today the best movers and shakers in video streaming, that is. I guess I have to clarify that, reza, welcome to the show and thank you for joining us today and thank you for joining us today.

Reza Rassool: 0:45

Welcome, mark. Welcome.

Reza Rassool: 0:48

Mark for having me on the screen.

Reza Rassool: 0:49

Thanks again.

Mark Donnigan: 0:50

Yeah, that's right. That's right. Yeah, well, this is really an exciting discussion because, wow, I mean, I can't go anywhere without seeing AI and avatars and generative AI and all of this, and for those of us working in video which we're going to get to your background, because it is fair to say that you are an OG this is really exciting to think about the revolution that's upon us. So, yeah, this is going to be a great discussion today, so why don't you go ahead and jump in and introduce yourself, tell the audience who you are and a few of the things you've done and what you're doing now?

Reza Rassool: 1:42

Yes, so my name's Reza Rasool. I'm CEO and chair of a nonprofit AI lab called KWAI. Kwai is a word from the tip of Africa. It means cool, but it also means angry or wild, so it's one of those sort of nuanced words that have-.

Mark Donnigan: 2:05

I like it.

Reza Rassool: 2:05

Things like sick or brat, for instance.

Mark Donnigan: 2:12

I like it.

Reza Rassool: 2:15

Okay, so my background. What brings me to this screen? I've been fortunate to be in the right place at the right time and been involved in some of the coolest digital media projects. In the 90s I fell in with a group of developers in the UK I guess it was called a startup and we created one of the first nonlinear editing systems that were around. The product won a technical Oscar and technical Emmy Award, and it's on the strength of that that I came to America. This is 30 years ago.

Mark Donnigan: 2:57

And.

Reza Rassool: 2:57

I had just been very fortunate to be on some pretty cool projects. Gosh, we can go into the details of that history.

Mark Donnigan: 3:07

Well, maybe you should do a quick highlight reel, because I think you're being a little too modest here.

Reza Rassool: 3:13

Okay. So the product was called a nonlinear editor at that time. Prior to that, movies were edited on film by splicing film together or by recording it to videotape and then doing a multi-deck edit, and it was very painful to do that. The notion of transferring video, the footage, onto a disk drive was new and we built that. The product was called Lightworks and it sold very well in Hollywood and it was actually on the strength of that that I brought my family out here 30 years ago. The product's still going. It's still. Lightworks is still a going concern. It's been through many different ownerships. Martin Scorsese still cuts on. Lightworks is still a going concern. It's been through many different ownerships. Martin Scorsese still cuts on Lightworks.

Mark Donnigan: 4:10

Amazing.

Reza Rassool: 4:11

And so you know there's a whole sort of rap sheet of Oscar-winning movies that were created on an Oscar-winning product. Let's see that were created on an Oscar-winning product. Let's see the job that brought me over here. I came to work for a disk drive company that was setting up a video systems division. You might recall the name Micropolis. They actually built the world's first video-on-demand server. So it was this notion of, okay, video on disk drives is going to be a big thing, and so that was what that um.

Reza Rassool: 4:50

And then, uh, I had a stint of time in the biomedical space where my audio and video um expertise was used on the cochlear implant project for and for a bionic eye development for its sister company, second site. But then I started working for a uh, a startup called wide vine technologies, google um, and that was in the business of encrypting uh video. This thing we're doing at the moment. The encryption of it is, at least for the TV that you watch, over-the-top television and so on. That's largely encrypted by Widevine technology, gosh, another startup that also exited to Google in the music gaming space Google in the music gaming space.

Reza Rassool: 5:49

Then I thought I was hotter than anything and I created a venture studio and started to nurture young startups. We had a CTO as a service practice where we would bring a CTO to a startup, bring them a development team which was a near shore team that was based in Mexico and then also bring them finance, and that model worked for a while until it doesn't work. And then Real Networks a recruiter from Real Networks said hey, we're looking for a CTO, and so I went to finish off my career as a CTO and help pivot that iconic brand from streaming media to AI. And when I retired from Real Networks, I formed this nonprofit called Quai Amazing.

Mark Donnigan: 6:38

So now we go in a circle.

Mark Donnigan: 6:40

Yeah, that's amazing, that's great. Well, thank you for that. So you know we're going to get into the technical side. Our audience are largely engineers and I know everybody's just. You know, chomping at the bit to. You know, hear us dive means what technologies are involved. You know in both producing but streaming and delivering an avatar experience. So we're going to get into that. But I think the huge question that everybody has before you start talking about the tech is what about, you know, ethics, regulation, what about that kind of stuff? And I know that that's, you know, definitely dovetails into why you're even doing, why you created the nonprofit, you know. So why don't we start there? You know, and maybe you know, give us kind of an initial where are we in the industry in terms of AI development? You know, give us kind of an initial. Where are we in the industry in terms of AI development? You know, with ethical, you know, both concerns what people are doing, what they're not doing. You know what's your perspective there and Kawhi's perspective.

Reza Rassool: 7:55

Well, the industry is in its infancy. There's still a lot to learn, and one way of accelerating that learning is by learning from previous industries and learning from the mistakes we've made in the past as well. The impetus for setting up Quai when I retired from Real Networks was really jokingly. I thought this is going to be. Setting up a nonprofit would be easier than learning how to play golf, but instead I you should have taken up golf.

Reza Rassool: 8:36

Instead of actually working harder now than I've ever done throughout my career done throughout my career. It is the angst that I felt when I left Real Networks was an angst that AI was heading on a dangerous path and you could. The choices we have would be to write sort of whiny blog posts about it and sort of wag your finger at the wind. But the alternative is to actually go and do something about it, and so when I put out a call to action video it was a year ago, almost a year to the day I put out a call to action video and asked hey, is anyone else feeling the same angst about AI? If so, let's just start a discussion group. And we started weekly public meetings on a Friday we just celebrated our 50th weekly public meeting.

Reza Rassool: 9:42

So that's amazing and, and so, um, we, and in those meetings we started to ideate about how can we do something practical, how can we mount, um, an intervention of volunteers that can help bend the arc of history, help actually change the course that we're on? And it seems idealistic. So the broad mission was to democratize AI, and that sounds like a fancy sort of bumper sticker. That really doesn't mean anything. But the way we're implementing it is through personal AI. We have this notion, a vision that everyone should own their own AI. If AI is going to be truly uplifting for humanity, well, let's start with the problems of humans first, rather than corporate AI.

Mark Donnigan: 10:44

Yeah, yeah, yeah.

Reza Rassool: 10:46

So, mark, imagine if you had your own personal assistant. That gave you a 10x improvement that made you. So that's a vision, that's a destination of where we want to take the technology. If it did that, if it made me, if it gave me a 10x improvement of just managing my domestic life, my personal life, it'll make me a better employee. It'll make me a more efficient consumer. It'll make me a more engaged citizen.

Mark Donnigan: 11:18

Yeah.

Reza Rassool: 11:21

So that was the thought. Yeah, we had some kind of revolutionary rhetoric in the early calls to action, but now it's very practical. It's around personal AI and we're actually building stuff. The movement has grown. The movement is now over 500 members. It's amazing 570, maybe 575. Wow, but the thing is I've lost count, because the size of the movement is doubling every quarter. Yeah, and we are, and it's largely engineers. It's largely about 80% of our energy is going into code development and the rest is split between research and policy development.

Reza Rassool: 12:09

So, there's a lot to tell you about where it's going. People can go and find it on yai.

Mark Donnigan: 12:17

That's right and we'll link up in the show notes and all so for those that watch this later. So maybe a good a question that I have, and maybe others have too, is okay. So personal AI that sounds like a custom GPT. You know that. At least that's, I think what a lot of us would associate it to is. You know that somehow has ingested and processed. You know all of my documents and you know all the information it knows about me. You know, via certain apps and online experiences, it truly gives the Mark Donegan experience and knowledge and you know, in other words, who I am. You know, versus the. You know Reza Razul. You know which right now, we all know. If we just query chat GPT, you know prompt, we get all of the world's. You know intelligence that's been indexed right. So maybe you can explain for me anyway and certainly for our listeners. Is that it? You?

Reza Rassool: 13:36

know, is it.

Mark Donnigan: 13:37

Yeah, or you know how is it different, or how does it relate, or you know yeah.

Reza Rassool: 13:42

I guess if I was one of the SaaS providers, that is how we would architect it. But if you are focusing rather than building up a big SaaS business model, a big SaaS service, and instead you focusing on solving the problems of the individuals, you'd approach the problem differently. The big angst people feel is that in order to interact with these SaaS AI services in the cloud, you have to upload your information to the cloud and the terms of service. Well, if you're using the free version, is they own your they?

Mark Donnigan: 14:27

own your information. That's right.

Reza Rassool: 14:29

If you are as as, let's say, as a small business, you're concerned about the intellectual property rights, you're preserving your own trade secrets and your intellectual property, you'll feel nervous about that. If you are an individual and you wanted to have a personal assistant that is truly informed by all of your private and your most intimate data, you'd feel nervous about sending that to the cloud especially if those cloud services say, hey, we're free to use that data.

Reza Rassool: 15:05

Any way, we darn well, please, actually, if those cloud services say, hey, we're free to use that data. Anyway, we darn well, please, to monetize you better, to all sorts of things.

Reza Rassool: 15:16

So we said well, the data really needs to stay with the user, and so that moves you down an architectural path of saying, well then, the processing needs to come to where the data is. So that's the big, I guess, architectural decision is that is AI going to run as a service in the cloud or is AI going to run as a local service? So this is also an architectural destination. We clearly can't get there now, but those are the challenges that we're going to face and that we're trying to overcome in order to get to what Mozilla calls local AI. So there are a few organizations that get it, that understand what's going on.

Reza Rassool: 16:13

The oligopoly, this small group of maybe a half a dozen corporations that control AI, want to keep AI up in the cloud, and so they will over bloat their large language models. They'll unnecessarily conflate a language model with a body of general knowledge, much like the big, you know, multi-billion parameter models. This is an unnecessary conflation. In fact, what it does is it forces the model to run in the cloud only.

Reza Rassool: 16:50

What we learned in the streaming media industry was that there is a stack, it's a layered approach, that there's a role for a codec, there's a role for the video effects layer, there's a role for the rendering and display. And to try and achieve all of those functions of an entire video pipeline functions of an entire video pipeline all the way from encoding to decode in a single body of bloatware is not practical. Yes, it'll preserve the moat around your business for as long as you can, but eventually, in order to get more people and more innovation engaged in the problem in the entire pipeline, it has to be broken up into its components and standardized, like we did in the streaming industry.

Reza Rassool: 17:53

We created standards that interface between each layer in the stack and we realized that some parts of the stack can run locally If you look at your television you know, the decoding part is happening in your TV.

Reza Rassool: 18:08

It's not happening at a big broadcast center that's then sending you the rendered video. So we understand that, and that understanding comes from an industry maturing and there were lots of sort of dead ends that we took in the video industry until we got to the state where we are now yeah and yeah yeah, so. So look, we, we are drawing from that those painful lessons that we learned, and we're applying that to the AI pipeline.

Mark Donnigan: 18:45

Yeah, interesting. So how does what you're working on? So I understand what you just said? The way I would think about this is that personal AI has almost a little bit of a double meaning in that it's personal, as in you know. It is a custom I guess you would say a trained model or a custom model that is trained on my data, right, but it's also personal in that it's running on my local personal device.

Mark Donnigan: 19:20

It's not, you know, sort of this shared resources out there where, again, as you pointed out, you really don't know is my data, my information, getting, you know, intermingled with someone else's. And not only does that create privacy. You know issues and concerns which we all should be worried about. But there's even just the issue of, again, you know, like, if this is a personal assistant and it's helping me well, guess what? You know Reza has different set of experiences than Mark you know. So it's like if ours are intermingled, you know, that may be useful on some level. It may be good to be like hey, mark, you know, reza over here has something you could learn from you know, and vice versa, perhaps.

Reza Rassool: 20:10

Well, let me just make one modification to that. It's actually not retrained on your local. What we've done is we've separated what the role is of the language model with the knowledge base and the knowledge base we implement in a technology called RAG Retrieval Augmented.

Mark Donnigan: 20:27

Generation. Yeah, that's right.

Reza Rassool: 20:29

And the language model is simply used for its linguistic capability its ability to understand language, the knowledge, the general knowledge that, for instance, gpt-4, 3.5, and all of those that is a conflation ofT-4, 3.5, and all of those that is a conflation of two parts of the stack. Yeah, that's right.

Reza Rassool: 20:49

It's an unnecessary conflation, yeah that's right and it's a conflation in order to preserve a moat, let's break that moat apart. Okay, so now you realize that you can actually use a smaller language model an SLM if you will. You can actually use a smaller language model an SLM if you will, and I only need it so that I can talk to my data. I want to have a conversation with my data.

Mark Donnigan: 21:12

I don't want to have to construct a Nali.

Reza Rassool: 21:14

SQL script or whatever.

Mark Donnigan: 21:31

And that's basically all. It's converting a spoken query into a um, into a, a, a, a database query, I guess, yeah, yeah, and so are you creating a format then that this, uh, general, that this body of knowledge is um stored in so that it can be retrieved? Because you know standards are important to you, I know.

Reza Rassool: 21:45

Well, fortunately, these things exist already. There are a number of vector databases ChromaDB, Pinecone, I know, Couchbase, many of the actual, many of the traditional database of the traditional database companies have started to enable and support vector objects. So let me explain. So just go under the hood of RAG a bit.

Mark Donnigan: 22:13

Yes, please.

Reza Rassool: 22:14

And so RAG is called Retrieval, augmented Generation. It allows you to take a body of knowledge, and it's basically written knowledge. It will become multimodal eventually, but for and it's basically written knowledge. It'll become multimodal eventually, but for now it's it's written knowledge. And let's say it was a. Let's say it's the entire scraping of a website, um, let's say it's the transcriptions of of meetings that we have, and all of that is in a big bucket. And now it translates all of that written text into its meaning. So it's a semantic representation, and the math for creating that semantic representation is called vectorizing. Now the vector let's say you've got a two-dimensional vector. It'll have an X component and a Y component. These vectors have many hundreds of dimensions, and I think that the database we're using it's got 1536 dimensions. So that means every chunk of text has now got a vector representation. And so when you're trying to compare two chunks to say, do they mean the same thing? It is the mathematics of comparing, the alignment of these two vectors and that math is easily understood.

Reza Rassool: 23:44

That is the sort of matrix math that we're doing anyway in AI processes. So we've separated what is the job of the language model and what is the job of the knowledge base, and now you can have your knowledge base locally.

Mark Donnigan: 24:05

Yeah, that's right, because you don't need an H100 sitting in your desktop, you know.

Reza Rassool: 24:15

And this knowledge base now can be secured by you. It's not a honeypot waiting for hackers to crack. They only need to crack it once and then they've got access to millions and millions of customers' data. That's the current topology, so it's more secure. It means everyone is securing their own data. It's actually less of a burden for the big SaaS providers. They can focus on what is their special source. But the knowledge should really stay local.

Mark Donnigan: 24:48

Yeah, yeah, for sure, very interesting. Well, what progress are you making? Evangelizing this through the developers of these, you know, these large language models, but it seems to me like it's even the device makers and it's the you know, like like apples of the world. I, you know, and I I know how difficult it is to talk, so maybe, maybe Apple's actively engaged with you and you can't say it, but you know, let's just generically, are you making some progress getting the attention of, you know, samsung and Apple and these big device makers, which it appears to me would be a natural place?

Reza Rassool: 25:31

Absolutely.

Mark Donnigan: 25:32

Yeah, where this technology needs to get, at least they need to say yes, this is the right approach.

Reza Rassool: 25:40

Yeah absolutely, and so look, if you look at my um the way I approached AI at Real Networks let's rather by the way.

Mark Donnigan: 25:50

By the way, yeah, yeah, sorry for interrupting. You know we totally scooted over that. We totally scooted over it, yeah, so maybe you better rewind and explain what you were building at Real Networks.

Reza Rassool: 26:01

Fantastic, okay, and that will actually answer this question without divulging what we are doing at the moment?

Mark Donnigan: 26:08

Yes, great great, so I wrote.

Reza Rassool: 26:10

Networks I joined 2016. The company was ailing. It was bleeding cash. Rob Glazer, the original founder, had a leave of absence from the company and came back to a company that completely lost direction. It was bleeding cash and it was in dire need of a turnaround. So Rob hired a whole new raft of executives, myself included. I was the CTO, and I guess the implicit mission was to build new technologies. That would be the foundation for a turnaround, and the bet that I made in 2016 was on AI. Look, it wasn't an obvious bet in 2016. It's kind of you know, it's kind of duh now, and so I started injecting AI into all of the product streams within real networks. My hunch was that it would have taken hold in the streaming media business, and I was also running that business in China. I was overseeing the engineering development in China where the codec business. We had lots of relationships.

Mark Donnigan: 27:20

And Real Networks was really, or maybe still is, very strong in China.

Reza Rassool: 27:24

right For the codec it was like the de facto codec of China, the RMVB Kodak.

Reza Rassool: 27:31

And so we had lots of connections with device manufacturers and now that business just, unfortunately, was the victim of US and China trade tensions. Covid also came along and did a number on it, but look, it's came along and did a number on it. But look, my bet was on AI and what we found within one of the product groups, there was a group called Real Times, where your camera roll it was a camera roll backup for your mobile phone and it would create movies out of your content. And within that we started to experiment. Well, can we identify faces within the videos?

Reza Rassool: 28:18

Yeah, and within a very short space of time we found that we were, our accuracy was going up, that our accuracy was going up and we ended up with, I'd say, arguably the best real-time face recognition stack in the industry. And this was running in the cloud, on GPUs and as a topology, I thought this was probably the dumbest thing to do, and we had some success. You know we were starting to have technical success and some early business success. But I started to lobby against maintaining building the business in the cloud and instead I said no, this processing needs to happen at the edge.

Reza Rassool: 29:13

It made no sense to stream the video of cameras at the edge up to a cloud for processing to then give you an answer and say, yeah, that's Mark's face in the stream, it's got privacy concerns, it makes no sense on the data ingress costs, and you simply can't create a service big enough to support a massive metropolitan source of camera data. This has to happen at the edge. It has to happen either in devices, in cameras, and so I started to pursue embedded processing, and eventually that's where the Real Networks ended up. So this was one of its divisions. A division called Safer Ended up with a kind of an enterprise level ring doorbell type of solution where the video is processed locally in the device and it's done. Instead of a $10,000 GPU system, you can actually run it on a processor that costs a few dollars and that is possible, and that is possible.

Reza Rassool: 30:32

Our current addiction to GPUs is completely brain dead. It's really great for training models, but for running inference zero sense. And so I've been through that painful process of first trying to persuade the decision makers of the sense of taking the processing to where the data is Sure. And I'm going through that again now. But having learned from those painful lessons. I know that this makes sense. Yeah, that's what we're doing, and so real networks.

Reza Rassool: 31:06

Let's just close the story up On the strength of these two divisions the Safer division and the Context division one, context, is a natural language processing division. Safer is a computer vision division. On the strength of having built those two, rob was able to take the company private, and so, in 2016, he owned 38% of the company. Now he owns 100% of a debt-free company that has got two promising AI divisions. So, on the strength of that, I retired and here we are.

Mark Donnigan: 31:40

Yeah, that's great. So, yeah, thank you for that. I'm glad we covered that, because we referenced that you were at Real Networks, but didn't talk at all about what you were doing. So, and that's important. So, coming back to how we get the device makers and the ecosystem partners to think different, because there's a common theme that I'm hearing you reference implicitly at least, although somewhat explicitly as well and that is that we need to think different about how we're implementing AI. You know how we're using these models, what some of it is. There's just incredible waste. You know, which I completely agree with, and I think you know a lot of people are talking about that. I like your reference to addiction on to GPUs. You know, at the same time, there's all of the. You know, there's a privacy concerns. There's everything that you're trying to address. But so what is the attitude right now? As're out there and you're trying to get people to think different and behave different. You know, are they? Are they playing along? Are they not quite sure? Are they right with you?

Reza Rassool: 33:00

I think, as a, as a nonprofit, we're in this fortunate position of being Well. First of all, we don't have the angst of trying to generate a profit.

Reza Rassool: 33:14

So, that motive often deforms the directions, the decisions you'd make in a business. So I think to some extent it's liberating and we can always say, hey, that's the right way, and all we can do is point the way, but also to show the way. And so, rather than just pointing the way, we are creating products and we're illustrating, and so if you go to the Kauai website, you'll find a proof of concept of an avatar and you can play with that, and this is one where it's a talking head. But the knowledge base is a RAG knowledge base and it's not running in one of the SaaS services. It's running using an open source language model. It's actually using Lama 3. And so we're grateful for Meta, for creating it.

Mark Donnigan: 34:15

Thank you to Meta. Yes.

Reza Rassool: 34:17

I mean you know many companies have been philanthropists in the industry. I mean Real Networks. Philanthropy was inventing streaming media. You know it published the first streaming media protocol. Rts we gave it away for free was inventing streaming media. You know he published the first streaming media protocol. Rtsp gave it away for free. When we look at some of the sort of iconic or pivotal tools that get developed you know TensorFlow that was given away for free, thank you very much. Google, pytorch given away for free, thank you very much and these become important acts of philanthropy that for-profit companies make and it makes the industry healthier. So QI is a nonprofit mounting, an altruistic intervention to make the industry healthier. We actually liken ourselves to the Linux of AI, maybe the Linux of personal AI.

Mark Donnigan: 35:15

Interesting.

Reza Rassool: 35:16

Remember, Linux was actually a reaction to the monopolization of the streaming of the server operating system.

Mark Donnigan: 35:25

That's right. Yeah, that's right.

Reza Rassool: 35:28

You maybe had a half a dozen choices of server operating system 30 years ago. And if you wanted to do anything at the kernel level, you needed to have a license, a million dollar license to get access to the source, and so along comes Linux volunteer, open source, non-profit, and now the industry is healthier for it.

Mark Donnigan: 35:49

That's right yeah.

Reza Rassool: 35:51

That's the example from history that we are.

Mark Donnigan: 35:56

Yeah, I love it. I think that's a great example. So you know, one of the things that I've heard you talk about and in our one-on-one discussions you made reference to like a personal AI operating system, right Is maybe explain that you know, is that actually? Is that what you're doing? I mean, do you position this as we're building a personal AI OS, or is it?

Reza Rassool: 36:24

just a way to explain it. It's an operating system and, of course, yes, it also causes a a conversation and push system. And of course, yes, it also causes a conversation and pushback.

Mark Donnigan: 36:32

Sure. Well, also, because people say really, what is that?

Reza Rassool: 36:37

We've got Linux, why create an operating system? Hold on, mac OS is built on top of another OS and calls itself an OS. Okay, so let's dive into that term. What is an operating system? An operating system is a layer of software that manages compute resource and allows you to run applications on top of it, and so, in the same way, amazon's Alexa is classed as an operating system If you go to the Wikipedia entry.

Reza Rassool: 37:12

But what is it? It's not running on the bare metal of IoT devices. It's a framework for automating IoT devices and running applications. They call their applications skills, and running applications they call their applications skills. So, in that way, the Pi OS, the personal AI operating system, is an operating system. It allows you to run a personal assistant that is extensible through packages of functionality we call abilities. Those might be the apps that you have, and so you will have a personal AI assistant out of the box that will have some innate abilities, like the ability, maybe, to understand speech and to talk back at you. And you know, much like your phone has some innate apps in there. When it comes out of the box, it's got a dialer, but everything else, all the other extensions, would come from for-profit vendors that are building towards an API and a specific.

Mark Donnigan: 38:22

Interesting, interesting.

Reza Rassool: 38:25

That is our vision for personal AI. Yeah, that is our vision for personal AI.

Mark Donnigan: 38:30

Yeah, so what technical challenges are you facing in developing this?

Reza Rassool: 38:37

Well, yeah, it's great. So we've got over 500 volunteers and the technical challenges we've set them onto is bringing AI local. Is bringing AI local. So the sort of technologies we are researching. I don't know if you're familiar with Mamba. Mamba is a new, a new formulation of a transformer where its complexity grows in a linear fashion rather than in a quadratic fashion. Um, in fact, this is the problem of the regular neural network. If you've got a layer, if your layer, let's say, has 1,000 nodes in it, the node in one layer is informed by all of the nodes in the previous layer, so you'll have a million connections between two layers, and that is. And if you were to grow that, basically the number of connections these are the weights of the model grows by n squared, and so it's no wonder that these things are running, have to run, in the cloud.

Mark Donnigan: 40:05

on expensive processes, massive, massive hardware.

Reza Rassool: 40:08

Exactly as you try and grow, but Mamba makes that a linear problem, and so you now can grow your number of parameters, you can grow the size of your context window, and you end up with only a linear growth in your processing complexity, rather than an X squared or N squared growth.

Reza Rassool: 40:34

So, that's one of the areas that we are researching into. The other things are how can AI be streamed? Streaming AI hey, that sounds familiar. Can we draw any lessons from the way we streamed video? And you know, video is, by its nature, a real-time problem, and so we're specifically looking at real-time AI. What?

Reza Rassool: 41:05

are the real-time use cases the use case of having a conversation with a personal assistant what part of that problem can be solved offline and what part of it really needs to be real-time. We learned a lot from the streaming media industry.

Reza Rassool: 41:25

We realized that, hmm, you actually only need to encode the movie. Once you can do that offline, you're decoding it in real time. We learned that we need to have standards if we're going to have a vibrant industry. And we have multiple people working at different parts of the stack, you're going to have to have standards.

Mark Donnigan: 41:47

That's right, different standards.

Reza Rassool: 41:50

So we learn about that and all of that learning we're now applying to AI. Distributed processing yeah, Can a model be distributed to run on multiple nodes concurrently? To run on multiple nodes concurrently, and will it run more efficiently and more efficiently not only in compute, but run greener as well?

Mark Donnigan: 42:18

Yeah, that's right.

Reza Rassool: 42:26

And so, yes, we have in our lab AI that's distributed amongst multiple nodes, ai that's distributed amongst multiple nodes, and imagine the sort of torrent of AI, imagine a peer-to-peer mesh where it is a digital public infrastructure, and so that's the end destination of where we're taking.

Mark Donnigan: 42:48

That's super, yeah, that's super interesting. I, several years now, for several years now, I've I've sort of landed on this concept that I use to describe where I personally think that we need to get to in video infrastructure, and that is where there is a fabric, you know where, where, like the, the, there's effectively a video fabric that is accessible very, very easily. It's not through walled gardens, it's not through, you know, this platform and that platform and the other. Now it's accessible. You know developers could use the fabric and then, you know, apply that for you know, this social network or this user generated site or maybe this premium, you know, know what you're evangelizing. What you're talking about is that is, that very notion of this needs to be a utility, it needs to be accessible to all. That's the democratization aspect I assume that you know you're largely referring to. Like this shouldn't be only those who have, you know, very special access to this kind of device or to these very expensive devices or to, you know, whatever technical knowledge that only a handful of people in the world have. You know, and's so much we could talk about and and, believe me, I've got a lot of questions around, like you know just advancements in in natural language processing.

Mark Donnigan: 44:35

You know computer vision, how we're going to get to more realistic avatars, what the challenges are in, not only there's that, there's that. How do you make them more. You know I'm going to in air quotes. You know human-like and that doesn't necessarily mean you know like, so the skin and everything, but it makes it so the interaction right feels like like. You know human. So I have a lot of questions around that. But our time we're coming up on the end here, so why don't you tell us you know what is, what is the future? And I don't want to look five years out. So let you know. Let's talk like 18 months out, 24 months out. Where do you see? You know how AI is advancing, the industry, kauai's vision. How does all of this converge, you know, over the next 18, 24 months? You know what are you excited about, what are you worried about? You know, give us your thoughts there.

Reza Rassool: 45:35

Yeah, Look. What's echoing in my mind is Yoval Harari said that AI has cracked the codec of humanity. What does that mean? It's actually that's interesting.

Mark Donnigan: 45:57

He used the word codec.

Reza Rassool: 45:59

He uses? Does he use the word codec? Well, maybe I did so.

Reza Rassool: 46:07

language, oh, he says has hacked humanity by tapping into language, my view of what language is. Language is a codec. It's a codec for telepathy, for getting a thought that's in my head into your head, and we've standardized that language, but we've also got some individualism to it. One of the worries is that these large language models have vacuumed up all of human writing, at least all that's accessible to them, and they homogenize that language and so the individual tones and meanings and nuances will be lost. The individual voices will be lost. Okay, so that's one broad angst. It's kind of a dull angst the notion that cloud-based SaaS AI is the direction for it. It makes no sense at all. There's literally not enough electrical power to run the data centers to be able to serve a large metropolitan community for their AI needs. Ai has got to come down to earth.

Reza Rassool: 47:21

It's got to run at the edge down from the clouds of the oligopoly and it's got to run at the edge on a mesh, on a fabric which would be digital public infrastructure. Yes, there will be tokens. There will be some sort of compensation for you providing local compute, much like there is for you putting solar panels on your roof and you're contributing, contributing back to the grid.

Reza Rassool: 47:48

Contributing back to the grid, exactly so. We've already got models of how this works, and some people have more panels, some people have no panels, and that's going to be a choice that each individual makes, but there is a common format of how you contribute back and how you make that resource available to the entire grid. Okay, so that's our vision for where AI goes, if it is going to be truly uplifting for humanity, and if it doesn't, then that's going to be truly horrible. We are going to go down the path that Yuval Harari said, where AI basically takes over. It becomes another weapon in the tool belt of capitalism to further monetize us and, worse than further monetize, to actually control us.

Reza Rassool: 48:45

And all those dystopian fears will go. I think there's a window of opportunity to make a change. There's a window of opportunity for a, an altruistic intervention from a nonprofit, from volunteers, and that window, that window is not indefinite, that window is going to close. And so, look, you are the agents of change. If you volunteer, if you come and participate either in this organization or a similar organization, you are the agents of change. There's no cavalry coming. No one else is coming. You are the cavalry. And so think about what you're doing. Ai has done a number on this industry, on the software industry. There are lots of talented folks out there that are in between jobs. Don't be unemployed, come and do it, go contribute.

Mark Donnigan: 49:40

Yeah, that's right.

Reza Rassool: 49:41

You will come away. Better You'll come away smarter. You will network. You might even get a job out of it. Yeah, that's right, there's a whole bunch and you'll feel good about yourself. Okay, Mark, thanks so much for giving me an opportunity.

Mark Donnigan: 49:57

That's amazing. That's amazing, Riza. Yeah Well, thank you so much for coming on. Voices of Video. I know that this is going to be a very popular episode, so we'll have to have you back and get some updates later in the year.

Reza Rassool: 50:13

I'm happy to take questions from the audience afterwards. They can email us.

Mark Donnigan: 50:16

Absolutely yeah, and that's an invitation. Thank you for pointing that out. We always welcome the audience to reach out to our guests and Risa is, I know, very accessible on LinkedIn, just in case you missed the website. So that is KWAI, that's K-W-A-A-I, so K-W-A-A-I dot A-I. So yeah, we'll link all this up and all, but, yeah, it's good. Well, reza, have an awesome rest of the day. Thank you for coming on and it was really great talking with you.

Reza Rassool: 51:07

Great Thanks. So much Thanks, mark. All right Cheers, bye. This episode of Voices of Video is brought to you by NetInt Technologies. If you are looking for cutting-edge video encoding solutions, check out NetInt's products at netintcom.

Voices of Video

Voices of Video

Who Owns Your Digital Brain?

Mark Donnigan

Per Nybom

Jan Ozer

Anita Fejter

Reza Rassool