System Load Balancing for AI Systems: The Case Of AI Autonomous Cars

 

 

I recall an occasion when my children had decided to cook a meal in our kitchen and went whole hog into the matter (so to speak). I’m not much of a cook and tend to enjoy eating a meal more so than the labor involved in preparing a meal. In this case, it was exciting to see the joy of the kids as they went about putting together a rather amazing dinner. Perhaps partially due to watching the various chef competitions on TV and cable, and due to their own solo cooking efforts, when they joined together it was a miraculous sight to see them bustling about in the kitchen in a relatively professional manner.

I mainly aided by asking questions and serving as a taste tester. From their perspective, I was more of an interloper than someone actually helping to progress the meal making process.

One aspect that caught my attention was the use of our stove top. The stove top has four burner positions. On an everyday cooking process, I believe that four heating positions is sufficient. I could see that with the extravagant dinner that was being put together, the fact that there were only four available was a constraint. Indeed, seemingly a quite difficult constraint.

During the cooking process, there were quite a number of pots and pans containing food that needed to be heated-up. I’d wager that at one point there were at least a dozen of such pots and pans in the midst of containing food and requiring some amount of heating. Towards the start of the cooking, it was somewhat manageable because they only were using three of the available heating spots. By using just three, it allowed them to then allocate one spot, the fourth one, as an “extra” for round robin needs. For this fourth spot, they were using it to do quick warm-ups and meanwhile the other three spots were for truly doing a thorough cooking job that required a substantive amount of dedicated cooking time.

Pots and pans were sliding on and off that fourth spot like a hockey puck on ice. The other three spots had large pots that were gradually each coming to a bubbling and high-heat condition. When one of the three pots had cooked well enough, the enterprising cooks took it off the burner almost immediately and placed it onto a countertop waiting area they had established for super-heated pots and pans that could simmer for a bit.

The moment that one pot came off of any of the three spots, another one was instantly put into its place.

Around and around this went, in a dizzying manner as they contended with only having four available heating spots. They kept one spot in reserve and used it for more of a quick paced warm-up and had opted to use the other three for deep heated cooking. As they neared the end of the cooking process for this meal, they began to use nearly all of the spots for the quick paced warm-up needs, apparently because they had by then done the needed cooking already and no longer needed to devote any of the pots to a prolonged period on a heating spot.

As a computer scientist at heart, I was delighted to see them performing a delicate dance of load balancing.

System Load Balancing Is Unheralded But Crucial

You’ve probably had situations involving multiple processors or maybe multiple web sites wherein you had to do a load balance across them.

In the case of web sites, it’s not uncommon for some popular web sites to be replicated at multiple geographic sites around the world, allowing for more ready speed responses to those from that part of the world. It also can help when one part of the world starts to bombard one of your sites and you need to flatten out the load else that particular web site might choke due to the volume.

In the cooking situation, the kids realized that having just four burner stove top positions was insufficient for the true amount of cooking that needed to take place for the dinner. If they had opted to sequentially and serially have placed pots of food onto the burners in a one-at-a-time manner, they would have had some parts of the meal cooked much earlier than other parts of the meal. In the end, when trying to serve the meal, it would have been a nightmarish result of some food that had been cooked earlier and was now cold, and perhaps other parts of the meal that were superhot and would need to wait to be eaten.

If the meal had been one involving much less preparation, such as if they had only three items to be cooked, they would have readily been able to use the stove top without any of the shenanigans of having to float around the pots and pans. They could have just put on the three pots and then waited until the food was cooked. But, since they had more needs for cooking then just the available heating spots, they needed to devise a means to make use of the constrained resources in a manner that would still allow for the cooking process to proceed properly.

This is what load balancing is all about.

There are situations wherein there are a limited available supply of resources, and the number of requests to utilize those resources might exceed the supply. The load balancer is a means or technique or algorithm or automation that can try to balance out the load.

Another valuable aspect of a load balancer is that it can try to even out the workload, which might help in various other ways. Suppose that one of the stove tops was known to sometimes get a bit cantankerous when it is on high-heat for a long time. One approach of a load balance might be to try and keep that resource from peaking and so purposely adjust to use some other resource for a while.

We can also consider the aspect of resiliency.

You might have a situation wherein one of the resources might unexpectedly go bad or otherwise not be usable. Suppose that one of the burners broke down during the cooking process. A load balance would try to ascertain that a resource is no longer functioning, and then see if it might possible to shift the request or consumption over to another resource instead.

Load Balancing Difficulties And Challenges

Being a load balancer can be a tricky task.

Suppose the kids had decided that they would keep one of stove top burners in reserve and not use it unless it was absolutely necessary. In that case, they might have opted to use the three other burners in a manner of allocating two for the deep heating and one for the warming up. All during this time, the other fourth burner would remain unused, being held in reserve. Is that a good idea?

It depends. I’d bet that the cooking with just the three burners would have stretched out the time required to cook the dinner. I can imagine that someone waiting to eat the dinner might become disturbed if they saw that there was a fourth burner that could be used for cooking, and yet it was not, and the implication being that the hungry person had to wait longer to eat the dinner. This person might go ballistic that a resource sat unused for that entire time. What a waste of a resource, it would seem to that person.

Imagine further if at the start of the cooking process we were to agree that there should be an idle back-up for each of the stove burners being used. In other words, since we only have four, we might say that two of the burners will be active and the other two are the respective back-up for each of them. Let’s number the burners as 1, 2, 3, and 4. We might decide that burner 1 will be active and it’s back-up is burner 2, and burner 3 will be active and its back-up is burner 4.

While the cooking is taking place, we won’t place anything onto the burners 2 and 4, until or if a primary of the burners 1 or burner 3 goes out. We might decide to keep the back-up burners entirely turned-off, in which case as a back-up they would be starting at a cold condition if we needed to suddenly switch over to one of them. We might instead agree that we’ll go ahead and put the two back-ups onto a low-heat position, without actually heating anything per se, and generally be ready then to rapidly go to high-heat if they are needed in their back-up failover mode.

I had just now said that burner 2 would be the back-up for primary burner 1. Suppose I adhered to that aspect and would not budge. If burner 3 went suddenly out and I reverted to using burner 4 as the back-up, but then somehow burner 4 went out, should I go ahead and use burner 2 at that juncture? If I was insistent that burner 2 would only and always be a back-up exclusively for burner 1, presumably I would want the load balancer to refuse to now use burner 2, even though burners 3 and 4 are kaput. Maybe that’s a good idea, maybe not.

These are the kinds of considerations that go into establishing an appropriate load balancer. You need to try and decide what the rules are for the load balancer. Different circumstances will dictate different aspects of how you want the load balancer to do its thing. Furthermore, you might not just setup the load balancer entirely in-advance, such that it is acting in a static manner during the load balancing, but instead might have the load balancer figuring out what action to take dynamically, in real-time.

When using load balancing for resiliency or redundancy purposes, there is a standard nomenclature of referring to the number of resources as N, and then appending a plus sign along with an integer value that ranges from 0 to some number M. If I say that my system is setup as N+0, I’m saying that there are zero or no redundancy devices. If I say it is N+1, then that implies there is 1 and only 1 such redundancy device. And so on.

You might be thinking that I should always have a plentiful set of redundancy devices, since that would seem the safest bet. But, there’s a cost associated with the redundancy. Why was my stove top limited to just four burners? Because I wasn’t willing to shell out the bigger bucks to get the model that had eight. I had assumed that for my cooking needs, the four sized stove was sufficient, and actually ample.

For computer systems, the same kind of consideration needs to come to play.

How many devices do I need and how much redundancy do I need, which has to be considered in light of the costs involved. This can be a significant decision in that later on it can be harder and even costlier to adjust. In the case of my stove top, the kitchen was built in such a manner that the four burner sized stove top fits just right. If I were to now decide that I want the eight burner version, it’s not just a simple plug-and-play, instead they would need to knock out my kitchen counters, and likely some of the flooring, and so on. The choice I made at the start has somewhat locked me in, though of course if I want to have the kids doing cooking more of the time, it might be worth the dough to expand the kitchen accordingly.

In computing, you can consider load balancing for just about anything. It might be the CPU processors that underlie your system. It could be the GPUs. It could be the servers. You can load balance on an actual hardware basis, and you can also do load balancing on a virtualized system. The target resource is often referred to as an endpoint, or perhaps a replica, or a device, or some other such wording.

Those in computing that don’t explicitly consider the matter of load balancing are either unaware of the significance of it or are unsure of what it can achieve.

For many AI software developers, they figure that it’s really a hardware issue or maybe an operating system issue, and thus they don’t put much of their own attention toward the topic. Instead, they hope or assume that those OS specialists or hardware experts have done whatever is required to figure out any needed load balancing.

Similar to my example about my four burner stove, the problem with this kind of thinking is that if later on the AI application is not running at a suitable performance level and all of a sudden you want to do something about load balancing, the horse is already out of the barn. Just like my notion of possibly replacing the four burner stove with an eight burner, it can take a lot of effort and cost to retrofit for load balancing.

AI Autonomous Cars And Load Balancing The On-Board Systems

What does this have to do with AI self-driving driverless autonomous cars?

At the Cybernetic AI Self-Driving Car Institute, we are developing AI systems for self-driving cars. One key aspect of an AI system for a self-driving car is its ability to perform responsively in real-time.

On-board of the self-driving car you have numerous processors that are intended to run the AI software. This can also include various GPUs and other specialized devices.