Wednesday, August 24, 2011

The Lean Approach to Context Switching

Tip of the Month: March 2011

A great insight of lean manufacturing was recognizing the pivotal importance of reducing changeover costs. American manufacturers would run the same parts on their stamping machines for two weeks because it took 24 hours to change over the machine. Along came the Japanese, who reduced the changeover time by 100x, and suddenly short run lengths became cost-effective. With shorter run lengths, batch sizes became smaller, and this improved quality, efficiency, and flow-through time. The great blindness of the American manufacturers was accepting the cost of changeovers as immutable. This condemned them to use large batch sizes.

Today software developers wrestle with a similar problem. Some view the cost of switching context as a form of waste. They think they can eliminate this waste by minimizing the number of times that developers must switch context. This approach inherently treats the cost of context switching the same way American manufacturers treated the cost of changeovers.

Is there a leaner approach? Rather than avoiding context switching we should ask how we can minimize the cost of switching context. Let’s use a simple technical analogy. When we design a microprocessor-based system we can choose to service interrupts immediately when they come in, or we can periodically check for waiting interrupts, a technique called polling. If we service interrupts immediately we must stop the operation in process, unload data from registers into memory, fetch the interrupt data, process it, and then restore the data from the operation we just interrupted. This creates lots of overhead.

What happens when we poll interrupts? We periodically check a memory location to see if an interrupt is waiting and process it if it is. The advantage in polling is that we control when we check for interrupts. By checking during the natural breaks between jobs, we massively reduce cost of context switching. The key point is that we can engineer technical and human systems to lower the cost of context switching – we don’t need to simply accept this cost as a constraint.

But why would we want to switch context more frequently? It isn't always desirable to have long uninterrupted efforts on a primary activity. There are cases where parallel secondary activities can improve the quality or efficiency of the primary activity. For example, most engineering degree programs force students to switch context between different subjects. We could teach a year of pure math before we tackle physics, but students would have a much harder time seeing the connections between these two subjects. Similarly, as an author, I never write my books by completing single chapters in isolation. By periodically shifting between chapters I can achieve much better integration between them. Authors who work one chapter at a time often lapse into repetitiveness. By the time they are writing Chapter 10, Chapter 1 is a distant memory.

So, if you find yourself instinctively trying to reduce context shifting you should ask yourself two questions. First, have you done everything you can to lower the cost of switching context. Second, are you capturing any benefits by switching contexts. If the benefits of context switching exceed its cost, then don't try to eliminate it.

Don Reinertsen

No comments: