Using the Toyota Production System to explain DevOps

The current project I’m on is finding a path to adopting DevOps in an enterprise environment. The weeks of preparation, pitch to directors, and executive management is in the past, I’ve received the necessary approval to proceed without any limitations (outside of additional authority to actually enforce change). Still need to persuade each product group to see the light and change on their own, while guiding them down a yellow brick road.

News had spread a little that we got the green light, and some individual contributors in the company were interested in what DevOps was really all about. There is as much curiosity in learning more about it as there was fear of job loss from the change, even though no headcount modifications were even in the pitch. The only changes we suggested were lateral moves in the existing reporting structure.

In trying to explain what DevOps was similar to, I tried to get an idea for people’s current understanding. All around, the concept of DevOps was pretty much a miss for everyone I talked to, except for a few newer folks or the other folks in the architecture team who were around for the last attempt. Weirdly upper management had a better grasp on it than the individual contributors, for which I applaud them given that DevOps is historically one of those types of movements that has always been an upward movement in terms of awareness and education.

So what exactly is DevOps?

Everyone you ask that has experience with DevOps and how they would define it is going to be completely different between each person you ask. It’s quite evident as comparison of those definitions to what you might find on Wikipedia is going to vary (sometimes greatly). I wouldn’t expect a verbatim recital of Wikipedia’s entry in anyone’s response. What they give makes perfect sense – it will match exactly for what they’ve implemented.

For some companies, it’s just a job title change for system administrators (which it is not). For other companies, it’s just the usage of automation tooling (which it is not). And then for others, it’s just putting the development and operations teams together (again, which it is not). Although I wouldn’t write these off as things that don’t happen, I would just write them off as things a group or company might solely do for the DevOps label. It’s much deeper than that.

Since there isn’t a manifesto or set of principles authored anywhere (which was on purpose), it’s hard to define all of the concepts of DevOps beyond just the oft repeated “continuous deployment”, and “continuous integration” phrases used in sales pitches. If I had to describe it in a single sentence, I would probably wrap as simply as a set of “loosely coupled ideas at scale”. Going a bit further in how I understand the DevOps movement is to explain that it’s an application of the principles in the Toyota Production System to the Information Technology sector.

What about a Toyota?

The Toyota Production System was a successor to a corporate philosophy developed at Toyota between 1948 and 1975 by Taiichi Ohno. It is by far, the most widely recognized set of principals in any industry, and one that most people would recognize having never worked in any sort of manufacturing.

The system defines these core principles:

  • Long-term philosophy
    • Base your management decisions on a long-term philosophy, even at the expense of short-term financial goals.
  • The right process will produce the right results
    • Create continuous process flow to bring problems to the surface.
    • Use the “pull” system to avoid overproduction.
    • Level out the workload. (Work like the tortoise, not the hare.)
    • Build a culture of stopping to fix problems, to get quality right from the first.
    • Standardized tasks are the foundation for continuous improvement and employee empowerment.
    • Use visual control so no problems are hidden.
    • Use only reliable, thoroughly tested technology that serves your people and processes.
  • Add value to the organization by developing your people and partners
    • Grow leaders who thoroughly understand the work, live the philosophy, and teach it to others.
    • Develop exceptional people and teams who follow your company’s philosophy.
    • Respect your extended network of partners and suppliers by challenging them and helping them improve.
  • Continuously solving root problems drives organizational learning
    • Go and see for yourself to thoroughly understand the situation.
    • Make decisions slowly by consensus, thoroughly considering all options; implement decisions rapidly.
    • Become a learning organization through relentless reflection and continuous improvement.

Obviously Toyota Production System methodology was developed to tackle an entirely different sector than DevOps, but they aren’t all that different. I like using the Toyota example because people know the brand and the amount of quality improvements that it is continually seeking in it’s products. They are pretty active in advertising it (albeit not on television or print advertisements) through their usage of ‘Kaizen‘ in both their printed and online product literature. The Toyota system has impressively been written in such a way to be almost applicable to many more sectors than manufacturing.

How exactly do these apply to Information Technology?

The objectives of the Toyota system are to remove any sort of overburden (muri), inconsistency (mura), and waste (muda). In fact, there are seven types of muda alone that are identified in the system. Here’s an example of all seven areas and how they could be applied to information technology with what some solutions would ultimately be:

  1. Waste of over production
    • Low value features, applications, and services that customers don’t want or won’t use, or won’t eventually ship
    • Abundance of resources that aren’t being used like shutting-down/terminating virtual machine instances when no longer needed for testing or when the load is no longer present.
  2. Waste of time on hand (waiting)
    • Slow performance in features, applications, and services
    • Manual processes that could be automated
    • Performing processes synchronously instead of in parallel
  3. Waste of transportation
    • Physical datacenter visits to resolve issues instead of using remote management tools such as DRAC (Dell), ILO (HP), or third party solutions.
    • Physical processes for audits
    • Unnecessary movement of data such as full database backups instead of full+transactional backups to lower bandwidth/data ingress/egress costs.
  4. Waste of processing itself
    • Processing something by the system that clients can process themselves (sorting tables on the browser for instance)
  5. Waste of stock at hand (inventory)
    • Server sprawl which increases power and (possibly) licensing costs for usage of the operating system. This is why cloud providers are so popular – you don’t have the overhead of managing the infrastructure and you can burst/scale when you need to instead of becoming a sitting duck if your service gets suddenly popular.
  6. Waste of movement
    • Fire fighting issues as they come up instead of investing in self-healing concepts, or taking the time to find a better solution. Loss of productivity due to ‘moving’ between mindsets.
  7. Waste of making defective products (defects)
    • Out of band bug fixes and changes. Usually leads to configuration management issues as the changes are never fed back into lifecycle. Think of making the datacenter services and their configurations immutable to change once deployed – push changes up from lower environments.
    • Failure to properly design the system, inability to foresee issues, or blind to how users would actually use the system.
    • Not properly instrumenting your services correctly or fully, leaving valuable feedback inaccessible.

In essence, you are never done improving. You should always be evaluating areas for improvement and change as you deal with issues that arise. There is much more behind the DevOps label is than Continuous Integration and Continuous Deployment, and sometimes it’s not always a technical change, but a cultural one as well.