DevOps Vision Blog

DevOps Vision Blog

About the Blog

Various posts about software delivery in the 21th century. Managed by Anders Lundsgard, @anderslundsgard. Psots are my own but inspired by many.

Autonomous teams Supporting Microservices 24/7

DevOpsPosted by Anders Lundsgård Tue, April 24, 2018 06:47PM

For agile software teams, moving to fully take responsibility not only for Development and Test, but also for Operations is a challenge. On top of that, supporting their services 24/7 may seem more than one can demand.

This post hopefully gives you some useful thoughts to consider when deciding how to effectively deal with support of your software applications/services 24/7.

First. Many have seen the value of a clear ownership of the code that builds your application. With code means: application code, tests, database scripts, configurations. Not only that but also the delivery pipeline definition and infrastructure code. It is important that one application or service has one (and only one) team owner.

At the very start, the team that chooses to create a repository (e.g. Git-repo in GitLab) and store some code in it is the owner of the lifecycle of the code. From the first day code hits production, the team that owns the code must then also take care of the 24/7 operational support of the code. If the team do not, the team is not (at least by my definition) autonomous. A team is usually between 3-8 persons.

Consider below aspects before taking a decision if you want to involve an external team/partner to take care of supporting operations of your code in production:

- What team do you think handle support better than your own team? Your team knows their code best. Agree?

- Do you believe you can write solution templates for an external person to follow when things start to fail at night? And if you do write “disaster instructions”, will the external person who follow those do more good than harm? For all types of things that can go wrong?

- What about offload the heavyweight lifting of for example a database server to a cloud provider. Using for example AWS RDS service your team don’t need to patch servers or care about high availability. The cloud provider manages that for you. Of course, you need to invest in understanding what the cloud provider does and does not do for you. With Serverless services your team does probably not need any capacity planning at all.

- Zero Downtime deployments: When your services can be updated without end user impact your team members will do it on office hours. QA mindset will significant increase when the engineers who writes the code also the same day push the change to production. And when the change is pushed, they are awake ready to deal with good or bad end-user impact.

- If your services are very business critical, consider introducing a pager system. That is an on-call schedule for all team members. If the system goes down at night, the engineer(s) who have the pager will be notified. But for that to work you really need to have a reliable monitoring of your apps. False alarms during night is not that very popular. This is unusual in Europe but more common in US.

Finally. If you seriously consider external support, are you doing that to increase customer satisfaction? Or just to have someone else to blame when your services are not available?





  • Comments(0)//blog.devops.vision/#post4

2016 State of DevOps Report

DevOpsPosted by Anders Lundsgård Mon, June 27, 2016 10:36PM
"We hope the findings, analysis and guidance in this report help you better understand the potential impact of DevOps on your organization."

As always (5th year in a row) exciting when the State of DevOps Report from Puppet comes out. Above quote is the last words from the report and I can't say anything else than the guys behind the report accomplished yet another success. I hope people from my company will read and adopt the practices and findings from this report.

Below are my favorite quotes and key takeaways from the report:

#1
"Our analysis shows that high performers are deploying 200 times more frequently that low performers, with 2555 times faster lead times. They also have the fastest recovery time and the lowest change failure rate."

#2
"The integration of security objectives is just as important as the integration of other business objectives, and security must be integrated into the daily work of delivery teams."

#3
"Always try to minimize the amount to test data you require in order to rum your automated tests, and avoid large database dumps wherever possible."

#4
"The idea that developers should work in small batches off master or trunk rather than on long-lived feature branches is still one of the most controversial ideas in the Agile canon, despite the fact that it is the norm in high-performing organizations such as Google."

#5
"Teams that don't have code freeze periods also achieve higher performance."

#6
"Agile has more or less won the methodology wars, but in larger companies it's still common to see months spent on budgeting, analysis and requirements-gathering before work stars."

#7
"Leaders can change culture. In today's fast-moving and competitive world, the best thing you can do for your products, your company and your people is institute a culture of experimentation and learning, and invest in the technical and management capabilities that enable it".

#8
"DevOps is no longer a mere fad or buzzword, but an understood set of practices and cultural patterns."

- Yes, I made the survey for the report and will do so next year. Thanks to Puppet, Inc!







  • Comments(0)//blog.devops.vision/#post3

Deadlines kill your Agility

NoEstimatesPosted by Anders Lundsgård Thu, June 16, 2016 12:39PM

Just a post to make you think about the value of your estimates to deliver software "on time".

It is the year of 2016, and large organizations still hire a bunch of agile coaches to make teams of developers more agile. I think they are mistaken if they think they can improve IT performance merely by getting everyone in the engineering department to adopt Scrum.

Before you continue reading. If you don’t believe that agility is good for your business, please do something better that continuing to read this post.

Agile thinking must extend beyond the boundaries of the Engineering Department for it to work at all. We need DevOps and Continuous Delivery for a fast and reliable path to production. Continuous Integration with automated tests in pipeline, trunk based development and so on are today’s mainstream even in large companies. Microservice architectures and Cloud environments make delivery and operation of software to a no-brainer. But all these practices only address the IT delivery cycle. It helps to build the thing right, not the right thing.

Building the right thing is an iterative process that requires full co-operation of the product management and other business stakeholders. However, when teams attempt to iterate at this level, they encounter friction on account of the organization’s structure, operational practices, politics, and culture. People in an organization act rationally in a way that maximizes their own success. Putting the emphasis on departmental output maximization, rather than optimizing the overall flow for the customer, means that the natural interests of the departmental manager come into conflict with the long-term goals of the company as a whole. In short, siloed organizations sub-optimize for their own success.

One of the things that people new to agile thinking have a hard time getting their heads around is that time (and estimates, and deadlines) is simply not a factor in an agile shop. They say that deadlines are a necessary part of the "real world." Yes, there are real-world issues that surround time. It's hard to move the date of the "2k bug" or the opening ceremony of the Olympics, but time is not a central part of the agile planning process. When you deal with complex systems (as we do in software delivery) my experience is that you should avoid deadlines wherever possible.

In a waterfall world, a deadline provides the illusion of security. A traditional deadline is a way to measure progress in order to manage a budget. The reasoning is that the product perhaps won't be totally finished for a year, but at least we know that they (the developer teams) will be "Done" by a year.


Agile processes promote sustainable development. The business, developers, and users should be able to maintain a constant pace indefinitely. Constant pace means that deadline-based management process simply can't work. If a team is punished for not meeting their deadlines (and overtime is a kind of punishment), then they'll pad the estimates to avoid the pain. Also in a code-monkey culture, the developers take shortcuts to meet an expected short-term business result. Omitting a reliable pipeline and test coverage increases technical dept. With that, the quality and ability to keep up the pace after the deadlines will decrease. And everyone asks why?!

In my experience deadlines are based on long-term estimates, and estimates of that sort almost never work. In reality everyone (including you) know that estimates always fails. But for some strange reason we keep behave like we don’t know that. Perhaps the simple reason is that we are to lazy to find better alternatives to master our software delivery. The hashtag #NoEstimates was born just because of that. It is a discussion on how to find alternatives to Estimates in software delivery.

A deadline-focused manager typically starts with the assumption that certain features are essential, and if you can't build these features by the deadline, then you've failed. When deadlines serve as a substitute to close collaboration between developers and the business they do not serve a good purpose. An agile company has realized that "essential features" will change as the product develops, so any estimation that was made in the past will be incorrect because the set of things on which you base the estimate will change. You always find flaws in the specification and encounter new implementation problems. Sorry, but that is probably true for all your software projects for the rest of your lifetime.


Questions

All projects are different. In your environment, have some thoughts on these questions:

1. Do you estimate the time team members should be out of office for various personal reasons?
2. Do you estimate the time to write tests and manage the CI/CD environment to run these tests and deploys?
3. Do you estimate the time your team will explore new territory to find better ways of solving your challenges?
4. Do you estimate the time your team will manage pull-requests?
5. Do you welcome new requirements between the time of estimate and the deadline?

The "solution"

When you manage to have a constant pace with short iterations, you can guarantee that you can deliver a working product every single day. The software is guaranteed to work when that trade-show arrives because it always works. The goal changes from "building a specific set of features by the 15th of September" to "building the best possible product in the time we have".

Continuous prioritizing based on feedback is important in this non-waterfall world. To achieve that we need to have a day to day collaboration (at the best also co-location) of development and business people. These days Outcome-oriented or Autonomous Teams are popular expressions for that. The responsibility for business outcome must belong to the developer teams building the product. As long as the company culture prevent that to happen the longer, the agile journey will be close to infinite.

Summary

A company with a culture of top down deadlines will die. The date of the funeral is based on how your competitors are doing in their "agile movement". Deadlines, especially those pushed from above, undermine teams with a constant push towards an illusory goal, and any attempt to work in an agile way will fail as a consequence. They will eventually also build products that nobody wants. Agile thinking must extend way beyond the boundaries of the Engineering Department for it to work at all.

And finally. I don't say do Estimates or do No Estimates (or deadlines). But probably all companies need to improve their current processes. Experiment with alternatives that fit your situation best. And of course - One size does not fits all.


References

#NoEstimates talk by Allen Holub
https://youtu.be/QVBlnCTu9Ms

There's No Room for Deadlines, June 30, 2014,
http://www.drdobbs.com/architecture-and-design/theres-no-room-for-deadlines/240168577

[Podcast] Designing Outcome-Oriented Teams: Part 1,
https://www.thoughtworks.com/insights/blog/podcast-designing-outcome-oriented-teams-part-1

NoEstimates book
By Vasco Duarte, http://noestimatesbook.com

Lean Enterprise book
By Jez Humble and Barry O’Reily, https://barryoreilly.com/lean-enterprise/

Toyota Kata book
By Mike Rother, http://www.lean.org/Bookstore/ProductDetails.cfm?SelectedProductId=324









  • Comments(2)//blog.devops.vision/#post2

Brooks’s Law

Laws of softwarePosted by Anders Lundsgård Tue, June 14, 2016 08:41AM
Brooks's Law on Wikipedia

"Adding people in a delayed project will make the project more delayed."


  • Comments(0)//blog.devops.vision/#post1

My very first blog post

DevOpsPosted by Anders Lundsgård Sun, June 12, 2016 10:14PM
Finally. Could not wait no longer than until today to create my very first blog post in my brand new DevOps blog.

  • Comments(0)//blog.devops.vision/#post0