DevOpsPosted by Anders Lundsgård Tue, April 24, 2018 06:47PM
For agile software teams, moving to fully take responsibility not only for Development and Test, but also for Operations is a challenge. On top of that, supporting their services 24/7 may seem more than one can demand.
This post hopefully gives you some useful thoughts to consider when deciding how to effectively deal with support of your software applications/services 24/7.
First. Many have seen the value of a clear ownership of the code that builds your application. With code means: application code, tests, database scripts, configurations. Not only that but also the delivery pipeline definition and infrastructure code. It is important that one application or service has one (and only one) team owner.
At the very start, the team that chooses to create a repository (e.g. Git-repo in GitLab) and store some code in it is the owner of the lifecycle of the code. From the first day code hits production, the team that owns the code must then also take care of the 24/7 operational support of the code. If the team do not, the team is not (at least by my definition) autonomous. A team is usually between 3-8 persons.
Consider below aspects before taking a decision if you want to involve an external team/partner to take care of supporting operations of your code in production:
- What team do you think handle support better than your own team? Your team knows their code best. Agree?
- Do you believe you can write solution templates for an external person to follow when things start to fail at night? And if you do write “disaster instructions”, will the external person who follow those do more good than harm? For all types of things that can go wrong?
- What about offload the heavyweight lifting of for example a database server to a cloud provider. Using for example AWS RDS service your team don’t need to patch servers or care about high availability. The cloud provider manages that for you. Of course, you need to invest in understanding what the cloud provider does and does not do for you. With Serverless services your team does probably not need any capacity planning at all.
- Zero Downtime deployments: When your services can be updated without end user impact your team members will do it on office hours. QA mindset will significant increase when the engineers who writes the code also the same day push the change to production. And when the change is pushed, they are awake ready to deal with good or bad end-user impact.
- If your services are very business critical, consider introducing a pager system. That is an on-call schedule for all team members. If the system goes down at night, the engineer(s) who have the pager will be notified. But for that to work you really need to have a reliable monitoring of your apps. False alarms during night is not that very popular. This is unusual in Europe but more common in US.
Finally. If you seriously consider external support, are you doing that to increase customer satisfaction? Or just to have someone else to blame when your services are not available?
DevOpsPosted by Anders Lundsgård Mon, June 27, 2016 10:36PM
"We hope the findings, analysis and guidance in this report help you better understand the potential impact of DevOps on your organization.
As always (5th year in a row) exciting when the State of DevOps Report
from Puppet comes out. Above quote is the last words from the report and I can't say anything else than the guys behind the report accomplished yet another success. I hope people from my company will read and adopt the practices and findings from this report.Below are my favorite quotes and key takeaways from the report:
"Our analysis shows that high performers are deploying 200 times more frequently that low performers, with 2555 times faster lead times. They also have the fastest recovery time and the lowest change failure rate.
"The integration of security objectives is just as important as the integration of other business objectives, and security must be integrated into the daily work of delivery teams.
"Always try to minimize the amount to test data you require in order to rum your automated tests, and avoid large database dumps wherever possible.
"The idea that developers should work in small batches off master or trunk rather than on long-lived feature branches is still one of the most controversial ideas in the Agile canon, despite the fact that it is the norm in high-performing organizations such as Google.
"Teams that don't have code freeze periods also achieve higher performance.
"Agile has more or less won the methodology wars, but in larger companies it's still common to see months spent on budgeting, analysis and requirements-gathering before work stars.
"Leaders can change culture. In today's fast-moving and competitive world, the best thing you can do for your products, your company and your people is institute a culture of experimentation and learning, and invest in the technical and management capabilities that enable it
"DevOps is no longer a mere fad or buzzword, but an understood set of practices and cultural patterns.
- Yes, I made the survey for the report and will do so next year. Thanks to Puppet, Inc!
NoEstimatesPosted by Anders Lundsgård Thu, June 16, 2016 12:39PM
Just a post to make you think about the value of your estimates to deliver software "on time".
It is the year of 2016, and large organizations still hire a
bunch of agile coaches to make teams of developers more agile. I think they are mistaken if they think they can improve
IT performance merely by getting everyone in the engineering department to
Before you continue reading. If you don’t believe that
agility is good for your business, please do something better that continuing
to read this post.
Agile thinking must extend beyond the boundaries of the
Engineering Department for it to work at all. We need DevOps and Continuous
Delivery for a fast and reliable path to production. Continuous Integration
with automated tests in pipeline, trunk based development and so on are today’s
mainstream even in large companies. Microservice architectures and Cloud
environments make delivery and operation
of software to a no-brainer. But all these practices only address the IT
delivery cycle. It helps to build the thing right, not the right thing.
Building the right thing is an iterative process that
requires full co-operation of the product management and other business
stakeholders. However, when teams attempt to iterate at this level, they
encounter friction on account of the organization’s structure, operational practices,
politics, and culture. People in an
organization act rationally in a way that maximizes their own success. Putting
the emphasis on departmental output maximization, rather than optimizing the
overall flow for the customer, means that the natural interests of the
departmental manager come into conflict with the long-term goals of the company
as a whole. In short, siloed organizations sub-optimize
for their own success.
One of the things that
people new to agile thinking have a hard time getting their heads around is
that time (and estimates, and deadlines) is simply not a factor in an agile
shop. They say that deadlines are a necessary part of the "real
world." Yes, there are real-world issues that surround time. It's hard to move the date of the "2k bug" or the opening ceremony of the Olympics, but time is not a central part
of the agile planning process. When you deal with complex systems (as we do in software delivery) my experience is that you should avoid deadlines wherever possible.
In a waterfall world, a deadline provides the illusion of security. A traditional deadline is a way to measure progress in order to manage a budget. The reasoning is that the product perhaps won't be totally finished for a year, but at least we know that they (the developer teams) will be "Done" by a year.
Agile processes promote sustainable development. The business,
developers, and users should be able to maintain a constant pace indefinitely.
Constant pace means that deadline-based management process simply can't work.
If a team is punished for not meeting their deadlines (and overtime is a kind
of punishment), then they'll pad the estimates to avoid the pain. Also in a code-monkey
culture, the developers take shortcuts to
meet an expected short-term business
result. Omitting a reliable pipeline and test coverage increases technical
dept. With that, the quality and ability
to keep up the pace after the deadlines will decrease. And everyone asks why?!
In my experience deadlines are based on long-term estimates, and
estimates of that sort almost never work. In reality
everyone (including you) know that estimates always fails. But for some strange
reason we keep behave like we don’t know that. Perhaps the simple reason is
that we are to lazy to find better alternatives to master our software
delivery. The hashtag #NoEstimates was born just
because of that. It is a discussion on how to find alternatives to Estimates in
manager typically starts with the assumption that certain features are
essential, and if you can't build these features by the deadline, then you've
failed. When deadlines serve as a substitute to close collaboration between developers and the business they do not serve a good purpose. An agile company has realized
that "essential features" will change as the product develops, so any
estimation that was made in the past will be incorrect
because the set of things on which you base the estimate will change. You
always find flaws in the specification and encounter new implementation
problems. Sorry, but that is probably true for all your software projects for the rest
of your lifetime.
All projects are different. In your environment, have some thoughts on these questions:
1. Do you estimate the time team members should be out of office for various personal reasons?
2. Do you estimate the time to write tests and manage the CI/CD environment to run these tests and deploys?
3. Do you estimate the time your team will explore new territory to find better ways of solving your challenges?
4. Do you estimate the time your team will manage pull-requests?
5. Do you welcome new requirements between the time of estimate and the deadline?
When you manage to have a constant pace with short
iterations, you can guarantee that you can deliver a working product every single
day. The software is guaranteed to work when that trade-show arrives because it always works. The goal
changes from "building a specific set of features by the 15th
of September" to "building the best possible product in the time we
Continuous prioritizing based on feedback is important in this
non-waterfall world. To achieve that we need to
have a day to day collaboration (at the
best also co-location) of development and business people. These days
Outcome-oriented or Autonomous Teams are popular expressions for that. The
responsibility for business outcome must belong to the developer teams building
the product. As long as the company culture prevent that to happen the longer, the agile journey will be close to
A company with a culture
of top down deadlines will die. The date of the funeral is based on how your competitors
are doing in their "agile movement". Deadlines, especially those pushed from above, undermine teams with a constant
push towards an illusory goal, and any attempt to work in an agile way will
fail as a consequence. They will eventually also build products that nobody wants. Agile
thinking must extend way beyond the boundaries of the Engineering Department
for it to work at all.
And finally. I don't say do Estimates
or do No Estimates
(or deadlines). But probably all companies need to improve their current processes. Experiment with alternatives that fit your situation best. And of course - One size does not fits all.References
#NoEstimates talk by Allen Holub
There's No Room for Deadlines, June 30, 2014,
[Podcast] Designing Outcome-Oriented Teams: Part 1,
By Vasco Duarte, http://noestimatesbook.com
Lean Enterprise book
By Jez Humble and Barry O’Reily, https://barryoreilly.com/lean-enterprise/
Toyota Kata book
By Mike Rother, http://www.lean.org/Bookstore/ProductDetails.cfm?SelectedProductId=324