25 Release Management KPI You Should Be Tracking
by Scott Reece, on Aug 19, 2019 2:20:00 PM
Release management is an integral part of DevOps. But it’s not always obvious how to make objective, empirical measurements of efficiency or effectiveness. It’s easy to get lost in the myriad of other responsibilities and metrics and forget to measure the overall performance.
Release management can continue along without any even basic understanding of performance. However, without understanding performance it is impossible to optimize and everything is based off of anecdotal evidence, or gut feelings.
We’ve put together a list of 20 core metrics companies can use to get a baseline understanding of release management performance.
Anytime users can’t access the software they need is considered downtime. Downtime is often most thought of from hardware failures, but it is part of release management as well. If releasing new software affects end users that must be taken into account and planned for during any release.
Downtime during a release is often measured in total minutes that a system is unavailable because of changes being released.
2. Estimated Release Downtime
Downtime can be unavoidable during a release. Software gets updated, databases get rebuilt, and servers get re-configured. The downtime needed for a release might be estimated in seconds, minutes, or even hours.
Creating an estimate of how much downtime there will be during any release cycle should be part of planning so that release timing can be taking into account, and to ensure the engineering and release team have an understanding of what is ‘actually’ happening during releases with realistic expectations.
If a team estimates downtime will be less than 5 seconds while a server updates, but in reality that update takes several minutes there is a disconnect somewhere in the process that needs to be addressed.
3. Actual Release Downtime
Measuring actual release downtime is important for future planning and scheduling of releases.
If a release takes a system off-line for hours finding a way to mitigate the downtime, either by releasing off-peak, or by rolling out the release in a way that doesn’t drastically affect users will be important. Depending on the release and the time it may be necessary to notify users in advance so they can plan for the outage.
There may always be downtime in a release the trend should be to minimize the time needed.
4. Number of Outages Caused by a Release
Outages can happen during a release because of undocumented changes, human error, or unexpected interactions between IT segments. These outages can be disastrous for an organization.
If outages continually occur from planned changes it is a good indication that there are undocumented elements that need to be addressed, or the underlying systems are too fragile and need to be shored up.
While organizations will have recovery plans in place, needing them consistently after releasing new versions of software indicates failure of process.
5. Number of Incidents Caused by a Release
Releases cause change to a system, inevitably some changes will cause incidents that must be resolved for the organization to move forward. Incidents can directly affect downtime depending on their severity, or downtime created by them.
While there are many ways of categorizing different types of incidents that releases cause (emergency, major, standard, minor, etc) the terms will vary by organization.
Tracking the number of incidents caused by a release is a metric that speaks directly to business impact of software delivered. However labeled, tracking and reducing the number of incidents for any release helps identify gaps in a release process when finding out how those incidents occurred.
6. Percent (%) of All Changes that End Up Causing Major Incidents
Knowing when incidents occur due to planned changes is important, as already noted. Tracking major, or top-tier incidents is key to ensuring that changes are going through a rigorous evaluation process before release.
Segmenting incidents helps to prioritize action needed to resolve incidents. Major incidents are any that need to be resolved immediately. These incidents cause direct and impactful harm to the organization and must be fixed.
Working to reduce major incidents as much as possible is important for business stability.
7. On-time Delivery
Delivering software to users is a necessity for any organization. If organizations can’t deliver software changes to end users consistently and on-time that business will always lag behind its competitors.
On-Time delivery is measured in the number of days after a scheduled release date the release to end users actually occurs.
8. Releases Delivered on Schedule by Application
Knowing when applications are released according to their predetermined schedule allows for better resource management and future planning. Without knowing which applications are consistently on, or behind schedule, changes can’t be implemented in an effective and meaningful way to help application teams deliver on schedule.
If a specific application is consistently behind its release schedule the cause may be: a lack of personnel resources, a constant change in priority from leadership, unrealistic scheduling, etc. Knowing which applications are not being delivered on schedule is the first step in determining the underlying issue.
9. Releases Delivered on Schedule by Priority
Not all applications, tickets, or features are created equally, some are more critical, or will have a larger impact to the organization. Prioritization of application release should correspond with those releases being on-time.
Tracking release priority and delivery as a metric will give insight about how teams respond when the business defines priority goals. If releases that are deemed high importance are often delayed or late as resource allocation and communication of what priority means needs to be addressed for effective change management.
10. Total Number of Days Late by Application
An application release schedule is created with business value in mind, getting new features to market, ensuring compliance, and improving user experience are all critical to ensure organizational success.
If an application is consistently behind schedule there are real and material effects to the business, either in lost market share, increased security risks, or possible fines from compliance authorities. This is unacceptable if an organization is to thrive and grow.
Knowing the total number of days late an application has historically been late can illuminate actual costs to an organization that could have been minimized or avoided.
11. Duration of Major Deployments
Businesses must always change and innovate, but getting new features and innovations to the marketplace can only be considered ‘complete’ when the target users have access to the changes. These changes are most often done via a major deployment due to their scale.
Tracking time between defining a major release (new features, bug fixes, etc) and the time it takes for that release to be available to end users is the duration or of a major deployment.
Knowing how long it actually takes to get changes on a major deployment to end users dictates how fast a business can actually serve its market.
12. Release Priorities
Release priorities may be due to external factors, like maintaining compliance, or internal decisions made by the business unit. Ensuring that the highest priority receive the proper attention and schedule is critical for organizational continuity.
Release Priority measurement can be difficult as lower-priority releases may be completed before higher priority ones, but lower priority releases should never be given precedence over a high priority release. Ensuring High priority release continue to move towards release on-schedule is the best indicator of success.
13. Coordinated Releases
No organization unit exists in a vacuum it is affected, and controlled by outside forces. The software that an organization needs to thrive is no different, and there will be frequent times that outside departments, other applications, or shared resources will all need to be coordinated, and kept informed of a release.
These releases will take more time, energy, oversight, and communication than isolated releases. However, these releases are more important to an organization because they affect multiple systems and directly contribute to over-arching goals set by the business.
Defining coordinated releases a high priority and knowing how many coordinated releases are scheduled at any given time will help ensure that there are enough resources available to release successfully, and on schedule.
14. Isolated Releases
Isolated release will generally only involve the individual application team, and only affect that application’s stand-alone systems. While isolated releases may still be challenging to implement on the development level they introduce less risk at the organization level because they have a limited sphere of contact with other applications, departments, and resources.
While still important isolated releases are generally not at critical to organization success and planning as coordinated releases.
15. Emergency Patches
There can be situation that demand emergency patches, whether from internal errors, or external vulnerabilities. These emergency patches often need unexpected resources for approval or quality assurance, and can create chaos to non-emergency timelines.
Having clear priorities in cases of emergency patches enables organizations to allocate necessary resources to clear the emergency as well as adjust expectations and schedules for other applications and systems that may have delays because of the emergencies.
The unexpected will always happen, having practices and procedures in place allow business units to adapt and adjust when they do.
Mistakes will happen, and they will happen during releases. Minimizing errors early in the release cycle is critical to long-term success, however when an error does make it into a release having processes in place to minimize impact is important.
As most organizations have an issue tracking system in place using it to track total numbers of errors at any given time is most efficient.
17. Number of Rollbacks
There are times that mid-release, or shortly after a release has completed a fatal defect or unacceptable issue is found. When this occurs even the time to create, test, verify, and deploy an emergency patch is too long. The solution is to rollback to the last release without the flaw.
Rollbacks should always be an option to implement, however they should only be used in extreme cases when they are the only acceptable solution. Tracking the number of rollbacks that occur throughout a system indicates the quality and thoroughness of development and testing.
18. Number of Software Defects that Made it into Production
Code perfection, while the goal, is never achievable. Updates to infrastructure, adding new features, and adherence to new regulations can create bugs in production software systems. Knowing the over-all amount of bugs in a system allows for better planning and resource management.
Reducing the amount of known defects in a system to zero doesn’t need to be the over-all goal for a reliable application, but making sure that the defects are non-critical, and don’t pile-up with new releases in an unmanageable way is more important.
Every organization will be different in their production defect tolerance, and it may vary from system to system depending on that system’s importance. Knowing about those defects allows for resource allocation, and release cycle planning.
19. Production Issue Open/Close Rates
Measuring the rate that tickets come in and are closed during the release cycle is important for trend monitoring. There will always be flaws in productions and new features will introduce new bugs, however by continually monitoring the open and close rates of these tickets you’ll be able to monitor trends in production and be able to plan accordingly.
Spikes outside the average may indicate a rushed process, or an unacceptable level of testing prior to release. Monitoring issue open and close rates sets a baseline for application performance and resource planning.
20. Mean Time to Repair
Tracking the amount of time that elapses between an application crashing and when it is recovered is valuable to assess the over-all health and reliability of systems.
Knowing which systems recover faster and why, either through early crash indication or automated disaster recovery, will allow better allocation of resources for critical systems that can be replicated for better over-all system reliability.
Systems will crash, monitoring the time from those crashes until recovery will allow the organization to reduce the overall effects of a crash.
21. Percent (%) of Release Success Rate
Software developed on a set release schedule will likely need more than just the pre-scheduled releases. Releases so that issues can be fixed, and maintenance performed are almost always necessary. These are not generally planned because it’s impossible to know what issues will arise out of a release until the release happens.
These non-planned releases can throw off a release schedule and delay planned releases. The percent of release success rate is the percentage of planned releases that were deployed as scheduled over a given amount of time.
Some releases may need a delay because of plan changes, or for priority changes, but consistently missing planned releases may indicate underlying concerns that need to be addressed.
Delivering software to users is at the core of any mature organization. The ability to deliver that software in an efficient manner is often the difference between success and failure for business units.
Measuring efficiency is determining if the team can deliver as the business demands, on schedule and with the resources allocated.
23. Average Lead Time
The ability to release new features to the marketplace is key to increasing business and for development teams to deliver value to non-technology stakeholders.
Understanding how long it takes to go from idea, or feature request, through to released software is a key indicator of organizational health. If it takes months or quarters to release then feature planning will always be that far behind when trying to stay competitive with organizations.
Understanding how long lead time is will set a baseline for improvement, or give guidance about where resources need to be allocated to improve.
24. Average Cycle Time
Knowing how long it takes from committing code until that code is released is necessary for efficient planning.
Cycle times may be measured in weeks or even months for some organizations, others that focus on delivery can measure cycle time in minutes or seconds allowing them to deliver code almost immediately when needed.
While no two releases are ever the same, on average it should take about the same amount of time for any given release cycle of similar type. Establishing this baseline allows for innovation both in mythologies as well as tooling to reduce the cycle time.
25. Average Cost Per Release
Software in business is ubiquitous, so much so that it is sometime hard to see how much it actually costs. This is more true for software developed internally since resources are often seen as ‘already paid for’. However, this point of view can lead to wasting resources. If a non-critical change to the software will take man-months to create, but deliver very little (or no) new revenue, and is only a very minor improvement, it is worth reevaluating its value and priority.
A simple calculation of man-hours and salary per release gives a simple cost to release those changes. Knowing this cost will allow you to focus on what are the most value-positive changes that can be made during a release.
See How Inedo Can Improve Your Release Management KPI
Understanding some of the above release management KPI will give insights into how truly reliable, efficient, and adaptive an organization is. BuildMaster, and other tools, help measure and increase efficiency in an origination and enable improvements in releasing software.
Inedo’s DevOps tools maximize developer time, minimize release risk, and empower stakeholders to bring their vision to life faster. All with the people and technology you have right now.