Software intelligence company Dynatrace has announced the release of a new, in-depth State of SRE report, based on an independent survey of 450 site reliability engineers (SREs). The report highlights that SREs are taking on a more strategic role, as organizations have a growing need to ensure teams have the answers and intelligent automation needed to accelerate digital transformation. The growth of new technologies used in cloud-native development, however, has created an explosion of complexity that is hindering these efforts.
The research reveals:
• 88% of SREs say there is now more understanding of the strategic importance of their role than there was three years ago.
• SREs currently dedicate the largest amount of their time to reducing MTTR (mean time to recovery) (67%), building and maintaining automation code (60%), and ensuring security vulnerabilities are detected and eliminated quickly (58%).
• 68% of SREs expect their role in security to become more central in the future, as organizations continue using third-party libraries, such as Log4j, for cloud-native application development.
• 99% of SREs encounter challenges when defining and creating SLOs to evaluate service levels for applications and infrastructure. The most common challenges include:
o Too many data sources (64%),
o Difficulty finding the most relevant metrics for a service (54%),
o The inability of monitoring tools to easily define and track SLO performance (36%).
• 68% of SREs say siloed teams and multiple tools make it difficult to align on a single version of ‘the truth’ about service levels.
“Reliability, experience, and security have become critical success factors in a world where every second of downtime leads to lost revenue, declining share prices, and lasting reputational damage,” said Bernd Greifeneder, Founder and Chief Technology Officer at Dynatrace. “This makes SRE central to driving faster digital transformation. Most organizations, however, remain relatively immature in their adoption of SRE practices. At a time when demand far outstrips the supply of skilled engineers, organizations should be doing everything in their power to amplify the efforts of these teams. Despite this, manual steps and unnecessary effort are a major distraction for SREs, which holds organizations back. SREs must define a ‘golden path,’ a set of steps development teams can take to navigate the complexity of cloud-native delivery, to overcome these barriers and fully unleash digital innovation.”
Additional findings from the report include:
• 85% of organizations say their ability to scale SRE practices will be dependent on automation and AI capabilities.
• 71% of organizations are increasing the use of automation across every part of the lifecycle to reduce toil for developers and SREs.
• Organizations are primarily using automation in SRE to resolve security vulnerabilities (61%), and application failures (57%), increase the speed of delivery (56%), and predict SLO violations before they occur (55%).
• SREs say AIOps will enable teams to automate more processes critical to ensuring service levels are continually met (64%), prioritize problems that have the biggest impact on user satisfaction (63%), and prioritize security vulnerabilities to minimize downtime (62%).
• By 2025, 85% of SREs want to standardize on the same observability platform from Dev to Ops and security.
“SREs need a single, unified platform that enables reliability, security, and automation by default,” continued Greifeneder. “Self-service observability and monitoring-as-code capabilities are key, allowing development teams to build feedback loops into their applications in just a few clicks. Through this, SREs will lead the charge in going beyond basic automation to smart orchestration of customer experience and business outcomes. That will empower organizations to drive digital transformation faster than ever, through self-healing cloud applications that quickly scale with business needs. As a result, SREs can be free to focus on the things that are core to their role, enabling them to create greater value by driving best practices for reliability, resiliency, security, and performance, to ultimately deliver better business outcomes.”
This report is based on a global survey of 450 SREs from large organizations with more than 1,000 employees, including 150 in the U.S., 150 across EMEA, and 150 in Asia-Pacific.