1 (866) 866-2320 Straight Talks Events Blog

Get your head out of the silo, with Vendor-Agnostic Situational Awareness

Blog

Get your head out of the silo, with Vendor-Agnostic Situational Awareness

About

This content is brought to you by Evolven. Evolven Change Analytics is a unique AIOps solution that tracks and analyzes all actual changes carried out in the enterprise cloud environment. Evolven helps leading enterprises cut the number of incidents, slash troubleshoot time, and eliminate unauthorized changes. Learn more

Intellyx BrainBlog for Evolven, by Jason English

The biggest obstacle to gaining visibility into potential risks and failures within an enterprise application state isn’t a lack of dashboards. It’s too many dashboards.

No development manager, IT operations leader, or CIO in the world will tell you that they wish they had more Slack alerts and Jira tickets, more data feeds to monitor, more dashboards to check.

Application architectures will only increase in complexity and interconnectedness over time. Simply throttling change is not an option, especially in a scenario like a merger and acquisition, where neither party can afford to cut off their current revenue streams by taking down critical applications.

So how do you gain control over configuration and change risk in complex environments, without throwing sand in the gears? An approach of Vendor-Agnostic Situational Awareness (or, VASA for short) might offer a solution.

What if we can’t break down silos?

Cloud hyperscalers like AWS and Azure offer their own forms of system and cloud usage monitoring and security. They still sell even more bolt-ons for APM, vulnerability scanning, release management, and issue tracking that are affiliated at the account level through their marketplaces.

Companies have also already made huge on-premises investments in datacenters and managed vSphere instances, and bespoke private cloud infrastructures, often supported by service providers, each of which may come equipped with their own measurement and compliance tools to prove they are meeting SLAs.

Below that, there are many more platforms and SPOG (single pane of glass) dashboards that can tell you ‘what’s wrong’ within specific ITSM, SIEM, observability and CI/CD pipelines, and an ecosystem of vendors for each. Take ServiceNow. Or Splunk. Or Datadog. Or Atlassian or GitLab.

Source systems Sample vendors Measures/monitors
ITSM ServiceNow, BMC, Atlassian (Jira), Pagerduty Tickets, Incident reports
ITOM BMC, IBM, Broadcom System events, upgrades, performance metrics, storage
Observability, APM Cisco (AppDynamics), Grafana, Dynatrace, Datadog, New Relic Latency, traffic, errors, saturation, performance
CI/CD, GitOps, SSC Jenkins, JFrog, GitHub, GitLab, Azure DevOps Pull requests, check-ins, documentation, packages
Security Operations Splunk, LogRhythm, Securonix, Elastic Vulnerabilities, threats detected, DDoS, ransomware
Networks Cisco, NETSCOUT, Kentik Traffic, latency, connections

Figure 1: Sampling different sources of information from a mix of vendors results in a plethora of seemingly unrelated measures that enterprise IT teams must glean awareness from. [Not a complete list.]

Introducing Vendor-Agnostic Situational Awareness

Many dashboards offer a window into proprietary silos, which creates a challenge in and of itself, because only events from vendor-approved or specific integrations will seem evident inside each platform’s purview.

Open source projects such as OpenTelemetry and the ELK stack are helping to democratize data and metrics, with community and vendor support.

Any composite organization is going to have a need for further neutrality, a trusted third party that can monitor any platform as a source of telemetry. Rather than looking inside a limited set of enterprise silos, the org should gain visibility into them all—ITSM, SIEM, observability and software delivery automation—to mitigate shared configuration and change risk across all departments and all technologies.

Each corporate department, every upstream partner, and every downstream customer will still operate in their own silos. When a team shapes the huge volume of incoming data and configures dashboards, they will naturally prioritize their own needs first. But for technology leaders, SecOps teams and SREs who must deal with the consequences of global risk across the IT estate, VASA is the way to go.

This starts with discovering and collecting configuration and change data wherever it occurs across multiple environments, bringing it all together in one place, then collating and correlating this event data according to time and impact. You might even think of VASA as a shared organizational consciousness for risk and change.

A single pane of glass for insurance against risk

Evolven has taken a non-opinionated approach to identifying change and configuration risk within an enterprise-wide view of the technology estate, by collecting near-real-time data from a distributed inventory of hybrid cloud assets.

A Case Study Using Evolven: A large insurance firm wanted to improve its customer-facing presence, while proactively managing compliance and SLOs (service level objectives) for availability and performance as a policy, rather than firefighting application failures in a reactive mode.

They managed their legacy application environment (Linux VMs, WebSphere, Oracle, etc.) through a long-standing ITIL process, with a Change Advisory Board (or CAB) authorizing changes through ServiceNow to prevent change-related failures.

At the same time, the firm’s innovation team conducted agile development with a CI/CD pipeline for automating deployments of its modern mobile application, running in containers orchestrated by Kubernetes on OpenShift. The new app is strategic to its improved digital experience, allowing customers to use functionality such as payments and claims on their client-side devices.

Upon a daily update to the mobile app, the application started failing to show some of the necessary customer information. With no visibility into what might be causing the error, the firm’s Production Support team got a Sev1 notification and had an SRE drop in to try and figure out what was wrong.

Using Evolven, they first checked both the legacy environment’s CAB-driven change approvals, and the mobile team’s CI/CD pipeline, to prove that all the change requests made to each system were authorized within ServiceNow. Check.

Then, they started checking that all of the production deployments in both legacy and modern environments were verified complete. Check. But, wait a minute… one of the back-end data services actually failed to deploy correctly to staging, prior to production!

Armed with this exact telemetry of a change event that caused the staging failure, which was deemed successful in deployment because it matched the staging environment, the SRE asked the irresponsible DevOps team why they didn’t bother to alert anyone else.

“Oh well, we didn't think anyone would notice that, and we were going to fix it first thing in the morning…” Good thing they were able to get situational awareness at exactly the right time!

The Intellyx Take

There are teams that interface daily with ServiceNow, Splunk, or Dynatrace to track issues and hunt for anomalies. These platforms are all useful, but why not dig deeper and look for situational awareness data at all of its sources, where change happens?

Sure, VASA may be represented as yet another single-pane-of-glass dashboard, but the practice would align multiple teams with a common consciousness of configuration, change and risk wherever it is exhibited across multiple environments.

All parties can keep their own silos if they want. However, a single view of all of the systems that contribute to configuration and change data is pertinent to the modern version of an enterprise-wide risk control board, which now has more distributed systems and stakeholders than ever before. Deep insight into configuration state, along with high-level visibility into risks and compliance failures are critical for the stability of modern applications, now and in the future.

About the Author
Jason English

Jason English (@bluefug) is Partner & Principal Analyst at Intellyx, an analyst firm which advises enterprises on their digital transformation strategies, and publishes the Cloud-Native Computing Poster and the weekly Cortex and Brain Candy newsletters.