CMDB: Lost in the Clouds
About
This content is brought to you by Evolven. Evolven Change Analytics is a unique AIOps solution that tracks and analyzes all actual changes carried out in the enterprise cloud environment. Evolven helps leading enterprises cut the number of incidents, slash troubleshoot time, and eliminate unauthorized changes. Learn more
“…But now they only block the sun They rain and they snow on everyone So many things I would have done But clouds got in my way…” -Both Sides Now, by Joni Mitchell
“Clouds do most definitely get in the way…” that is if you are trying to operate IT environments in the cloud exactly as you did in the datacenter. That approach is generally referred to as “lift and shift”. You will not automatically capture the advantages in agility and scale that a cloud-based architecture could provide. As a result, your expectations for improvement may end up “lost in the clouds…”
Where is the CMDB?
The CMDB is safe and sound where it belongs, in the datacenter - on-premises. The CMDB (Configuration Management Database) is a logical database containing all the essential information about IT infrastructure components and the dependencies between them. The practice of IT Service Management (ITSM) using the IT Infrastructure Library (ITIL) typically leverages the CMDB for configuration management, changes to configurations, as well as the processes for incident, problem, release management, and more. The CMDB is applied to help improve IT service delivery and governance.
While CMDBs can provide significant business value in a traditional on-premises deployment, delivering on the promise of a CMDB requires more than tool investments. I&O leaders must implement a service asset and configuration management program (SACM) to capture this value. SACM is the process of maintaining information (i.e., configurations) about IT Assets and Configuration Items (CIs) required to deliver an IT service - including their relationships. It requires a high degree of ITSM maturity to fully realize the benefits of the CMDB.
According to Gartner (in their recent Market Guide for Cloud Management Tooling), “Enterprises are intentionally and unintentionally needing to support multi-cloud deployments, which is stressing their current operational process and tooling that is often focused on a single environment “. This evolution of the standard IT deployment model to multi-cloud (and hybrid) is where the trouble starts. CMDB’s and SACM processes were not designed to handle this model.
Clouds Got in My Way
As Joni so eloquently said, “but clouds got in my way”. Here’s how that happens.
Need for Speed
The SACM processes that manage change are just too slow for use in hybrid, multi-cloud deployments. While the process of discovery for capturing configurations and the changes already deployed can be done automatically, the process for approving new changes is typically not. The CMDB may be “up-to-date” after automatic discovery; however, there would be no certainty of whether the changes discovered were approved, correct, or engender risk.
Typically, the processes to manage change are manual and dependent on human interactions such as the approval process for a change via a Change Advisory Board (CAB). An important change that could be business-critical may well sit and wait for a meeting before it can be enacted and then deployed. However, the advantages in agility that the cloud offers require almost constant change, requiring these changes to be immediate. This fluid, almost shape-shifting-like quality enables businesses to take advantage of fluctuations in demand, market, and competition and utilize them as opportunities for growth and market advantage.
Awareness of the impact of these changes as they happen or better yet before they happen is critical to ensure that there is no impact on stability, security, or compliance. Unfortunately, these manual processes are just too slow and in a cloud deployment, they are too late to still be relevant, and essentially, they can “get in the way”.
Scale
The CMDB was never designed to store the enormous volumes of deep and ephemeral configuration data in dynamic cloud-native frameworks such as Kubernetes, nor clarify what must be done to avoid risky changes (see “The Kung Fu of Change Risk Intelligence”). If the CMDB is to be the “single source of truth”, the missing configuration data could complicate and elongate troubleshooting. A CMDB can store ephemeral data and commonly used ones have a wide breadth of discovery. However, CMDB’s generally fall short in several areas including the following:
- Depth of discovery is limited and not sufficiently granular. They do not store the complete extent of details per configuration item (CI)
- CMDB’s do not connect configuration with automation assets
- Analysis of discovered configuration data is limited to asset classification
The CMDB’s depth of discovery, contextualization, and analysis is insufficient for helping enterprises navigate the volatile, digital ocean of the cloud, and avoid the impact of misconfigurations and risky changes
Due to this, significant amounts of critical configuration data that are changed on the fly may never make it into the CMDB. Changes to the detail of CIs may result from heroic efforts to restore stability and the constant set of changes flowing out of DevOps as the cadence of new releases is accelerated in accordance with business objectives. If they do make it to the CMDB, it may be too late to use this data to determine risk, avoid instability and ensure compliance.
For example, CMDBs may capture the hierarchal structure of Kubernetes, including namespace, deployment, service, and pod. But they do not typically retrieve all the properties of each element for namespace, deployment, service, DeploymentConfigs, Secret Maps, and more.
Since Kubernetes is often used to construct Microservice applications, the impact in terms of behavior, performance, and security of a change in one of its component’s configurations can be enormous and the challenges in troubleshooting it will be daunting. Without deep, granular configuration data, correlating low-level changes with incidents to drive root cause analysis is not possible.
Business-critical decisions are made on the data that is in the CMDB…but if the data is incomplete the decisions may be incorrect as well as late…and reliability will suffer (see “It’s Time to Turn and Face the Changes”.
Configuration Changes Determine your Fate
Configurations are the blueprints that determine behavior. They are the guidelines, almost like IT DNA that governs the conduct of code, frameworks, firewalls, security, automation, deployment tools, and probably just about everything in an IT environment. Much of the changes that occur in an IT environment are changes to configurations. These can be, for example, memory constraints, buffering, threads, instances, rules, services, replicas, jobs, code, security settings and so much more. Modern cloud-native applications often have thousands of components with short-lived, interdependent relationships between them. The dynamic dependencies between these parts are immediately impacted by changes in their configurations sometimes with unexpected results.
Changes to these parameters in a configuration can be why an application performs much faster, delivers a new feature to customers, or why the application crashed and burned, taking your company’s reputation with it.
Artificial Intelligence: Analysis before Paralysis
Artificial Intelligence (AI) as a form of Change-Centric AIOps can utilize supervised and unsupervised machine learning (ML), predictive analytics, and anomaly detection to uncover patterns that implicate configuration change with unexpected system behavior. Commonly collected data found in the typical Observability sources such as logs, metrics, and traces will only show symptoms and not the reason why stability has suffered (see Prevent Problems Using Four-Dimensional Observability”).
AI can be applied to this problem and utilized to establish causal relationships between elements of telemetry data and configuration changes. By configuration changes, we mean the entire state of the environment including code, data schemas, security, containers, and more. It can also estimate the blast radius of what has been impacted and going further predict the risk incurred by these changes. This last benefit can help you evolve into more of a preventative approach to problem management.
To apply AI effectively to this problem it must have the necessary data sets and models. The symptomatic data is readily available from monitoring tools. APIs and REST interfaces make data acquisition from the cloud easy. Yet, most CMDBs do not have the data necessary to analyze the impact of misconfigurations and risky changes.
However, the real value comes from the analysis of this data that leads to action to ensure stability, compliance, and security. Deep visibility into all the parameters of a configuration, as well as the changes that are dynamically applied in modern cloud deployments using continuous integration/continuous deployment (CI/CD) processes, is essential.
Commercial CMDBs typically describe their solutions as utilizing AI but, in many cases, they are referring to policies and detecting differences from a baseline and not true AI. The results of their analysis are visualized in a dashboard and an operator is left to decide whether intervention is necessary. A connection from configuration to automation assets is essential but unfortunately is missing in those solutions. As a result, the resolution of a misconfiguration becomes a manual process and one that is unable to keep up with the speed of the cloud…if it is detected at all.
CMDBs are also missing AI-driven automation of change reconciliation between actual changes, change requests, authorized automated deployments, and CI/CD steps. This is essential to correlate detected changes with automated deployments and CI/CD metadata.
CMDBs lack an AI analysis of code which is a highly critical capability enabling DevOps to avoid risky changes from misconfigurations before they are deployed to production.
Configuration Risk Intelligence
Proper Governance in the cloud must include the immediate determination that configuration change has on compliance, stability, performance, reliability, security, customer experience, and more. Businesses need Cloud Governance solutions that regulate configuration change and analyze risk. This requires a new repository and service implemented in the cloud as well as on-premises that captures and persists the dynamic churn of change and uses AI to determine and score current and future risks. AI is necessary to automate risk assessment from change as the manual processes typically used impact agility (Today’s CMDB tooling does not provide this). As Gartner recently said, “This requires leveraging a data store that handles the unique challenges presented by cloud deployment”. There are tools available that provide support for multi-cloud deployments, but their governance capabilities do not include configuration risk intelligence.
Configuration Governance
As most enterprises will continue to utilize a hybrid approach even as they explore multi-cloud deployments, there needs to be dynamic, working integration between their CMDB deployments and the new system for cloud governance of configuration change. The CMDB doesn’t go away, instead, it gets a working partner designed to handle the scale and churn that is endemic to the cloud. The CMDB should also evolve to better extend itself to handle the cloud’s requirements for speed, and agility. For now, enterprises should adopt a bi-modal approach such that mode one legacy, run-the-business applications, continue to leverage traditional IT processes and the CMDB in the datacenter. However, newer mode two, grow-the-business applications, utilize the automated, AI-driven configuration governance approach. This new aspect of configuration governance is dependent on automation, AI, and frictionless processes that avoid people dependencies.
Summary
Adopting Configuration Governance in the hybrid multi-cloud environments will require changes to tools, processes, and staff:
- Leverage tooling that delivers configuration intelligence providing deep discovery of configuration parameters across your entire IT environment, including the cloud.
- Utilize AI to determine the risk that configuration change will impact the stability of IT
- Employ AI integrated with CI/CD to predict potential risks from the changes in a new release before it is pushed to production
- Make use of automation to reconcile changes between the new automated, AI-driven configuration governance approach and the legacy CMDB
- Reduce the toil in your processes and free up your teams to focus on rapidly delivering value to customers via change
Joni said,” I really don’t know clouds at all”. But you do. Stop letting “clouds get in the way” and begin to establish configuration governance as a key part of your Cloud Ops strategy.
Contact Evolven here to see the Evolven Change Control technology in action.