Microsoft Azure’s pay-as-you-go pricing gives organizations substantial flexibility to scale infrastructure on demand, but that flexibility can also quickly become a problem. Without proper governance and monitoring, cloud costs can spiral out of control as multiple teams provision resources independently across the organization.

Traditional FinOps tools often fall short in Azure environments, where decentralized provisioning is the norm. These tools typically focus on reactive cleanup, which involves identifying waste after it has already piled up on your cloud bill. Instead, organizations need a shift toward measurable, ongoing optimization that prevents waste before it occurs while keeping the agility that made cloud adoption attractive in the first place.

Native Azure tools like Azure Cost Management and Azure Advisor provide a solid start for cost visibility and basic recommendations. That said, you can significantly enhance them with external solutions that offer advanced automation, machine learning-driven insights, and cross-functional orchestration. This new generation of FinOps automation technology bridges the gap between insight and action, letting you move from just spotting optimization opportunities to implementing them at scale.

This article explores best practices for Azure cost optimization, focusing on practical strategies that balance saving money with meeting performance requirements. It looks at how to optimize compute resources, implement intelligent storage tiering, leverage Azure’s commitment-based discounts, and establish governance frameworks that enable innovation rather than restricting it.

Azure cost optimization best practices

Best practiceDescription
Establishing cost allocation and chargebackAssign Azure spending to teams and projects through tagging policies and allocation rules to create accountability. Kubernetes environments require namespace-level tracking and showback reports that drive optimization decisions.
Optimize compute resourcesEliminate idle VMs, rightsize instances to match workload needs, and use Azure Hybrid Benefit for licensing savings. Implement shutdown schedules for non-prod environments.
Implement storage optimizationConfigure lifecycle management to tier data, select storage classes based on access patterns, and delete orphaned disks and snapshots.
Leverage Azure reservations and savings plansCommit to steady-state usage through Reserved Instances and Azure Savings Plans to get significant discounts over pay-as-you-go pricing for predictable workloads.
Establish cost governanceDeploy guardrails, enforce resource tagging, and set up budget alerts to prevent cost overruns while letting teams innovate safely.
Enable comprehensive monitoringUse Cost Management tools for granular visibility into spending patterns and optimization opportunities across your Azure estate.
Automate cost controlsImplement automated VM rightsizing, scheduled shutdowns for non-production workloads, and policy enforcement to reduce manual work and speed up implementation.
Set container requests and limits algorithmicallyEliminate manual CPU and memory configuration for Kubernetes workloads by using machine learning that watches actual usage and automatically adjusts resource allocation.
Bring control to Azure complexity.

CloudBolt makes Azure work for your organization — not the other way around. Self-service provisioning, built-in governance, and unified orchestration help teams move faster while IT stays in control.

Explore CloudBolt for Azure

Establishing cost allocation and chargeback

You can’t optimize what doesn’t have ownership. When Azure resources lack clear assignment to specific teams or projects, there’s no accountability for the spending decisions behind them. Orphaned resources accumulate because no one takes responsibility for reviewing whether they’re still needed, and rightsizing decisions stall because no team feels empowered to make changes to resources they don’t technically own. 

Address multi-cloud and Kubernetes allocation 

Azure environments often run alongside other cloud platforms and on-premises infrastructure, which means allocation needs to work across your entire estate. Kubernetes cost allocation presents particular challenges since containers share underlying infrastructure, making it difficult to attribute costs accurately to specific namespaces or applications.

Cloud cost allocation and ownership dashboard example
Cloud cost allocation and ownership dashboard example

Effective cloud cost allocation and chargeback systems use tagging policies, resource hierarchy mapping, and algorithmic splitting of shared costs to assign spending to the teams and projects actually consuming resources. Once teams can see their own spending in regular showback reports, they gain both the visibility and motivation to implement the rightsizing, scheduling, and reservation strategies covered in this article.

Optimize compute resources

Virtual machines usually represent the biggest chunk of Azure spending for most organizations. The challenge is that engineering teams often overprovision resources to dodge performance issues while dev/test environments run 24/7, even though they’re only needed during business hours.

Identify and eliminate idle VMs

Start by finding VMs that are consistently underutilized. Azure Advisor offers recommendations based on CPU usage, but this approach has limitations: It doesn’t factor in memory pressure, disk I/O, or network throughput. A more comprehensive analysis should examine multiple metrics over at least 30 days to capture full usage patterns and avoid false positives from seasonal variations.

Custom queries using Azure Monitor can help identify idle resources, like this:

Perf
| where TimeGenerated > ago(30d)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| where InstanceName == "_Total"
| summarize AvgCPU = avg(CounterValue) by Computer
| where AvgCPU < 5
| project Computer, AvgCPU
Balancing innovation with IT control doesn’t have to be hard.

See how enterprises are using CloudBolt to simplify Azure management at scale.

Download the Solution Guide

On the other hand, CloudBolt’s Policy Configurations feature offers a simple drag-and-drop interface for automating the elimination of idle VMs. 

The Waste Signal Configuration option enables you to customize the telemetry settings used by the CloudBolt Platform to identify underutilized resources. Policy Configuration then uses this data to identify and recommend actions for specific services, helping you manage resources efficiently and effectively. Waste Signal Configuration is also known as Signal Configuration. You can configure multiple waste signal configurations for a single service type.

Example of CloudBolt’s Waste Management signals configuration for automated idle resources elimination

Rightsize overprovisioned instances

Rightsizing involves analyzing CPU, memory, disk, and network metrics to determine if a smaller VM SKU would still support the workload. The key is to look beyond average utilization and examine peak usage patterns; as an example, a VM running at 20% average CPU but spiking to 90% during business hours probably isn’t a good candidate for downsizing.

When rightsizing, think about moving within the same VM series (e.g., from D4s_v3 to D2s_v3) or across series if the workload characteristics have changed. For instance, memory-intensive apps might do better on the E-series, while compute-intensive workloads should use the F-series.

Implement shutdown schedules

Development and test environments typically do not require running outside of business hours. Implementing automated shutdown schedules can reduce compute costs by about 65-75% for these workloads. Azure Automation runbooks can schedule VM start/stop operations, as shown in this PowerShell script.

$VMs = Get-AzVM -ResourceGroupName "Dev-RG"
foreach ($VM in $VMs) {
if ($VM.Tags.Environment -eq "Dev") {
Stop-AzVM -Name $VM.Name -ResourceGroupName $VM.ResourceGroupName -Force
}}

Although proper and continuous workload optimization is a complex task, when configured using the default Azure features and capabilities, you need to consider various factors, such as time of day, day of the week, holidays, and IOPS, among others. On the other hand, CloudBolt’s Waste Signals feature helps configure automation for such optimization.

Example of the Idle AWS VMs Waste Signal configuration in the CloudBolt’s portal
Example of the Idle AWS VMs Waste Signal configuration in the CloudBolt’s portal
Cut through Azure complexity with unified provisioning, governance, and cost visibility.

Download the White Paper

Leverage the hybrid benefits of Azure

Organizations with existing Windows Server and SQL Server licenses that include Software Assurance can apply these licenses to Azure VMs, which cuts costs by up to 40% for Windows VMs and as much as 55% for SQL Server workloads. You can even combine this benefit with Reserved Instances for even bigger savings.

CloudBolt’s automation capabilities can speed up these optimization efforts by continuously monitoring utilization patterns, generating tailored recommendations, and orchestrating the implementation of changes through integrated workflows. The platform’s conversational AI interface lets teams query their environments naturally. For example, you can ask “which VMs in the production subscription have been idle for more than 14 days?” and then immediately act on the results.

Implement storage optimization

Azure Storage costs come from several places: the volume of data stored, the storage tier you select, transactions performed, and data egress. Optimizing storage requires a strategic approach that balances access needs with cost efficiency.

Configure intelligent tiering

Azure Blob Storage offers four access tiers: hot, cool, cold, and archive. Each tier is designed for different access patterns and is priced accordingly. The hot tier costs more to store data but less to access it, while archive tier storage is the cheapest but has higher retrieval costs and latency.

Lifecycle management policies automate moving data between tiers based on rules you define, as seen in this JSON example.

{
"rules": [
{
"name": "moveToArchive",
"enabled": true,
"type": "Lifecycle",
"definition": {
"filters": {
"blobTypes": ["blockBlob"],
"prefixMatch": ["backups/"]
},
"actions": {
"baseBlob": {
"tierToCool": {
"daysAfterModificationGreaterThan": 30
},
"tierToArchive": {
"daysAfterModificationGreaterThan": 90
}
}
}
}
}
]
}

Note that Azure lifecycle policies automatically move data down to cooler tiers based on modification time. However, accessing a file doesn’t automatically promote it back to a hotter tier; you’ll need to either manually move it or set up separate promotion policies based on access patterns.

Identify and eliminate storage waste

Orphaned managed disks (those not attached to any VM) keep racking up charges even though they provide no value. They usually happen when VMs are deleted but the “Delete with VM” option wasn’t checked. Similarly, old snapshots created for backups but never cleaned up can eat up a big part of your storage costs. Consider whether snapshots are even necessary for VMs and disks that can be easily recreated through automation or infrastructure as code; eliminating them entirely provides the most cost savings.

A regular audit process should find these resources. Azure Resource Graph queries can be helpful, such as this one.

Resources
| where type == "microsoft.compute/disks"
| where properties.diskState == "Unattached"
| project name, resourceGroup, location, properties.diskSizeGB

Premium storage should also be carefully evaluated. Many workloads provisioned on Premium SSDs don’t actually require that level of performance and could be migrated to Standard SSD or even Standard HDD for non-critical data, resulting in cost reductions of 50-80%.

To address this, CloudBolt’s Waste Signals support scanning various cloud storage components, such as idle or detached block storage volumes and volume snapshots. 

Waste Signals configuration to spot idle and detached block storage volumes and snapshots
Waste Signals configuration to spot idle and detached block storage volumes and snapshots

Leverage Azure reservations and savings plans

Commitment-based discounts are one of the most impactful ways to optimize Azure costs, but they require careful analysis to avoid overcommitting or underutilizing your purchase.

Reserved instances for predictable workloads

Azure Reserved Instances (RIs) provide up to 72% savings over pay-as-you-go pricing. The trade-off is the need for a one-year or three-year commitment to specific VM instances in a particular region. The key to successful RI management is finding steady-state workloads that will run consistently throughout that commitment period.

Analyze your usage patterns over the past 30-60 days to find candidates. Look for VMs that run 24/7 in production environments. Calculate the breakeven point; typically, a VM needs to run approximately 60-70% of the time for a reservation to be more cost-effective than pay-as-you-go.

Azure Savings Plans provide flexibility

Azure Savings Plans offer more flexibility than RIs by applying discounts to compute usage across VM series, regions, and even services like Azure App Service and Azure Functions. In exchange for committing to spend a certain dollar amount per hour, you get a discount of up to 65%.

Savings Plans work well for organizations with dynamic workloads that might shift between regions or instance types. The trade-off for the significantly greater flexibility is slightly lower maximum savings compared to RIs.

Guidance on avoiding overcommitment

The biggest risk is buying more capacity than you’ll actually use. To mitigate this:

  • Start conservatively by committing to 60–70% of your baseline usage.
  • Use one-year terms with a monthly payment (no extra cost) initially to maintain flexibility and limit up-front costs.
  • Monitor utilization monthly and adjust future purchases.
  • Account for planned migrations or decommissions.

CloudBolt’s reservation management capabilities continuously monitor commitment utilization. They proactively alert teams when reservations are underutilized or when new purchase opportunities arise based on usage trends. 

CloudBolt’s Anomaly Detection and Waste Triggers help streamline commitment utilization management.
CloudBolt’s Anomaly Detection and Waste Triggers help streamline commitment utilization management.

The platform’s recommendation engine factors in historical patterns, seasonal variations, and planned infrastructure changes to make the best commitment decisions.

Establish cost governance

Governance isn’t about restricting teams but rather enabling them to innovate safely and quickly within clearly defined guardrails. This philosophy fits perfectly with CloudBolt’s mission to help organizations move faster while keeping financial control.

FOCUS-native platform automatically maps costs across Azure, Kubernetes and other clouds if necessary, hence eliminating the need to manage tags.
FOCUS-native platform automatically maps costs across Azure, Kubernetes and other clouds if necessary, hence eliminating the need to manage tags.

Implement proactive cost controls

Azure Policy lets you enforce organizational standards and assess compliance at scale. For cost optimization, policies can restrict VM sizes to approved SKUs, require specific tags for cost allocation, or prevent deployment to expensive regions, as shown below.

{
"if": {
"allOf": [
{
"field": "type",
"equals": "Microsoft.Compute/virtualMachines"
},
{
"field": "Microsoft.Compute/virtualMachines/sku.name",
"notIn": ["Standard_D2s_v3", "Standard_D4s_v3", "Standard_E2s_v3"]
}
]
},
"then": {
"effect": "deny"
}
}

Resource tagging strategy

A consistent tagging strategy is crucial for accurate cost allocation and chargeback. Essential tags include the following:

  • Environment: Prod, dev, test, staging
  • Application: The application name or identifier
  • Owner: The team or individual responsible
  • CostCenter: For chargeback and showback
  • Project: For project-based cost tracking

Enforce tagging through Azure Policy and use tag inheritance to automatically propagate tags from resource groups to child resources.

Budget alerts and showback models

Azure Cost Management budgets will alert you when spending exceeds thresholds, but they’re reactive by nature. Combine budgets with predictive analytics to forecast spending and take preemptive action. Set up multi-tier alerts (e.g., at 50%, 75%, and 90% of the budget) with escalating response procedures.

Showback and chargeback models create accountability by attributing costs to specific teams or business units. This transparency encourages teams to optimize their own spending while understanding the true cost of their cloud consumption.

CloudBolt’s native reporting dashboard showcases predictive budgeting and real-time alerts.
CloudBolt’s native reporting dashboard showcases predictive budgeting and real-time alerts.

CloudBolt enhances native governance capabilities by providing a unified platform for policy enforcement across hybrid environments. Its conversational AI interface lets stakeholders at different organizational levels access relevant cost insights. Engineers see their resource consumption, while financial managers view budget variances by department.

Enable comprehensive monitoring

Visibility is the foundation of optimization. Native Azure tools offer good cost visibility within Azure, but organizations operating in hybrid or multi-cloud environments need more comprehensive solutions.

Limitations of native tools

Azure Cost Management offers robust cost analysis and reporting within the Azure ecosystem. However, it has limitations for organizations with complex environments:

  • No visibility into on-premises infrastructure costs
  • Limited integration with third-party SaaS spending
  • Basic forecasting capabilities that don’t account for complex scenarios
  • Manual processes for translating insights into actions

Third-party solutions for visibility and automation

External platforms like CloudBolt address the limitations of Azure Cost Management by giving you a unified view across cloud and non-cloud spending categories. The platform takes in data from Azure, on-premises infrastructure, and SaaS services to create a comprehensive financial picture.

CloudBolt helps implement accurate, automated and integrated chargeback tailored to different stakeholder groups.
CloudBolt helps implement accurate, automated and integrated chargeback tailored to different stakeholder groups.

CloudBolt’s AI-powered analytics deliver real-time insights tailored to different stakeholder groups. Here are some example queries:

  • “Break down my Azure costs by application and environment for the last quarter.”
  • “Which services drove the 23% increase in my subscription costs this month?”
  • “Show me all resources tagged to the finance department running in premium storage.”

This level of granular, queryable visibility accelerates the path from identifying opportunities to realizing savings.

Automate cost controls

Manual optimization processes simply don’t scale. As cloud environments get more complex, automation becomes essential for maintaining cost efficiency without overwhelming your teams.

Scheduled automation for non-prod workloads

Automated start/stop schedules for development and test environments provide immediate ROI with minimal risk. Azure Automation accounts can run PowerShell or Python scripts on defined schedules, while Logic Apps can orchestrate more complex workflows that consider dependencies between resources.

Continuous optimization loops

Instead of treating optimization as a quarterly review, implement continuous feedback mechanisms. Monitor resource utilization in real time, automatically generate recommendations when thresholds are exceeded, and track the implementation of changes to validate savings.

Self-service reporting aligns fully with your unique financial logic and cost allocation methodologies.
Self-service reporting aligns fully with your unique financial logic and cost allocation methodologies.

CloudBolt’s automation capabilities go beyond basic scheduling to include intelligent orchestration. The platform can automatically implement low-risk optimizations (like snapshot cleanup) while routing higher-risk changes (like VM rightsizing) through approval workflows integrated with ITSM tools like ServiceNow or Jira.

Set requests and limits algorithmically

For organizations running containerized workloads on Azure Kubernetes Service (AKS), resource requests and limits are a unique optimization challenge. Setting these values manually creates a three-stage failure pattern: Not setting resource request values leads to performance issues, one-size-fits-all configurations create cost waste, and manual tuning creates developer toil.

Traditional approaches fail because they can’t adapt to changing application behavior. For example, an application that needs 2 GB of memory during normal operation might require 4 GB during month-end processing. Static configuration either overallocates for typical usage or underallocates for peak periods.

Machine-learning-based solutions observe actual resource consumption patterns over time and automatically adjust requests and limits without developer intervention. CloudBolt (with the StormForge acquisition) provides this capability for AKS workloads, going beyond Azure’s native autoscaling to deliver intelligent, ML-driven resource optimization. This eliminates manual request setting while ensuring applications get the resources they need, when they need them. The result is improved utilization (often 40-60% improvement) without performance degradation. You can learn more about using StormForge and Karpenter by reading this article or visiting the product’s Sandbox environment for an interactive demonstration:

StormForge’s cost-saving Overview Dashboard
StormForge’s cost-saving Overview Dashboard
Manage AWS, Azure, GCP, VMware, and more — all from one control plane.

Explore the Cloud Management Platform

Final thoughts

Comprehensive Azure cost optimization requires a systematic approach that combines native tools with advanced capabilities. While Azure provides solid foundational tools for cost visibility and basic recommendations, achieving sustained optimization at enterprise scale demands more sophisticated solutions.

The key is moving from reactive cost management—cleaning up waste after it shows up on your bill—to predictive optimization that prevents waste through proactive governance, intelligent automation, and continuous feedback loops. This transformation requires three elements: accurate visibility into spending patterns, actionable intelligence about optimization opportunities, and automated mechanisms to implement changes at scale.

CloudBolt’s platform enables this complete cloud lifecycle optimization by bringing together AI-powered insights, conversational interfaces for natural interaction with financial data, and intelligent automation that speeds up the journey from recommendation to implementation. The platform’s hybrid cloud capabilities ensure that optimization efforts extend beyond Azure to encompass the entire IT estate, giving you a single source of truth for technology spending.

Organizations that embrace this approach move beyond simple cost cutting to achieve true cost optimization, maintaining or improving performance while reducing spending and freeing up resources to invest in innovation and business differentiation. In an environment where cloud spending keeps growing rapidly, this capability is essential for staying competitive.

Solve your cloud ROI problem

See for yourself how CloudBolt’s full lifecycle approach can help you.

Request a demo

Explore the chapters:

Related Blogs

 
thumbnail
Field Notes From Enterprise VMware Teams After Broadcom 

Two years after Broadcom’s acquisition of VMware, the conventional wisdom around what’s happening feels rather binary: companies are either leaving wholesale or not making a change…

 
thumbnail
VMwhere? How Enterprises Decide What Moves Off VMware and Where It Goes Next 

Two years into VMware’s Broadcom era, most enterprises aren’t asking whether they are leaving. They’re trying to answer a more operational question. What can move safely this year, what has to wait,…

 
thumbnail
Lobster Data customer story

When Lobster, Germany’s leading data integration and automation company serving 2,000+ customers worldwide, outgrew manual, fragmented infrastructure operations, they turned…