BACK

GKE Pricing: Schemes and Optimization Strategies

by: CloudBolt / January, 26 2024

Explore the chapters:

Kubernetes is an open-source platform developed by Google for managing containerized workloads. Containerization solves the problem of traditional application deployment by making apps independent of any underlying infrastructure. Since they are loosely coupled, applications become easy to replicate across cloud providers and operating system (OS) distributions. Hosted Kubernetes – A Strategic Overview provides more context on the history and inner workings of Kubernetes.

Google Kubernetes Engine (GKE) is a Kubernetes platform for easy application scaling and maintenance. There are two operational modes: autopilot and standard. The former manages the entire cluster and node infrastructure, whereas the latter provides configuration flexibility and complete control over how clusters and nodes are handled. Both modes have their pricing schemes, which are the subject of this article.

Executive summary

We will be discussing the following concepts, summarized below for your reference:

Standard mode	allows users to control the underlying node infrastructure
Autopilot mode	provisions and manages node infrastructure based on the user’s requirements
Cluster management fee	flat fee paid per hour regardless of cluster mode, topology, and regionality
Multi-cluster-ingress	mechanism that allows load-balancing amongst clusters in different regions
GKE Backups	service for taking and preserving cluster snapshots

GKE pricing

The pricing for GKE is driven primarily by the chosen mode (Standard or Autopilot). Standard is charged based on the configuration of the machines being used and is subject to Compute Engine pricing. Autopilot has a more defined price table where the user is unconcerned with technical details. Regardless of mode, there will always be a cluster management fee. When applicable, there are also multi-cluster ingress and GKE backup fees.

Note that there are slight differences in pricing between regions. All price tables in this article will be based on Iowa (us-central1). The cluster management fee of $0.10 per hour applies to all GKE clusters.

Standard mode

Standard mode allows users to configure clusters and their underlying infrastructure. Users have maximum freedom but are responsible for cluster management.

Cluster topology can be zonal or regional. The Zonal cluster control plane runs in the same zone as their nodes and – unless they are multi-zonal – have no replicas. Regional clusters have node replicas across multiple zones. The topology does not directly affect pricing but does impact availability.

Since standard mode uses Compute Engine instances as worker nodes, billing is primarily dictated by Compute Engine pricing. The default pricing scheme is on-demand, where instances are billed per second. Users can obtain price reductions using committed-use discounts, agreeing to a one-year or three-year contract, or via spot instances.

A machine’s pricing depends on its type, which caters to specific use cases. The five machine types describe how many vCPUs and how much memory is allocated to a VM and are as follows:

General-purpose
Compute-optimized
Memory-optimized
Accelerator-optimized
Shared-core

The pricing for each machine type differs and can be found here. The price table you can see below is for the E2 machine type.

Item	On-demand price	Spot price	1-year commitment price	3-year commitment price
Predefined vCPUs	0.021811 per vCPU per hour	0.006543 per vCPU per hour	0.013741 per vCPU per hour	0.009815 per vCPU per hour
Predefined memory	0.002923 per GB per hour	0.000877 per GB per hour	0.001842 per GB per hour	0.001316 per GB per hour

Stop Setting Kubernetes Requests and Limits

LEARN HOW

Autopilot mode

This mode fully handles the management of clusters and can be used to deploy and maintain them quickly and easily. However, this mode provides little flexibility in terms of the underlying infrastructure.

Autopilot mode also features Spot Pods for fault-tolerant workloads, which work precisely like Spot VMs. In this case, you can run pods at a lower price but at the risk of pods being evicted if other standard pods require the resources.

Concerning pricing, Standard mode operates on a pay-per-node basis, and Autopilot runs on a pay-per-pod basis. Users are billed based on the pods’ CPU, memory, and ephemeral storage requests.

There are two compute classes for Autopilot pods:

general-purpose
scale-out

Users can specify the compute class they wish to use for their workloads. General-purpose is the default class and is best used for medium-intensity workloads such as web servers, small to medium databases, or development environments.

Scale-out machines have simultaneous multi-threading turned off and are optimized for scale. They best suit intense workloads like containerized microservices, data log processing, or large-scale Java applications.

The price table for General-purpose pods can be found below.

Item	Regular price	Spot price	1-year commitment price	3-year commitment price
vCPU per hour	0.0445	0.0133	0.0356	0.0244750
Pod Memory per GB-hour	0.0049225	0.0014767	0.0039380	0.0027074
Ephemeral Storage GB-hour	0.0000548	0.0000548	0.0000438	0.0000301

For comparison, here is the price table for Scale-out compute class pods.

Item	Regular price	Spot price	1-year commitment price	3-year commitment price
Arm Pod vCPU per hour	0.0356	0.0107	0.02848	0.01958
Arm Pod Memory per GB-hour	0.003938	0.0011814	0.0031504	0.0021659
Pod vCPU per hour	0.0561	0.0168	0.04488	0.030855
Pod Memory per GB-hour	0.0062023	0.0018607	0.0049618	0.0034113

Multi-cluster ingress

A multi-cluster ingress is a service that load-balances traffic across Kubernetes clusters in different regions. Anthos is a Google Cloud service that consolidates other services that can be either internal or external to Google Cloud. If the Anthos API is enabled, then Multi-Cluster Ingress is billed under Anthos. Otherwise, standalone pricing applies at $0.0041096 per backend Pod per hour.

Backups

The billing for a GKE backup depends on two dimensions:

backup management
backup storage

Backup management costs $1 per month per pod, and storage costs $0.028 per GB per month. So if a user backs up 10 pods on a certain month with 20 GB worth of data, the calculation would be as follows:

Backup management fee = $1 x 10 pods = $10 per month

Backup storage fee = $0.028 x 20 = $0.56 per month

Total = $10 + $0.56 = $10.56 per month

Example pricing scenario and calculation

A user would like to provision a node pool for their web application. It should be available 24/7 with little to no downtime. The expected workload is not too heavy, so a machine with four vCPUs and 15GB RAM should be able to handle all requests. The user would like to compare the cost difference between GKE Standard and GKE Autopilot.

Here is what the cost breakdown looks like based on the GCP pricing calculator:

Image shows the pricing for a GKE Autopilot based on the user’s requirements

Based on the compute engine price table, an n1-standard-4 instance costs $0.189999 per hour. The sustained use discount applies because our clusters are running 24/7. The estimated discount is around 30%.

Base price: 2 instances x 730 hours x $0.189999 = $277.39854

After discount: $277.39854 x 0.7 = $194.18 per month

The cost breakdown for the equivalent Autopilot version would be:

Image shows estimated cost breakdown for GKE Autopilot

Note that we are assuming the same number of resources as the n1-standard-4 machine type used in the Standard mode calculation. 2 instances with 4 vCPUs and 15 GB memory. Ephemeral storage is set to 0.5 GB.

vCPU: 2 instances x 4 vCPUs x 730 hours x $0.0445 = $259.88

Memory: 2 instances x 15 GB x 730 hours x $0.0049225 = $107.80275

Ephemeral storage: 2 instances x 0.5 GB x 730 hours x $0.0000548 = $0.04

Total: $259.88 + $107.80275 + $0.04 = $367.72 per month

Comparison across cloud providers

AWS EKS (Elastic Kubernetes Service)

There are two main orchestration tools for containerized applications on AWS:

Each orchestration tool has two hosting types:

AWS EC2: the counterpart of GCP Compute Engine
AWS Fargate: the counterpart of GCP GKE Autopilot

A diagram illustrating the Kubernetes orchestration tools and hosting types on AWS.

The main difference between EKS and ECS is that the former is built explicitly on Kubernetes, whereas the latter is explicitly built for AWS. That means that the direct counterpart for GCP GKE is AWS EKS.

There is a $0.10 per cluster per hour flat fee similar to GCP GKE’s cluster management fee. If the hosting type used is Amazon EC2, then billing would depend on EC2 pricing, identical to GCP GKE’s Standard mode, where the cost ultimately depends on the VMs.

AWS EC2 has a pricing table for each instance type wherein per-second billing applies. There are also pricing structures such as on-demand, savings plans, reserved, and spot instances.

AWS Fargate is very similar to GKE Autopilot because it follows the principle of only paying for resources. The pricing table is as follows:

Per vCPU per hour	$0.012144
Per GB per hour	$0.0013335

Here is a sample comparison between AWS Fargate and GKE Autopilot based on regular pricing.

Resource	Quantity	Hours per month	AWS Fargate rate	GKE Autopilot rate	AWS Fargate total per month	GKE Autopilot total
vCPU	4	730	$0.012144	$0.0445	$35.46048	$129.94
GB	8	730	$0.0013335	$0.0049225	$7.78764	$28.7474

Azure AKS (Azure Kubernetes Service)

Unlike GCP and AWS, Azure AKS does not have a cluster management fee. However, there is an option to pay for an Uptime SLA, which costs $0.10 per cluster per hour. There is no fully managed service similar to GKE Autopilot or AWS Fargate; therefore, the Azure AKS cost is based solely on VM pricing. Users can save money by using 1-year or 3-year instances and spot instances.

Savings vehicle	Estimated savings
1-year reservation	48%
3-year reservation	65%
Spot instances	90%

Recommendations for cost optimization

The following are best practices for cost optimization recommended by Google.

A summary of Google recommended best practices for cost optimization on GKE (source)

Automate K8s autoscaling with machine learning

Learn More

Solution	Rightsizing recommendations	Automation	Fully compatible with HPA	Powered by machine learning	Historical metrics analysis	Trend forecasting
VPA
StormForge

Monitor your spending

Cost optimization ultimately boils down to proper monitoring. Features like recommendations, alerts, and quotas keep projects within budget. For GKE specifically, you have the following options:

use the monitoring dashboard
refer to GKE usage metering
use Metrics Server to examine metrics that determine autoscaling behavior
use quotas
use Anthos Policy Controller
enforce policy constraints in your CI/CD pipeline

One third-party tool made for this purpose is CloudBolt.

Configure Kubernetes applications correctly

The best features of GKE are also the most cost-effective. For instance, vertical and horizontal autoscaling is the most reliable cost-saving mechanism because it’s automatically optimized for the current workload. To start though, you must understand the application’s concurrency by undergoing stress tests. Stress tests help reveal the most likely scenarios cost-wise during peak usage.

Next, once that baseline has been established, you can define the PodSpec accurately (if using Autopilot) or choose the most appropriate machine type (if using Standard). That way, resource allocation does not become a hit-or-miss affair. If the initial configurations aren’t precisely correct, it shouldn’t be a problem because, in the cloud, it’s easy to reconfigure and scale.

Use GKE cost-optimization features and options

GKE itself offers multiple ways to save on costs. Aside from fine-tuning autoscaling based on the baseline mentioned above, you can:

locate clusters in cheaper regions provided that latency is bearable
leverage committed-use discounts
use a multi-tenancy strategy for smaller clusters to save on add-on costs
review how clusters in different regions send and receive traffic

Promote a cost-saving culture

Cost optimizations can come from promoting a frugal mindset, so employees should understand costing mechanics to help enforce efficiency. To cement the point, users should know how charges are calculated to recognize how their decisions can contribute to overall costs.

Experience StormForge in a sandbox – no email required

Access Sandbox

Conclusion

To recap, GCP GKE has two modes: standard and autopilot. Standard allows users to configure their underlying infrastructure, and billing depends on the machine types of the worker nodes. The latter, Autopilot, is a fully managed service where users only have to define a PodSpec and don’t have to worry about other technical details. The billing for GKE Autopilot depends on how much vCPU and memory are requested.

Several GKE features contribute to cost-effectivenesses, such as autoscaling, load-balancing, and monitoring. One example is the option to use Spot VMs/Pods, where users can benefit from cheaper pricing provided their workloads are fault-tolerant. Whatever options you choose, promoting a sense of efficiency and an understanding of cost structures helps developers contribute to cost-cutting processes and allows for the best utilization of funds.

Explore the chapters:

Related Blogs

Cloud Management Platform Roadmap Webinar

Get the latest on CloudBolt’s Cloud Management Platform (CMP)! Discover new hypervisor support (OpenShift & Oracle Linux), enhanced source control,…

Ensuring Budget Adherence

Simply having visibility isn’t enough. To KEEP costs in line, you need to shift left and continuously prevent costly new…

Large US insurance provider chooses CloudBolt to navigate VMware acquisition

With the recent acquisition of VMware by Broadcom and with many changes looming in the near future, this large US…

The End of Manual Optimization: Why We Acquired StormForge

From our leaders: StormForge joins CloudBolt

CloudBolt Software Listed in AWS “ICMP” for the US Federal Government

StormForge is now part of CloudBolt

State of FinOps 2025: 5 Expert Takeaways from Our Live Panel Discussion

Delivering on Augmented FinOps