BQ ML is a mode in BigQuery that allows you to create, test, and tune Machine Learning (ML) models using standard SQL queries. It supports widely used models such as linear regression for forecasting, K-means clustering for data segmentation, and Deep Neural Networks. Google Cloud’s AutoML and Vertex AI can create the same models. However, one would typically choose BQ ML when training data within BigQuery or when Data Analysts prefer a self-service approach to building models.

BQ ML democratizes ML by allowing Data Analysts without extensive training in Data Science to create and use models based on business data. Development time is also significantly reduced because the data used for model creation already exists within the same data warehouse. 

For these reasons, many organizations migrate their analytic workloads exclusively to BigQuery. However, the most critical question remains: What will it cost?

There are two usage patterns: on-demand pricing and flat-rate pricing. Understanding these options is fundamental to creating a well-planned BQ ML cost strategy. Other aspects of Machine Learning Operations (MLOps), such as hyperparameter tuning, model deployment, evaluation, inference, and feature processing, can be conducted within BQ ML and are charged as queries relative to the storage used and the amount of data processed.

Executive Summary

This article will explain the following concepts, which we have summarized here for your easy reference:

Pricing StructureDescription
Free tierThe following parameters have free usage limits:storagedata processeddata inserteddata processed by CREATE MODEL queriesOnce the free usage limits have been surpassed, the user will start to incur costs according to on-demand pricing
On-demand pricingOn-demand pricing is how operations are billed as-is. Processes such as model creation, evaluation, inspection, and prediction incur costs when run ad-hoc. Additionally, costs differ based on the model type. Some examples of model types include logistic regression, k-means clustering, and DNN
Flat-rate pricingFlat-rate pricing is the primary savings vehicle used for BigQuery workloads. It works by purchasing commitments and assigning those commitments to Google Cloud projects. A commitment pertains to dedicated BigQuery slots. Slots describe a unit of query processing.

BQ ML pricing

Pricing usage patterns can be categorized into: 

  • On-demand pricing
  • Flat-rate pricing

Note that free usage applies to BQ ML, wherein operations are free to a certain extent. On-demand pricing refers to queries that are billed as-is with no special discounts. Customers can use flat-rate pricing to save money when the number of monthly queries is predictable. There are two model types:

  • Built-in models
  • External models

A built-in model is a model that is trained within BigQuery, and an external model uses other Google Cloud services like Vertex AI and AutoML. The pricing for built-in models would depend on model and operation type. With external models, pricing is still affected by model and operation type with the addition of any other Google Cloud services outside of BQ ML like Vertex AI and AutoML.

Free tier

There are four parameters of usage: 

  • Storage
  • Model prediction, inspection, and evaluation queries
  • Use of BigQuery Storage Write API
  • Model creation queries
ParameterFree usage limits
StorageThe first 10 GB per month is free
Model prediction, inspection, and evaluation queriesFirst 1 TB of query data processed per month is free
BigQuery Storage Write APIFirst 2 TB of data inserted into BigQuery via API per month is free
Model creation queriesFirst 10 TB of data processed by CREATE MODEL queries per month is free

If your usage is well below the free-tier threshold, there’s a good chance you’re not paying for BigQuery. However, once you start scaling your services, you will likely notice that your bill is increasing, which may result from on-demand pricing.

Stop Setting Kubernetes Requests and Limits

LEARN HOW

On-demand pricing

This pricing comes on an as-is basis, and the actual costs incurred will depend on the operations you’re carrying out. The available Big Query operations are as follows:

  • Model creation
  • Model evaluation
  • Model inspection
  • Model prediction

Note that Model Creation for Matrix Factorization, the most commonly used model for recommender systems, is not supported by on-demand pricing and is available as a Flat-rate only.

Operation typeModel typePricing
Model CreationLogistic Regression$250 per TB
Linear Regression
K-means clustering
Time-series
Model CreationAutoML tables$5 per TB, plus Vertex AI training cost
DNN 
Boosted tree
Model EvaluationAll types$5 per TB of data processed
Model Inspection
Model Prediction

Flat-rate pricing

This type of pricing is best for clients with large-scale BigQuery model deployments. The predictability of monthly costs makes cost optimization more simple. It works by using reservations for both built-in and external models, and the general flow is as follows:

  1. The user purchases “commitments”
  2. The user assigns slots to reservations
  3. The user assigns one or more projects to a reservation
Image shows the typical steps to use flat-rate pricing (source)

Commitments

Clients purchase “commitments” or dedicated query processing capacity. Commitments are measured as BigQuery “slots.” A slot is a vCPU used to execute SQL queries. The number of slots needed depends on the size of the query and its complexity. A commitment also has a duration. There are three types of commitment:

  • Annual
  • Monthly
  • Flex

Annual commitments last for a minimum of 365 days. Monthly commitments last for a minimum of 30 days. Lastly, Flex slots can last for as little as 60 seconds and are typically used for testing and seasonal demands. 

Reservations

After buying commitments, users will assign them to different “reservations.” This resource-allocation system allows you to associate commitments to other workloads. For example, you could attach commitments to reservations called “prod” for production workloads, “dev” for development workloads, and so forth.

Once complete, one or more projects, folders, or organizations must be assigned to a reservation to make the slots usable. For example, if you assign a reservation to a project, then the datasets within that project can use the slots associated with the reservation.

Assignments

Assignments have two possible job types for BQ ML:

  • QUERY
  • ML_EXTERNAL

A QUERY job includes BQ ML queries such as model creation, inspection, evaluation, and prediction. Most queries for BQ ML will therefore fall under this type of job. An ML_EXTERNAL job applies to BQ ML queries that use external services such as Vertex AI and AutoML. Only external model reservations have this job type.

Cost Table

Type of commitmentNumber of slotsPricing
Monthly100$2000
Annual100$1700
Flex100$4 per hour or $2920 per month

Limitations

Flat-rate pricing has several limitations, including:

  • The inability to share reservations with other GCP organizations
  • An organization can only have a maximum of five projects with an active commitment in specific locations
  • Commitments are regional resources – you cannot move them between regions
Automate K8s autoscaling with machine learning

Learn More

Solution Rightsizing recommendations Automation Fully compatible with HPA Powered by machine learning Historical metrics analysis Trend forecasting
VPA
StormForge

Recommendations

The following are factors that allow you to create more realistic cost estimations:

  • Clear reservation allocation rules for complex organizations
  • An accurate estimate of how many slots are needed
  • Well-managed workloads across different reservations

To help you achieve these, we have provided several recommendations below.

Create an administration project solely for reservations

Google recommends creating an “administrative project” just for reservations, allowing you to centralize billing and management. Projects under the same organization as the administration project can use reservations. Additionally, these projects can share idle and unallocated slots, making slot management more flexible.  

Use Flex slots to get a slot estimate

Monitoring tools like Cloud Logging allow you to monitor the average capacity your workloads consume. It’s best practice to use a small number of Flex slots and increase their number as you test your workloads. Examine your logs to identify what number of slots provides the best performance-to-value ratio. 

Define workloads clearly 

Take advantage of reservations by defining workloads based on purpose or domain. Following the previously outlined methodology, estimate the number of slots required for each workstream.

For example, given 1000 committed slots and three functional areas, the assignment of resources could work as follows:

  • Assign five hundred slots to the Data Science team. They usually get the most slots because they will be doing the most intensive work
  • Assign three hundred slots for ETL (Extract-Transform-Load). This function is responsible for cleaning and transforming business data before it arrives at the Data Science and Business Intelligence teams
  • Assign two hundred slots to BI (Business Intelligence). BI creates reports and visualizations that provide meaningful insights, allowing stakeholders to make data-driven decisions
Experience StormForge in a sandbox – no email required

Access Sandbox

Conclusion

BQ ML is a powerful tool that allows the entire ML Ops lifecycle to exist within a single Data Warehouse solution. It democratizes data and enables Data Analysts without formal Machine Learning training to use SQL to create, evaluate, and deploy powerful industry-standard models. 

Regarding pricing, there are two usage patterns: on-demand and flat-rate pricing. On-demand charges on an “as-is” basis and is an efficient choice for one-off workloads. Flat-rate is ideal for enterprises that want predictability in their billing, especially if they run large numbers of monthly BQ ML queries. 

Flat-rate pricing works by purchasing committed slots, assigning slots to reservations, and distributing reservations amongst projects. A sound BQ ML pricing strategy depends on the number of monthly queries and the estimated number of slots required if the client chooses flat-rate pricing.

Solve your cloud ROI problem

See for yourself how CloudBolt’s full lifecycle approach can help you.

Request a demo

Explore the chapters:

Related Blogs

 
thumbnail
Cloud resale and distribution solution guide

Cloud billing, rerating, and accuracy at scale to unlock new revenue streams with a platform built to scale modern cloud…

 
thumbnail
Cloud Management Platform solution guide

Automate provisioning, enforce governance, and orchestrate hybrid operations—without ripping out the tools and processes you already rely on.

 
thumbnail
Kubernetes rightsizing solution guide

Cut costs, eliminate performance risks, and scale optimization across clusters with a platform purpose-built for complex Kubernetes environments.