BQ ML is a mode in BigQuery that allows you to create, test, and tune Machine Learning (ML) models using standard SQL queries. It supports widely used models such as linear regression for forecasting, K-means clustering for data segmentation, and Deep Neural Networks. Google Cloud’s AutoML and Vertex AI can create the same models. However, one would typically choose BQ ML when training data within BigQuery or when Data Analysts prefer a self-service approach to building models.
BQ ML democratizes ML by allowing Data Analysts without extensive training in Data Science to create and use models based on business data. Development time is also significantly reduced because the data used for model creation already exists within the same data warehouse.
For these reasons, many organizations migrate their analytic workloads exclusively to BigQuery. However, the most critical question remains: What will it cost?
There are two usage patterns: on-demand pricing and flat-rate pricing. Understanding these options is fundamental to creating a well-planned BQ ML cost strategy. Other aspects of Machine Learning Operations (MLOps), such as hyperparameter tuning, model deployment, evaluation, inference, and feature processing, can be conducted within BQ ML and are charged as queries relative to the storage used and the amount of data processed.
Executive Summary
This article will explain the following concepts, which we have summarized here for your easy reference:
Pricing Structure | Description |
---|---|
Free tier | The following parameters have free usage limits:storagedata processeddata inserteddata processed by CREATE MODEL queriesOnce the free usage limits have been surpassed, the user will start to incur costs according to on-demand pricing |
On-demand pricing | On-demand pricing is how operations are billed as-is. Processes such as model creation, evaluation, inspection, and prediction incur costs when run ad-hoc. Additionally, costs differ based on the model type. Some examples of model types include logistic regression, k-means clustering, and DNN |
Flat-rate pricing | Flat-rate pricing is the primary savings vehicle used for BigQuery workloads. It works by purchasing commitments and assigning those commitments to Google Cloud projects. A commitment pertains to dedicated BigQuery slots. Slots describe a unit of query processing. |
BQ ML pricing
Pricing usage patterns can be categorized into:
- On-demand pricing
- Flat-rate pricing
Note that free usage applies to BQ ML, wherein operations are free to a certain extent. On-demand pricing refers to queries that are billed as-is with no special discounts. Customers can use flat-rate pricing to save money when the number of monthly queries is predictable. There are two model types:
- Built-in models
- External models
A built-in model is a model that is trained within BigQuery, and an external model uses other Google Cloud services like Vertex AI and AutoML. The pricing for built-in models would depend on model and operation type. With external models, pricing is still affected by model and operation type with the addition of any other Google Cloud services outside of BQ ML like Vertex AI and AutoML.
Free tier
There are four parameters of usage:
- Storage
- Model prediction, inspection, and evaluation queries
- Use of BigQuery Storage Write API
- Model creation queries
Parameter | Free usage limits |
---|---|
Storage | The first 10 GB per month is free |
Model prediction, inspection, and evaluation queries | First 1 TB of query data processed per month is free |
BigQuery Storage Write API | First 2 TB of data inserted into BigQuery via API per month is free |
Model creation queries | First 10 TB of data processed by CREATE MODEL queries per month is free |
If your usage is well below the free-tier threshold, there’s a good chance you’re not paying for BigQuery. However, once you start scaling your services, you will likely notice that your bill is increasing, which may result from on-demand pricing.
See the best multi-cloud management solution on the market, and when you book & attend your CloudBolt demo we’ll send you a $75 Amazon Gift Card.
On-demand pricing
This pricing comes on an as-is basis, and the actual costs incurred will depend on the operations you’re carrying out. The available Big Query operations are as follows:
- Model creation
- Model evaluation
- Model inspection
- Model prediction
Note that Model Creation for Matrix Factorization, the most commonly used model for recommender systems, is not supported by on-demand pricing and is available as a Flat-rate only.
Operation type | Model type | Pricing |
---|---|---|
Model Creation | Logistic Regression | $250 per TB |
Linear Regression | ||
K-means clustering | ||
Time-series | ||
Model Creation | AutoML tables | $5 per TB, plus Vertex AI training cost |
DNN | ||
Boosted tree | ||
Model Evaluation | All types | $5 per TB of data processed |
Model Inspection | ||
Model Prediction |
Flat-rate pricing
This type of pricing is best for clients with large-scale BigQuery model deployments. The predictability of monthly costs makes cost optimization more simple. It works by using reservations for both built-in and external models, and the general flow is as follows:
- The user purchases “commitments”
- The user assigns slots to reservations
- The user assigns one or more projects to a reservation
Platform
|
Multi Cloud Integrations
|
Cost Management
|
Security & Compliance
|
Provisioning Automation
|
Automated Discovery
|
Infrastructure Testing
|
Collaborative Exchange
|
---|---|---|---|---|---|---|---|
CloudHealth
|
✔
|
✔
|
✔
|
||||
Morpheus
|
✔
|
✔
|
✔
|
||||
CloudBolt
|
✔
|
✔
|
✔
|
✔
|
✔
|
✔
|
✔
|
Commitments
Clients purchase “commitments” or dedicated query processing capacity. Commitments are measured as BigQuery “slots.” A slot is a vCPU used to execute SQL queries. The number of slots needed depends on the size of the query and its complexity. A commitment also has a duration. There are three types of commitment:
- Annual
- Monthly
- Flex
Annual commitments last for a minimum of 365 days. Monthly commitments last for a minimum of 30 days. Lastly, Flex slots can last for as little as 60 seconds and are typically used for testing and seasonal demands.
Reservations
After buying commitments, users will assign them to different “reservations.” This resource-allocation system allows you to associate commitments to other workloads. For example, you could attach commitments to reservations called “prod” for production workloads, “dev” for development workloads, and so forth.
Once complete, one or more projects, folders, or organizations must be assigned to a reservation to make the slots usable. For example, if you assign a reservation to a project, then the datasets within that project can use the slots associated with the reservation.
Assignments
Assignments have two possible job types for BQ ML:
- QUERY
- ML_EXTERNAL
A QUERY job includes BQ ML queries such as model creation, inspection, evaluation, and prediction. Most queries for BQ ML will therefore fall under this type of job. An ML_EXTERNAL job applies to BQ ML queries that use external services such as Vertex AI and AutoML. Only external model reservations have this job type.
Cost Table
Type of commitment | Number of slots | Pricing |
---|---|---|
Monthly | 100 | $2000 |
Annual | 100 | $1700 |
Flex | 100 | $4 per hour or $2920 per month |
Limitations
Flat-rate pricing has several limitations, including:
- The inability to share reservations with other GCP organizations
- An organization can only have a maximum of five projects with an active commitment in specific locations
- Commitments are regional resources – you cannot move them between regions
Recommendations
The following are factors that allow you to create more realistic cost estimations:
- Clear reservation allocation rules for complex organizations
- An accurate estimate of how many slots are needed
- Well-managed workloads across different reservations
To help you achieve these, we have provided several recommendations below.
Create an administration project solely for reservations
Google recommends creating an “administrative project” just for reservations, allowing you to centralize billing and management. Projects under the same organization as the administration project can use reservations. Additionally, these projects can share idle and unallocated slots, making slot management more flexible.
Use Flex slots to get a slot estimate
Monitoring tools like Cloud Logging allow you to monitor the average capacity your workloads consume. It’s best practice to use a small number of Flex slots and increase their number as you test your workloads. Examine your logs to identify what number of slots provides the best performance-to-value ratio.
Define workloads clearly
Take advantage of reservations by defining workloads based on purpose or domain. Following the previously outlined methodology, estimate the number of slots required for each workstream.
For example, given 1000 committed slots and three functional areas, the assignment of resources could work as follows:
- Assign five hundred slots to the Data Science team. They usually get the most slots because they will be doing the most intensive work
- Assign three hundred slots for ETL (Extract-Transform-Load). This function is responsible for cleaning and transforming business data before it arrives at the Data Science and Business Intelligence teams
- Assign two hundred slots to BI (Business Intelligence). BI creates reports and visualizations that provide meaningful insights, allowing stakeholders to make data-driven decisions
Conclusion
BQ ML is a powerful tool that allows the entire ML Ops lifecycle to exist within a single Data Warehouse solution. It democratizes data and enables Data Analysts without formal Machine Learning training to use SQL to create, evaluate, and deploy powerful industry-standard models.
Regarding pricing, there are two usage patterns: on-demand and flat-rate pricing. On-demand charges on an “as-is” basis and is an efficient choice for one-off workloads. Flat-rate is ideal for enterprises that want predictability in their billing, especially if they run large numbers of monthly BQ ML queries.
Flat-rate pricing works by purchasing committed slots, assigning slots to reservations, and distributing reservations amongst projects. A sound BQ ML pricing strategy depends on the number of monthly queries and the estimated number of slots required if the client chooses flat-rate pricing.
Related Blogs
The New FinOps Paradigm: Maximizing Cloud ROI
Featuring guest presenter Tracy Woo, Principal Analyst at Forrester Research In a world where 98% of enterprises are embracing FinOps,…
Building the Foundations of Automation: The People and Processes Behind Success [Crawl Phase – Part 1 of 3]
Automation is often synonymous with technology—tools, software, and platforms that promise to streamline operations and boost efficiency. However, successful automation…