Pricing

Cut your AI spend and get to production faster with Predibase.

Fine-tune and serve with zero headaches in our cloud or your VPC.

Calculate Your Inference Cost-Savings

Use case (tokens per request)

Requests / second

Current model

Price: $30 / 1M input tokens, $60 / 1M output tokens

$15,648,058.08

estimated yearly cost savings

Based on 1 A100 replica and fine-tuned Llama-3-8B

Savings do not factor in Enterprise discounts and vary based on hardware selection. Estimations are based on the list price for an A100 80GB GPU.
For a personalized quote, reach out to us at support@predibase.com

Predibase Tiers

Predibase AI Cloud

Free Plan

Get started right away fine-tuning and serving your own LLMs that beat GPT-4

Included in our Free Tier:

Up to 1 user
Best-in-class fine-tuning with A100 GPUs
Inference:
- 1 private serverless deployment (no rate limits)
- Autoscaling and scale to 0
- Serve unlimited adapters on a single GPU with LoRAX
- Free shared serverless inference (with rate limits) for testing
Access to all available base models
Data connection via file uploads
2 concurrent training jobs
In-app chat, email, and Discord support

Get Started with $25 in Free Credits

Note: Free credits expire after 30 days.

SaaS

Enterprise Plan

Guaranteed autoscaling and priority compute access for teams ready to go into production

Everything in the Free Tier, plus:

Additional seats for your whole team
Volume discounts for serving compute
Inference:
- Guaranteed instances to ensure scaling to meet increased demand
- Additional replicas for burst usage
- Additional private serverless deployments
Guaranteed uptime SLAs
Data connection via Snowflake, Databricks, S3, BigQuery, and more
Additional concurrent training jobs
Dedicated Slack channel, plus consulting hours with our experts

Get a Custom Quote

Your VPC

VPC

Enterprise Virtual Private Cloud (VPC)

Fine-tune and serve while guaranteeing that your data never leaves your cloud

Like Enterprise Saas, plus:

Deploy directly into your own cloud (AWS, Azure, GCP)
Use your own cloud commitments
Optimize usage with your own GPUs
Enterprise security and compliance

Get a Custom Quote

Use committed cloud spend on Predibase

If you have committed spend with AWS, Azure, or GCP, you will soon be able to use that commit on Predibase.Learn More

Do you offer discounts?

Yes, discounted pricing on compute is available for Enterprise customers. Please contact us to learn more.Learn More

Private Serverless Inference

We offer usage-based pricing billed by the second so you can configure your deployments to scale to 0 when idle or add additional replicas when usage spikes. Enjoy exceptional inference performance and the ability to serve unlimited fine-tuned adapters on a single deployment to maximize your GPU utilization and cost-effectiveness.

Hardware	Base Price ($ / hr)
1 L4 (24 GB)	$2.14
1 A10G (24 GB)	$2.60
1 L40S (48 GB)	$3.20
1 A100 (80 GB)	$4.80
1 H100 (80 GB)	Enterprise-only
1 H200 (141 GB)	Enterprise-only
Multi H100 or H200	Enterprise-only

Shared Serverless Inference

Predibase supports state-of-the-art, efficient inference for both pre-trained and fine-tuned models enabled by LoRA Exchange (LoRAX). Serverless pricing is designed for experimentation and is free to use for up to 1M tokens per day and 10M tokens per month.

Solar-Mini
Solar Pro Preview
Llama-3-1-8b-instruct
Llama-3-1-8b
Llama-3-70b
Llama-3-70b-instruct
Mistral-7b-instruct-v0.1
Mistral-7b-instruct-v0.2
Mistral-7b
Gemma-2B-Instruct
Gemma-7B-Instruct
Code-llama-13b-instruct

See Full List of Models

Fine-tuning Costs

Predibase offers state-of-the-art fine-tuning at cost-effective prices. Expected costs vary depending on the dataset, size of the base model, and whether you're training a LoRA, Turbo LoRA, or Turbo.

Calculate Your Fine-tuning Cost

Model Size (parameters)

Dataset (tokens)

Epochs (number of iterations)

Estimated Cost for LoRA SFT fine-tuning

$0.00

Officially Supported Models for Fine-tuning Include:

Solar Mini, Solar Pro Preview
Llama 3.2 (1B, 3B; Instruct and non-Instruct)
Llama 3.1 8B (Instruct and non-Instruct)
Mistral-7b, Mistral-7b-instruct-v0.1 and v0.2
Mistral Nemo 12B 2407 (Instruct and non-Instruct)
Mixtral-8x7B-Instruct-v0.1
Codellama 13B Instruct, Codellama 70B Instruct
Zephyr 7B Beta
Gemma 2 (9B, 27B; Instruct and non-Instruct)
Phi 3.5 Mini Instruct
Phi 3 4k Instruct
Qwen 2.5 (1.5B, 7B, 14B, 32B; Instruct and non-Instruct)
Qwen 2 (1.5B; Instruct and non-Instruct)
Any OSS Model from Huggingface (best effort)

Fine-Tuning Pricing (per 1M tokens)

Up to 16B - SFT, Continued Pretraining (LoRA, Turbo)

$0.50

16.1 to 80B - SFT, Continued Pretraining (LoRA, Turbo)

$3.00

Up to 16B - SFT, Continued Pretraining (Turbo LoRA)

$1.00

16.1 to 80B - SFT, Continued Pretraining (Turbo LoRA)

$6.00

Up to 16B - RFT GRPO (LoRA)

$10.00

16.1 to 32B - RFT GRPO (LoRA)

$20.00

Ready to efficiently fine-tune and serve your own LLM?

Try Predibase for Free