Vertex

Get API
access

Be amongst the first to ship with Vertex.
Sign up to be notified when we launch.

Free tier · No credit card

vertex.ughoron.cloud / playgroundgemma-2-9b

You

Summarise this PR diff in three bullets.

Vertex

Here's a tight read of the diff:

1. Replaces the in-memory cache with an LRU keyed by tenant.

2. Drops the synchronous fanout in favour of a queued worker.

3. Adds a 24h TTL so cold tenants don't hold memory.

142ms first token · 318ms total · 184 tokens

Send a message…⏎

OpenAI-compatible

Same SDK, new endpoint. No client to rewrite.

One key, every model

Pick a model per call. No vendor juggling.

< 200ms first token

Models stay warm. No cold-start tax.

Per-token billing

No seats, no minimums, no annual contracts.

< 200ms

P95 first token

99.9%

Routing uptime

$0.20

Per 1M tokens

Pricing

Affordable plans

Start free. Pay per token after that.

Hobby

Free

Prototypes and weekend builds.

Start free

What's included

100K tokens / month
Lean models
Community Discord

Builder

Popular

$29/mo

Production apps with steady traffic.

Choose Builder

What's included

5M tokens / month included
Then $0.20 / 1M tokens
All available models
Email support

Scale

Custom

Reserved capacity with an SLA.

Talk to us

What's included

99.9% uptime
Pinned availability
Private Slack

FAQs

Frequently Asked Questions

Is Vertex really OpenAI-compatible?

Yes. Same request shape, same response shape. Point the OpenAI SDK at the Vertex base URL, pass your Vertex key, and your existing code works unchanged.

Which models can I call today?

Open-weight chat models — Gemma 2, Phi-3, Qwen 2.5. New ones get added as we test them. Pick a model per call by id.

How does billing work?

Free tier covers 100K tokens per month. After that, $0.20 per million tokens on the Builder plan. No per-seat fees, no minimums.

Can I integrate with other tools?

Anything that speaks the OpenAI chat completions API speaks Vertex. SDKs, frameworks, third-party UIs — they all just work.

How secure is my data?

Requests aren't used for training. Prompts and completions are logged only for the duration needed to bill and serve traffic, then dropped.

Can I use my Ughoron Account?

Yes — Vertex uses the same account that signs into ughoron.cloud. One identity across the network.

Get APIaccess

OpenAI-compatible

One key, every model

< 200ms first token

Per-token billing

Affordable plans

Frequently Asked Questions

Get API
access