Platform

Fine-tune, evaluate, and (soon) serve private models.

Alpha focus: making the training surface boring and predictable. Hosted inference lands next, on the same credit wallet and the same per-token pricing model.

Stage Private alphaSurfaces CLI · REST API · Web appBilling Per-token credits

01 · TRAIN

LoRA fine-tuning on your dataset.

Instruction, chat, or continued-pretraining tasks. The trainer handles tokenization, LoRA rank, learning-rate schedule, and the eval split. You pick the base model and the quality setting; the platform does the rest.

JSONL in, adapter out

Upload one file per task shape — instruction / chat / continued-pretraining. The trainer handles tokenization, LoRA config, and eval split.

3B to 70B base models

Pick by latency and cost target. Per-token pricing scales with parameter size so small experiments stay genuinely cheap.

No charge on pre-training failure

Infrastructure error or bad dataset before the first gradient step returns reserved credits in full. Honest billing.

# Fine-tune a 7B adapter on your dataset
$ ownllm finetune --data ./dataset.jsonl \
    --task instruct \
    --quality balanced \
    --model 7b

→ uploading dataset · 312 MB
→ estimated cost ≈ 3.01 credits · confirm? [y/N]

POST /job
Authorization: Bearer $OWNLLM_TOKEN

{
  "dataset_url": "s3://...",
  "task": "instruct",
  "quality": "balanced",
  "model_size": "7b"
}

02 · DEPLOY

Three ways to serve a finished adapter.

Alpha ships with self-hosting from day one. Hosted inference is next on the roadmap and will bill on the same per-token credits already published.

TRAIN · LIVE

Fine-tune a LoRA adapter

Submit a job, get a downloadable adapter. Available today to every account after credit top-up.

Per-token pricing
Dataset deleted after run
Adapter in safetensors + config

DEPLOY · POST-ALPHA

Hosted inference

OpenAI-compatible /v1/chat/completions backed by vLLM. Shipping after we’re sure the training surface is stable.

Per-token billing, same credit wallet
Bring your own adapter or ours
Soft-launched to alpha cohort first

SELF-HOST · ALWAYS

Serve the adapter yourself

Not willing to wait for hosted inference? Download the adapter and run it on any vLLM or TGI-compatible stack, any hardware, any cloud.

Standard safetensors format
No DRM, no lock-in
Works offline

03 · EVALUATE

What the job report tells you.

Training loss, eval loss, before / after samples, and an itemised token cost — attached to every completed run, readable in the web app or via the API.

Training + eval loss per step

Every job emits per-step loss you can watch live in the web app or stream from the CLI.

Before / after samples

The job report includes completions from a held-out slice for both the base and the fine-tuned adapter, side by side.

Per-token cost breakdown

Every job ledger entry shows tokens seen, runtime, and credits deducted — tied to a specific job id.

04 · DEVELOPER SURFACE

A CLI, a REST API, and the web app — same primitives everywhere.

The CLI is the reference surface during alpha. Everything it does maps to a documented REST endpoint; the web app calls the same backend.

$ ownllm status
job_8c3f… · RUNNING · epoch 2/3 · eval loss 1.58
job_4a9e… · DONE · 11m 42s · eval loss 1.42

$ ownllm download job_8c3f
✓ adapter saved to ./adapters/job_8c3f/
✓ adapter_model.safetensors · 81 MB
✓ adapter_config.json

GET /job/{id}/result
Authorization: Bearer $OWNLLM_TOKEN

{ "data": { "artifact_url": "https://...", "eval_loss": 1.42,
            "training_loss": 1.38, "runtime_seconds": 702 } }

05 · ROADMAP

What’s next, in the order we plan to ship it.

No speculative bolt-ons. The list below is what the team is actively building after the training surface exits alpha.

Hosted inference

vLLM-backed, OpenAI-compatible. Per-token pricing mirrors the rates already published on the pricing page.

Signed eval bundles

Tamper-evident report artifact attached to each completed job — for compliance teams that need something auditable.

Team workspaces

Shared credits, roles, and job history across a company account — after the single-user flow stabilises.