Fine-tune, evaluate, and (soon) serve private models.
Alpha focus: making the training surface boring and predictable. Hosted inference lands next, on the same credit wallet and the same per-token pricing model.
LoRA fine-tuning on your dataset.
Instruction, chat, or continued-pretraining tasks. The trainer handles tokenization, LoRA rank, learning-rate schedule, and the eval split. You pick the base model and the quality setting; the platform does the rest.
JSONL in, adapter out
Upload one file per task shape — instruction / chat / continued-pretraining. The trainer handles tokenization, LoRA config, and eval split.
3B to 70B base models
Pick by latency and cost target. Per-token pricing scales with parameter size so small experiments stay genuinely cheap.
No charge on pre-training failure
Infrastructure error or bad dataset before the first gradient step returns reserved credits in full. Honest billing.
# Fine-tune a 7B adapter on your dataset $ ownllm finetune --data ./dataset.jsonl \ --task instruct \ --quality balanced \ --model 7b → uploading dataset · 312 MB → estimated cost ≈ 3.01 credits · confirm? [y/N]
POST /job Authorization: Bearer $OWNLLM_TOKEN { "dataset_url": "s3://...", "task": "instruct", "quality": "balanced", "model_size": "7b" }
Three ways to serve a finished adapter.
Alpha ships with self-hosting from day one. Hosted inference is next on the roadmap and will bill on the same per-token credits already published.
Fine-tune a LoRA adapter
Submit a job, get a downloadable adapter. Available today to every account after credit top-up.
- Per-token pricing
- Dataset deleted after run
- Adapter in safetensors + config
Hosted inference
OpenAI-compatible /v1/chat/completions backed by vLLM. Shipping after we’re sure the training surface is stable.
- Per-token billing, same credit wallet
- Bring your own adapter or ours
- Soft-launched to alpha cohort first
Serve the adapter yourself
Not willing to wait for hosted inference? Download the adapter and run it on any vLLM or TGI-compatible stack, any hardware, any cloud.
- Standard safetensors format
- No DRM, no lock-in
- Works offline
What the job report tells you.
Training loss, eval loss, before / after samples, and an itemised token cost — attached to every completed run, readable in the web app or via the API.
Training + eval loss per step
Every job emits per-step loss you can watch live in the web app or stream from the CLI.
Before / after samples
The job report includes completions from a held-out slice for both the base and the fine-tuned adapter, side by side.
Per-token cost breakdown
Every job ledger entry shows tokens seen, runtime, and credits deducted — tied to a specific job id.
A CLI, a REST API, and the web app — same primitives everywhere.
The CLI is the reference surface during alpha. Everything it does maps to a documented REST endpoint; the web app calls the same backend.
$ ownllm status job_8c3f… · RUNNING · epoch 2/3 · eval loss 1.58 job_4a9e… · DONE · 11m 42s · eval loss 1.42
$ ownllm download job_8c3f ✓ adapter saved to ./adapters/job_8c3f/ ✓ adapter_model.safetensors · 81 MB ✓ adapter_config.json
GET /job/{id}/result Authorization: Bearer $OWNLLM_TOKEN { "data": { "artifact_url": "https://...", "eval_loss": 1.42, "training_loss": 1.38, "runtime_seconds": 702 } }
What’s next, in the order we plan to ship it.
No speculative bolt-ons. The list below is what the team is actively building after the training surface exits alpha.
Hosted inference
vLLM-backed, OpenAI-compatible. Per-token pricing mirrors the rates already published on the pricing page.
Signed eval bundles
Tamper-evident report artifact attached to each completed job — for compliance teams that need something auditable.
Team workspaces
Shared credits, roles, and job history across a company account — after the single-user flow stabilises.