# The Bilbs Box

> **Audience.** CTOs / heads of engineering / IT directors evaluating the
> flagship product. 6-minute read.

**TL;DR.** The Bilbs Box is a rackmount server pre-flashed with a
language model fine-tuned on *your* data, the orchestration layer to run
it, the admin UI your team will use, and a printed runbook. You own the
hardware, the weights and the software. Plug it in, point your VPN at
it, done. $0 recurring required.

Full tier comparison: [offers.md](./offers.md) · deep-dive pages in
[./tiers/](./tiers/).

---

## 01 · Why the Box exists

Most AI deployments today rent. You send prompts to OpenAI, Anthropic or
Google, they send tokens back, and you pay per million. That works great
for prototypes and breaks at scale for three reasons:

1. **Cost.** Gross margin compresses as usage grows because your variable
   cost scales with revenue.
2. **Trust.** Sensitive data leaves your perimeter. Legal, compliance and
   enterprise procurement don't like it.
3. **Latency.** A round-trip to `us-east-1` is a user-experience tax you
   can't engineer around.

The Bilbs Box eliminates all three. The model runs on the silicon
sitting in your rack. The tokens never leave. The latency is LAN latency.
The cost is a single capital expense with zero ongoing obligation.

## 02 · What "you own it" actually means

When the Box ships to you, you receive:

- The **chassis + GPUs + drives** (title passes on final payment).
- The **base model weights** fine-tuned on your corpus (under the MSA,
  these are assigned to you on final payment).
- The **orchestration layer**, eval harness, admin UI, and CLI — all
  under the MSA's IP-assignment clause.
- The **Dockerfile, Helm chart, Terraform** and any other build artefact
  produced during the engagement.
- A **printed runbook** plus a Loom-style video walkthrough per subsystem.

You can:
- Re-flash the Box with a different model.
- Physically resell the chassis after you're done with it.
- Fork the orchestration layer and modify it freely — it's yours.
- Move the weights to a larger or smaller Box as your team scales.

You cannot:
- Re-sell the Bilbs AI name or mark with a resold Box.
- Claim compatibility with the "Updates plan" without paying for it.

Picking the model that runs on the Box is covered in
[models.md](./models.md) — the tl;dr is "we fine-tune an open-weight
base on your data and burn the result on the Box." Your Box tier
determines which bases comfortably fit; your use case determines
which base from the ones that fit.

## 03 · The four sizes

| Tier | Concurrent | VRAM | Typical model size | Power draw |
|------|-----------:|-----:|-------------------|-----------:|
| [Mini](./tiers/box-mini.md) | 5 | 48 GB | 7–14B 4-bit quantised | 150–550 W |
| [Box](./tiers/box.md) | 25 | 48 GB | 7–14B full-precision *or* 32B 4-bit | 300–900 W |
| [Pro](./tiers/box-pro.md) | 200 | 96 GB | 70B 4-bit *or* dual 14B HA | 600–1800 W |
| [Cluster](./tiers/box-cluster.md) | 1000+ | 640 GB – 1.1 TB | 180B+ full-precision, multi-tenant | 3–9 kW |

Sizing is keyed to *concurrent users* because that, not employee
headcount, is what actually stresses the hardware. A 200-employee
company where 10% of staff use AI daily sits happily on the 25-seat
**Box**; a 20-person startup whose whole product depends on AI probably
needs the **Pro**.

Sizing call: [cal.com/bilbsgroup/bilbs-ai](https://cal.com/bilbsgroup/bilbs-ai).

## 04 · What lands in your rack

One flight case per Box arrives via freight. Inside:

- The Box itself, rack-ear-mounted, protective foam.
- Rack rails + cable-management arm.
- A USB-C flash drive with the full image (offline-installable).
- A printed manual and wallet-card with serial + support PIN.
- Two spare NVMe drives (for the Mini: one spare).
- C13/C14 or C19/C20 power cables, regional plug head included.
- A warranty card (OEM — NVIDIA / Dell / Supermicro — passed through).

Unboxing + install: [setup/whats-in-the-box.md](./setup/whats-in-the-box.md)
and [setup/installation.md](./setup/installation.md).

## 05 · What "customisable" means

Every Box is configured-to-order against:

- Your **model choice** (Llama 3/4/5, Qwen 2.5/3, Mistral, Gemma, custom
  MoE). We recommend against bespoke models unless you have a reason.
- Your **fine-tune corpus** — the data we pre-train on before burning
  the weights.
- Your **concurrency target** (drives tier selection).
- Your **network posture** — fully airgapped, VPN-only, or WireGuard out
  to your cloud plane.
- Your **SSO / IdP** — Okta, Azure AD, Google Workspace, OneLogin, JumpCloud,
  or LDAP for on-prem shops.
- Optional **HA pair** (Pro and up) — active / passive on a shared VIP.

Customisations happen during the 6-week Build engagement.

## 06 · What you pay for, clearly

1. **One-time** — the Box + Build engagement.
2. **Zero required recurring.** The Box will run forever on the model
   it shipped with, on the silicon you own, with no phone-home.
3. **Two optional plans** you can opt into at any time and cancel at any
   time: [Updates](./plans/updates-plan.md), [Support](./plans/support-plan.md).

Full pricing matrix in [offers.md](./offers.md).

## 07 · Common questions

### "What happens when Gemma 5 (or Llama 5, or Qwen 3) ships?"

If you're on the Updates plan, we pull the new base weights, re-fine-tune
on your corpus, validate against your eval harness, and push the new
image to your Box. A flag flips in the admin UI and traffic shifts over
the next maintenance window. No per-release charge; no chassis swap.

If you're not on the Updates plan, you keep the model that was on the
Box when you bought it. It continues to work indefinitely.

See [plans/updates-plan.md](./plans/updates-plan.md) for SLA and cadence.

### "What if the box dies in year 2?"

Hardware is covered by the OEM's standard 3-year warranty. You file an
RMA directly, they send a replacement chassis, we re-burn the weights
off your last snapshot. If you're on the Support plan, we shepherd the
whole thing for you — you just ship the bad chassis back.

### "Can we airgap it fully?"

Yes — at install time you select airgap mode. The Box then refuses to
make any outbound DNS / HTTP request ever. Model updates under the
Updates plan are delivered via a signed USB drive mailed to you
(quarterly) or via a one-way data diode if you have one.

### "What about HA?"

Pro and Cluster support active/passive HA with a shared VIP. Mini and
Box do not — failure modes for small deployments are rare enough that we
recommend "keep a cold spare and restore from snapshot" as the
recovery strategy.

### "Can we start on a Mini and scale to a Pro?"

Yes. Your weights are portable. We credit 80% of the list price of the
old chassis against the new one if you trade up within 24 months.

---

Next: pick a tier →
[Mini](./tiers/box-mini.md) ·
[Box](./tiers/box.md) ·
[Pro](./tiers/box-pro.md) ·
[Cluster](./tiers/box-cluster.md).
