[ Firm · full-service tier ] 30–100 lawyers

The Firm.

The bigger server. A proper rack-mount with two power supplies, two network cables, and hot-swap drives — built so a single failure doesn’t knock anyone offline. It runs the biggest model at full precision for 30 to 100 lawyers across one or more offices. Built for full-service Québec firms and the handful of bay-street-adjacent firms in the province. Nothing leaves the building.

See the other tiers

from $29 / lawyer / month · one line on the invoice Hardware below · audited separately, yours or sourced by us

200 Concurrent users

9 Weeks from call to live

[ Who the Firm tier fits ]

For firms where AI is infrastructure, not a trial.

Full-service Quebec firms

30 to 100 lawyers across M&A, litigation, real estate, tax, and labour. The kind of firm that sees questionnaires from banks, insurers, and pension funds every week — and where the partnership wants the same answer on every one.

Multi-office practices

Montréal plus Québec City, Ottawa, or Toronto - one Firm-tier Box serves every associate over the corporate VPN. Per-office latency under 50 ms within province.

Firms with SLA clients

Where AI downtime affects a litigation deadline or a closing. Deterministic failover under 30 seconds. Optional HA pair for single-rack redundancy.

Firms growing out of Practice

Firms that started on the smaller Practice server and have grown past 30 lawyers. We swap the server for the price difference, re-train on the new hardware over a weekend, and the lawyers don’t lose a day of work.

[ Hardware specifications ]

Built on a dual EPYC 4U rackmount.

Form factor 4U rackmount (rack-native)

GPU 2× NVIDIA RTX 6000 Ada / A100
96–160 GB HBM combined · PCIe Gen 4/5

CPU Dual AMD EPYC 9354
64 cores / 128 threads total

RAM 512 GB DDR5-4800 ECC

Primary storage 4x 3.84 TB NVMe Gen 5 (RAID 10)
8 TB usable

Cold storage 8x 7.68 TB SATA SSD (RAID 10)
32 TB usable

Networking 2x 100 GbE QSFP28 + 2x 25 GbE SFP28

Power 2+2x 2600W 80-PLUS Titanium
Redundant, hot-swap

Cold spares included 1x PSU, 1x fan tray, 1x NVMe

Power & Environment

Idle power 520W

Inference load 1,100-1,600W

Peak / dual-GPU fine-tune 2,100W

Acoustic <75 dBA

Needs a proper rack room or sound-isolated closet

2x C19 (30A) on dedicated 208-240V circuits. Recommended UPS: 3500 VA on each feed for 10 min runtime.

[ Model performance ]

Run Gemma 4 70B FP16 at full precision.

Gemma 4 70B (FP16)

~48 tok/s

Tensor-parallel · default serving model · 200 concurrent

Gemma 4 70B (FP8)

~62 tok/s

250 concurrent · quantised for throughput

Gemma 4 27B (FP16)

~105 tok/s

Side serving model · lower-stakes drafting

Gemma 4 LoRA (firm)

~48 tok/s

Fine-tuned on the firm’s precedents

Simultaneous: 70B + 12B + embed

200+80

Chat users + agent users

Multi-model hosting

With up to 160 GB of GPU memory across two A100s and PCIe Gen 5 interconnect, host Gemma 4 70B FP16 + a Gemma 4 12B agent + Gemma 4 Embedding concurrently with efficient tensor parallelism. H200 NVL upgrade lifts that to 282 GB HBM3e.

[ Concurrency envelope ]

Scale to 200 simultaneous users.

Chat (Gemma 4 70B INT4, ~1k-token prompts) 200 users

Ask the firm’s files a question 160 users

Multi-step agents (6-10 tool-use) 120 users

Streaming transcription 60 streams

A mix of all of the above 80+40+40

[ High availability ]

Designed for active/passive HA pairs.

HA pair configuration

2x Pro in adjacent RUs, cross-connected via 100 GbE

Shared virtual IP managed by keepalived / VRRP

Weight sync over dedicated 25 GbE link, replicated every 30s

Snapshot: 1h rolling, 6h pinned, 24h offsite

Sub-30-second automatic failover, zero prompt loss

HA pair add-on

+ second server

Second Firm-spec server on standby · hardware quoted at OEM cost, same per-lawyer line

Recommended for production workloads. In-flight request queue replication means no lost prompts during failover. The $29 / lawyer / month subscription covers both servers — only the hardware is doubled.

[ Decision criteria ]

Pick Pro if any two are true.

The firm has 30 or more lawyers (or is on its way there)

The work has hard deadlines you cannot miss — litigation calendars, closings, regulator filings

Different practice groups need their own model, isolated from each other

A regulator or a client contract requires a second server that takes over automatically if the first one fails

You want the biggest model at full precision (the more accurate version)

Otherwise: the Practice tier may fit the firm better

[ Pricing ]

One line. From $29 per lawyer.

Bilbs subscription · per lawyer, per month

from $29

/ lawyer / month · in Canadian dollars · goes up only if the firm adds optional services

Free audit · we scope the firm before you sign anythingIncluded

Deployment, indexing, on-site training (per practice group)Included

Updates, support, audit log, 24/7 pager, OEM RMA shepherdingIncluded

Hardware (this Firm spec)Yours, or we source it

Full-service spec

4U rackmount with NVIDIA RTX 6000 Ada / A100, enterprise redundancy (dual PSU, hot-swap NVMe)

6-week deployment (fine-tune, indexing, on-site training)

Trained on every contract, memo, and matter your firm has handled

Cold spares: 1x PSU, 1x fan tray, 1x NVMe

One on-site install day (NA/UK/EU) included

Hardware path A · you have one

Use your server

Most full-service firms in this band have a server room and an IT director. If the firm already runs GPU-capable enterprise hardware (or has approved capex for it), we deploy onto it. The Firm spec on this page is the reference: 2× RTX 6000 Ada / A100, dual EPYC, 512 GB ECC, 4U rack with enterprise redundancy.

Hardware path B · you don’t

We source it

If the partnership wants to add private AI without the IT scope expanding, we source the Firm-spec server, install it on-site, configure the network, and own the runbook. Hardware sits on a separate, transparent quote at OEM list — your call once you see the spec. We don’t mark up the GPU.

If the firm grows

No penalty

If the firm grows past 100 lawyers, we swap the chassis for the National cluster and re-train on the new hardware. No re-platforming charge. The per-lawyer line stays the same.

[ Timeline ]

Contract to live in ~8 weeks.

Step 1

Contract + deposit

Same day

Step 2

Deployment

6 weeks

Step 3

Hardware

5-6 weeks

(H200 NVL lead times)

Step 4

White-glove freight

5-10 days

Step 5

On-site rack + bring-up

4-6 hours

Included

On-site install day

NA / UK / EU at no extra cost

AI infrastructure for a full-service firm.

A 45-minute call within one business day. We’ll tell you whether the Firm tier is the right size for your firm — and if it isn’t, we’ll tell you which one is. No pitch. No deck. No payment today.

Meet the person who installs it