[ Buyer’s guide · 2026 ]

Private AI for law firms, without the gamble.

A plain-language guide to what private AI does for a law firm, where it runs, how your data stays yours under Loi 25, what it costs, and how to buy it, written for partners and IT directors, not engineers.

[ Contents ]

01Why now 02What it is 03Who it’s for 04Where it runs 05What it does 06Your data & Loi 25 07Security & governance 08How it’s built 09Deployment & timeline 10Hardware sizing 11What it costs 12The business case 13How it compares 14Procurement & IP 15Support & continuity 16What it doesn’t do 17Common questions 18Glossary

[ 01 · Why now ]

The AI is already in your firm. The question is whether you can see it.

Your associates are already pasting client matters into public AI tools after hours, because the policy says no but the tool in front of them says yes. That is the exposure a private AI removes: it gives lawyers an AI they prefer, on infrastructure you control, with a record of every question.

$10M

or 2% of worldwide turnover, the maximum administrative penalty under Loi 25.

Shadow AI

Most lawyers already use public AI, usually in browser tabs the firm can’t see, log, or govern.

Client audits

Corporate clients now send AI & data-security questionnaires before they send work.

[ 02 · What it is ]

We sell the platform and the AI layer, not the hardware.

Bilbs is the software: a lawyer-facing web app, an admin console, retrieval and inference, identity and access control, an audit log, and the connectors that read your firm’s documents. You provide the infrastructure it runs on, and a free consultation recommends which kind. We never resell hardware at a markup, and your firm’s data is never used to train Bilbs’, Anthropic’s, or anyone else’s models.

[ 03 · Who it’s for ]

Built for firms that hold confidential files.

Bilbs fits any practice that lives on privileged documents, from a 5-lawyer boutique to a 100-plus-lawyer national firm. It is a particularly strong fit when data residency, Loi 25, or a client’s AI questionnaire requires the AI to live inside the firm.

By size

Boutiques to national firms, 5 to 100-plus lawyers. Four hardware references map to the size of the practice; the platform is the same throughout.

By practice

Litigation, corporate, real estate, family, IP, in-house legal departments, and notarial offices. Bilingual French and English by design.

By pressure

Firms facing Loi 25 obligations, client AI questionnaires, shadow-AI risk, or succession risk on senior rainmakers.

[ 04 · Where it runs ]

Local first. Cloud if it fits.

Path A

On your own hardware

Open-weight models (Llama, Qwen, Gemma, Mistral) served locally with llama.cpp on a GPU server in your office. Fully offline, it can run with the internet cable unplugged. Use hardware you already own, or we source it for you at cost.

· Nothing leaves the building
· You keep the fine-tuned model
· Installs from a USB key in ~30 minutes

Path B

In your AWS Bedrock account

The same platform, calling Claude through Amazon Bedrock in your own AWS account and region. No hardware to buy or maintain. Your prompts and documents stay inside your AWS tenancy and are not used to train the model.

· No hardware, no capex
· You choose the AWS region
· You pay AWS for Bedrock + Claude usage

Not sure which fits? The free consultation looks at your firm’s size, IT, and confidentiality requirements and recommends a path, with the hardware spec or the Bedrock estimate written down.

[ 05 · What it does ]

The capabilities that ship today.

Ask your matters

Plain-language questions answered from your own documents, every answer cited to the exact source file and paragraph.

Draft from precedent

First drafts of memos, letters and clauses built from the firm’s own templates and prior work, not a blank page.

Transcription & OCR

Offline Whisper transcription of calls and dictation, and Tesseract OCR (French + English) for scanned PDFs.

Matters, roles & access

Matter-scoped access with role-based permissions, so lawyers only see what they’re entitled to.

Audit log

Every prompt, answer and citation recorded, exportable for a client questionnaire or an internal review.

Redaction

Detect and redact personal and privileged information before documents are shared or exported.

[ 06 · Your data & Loi 25 ]

Your data stays under your control.

On the local path, documents and prompts never leave your building; the server can run air-gapped. On the AWS Bedrock path, everything stays inside your own AWS account and the region you choose, and Claude on Bedrock does not train on your inputs. Either way, access is controlled per matter, every interaction is logged, and answering a client’s “where does our data go?” questionnaire becomes a one-page answer. Loi 25 and PIPEDA were the design constraint, not an afterthought.

[ 07 · Security & governance ]

Built so the answer to a client audit is one page.

Data residency

On-premise, data never leaves the building and the server can run air-gapped. On Bedrock, it stays in your AWS account and region.

Access control

Role-based, matter-scoped permissions. Lawyers see only the matters they’re entitled to; admins manage roles centrally.

Authentication

Argon2 password hashing with JWT sessions and optional TOTP two-factor. Integrates with your identity directory.

Audit log

Every prompt, answer and citation recorded and exportable, so “who asked what, when?” has a written answer.

Redaction

Detect and redact personal and privileged information before anything is shared or exported.

No model training

Your files are never used to train Bilbs’, Anthropic’s, or anyone else’s models. Full stop.

[ 08 · How it’s built ]

Boring, auditable infrastructure.

A small set of services your IT director can actually reason about: an API for authentication, role-based access and audit; an inference runtime doing retrieval-augmented generation over your documents; an ingestion worker that chunks, embeds and indexes files; and the lawyer and admin web apps. Retrieval uses PostgreSQL + pgvector; embeddings use BGE; jobs run on Redis.

Models are open-weight via llama.cpp on the local path, or Claude via Amazon Bedrock on the cloud path. Authentication is Argon2 + JWT with optional TOTP. Everything ships with a 30 to 50 page runbook so the firm is self-sufficient by design.

[ 09 · Deployment & timeline ]

From first call to live in weeks, not quarters.

Typically 7–8 weeks contract-to-live for a single-server deployment; 18–24 weeks for a national cluster. One practice group goes first; the rest of the firm rolls in once they sign off. We install, stay on call, and own the runbook.

01 · Free

Audit & scope

We review your document system, identity, network and GPU situation, then recommend a path in writing. Free.

02 · ~Weeks 1–3

Install & connect

We deploy the platform on your server or AWS account and connect it to your document system, read-only.

03 · ~Weeks 3–6

Index & train

It indexes your matters and learns the firm’s style. Optional fine-tuning sharpens drafting on your precedents.

04 · ~Weeks 6–8

Train the team & go live

Two on-site sessions per practice group. Day one looks like the day before, lawyers just open a browser tab.

[ 10 · Hardware sizing ]

A reference for the local path, sized to the firm.

Indicative references for an on-premise deployment. Use hardware you already own, or we source it at transparent OEM cost. On the Bedrock path there’s no hardware at all.

Reference	Firm size	Indicative GPU	Form factor
Foundation	5–10 lawyers	1× RTX 5090	Tower
Practice	10–30 lawyers	2× RTX 5090	Water-cooled desktop
Firm	30–100 lawyers	2× RTX 6000 Ada / A100	4U rackmount
National	100+ lawyers	8× A100 SXM	Half-rack (H200/B200 path)

The free consultation confirms the exact spec, or the Bedrock usage estimate, before you commit. See The Box for full references.

[ 11 · What it costs ]

One platform fee. Infrastructure at cost.

The platform & AI layer

from $99

/ lawyer / month · CAD

Covers the software, deployment, training, updates and support. One line on the invoice.

Infrastructure (on top, at cost)

Local: a GPU server you own. Use your own, or we source it at transparent OEM cost, no markup.

AWS Bedrock: your AWS Bedrock + Claude API usage, billed by AWS directly to your account.

A free consultation gives you the exact hardware spec or a Bedrock usage estimate before you commit.

In the per-lawyer fee

· Audit, deployment and indexing
· On-site training and onboarding
· Updates, model refresh and security patches
· Monitoring, backups and the audit log
· Support within 4 hours, 24/7 emergency line

A worked example

A 60-lawyer firm at $99 is $5,940 / month for the platform, one line on the invoice. Infrastructure is separate and at cost: a server you own, or a one-time OEM-cost quote, or metered AWS usage on the Bedrock path.

Optional & not included

The fee rises only if you choose add-ons such as the fine-tuning tier or extra integrations. Not included: the infrastructure itself, and your AWS usage on the Bedrock path. We never mark up hardware. Cancellable on 30 days’ notice.

[ 12 · The business case ]

It pays for itself in three currencies.

The firm isn’t paying for GPUs or models. It is paying for reclaimed time, reduced risk, and operational leverage. Any one of the three usually covers the fee.

Reclaimed time

Hours lost to searching, first drafts, onboarding and summarising come back as billable capacity. One day of recovered leakage can cover a quarter of the fee.

Reduced risk

A single Loi 25 incident carries a maximum penalty of $10M or 2% of worldwide turnover. Keeping data in the building and logged is cheap insurance against that.

Operational leverage

Winning a client’s AI questionnaire, retaining a retiring partner’s know-how, and onboarding faster are each worth multiples of the platform’s price.

[ 13 · How it compares ]

The difference is architectural.

Public chatbots and cloud legal-AI tools are strong products. They run as hosted cloud services; Bilbs installs inside the firm. Here is how that plays out.

	Public AI (ChatGPT)	Cloud legal AI	Bilbs
Where data lives	Vendor cloud, often US	Vendor cloud	Your building or your AWS
Trained on your files	No	Limited	Yes, and yours to keep
On-prem audit log	No	Vendor-side	Yes, on your server
Runs offline	No	No	Yes (local path)
Loi 25 cross-border	Exposed	Depends on region	Not triggered

[ 14 · Procurement, IP & paperwork ]

The contract answers the partnership will ask.

NDA & agreements

Mutual NDA within 24 hours of the first call. The MSA and privacy agreement are bilingual and already Loi 25-aligned; redlines welcome.

IP & ownership

On local deployments the firm owns the trained model outright; the software is licensed to the firm and lives in your IT’s repository. Your data is never used to train anyone’s models.

Warranty & security pack

Hardware carries the manufacturer’s 3-year warranty; we handle the paperwork. A security-questionnaire prefill and attestations are available for procurement.

[ 15 · Support & continuity ]

Installed, supported, and yours to keep.

We install, stay on call, and own the runbook, so your IT director’s weekend isn’t on the hook. And because it runs on your infrastructure, nothing depends on us still being here.

Response SLAs

Support within 4 hours, a 24/7 line for Sev-1 emergencies, based in Montréal, Eastern time.

Kept current

Model refresh, retraining when you add files, security patches, monitoring and backups, all in the fee.

A real runbook

A 30 to 50 page runbook lives on your shared drive, so the firm is self-sufficient by design, not dependent on us.

No lock-in

If we close shop tomorrow, everything you use today keeps working on your own hardware, with the model and runbook in your hands.

[ 16 · What it doesn’t do ]

A first drafter, never a decider.

Honesty about the limits is part of the diligence. Every answer is grounded in and cited to your own files; if it can’t ground an answer, it says so rather than inventing one. The lawyer always reads the cited source before anything goes out, the same review a partner gives a junior’s draft.

·It does not replace legal judgment. Human oversight stays mandatory on every work product, in the app, the onboarding kit, and the terms.
·It does not browse the open web or send your data out to do so. It answers from your firm’s indexed material.
·It is not a case-management or billing system. It connects to the ones you already run and leaves them in charge.

[ 17 · Common questions ]

Local or AWS Bedrock: which should we pick?

Local suits firms that want everything air-gapped and predictable, with no per-query cost. Bedrock suits firms that prefer no hardware and want Claude’s capability, scaling usage up and down. The consultation recommends one based on your situation.

Do you mark up the hardware?

No. Use hardware you already own, or we source it at transparent OEM cost. We make our money on the platform, not the metal.

Is our data used to train models?

Never. On local it never leaves your hardware; on Bedrock it stays in your AWS account and region, and Claude on Bedrock does not train on your inputs.

What if Bilbs goes away?

The software is licensed to your firm and lives in your IT’s repository; on local deployments the trained model is yours to keep, and a printed runbook lives on the shared drive. Everything keeps working.

How does it handle French and English?

Bilingually, by design. Lawyers ask in French or English and the AI answers in kind, citing French and English source documents alike. OCR covers both languages for scanned files, and the MSA and privacy agreement are bilingual and Loi 25-aligned.

Does it integrate with our document system?

Yes. It connects, read-only, to the document and practice-management systems you already run (iManage, NetDocuments, SharePoint/M365 and similar) through your existing identity directory, and indexes from there. Billing and time entry don’t change.

What about accuracy and hallucinations?

Answers are grounded in your own files and cited to the source. If it can’t ground an answer in your indexed material, it says so rather than inventing one, which sharply reduces the hallucinated-paragraph problem. The lawyer still reviews the cited source before anything leaves the firm.

[ 18 · Glossary ]

The jargon, in plain English.

On-premise / on-prem

Running on a server physically in your office, under your control, rather than in a vendor’s cloud.

Air-gapped

A machine with no network connection to the outside world. The on-prem path can run this way.

Open-weight model

An AI model whose parameters you can download and run yourself (Llama, Qwen, Gemma, Mistral), no external API.

RAG (retrieval-augmented generation)

The AI looks up relevant passages from your documents first, then answers from them, which is what lets it cite sources.

Fine-tuning

Further training the model on the firm’s own work so its drafting matches your house style. Optional.

AWS Bedrock

Amazon’s service for running models like Claude inside your own AWS account and region, with no training on your inputs.

Start with a conversation.

The 45-minute call is free and commits you to nothing. We talk through where your firm stands on shadow AI and Loi 25, and whether the local or Bedrock path fits the way you work.

See pricing