DIY experiments. Every week.

Raw projects. No spam.

Lawyers vs Python

Cover Image for Lawyers vs Python

When the Floor Drops

I'm a Brazilian living in Japan. Married to a Japanese, with a daughter.

Now I'm getting divorced.

When you're a foreigner divorcing in Japan, the power imbalance is brutal:

- The law is dense legalese - in Japanese, of course.

- Your spouse already has a lawyer lined up and a game plan.
It feels like being a kid thrown into a ring with Mike Tyson.

- You're told: "Just sign here.
"And you don't even know what you're giving up.

My biggest fear: losing my daughter, with no say in visitation.

Searching for Answers

I googled. I ChatGPT'ed: "divorce custody Japan."

Some info popped up, but nothing that really prepared me.

So I tried a lawyer consultation (¥10,000/hour).

The lawyer was polite, answered my questions.

But the vibe was: "You're not even officially divorcing yet. Don't bother me until you have something legally sound."

I tried another lawyer. Conflicting answers. WTF?

I started realizing: lawyers are just humans who read those docs, interpret them, and charge per hour to repeat them back (sometimes unreliably).

Lawyers vs Python

George Hotz once said: "A human brain is ~20 PFLOPS."

So yeah, lawyers can be powerful. But also expensive, inconsistent, and slow.

Meanwhile, I've got a Linux box sitting under my desk:

- AMD Ryzen 5 4500 (6 cores / 12 threads)

- NVIDIA RTX 3060 (12GB)

- 16GB RAM

I was already running LLMs locally. So I did the obvious thing: I built an app.

What Broke First

I thought this would be a simple FastAPI RAG endpoint. It wasn't.

  • Data chaos: vertical Japanese PDFs, scanned tables, XML statutes with kanji article numbers (“第八百十九条”).

  • Tokenization hell: standard English NLP breaks on CJK. I had to build Japanese-aware BM25 retrieval + embeddings.

  • Hallucinations: vanilla LLMs love to make up laws. That’s unacceptable in custody disputes. So I set guardrails:

Quote statutes verbatim or say “⚠️ no excerpt found.”

Don’t invent child-support numbers.

The Build

I ended up building a local RAG app:

  • Corpus: Japanese Civil Code (JP/EN), Domestic Relations Procedure Act, Hague Act, Houterasu/Embassy guides, 2019 child-support tables.

  • Retrieval: hybrid ranker (BM25 + embeddings), routed by issue (custody, visitation, child support, procedure).

  • Guardrails:

    • Custody → always cite Civil Code 819 & 766

    • Visitation → guides + phased schedules, only if retrieved

    • Child support → cite 算定表, never fabricate

    • I also fed a case profile - just a simple YAML file with facts about my situation - so the model always had my context

Not Walking Blind

Divorce is hell. Especially with kids.

I wish these things could always be solved amicably. Without lawyers. Without fear.

But they aren’t. At least not yet.

What I built doesn’t solve everything. But it means I’m not walking blind anymore.

👉 link to repo/code here