LOOPHOLE

01 — THE BIG IDEA

Legal System Evolution, Compressed

Real legal systems take decades to develop. Loophole compresses that adversarial evolution into minutes — forcing your moral principles to confront edge cases, loopholes, and overreach.

Real Legal Evolution

1789

Constitution Drafted

Broad principles written by framers

1803

Marbury v. Madison

First judicial review — gaps exploited

1954

Brown v. Board

Overreach discovered, precedent reversed

1973

Roe v. Wade

Loophole found in privacy doctrine

Today

Still Evolving

235+ years of adversarial refinement

TIME SPAN

235+ Years

Loophole Compression

T+0:00

Input Principles

You provide moral principles in plain text

T+0:30

Legislator Drafts

Formal legal code created automatically

T+1:00

Attacks Begin

Loophole + Overreach agents attack the code

T+2:00

Human Escalation

Genuine dilemmas escalated for your decision

T+5:00

Robust Code

10 rounds of adversarial refinement complete

TIME SPAN

~5 Minutes

STEP 01

Write Principles

You articulate your moral beliefs in plain language — "privacy is a fundamental right", "no surveillance without consent"

STEP 02

Draft Code

The Legislator agent translates your principles into structured legal code with articles, definitions, and exceptions

STEP 03

Attack

Two adversarial agents relentlessly probe for loopholes (legal-but-wrong) and overreach (illegal-but-acceptable)

STEP 04

Patch & Repeat

Resolvable cases get patched; unresolvable ones are escalated to you. Each resolved case becomes binding precedent

LIVE DEMO — CLICK TO ATTACK AND PATCH

        Article 3.1 — Data Collection Consent
          No entity shall collect personal data without explicit consent.
        Article 3.2 — Data Sale Prohibition
          Personal data may not be sold to third parties under any circumstances.
        Article 3.3 — Aggregated Data Exception
          Anonymized aggregate statistics are exempt from collection restrictions.
        Article 3.4 — Research Exception
          Academic research institutions may collect data with IRB approval.
      

02 — THE AGENTS

Four Minds in Adversarial Loop

Each agent has a precisely tuned temperature, a distinct adversarial role, and actual prompt engineering. Click each to explore their system prompts and example outputs.

⚖️

Legislator

temp: 0.4

🔴

Loophole Finder

temp: 0.9

🟡

Overreach Finder

temp: 0.9

⚖

Judge

temp: 0.3

⚖️

The Legislator

DRAFTER · CONSERVATIVE

0.4

Temperature: 0.4 — Conservative Low temperature = consistent, precise, formal output. The Legislator should not be creative — it should be exacting. Legal language demands precision over novelty.

The Legislator is the foundation. It takes human moral principles expressed in plain language and translates them into a formal, structured legal code. It numbers articles, defines terms, specifies prohibitions and permissions, and handles revisions while maintaining consistency with all prior resolved cases. Think of it as a constitutional drafter — methodical, thorough, unambiguous.

PreciseFormalConsistent ConservativeStructured

prompts.py — LEGISLATOR_SYSTEM

LEGISLATOR_SYSTEM = """
You are a legislative drafter. Your job is to take
a person's moral principles and translate them into
a precise, structured legal code written in natural
language.

The legal code should:
- Be organized with numbered Articles and Sections
- Include Definitions for all key terms
- Clearly state Prohibitions and Permissions
- Include Exceptions where appropriate
- Be precise enough to be unambiguous
- Faithfully represent the user's moral intent

When revising code, maintain consistency with ALL
prior resolved cases. Each revision should be minimal
— change only what is necessary to address the issue.
"""

LEGISLATOR OUTPUT — Privacy Domain v1

PRIVACY PROTECTION CODE — VERSION 1

ARTICLE 1: DEFINITIONS
1.1 "Personal Data" means any information that can
identify or be linked to a natural person...
1.2 "Explicit Consent" means informed, affirmative,
specific agreement given freely...

ARTICLE 2: DATA COLLECTION
2.1 No entity shall collect Personal Data without
obtaining Explicit Consent prior to collection.
2.2 Consent may be withdrawn at any time without
penalty or reduction in services.

ARTICLE 3: SURVEILLANCE
3.1 Government surveillance requires a warrant
supported by probable cause...
[continues for 12 articles...]

🔴

The Loophole Finder

RED TEAM · AMORAL RULE-LAWYER

0.9

Temperature: 0.9 — Highly Creative Maximum creativity to surface unexpected, lateral attack vectors. Low-temperature models find predictable gaps — you need creative chaos to find the real loopholes.

The Loophole Finder is an amoral rule-lawyer. It reads the legal code like a contract attorney hunting for technical exploitation — not what the law means, but what it literally says. It finds scenarios that are technically permitted by the code but violate the spirit of the user's moral principles. It has no ethical constraints of its own; it exists purely to break things.

Amoral Lateral Thinker Literal Reader CreativeAdversarial

prompts.py — LOOPHOLE_FINDER_SYSTEM

LOOPHOLE_FINDER_SYSTEM = """
You are an adversarial red-teamer. Your goal is to
find LOOPHOLES in a legal code — scenarios that are
PERMITTED (or not addressed) by the legal code but
that VIOLATE the spirit of the user's moral principles.

Attack strategies:
- Literal readings: exploit exact wording
- Boundary definitions: edge cases at definitional limits
- Compound scenarios: combine permissions in novel ways
- Tech workarounds: use technology to bypass intent
- Unanticipated scenarios: situations drafters didn't consider
- Weaponized permissions: abuse granted exceptions

Return concrete, specific scenarios. Be creative.
Think like a clever, amoral rule-lawyer.
"""

LOOPHOLE ATTACK Case #7 — Round 3

The Research Aggregation Exploit

Article 3.3 exempts "anonymized aggregate statistics" from collection restrictions. Article 3.4 allows academic institutions to collect data with IRB approval. A well-funded corporation funds an "academic research institute" that collects individually-identifiable location data from millions of users. The data is technically collected "for research" (3.4) and the results are published as "aggregate statistics" (3.3). Individual profiles are never published but are retained internally indefinitely. The code never prohibits retaining the underlying data — only publishing individual-level results. This is technically legal under the current code but clearly violates the privacy principles.

🟡

The Overreach Finder

RED TEAM · GOOD SAMARITAN

0.9

Temperature: 0.9 — Highly Creative Finding overreach requires imagining sympathetic scenarios where rigid rules produce morally repugnant outcomes. Creativity reveals situations the drafter never anticipated.

The Overreach Finder is the inverse adversary — it finds cases where the legal code is too strict. It looks for scenarios the code prohibits but that the user would consider morally acceptable, praiseworthy, or even obligatory. Good Samaritan situations. Emergency exceptions. Professional duties. Situations where following the letter of the code leads to catastrophic outcomes.

Good Samaritan Emergency Thinker Sympathetic EmpatheticCreative

prompts.py — OVERREACH_FINDER_SYSTEM

OVERREACH_FINDER_SYSTEM = """
You are an adversarial red-teamer. Your goal is to
find OVERREACH in a legal code — scenarios that the
legal code PROHIBITS but that the user would likely
consider MORALLY ACCEPTABLE or even praiseworthy.

Attack strategies:
- Good Samaritan situations: helping someone
inadvertently triggers a prohibition
- Professional duties: doctors, lawyers forced to
violate code to serve clients properly
- Overbroad prohibitions: rule catches clearly
innocent behavior in its net
- Emergency situations: rigid compliance causes
significantly worse outcomes
- Over-inclusive definitions: edge cases at limits
- Chilled activity: prohibition discourages
legitimate, important behavior
"""

OVERREACH ATTACK Case #12 — Round 4

The Emergency Room Prohibition

Article 2.1 states that no entity shall collect Personal Data without Explicit Consent prior to collection. An unconscious accident victim is brought to an emergency room. The hospital's AI triage system needs to access the patient's medical history to determine drug interactions and allergies before administering treatment. The patient cannot provide consent. Under the current code, accessing any prior medical records without explicit consent is prohibited — even to save the patient's life. Most people would consider this access not only acceptable but obligatory. The code needs a "medical emergency" exception, but defining its boundaries without creating a new loophole is genuinely difficult.

⚖

The Judge

ARBITER · ULTRA-CONSERVATIVE

0.3

Temperature: 0.3 — Ultra-Conservative The Judge must be cautious. Incorrectly declaring a case RESOLVABLE when it isn't creates contradictory precedent that corrupts all future reasoning. When in doubt, escalate.

The Judge is the gatekeeper of coherence. Given a case (loophole or overreach), it determines: can this be fixed with a minimal code revision that doesn't contradict any prior resolved cases? If yes → propose the revision. If no → escalate to the human. It also runs a validation step: after each Legislator revision, the Judge re-checks every prior resolved case to ensure no regressions. This creates a growing test suite that constrains future revisions.

Conservative Precedent-Aware Systematic GatekeeperValidator

prompts.py — JUDGE_SYSTEM

JUDGE_SYSTEM = """
You are a judicial agent. Given a legal case and the
current legal code, determine whether the case can
be resolved with a minimal code revision.

Two possible verdicts:

RESOLVABLE: You are confident a minimal revision
can fix the issue WITHOUT contradicting any prior
resolved cases. Provide the specific revision.

UNRESOLVABLE: Any fix would contradict a prior
decision, or the case reveals a fundamental tension
in the principles themselves. Escalate to human.

Be CONSERVATIVE: when in doubt, declare UNRESOLVABLE.
A false RESOLVABLE corrupts all future reasoning.
"""

prompts.py — JUDGE_VALIDATE

JUDGE_VALIDATE = """
Given a proposed revision to the legal code and a list
of ALL previously resolved cases (with their resolutions),
check whether the revised code still correctly handles
every prior case.

For each resolved case, verify:
- LOOPHOLE cases: is the scenario now prohibited?
- OVERREACH cases: is the scenario now permitted?

If ANY prior case fails under the new code,
REJECT the revision and report which cases broke.
The Legislator must try again with stricter constraints.
"""

03 — THE SIMULATION

One Cycle, Step by Step

Walk through a complete adversarial cycle using real privacy principles. Navigate forward and back — every step reveals the system's inner workings.

STEP 1 OF 6

Input: Privacy Moral Principles

These are the raw moral principles you'd provide to Loophole. They're written in natural language — no formal structure needed. The system will handle formalization.

🔒Privacy is a fundamental human right, not a commodity to be sold or exchanged. Every person has inherent authority over their personal information.

✍️No data collection or sale without explicit, informed, freely-given consent. Pre-checked boxes and buried terms do not constitute consent.

⚖️Government surveillance requires warrants supported by probable cause. Mass surveillance programs violate the presumption of innocence.

📷No facial recognition tracking in public spaces. Anonymity in public is a civil liberty, not a privilege.

📰Strong protections for journalists and whistleblowers. The press must be able to communicate with sources without surveillance.

🏥Medical, financial, and communications data receive extra protection as especially sensitive categories.

👶Children receive stronger protections. No behavioral tracking or profiling of minors online.

🚨Privacy is not absolute: credible, imminent safety threats can override with minimal intrusion, oversight, and sunset clauses.

📁 Source: These mirror the privacy_principles.txt from the Loophole repository's example domain. The system accepts any moral domain — you could use parenting ethics, workplace fairness, environmental policy, etc.

STEP 2 OF 6

⚖️ Legislator Drafts Legal Code

The Legislator (temp: 0.4) takes the 8 principles and drafts a formal legal code. Watch it being written below:

Legislator — drafting v1... (temp: 0.4)

▋

💡 Why temp 0.4? Legal drafting requires consistency and precision, not creativity. Lower temperature = more deterministic, formal language. The same principles should produce essentially the same code every run.

STEP 3 OF 6

🔴 Loophole Finder Attacks

The Loophole Finder (temp: 0.9) reads the full legal code and generates 3 attacks per round. Here are the attacks from Round 1:

LOOPHOLE #1 Round 1 — Technical Consent Bypass

The Dark Pattern Compliance Trap

Article 2.1 requires "explicit, informed consent." A company redesigns its consent flow as a 47-screen onboarding wizard where "agree" buttons are large and colorful but "decline" requires navigating through 12 sub-menus. Users do technically consent. The code requires explicit consent but says nothing about consent obtained through deliberately confusing UX. Result: total surveillance achieved "legally."

LOOPHOLE #2 Round 1 — Definitional Exploit

The "Statistical Profile" Loophole

Article 1.1 defines "Personal Data" as information that "identifies or can be linked to a natural person." A data broker creates probabilistic profiles with 99.7% accuracy that technically never store a name or ID — only statistical vectors. They argue these "statistical profiles" are not Personal Data because they are never definitionally linked to an identity, only probabilistically correlated. The definition needs a "re-identification risk" clause.

LOOPHOLE #3 Round 1 — Jurisdiction Hop

The Foreign Entity Shell

Article 3.1 restricts "entities" from conducting surveillance. A domestic corporation transfers its data operations to a foreign subsidiary incorporated in a jurisdiction with no privacy laws. The foreign entity collects the data and sells aggregate reports back to the domestic parent. No domestic entity ever collected the data. The code needs to address corporate control, not just direct collection.

STEP 4 OF 6

🟡 Overreach Finder Attacks

Simultaneously, the Overreach Finder (temp: 0.9) attacks from the other direction — finding scenarios the code prohibits that seem morally acceptable:

OVERREACH #1 Round 1 — Medical Emergency

The Unconscious Patient

Article 2.1 prohibits collecting or accessing personal data without prior explicit consent. An unconscious car accident victim is rushed to the ER. Doctors need to access their medication history to avoid a fatal drug interaction. The patient cannot consent. Under strict code interpretation, accessing medical records is prohibited. Most people would consider this access not only acceptable but morally obligatory.

OVERREACH #2 Round 1 — Child Safety

The Missing Child Alert

Article 7.1 prohibits facial recognition in public spaces. An 8-year-old goes missing in a crowded city. Police want to use the city's camera network with facial recognition to locate the child. The strict code prohibition covers all cases — there is no emergency exception even for active child abduction. Preventing this use seems morally wrong, but creating an exception risks undermining the entire ban.

OVERREACH #3 Round 1 — Research Chilling

The Epidemiologist Dilemma

Articles 2.1-2.3 restrict all health data collection without explicit consent. An epidemiologist studying a new infectious disease needs to analyze patient records retroactively to trace outbreak patterns. Many affected patients are deceased and cannot consent. Their estates argue privacy rights survive death. The research could prevent thousands of deaths from future outbreaks, but the code as written makes it impossible.

STEP 5 OF 6

⚖ Judge Evaluates Each Case

The Judge (temp: 0.3) evaluates each of the 6 cases. Click the verdict for the Missing Child Alert case — the hardest one this round:

CASE UNDER REVIEW OVERREACH #2 — Missing Child Alert

Should facial recognition be allowed to find a missing child?

The code has a blanket public facial recognition ban (Principle #4 / Article 7.1). If we create a "missing person" exception, we've opened a door: who defines "missing"? How soon after disappearance? What age threshold? Could this exception be weaponized to track adults fleeing abuse? Prior resolved Case #3 established that government surveillance exceptions require "imminent physical threat" — does a missing child qualify?

ROUND 1 CASE SUMMARY

AUTO-RESOLVED

ESCALATED

NEXT VERSION

STEP 6 OF 6

🔧 Code Patched → Round 2 Begins

For the 4 resolvable cases, the Legislator revises the code. Here's the diff for the Loophole #2 fix (Statistical Profile patch):

DIFF — privacy_code.txt v1 → v2 | Article 1.1: Personal Data Definition

Article 1: Definitions - 1.1 "Personal Data" means any information that - identifies or can be directly linked to a - natural person. + 1.1 "Personal Data" means any information that + identifies, can be directly linked to, or can + be re-identified with a natural person with + greater than 10% probability using any means + reasonably available to a competent adversary. + This includes probabilistic profiles, inferred + attributes, and derived data. 1.2 "Explicit Consent" means informed, affirmative, + 1.3 "Coercive Consent Design" means any UI/UX + pattern that obscures, discourages, or penalizes + the exercise of privacy rights, including but + not limited to: disproportionate friction on + decline options, false urgency, and confusing + consent flows. Consent obtained via Coercive + Consent Design is null and void.

✓ VALIDATION PASSED

Judge re-ran all 4 resolved cases against v2. All pass. No regressions. Code promoted to current version. Round 2 begins.

→ WHAT YOU LEARNED

You discovered that your privacy principle requires re-identification risk thresholds and coercion definitions — things you never explicitly stated, but clearly believe.

"The real output of Loophole isn't the legal code — it's what you discover about your own moral beliefs under adversarial pressure." — Brendan Hogan, Loophole README

Step 1 of 6

05 — WHY IT MATTERS

Connections to Bigger Ideas

Loophole isn't just a coding tool. It's a lens for understanding AI alignment, legal philosophy, security research, and moral epistemology.

🤖 Constitutional AI — Same Architecture, Transparent

Anthropic's Constitutional AI (CAI) is the method behind Claude's values. It uses a "constitution" of principles to guide AI behavior through adversarial self-critique and revision. Loophole implements the same structural logic — but makes it interactive, transparent, and human-in-the-loop.

Constitution = Moral Principles — Both systems start with high-level principles. CAI's constitution; Loophole's plain-text input.
Red-Teaming = Loophole/Overreach Agents — CAI uses adversarial prompting to find harmful outputs; Loophole uses dedicated agents to find logical failures.
Preference Data = Test Suite — CAI's accumulated human preference signals map to Loophole's growing corpus of resolved cases.
RLHF = Human Escalation — When automated resolution fails in both systems, a human provides the signal that shapes future behavior.
Key Difference: Transparency — CAI is a training process you can't watch. Loophole lets you see every attack and every patch in real-time.

"Loophole makes Constitutional AI interactive — you don't just define the constitution, you watch it get stress-tested and evolve in response to adversarial pressure. You see exactly where your beliefs break down." — Brendan Hogan, Loophole Repository

⚖️ Common Law Evolution — 235 Years in 5 Minutes

The Anglo-American common law system evolved through exactly the mechanism Loophole simulates: adversarial parties find edge cases, judges rule, those rulings become binding precedent (stare decisis), and subsequent rulings must remain consistent with the accumulated body of precedent.

Precedent = Resolved Cases — Every resolved case in Loophole becomes binding on future revisions, exactly like judicial precedent.
Adversarial System = Loophole/Overreach Agents — Prosecution and defense in real courts mirror the two adversarial agents — one finding what's permitted that shouldn't be, one finding what's prohibited that shouldn't be.
Judicial Review = Judge Agent — The Judge's validation step mirrors constitutional courts checking new legislation against prior rulings.
Human Escalation = Legislature — Cases that can't be resolved by courts get sent to the legislature (humans) to make new law.
Loophole finds in minutes what took centuries — The same adversarial process that took 235 years to produce modern privacy law can be run in a single session.

🔐 AI Red-Teaming — Security-Style Testing for Ethics

Red-teaming in cybersecurity means hiring ethical hackers to break your defenses before malicious actors do. AI red-teaming applies the same logic to AI systems — adversarially probing for harmful outputs before deployment. Loophole is AI red-teaming for moral frameworks.

Penetration Testing = Adversarial Agents — Just as pen testers systematically find exploits, Loophole's agents systematically find moral exploits.
CVE Database = Case Log — Security vulnerabilities get catalogued; Loophole catalogs moral vulnerabilities and their patches.
Regression Testing = Judge Validation — After patching a vulnerability, security teams retest all previous exploits to ensure no regressions. Loophole does the same.
Zero-Day Exploits = Genuine Dilemmas — Some security vulnerabilities are unfixable without breaking core functionality. UNRESOLVABLE cases are the moral equivalent.
Continuous Integration = Multi-Round Loops — Modern security uses continuous testing pipelines. Loophole's 10-round adversarial loop is analogous CI for ethics.

🧠 Moral Philosophy — Discovering What You Actually Believe

The deepest insight in Loophole is philosophical: you don't fully know your own moral beliefs until they face adversarial pressure. The real output isn't the legal code — it's self-knowledge.

Socratic Method — Socrates tested beliefs through adversarial questioning. Loophole is automated Socratic dialogue applied to ethics.
Reflective Equilibrium — John Rawls argued that moral reasoning involves cycling between intuitions and principles. Loophole implements this as a computational loop.
Trolley Problems at Scale — The escalated cases are structurally identical to philosophical thought experiments — but generated from your specific principles, not abstract scenarios.
Hidden Priors — You will discover beliefs you hold but never explicitly stated. Privacy means something different to you than you thought until a chatbot breaks it.
Consistency Checking — The precedent system forces consistency. You can't say "yes" to the missing child case but "no" to a structurally identical adult case without explicitly justifying the difference.

"You might start thinking you believe in absolute privacy, and discover through the process that you actually believe privacy is a strong default right with narrow exceptions. That's a genuinely different view — and you might not have known without running this." — Brendan Hogan, Loophole Repository

06 — EXTENSIONS

Taking It Further

Click any card to expand the full vision. Each extension represents a research direction that could meaningfully advance how AI systems encode human values.

🤖

Multi-Model Battles

Different LLMs for different adversarial roles

Currently all four agents use Claude. What if the Loophole Finder was GPT-4o (known for literal, logical reasoning), the Overreach Finder was Gemini (known for creative synthesis), and the Judge was Claude (known for careful judgment)? Each model has different failure modes — using them adversarially could surface attacks that a single-model system would never find. This mirrors how diverse teams outperform homogeneous ones in security research.

Research Question: Does model diversity produce qualitatively different attacks, or do all frontier models find the same loopholes?

↗ Click to expand

🏆

Tournament Mode

Pit moral frameworks against each other

Run Loophole simultaneously on multiple moral frameworks — utilitarian, deontological, virtue ethics, contractarian. After 10 rounds each, compare: which framework produced the most robust code? Which escalated the most cases? Which found genuine dilemmas the others resolved? This isn't just academic — it provides empirical evidence about which ethical frameworks are internally consistent under adversarial pressure.

Research Question: Is any moral framework "more complete" in the sense of fewer UNRESOLVABLE escalations?

↗ Click to expand

🎯

Coalition Attacks

Multiple adversarial agents collaborating

What if instead of one Loophole Finder, there were three — each with different attack strategies (one specializing in definitional exploits, one in technical workarounds, one in compound scenarios)? They could share findings and build on each other's attacks, the way red teams collaborate. The coalition would likely find loopholes that escape a single adversary, more closely mirroring how actual bad actors operate in coordinated groups.

Hypothesis: Coalition attacks find qualitatively different vulnerabilities — compound loopholes that require chaining multiple code weaknesses.

↗ Click to expand

📚

Historical Injection

Seed with real legal edge cases

Pre-seed the simulation with actual landmark court cases as initial test cases. For privacy: Katz v. United States (wiretapping), Carpenter v. United States (cell phone location), GDPR enforcement cases. The generated code must handle all these historical cases from Round 1, accelerating convergence to a robust framework and grounding the simulation in real-world complexity rather than purely hypothetical scenarios.

Application: Could reproduce the evolution of US privacy law from scratch, or test whether AI systems "rediscover" legal principles that took decades to establish.

↗ Click to expand

📊

Complexity Metrics

Track code entropy and loophole density

Instrument the system to track code evolution quantitatively: word count per version, number of exceptions per article, readability scores, number of defined terms, entropy of the definition graph. Plot these metrics across rounds to answer: does the code get more complex over time? Is there a complexity "ceiling"? Do some moral frameworks produce more complex code than others? This could reveal whether some principles are inherently more expressible in rule form than others.

Prediction: Complexity grows monotonically through rounds 1-6 then plateaus as all edge cases are captured. Some moral domains are inherently harder to codify than others.

↗ Click to expand

🌐

Community Platform

Share sessions, compare frameworks

A platform where users publish their Loophole sessions — the principles they started with, every attack, every resolution, and every escalation decision. A semantic search engine over the corpus of escalated cases would let researchers find structurally similar dilemmas across different users. A "moral consistency score" could compare two users' decision patterns. The aggregate data would constitute the world's largest empirical dataset on human moral reasoning under adversarial pressure.

Dataset potential: millions of human decisions on adversarially-generated moral dilemmas — unprecedented training data for value-aligned AI systems.

↗ Click to expand

🔢

Formal Verification

Mathematical proofs of moral consistency

After a Loophole session produces a stable legal code, translate it into formal logic (modal logic, deontic logic) and use theorem provers to check properties: Is it internally consistent? Does it entail any unintended consequences when combined with standard background assumptions? This bridges LLM-generated natural language with rigorous mathematical verification — the first step toward provably consistent AI ethics systems.

Technical challenge: mapping natural language legal code to deontic logic is itself a hard NLP problem. But LLMs are increasingly capable of this translation.

↗ Click to expand

🔄

Meta-Constitutional

Let the system improve its own constitution

Currently, humans provide the initial principles. What if a meta-level agent watched the escalation patterns across many sessions and proposed amendments to the initial moral principles themselves? "Users who wrote Principle X consistently resolve escalated cases in ways that imply Principle Y is also held. Consider adding it explicitly." This is recursive constitutional refinement — the constitution improving itself based on revealed preferences.

Risk: recursive self-modification without stable axioms could converge to arbitrary values. Requires careful human oversight at the meta-level.

↗ Click to expand

🔀

Cross-Domain Transfer

Apply privacy learnings to speech, property

The loopholes discovered in a privacy session often have structural analogues in other moral domains. The "re-identification" loophole in privacy has a speech analogue (technically anonymous statements that are clearly attributable). Can the system automatically identify these cross-domain parallels and pre-populate new sessions with structurally similar test cases? This would accelerate convergence and reveal deep structural patterns in moral reasoning across domains.

Research insight: If privacy and speech domains share loophole structure, this suggests underlying patterns in how humans reason about rights and exceptions.

↗ Click to expand

🎮

Real-Time Multiplayer

Multiple humans debate escalated cases

When a case is escalated, instead of one human deciding, put it to a panel of 5-100 humans who debate and vote. Show them each other's arguments. Let them see how their vote compares to aggregate results. The final decision could be a supermajority (67%), or a consensus mechanism, or a weighted vote by domain expertise. This makes Loophole a platform for collective moral deliberation — more like a constitutional convention than individual decision-making.

Democratic theory implication: this implements a form of deliberative democracy for AI value alignment — the AI constitution is built by actual collective human deliberation.

↗ Click to expand

Legal System Evolution, Compressed

Write Principles

Draft Code

Attack

Patch & Repeat

Four Minds in Adversarial Loop

The Legislator

The Loophole Finder

The Research Aggregation Exploit

The Overreach Finder

The Emergency Room Prohibition

The Judge

One Cycle, Step by Step

Input: Privacy Moral Principles

⚖️ Legislator Drafts Legal Code

🔴 Loophole Finder Attacks

The Dark Pattern Compliance Trap

The "Statistical Profile" Loophole

The Foreign Entity Shell

🟡 Overreach Finder Attacks

The Unconscious Patient

The Missing Child Alert

The Epidemiologist Dilemma

⚖ Judge Evaluates Each Case

Should facial recognition be allowed to find a missing child?

🔧 Code Patched → Round 2 Begins

System Architecture

Connections to Bigger Ideas

🤖 Constitutional AI — Same Architecture, Transparent

⚖️ Common Law Evolution — 235 Years in 5 Minutes

🔐 AI Red-Teaming — Security-Style Testing for Ethics

🧠 Moral Philosophy — Discovering What You Actually Believe

Taking It Further

Multi-Model Battles

Tournament Mode

Coalition Attacks

Historical Injection

Complexity Metrics

Community Platform

Formal Verification

Meta-Constitutional

Cross-Domain Transfer

Real-Time Multiplayer

File Explorer