Multi-Agent Epistemic Research System

Truth has
two sides.

Two AI agents independently research opposing sides of any question using live web sources, structured belief graphs, and semantic contradiction detection. Not another chatbot — a belief accumulation engine.

Run a Debate → See how it works

50–80

Real sources per debate

Collaborative agents

0.41

Honest clarity on contested topics

Live preview

Watch beliefs form in real time

Every piece of content ingested is parsed into typed, confidence-weighted belief nodes. The system knows what it doesn't know.

debate-engine-two.vercel.app — Round 2 / 6

Topic: Are EVs good or bad for the planet? · PRO-EV vs ANTI-EV

PRO-EV — Director: lifecycle emissions

Nature Energy — EV lifecycle carbon study 2024

IEA — Global EV Outlook 2024 report

MIT Climate Portal — battery manufacturing emissions

0.54

ANTI-EV — Director: mining footprint

Science Direct — lithium mining ecosystem impact

Reuters — cobalt supply chain analysis

Yale E360 — battery recycling limitations

0.47

⚡ Semantic contradiction detected

"EVs produce 50–70% less lifecycle CO₂ than ICE vehicles on average grids" ↔ "Manufacturing phase emissions negate EV benefits in coal-heavy grids for up to 8 years"

Process

How a debate runs

Six coordinated phases turn a question into calibrated epistemic understanding — not just a list of arguments.

Config Bootstrap

GPT-4o generates two named sides, a goal statement, 4 investigable knowledge gaps, and seed search queries — before any research begins.

Shared Namespace

PRO and ANTI agents write independently to the same belief graph. The SDK fuses outputs automatically — no manual diffing needed.

Information-Gain Loops

Each round, the judge reads ranked "moves" — next research actions by expected epistemic value. The Director translates these into Exa search queries.

Contradiction Detection

The SDK semantically detects when two beliefs from completely independent sources negate each other — and suppresses clarity accordingly.

Early Exit Logic

Research stops when: director says sufficient, clarity plateaus with contradictions, or both sides reach high clarity. Never runs longer than needed.

Grounded Verdict

The judge calls beliefs.before() — a structured brief injected into GPT-4o. The verdict is grounded in the belief graph, not raw web text.

Capabilities

What makes this different

Traditional LLM research accumulates text. This system accumulates understanding.

🧬

Evolving Knowledge Base

Save debates to your personal belief system. Cumulative stats track beliefs, contradictions, and clarity across all your research sessions over time.

⚡

Semantic Contradiction Detection

Finds when two sourced claims from independent publications negate each other — without either source referencing the other. Pure semantic reasoning.

📊

Calibrated Uncertainty

Clarity scores are epistemic readiness, not quality scores. A 0.41 on a genuinely contested topic is correct behavior — the system knows it doesn't know.

📄

Research Report Export

Generate comprehensive, fully-cited reports in Academic, Executive, or Technical style. Export as PDF or DOCX with one click after any debate.

🌙

Dark / Light / Sepia Themes

Three carefully designed themes with full CSS variable system. Your preference is remembered across sessions.

🔒

Rate Limited by Design

5 debates per hour per IP. Protects API costs while keeping the tool freely accessible. Middleware-level enforcement, not application logic.

Competitive landscape

vs Other research tools

Most tools accumulate text. This one accumulates structured understanding with calibrated uncertainty.

Capability	Belief Engine	ChatGPT Deep Research	Perplexity	Elicit
Live web sources	✓	✓	✓	~
Adversarial agent structure	✓	✗	✗	✗
Belief graph (typed, weighted)	✓	✗	✗	✗
Semantic contradiction detection	✓	✗	✗	✗
Information-gain driven research	✓	~	✗	✗
Calibrated uncertainty output	✓	✗	✗	~
Persistent knowledge base	✓	✗	✗	✗
PDF / DOCX report export	✓	~	✗	~

Epistemic design

The Clarity Score

Clarity is not a quality score. It's epistemic readiness — computed across four independent channels.

clarity = f( decisionResolution, knowledgeCertainty, coherence, coverage )

decisionResolution

Goals met by the research. How many seed objectives have been addressed by the evidence collected.

knowledgeCertainty

Proportion of high-confidence beliefs. Beliefs above 0.70 threshold vs total belief count.

coherence

Inverse of contradictions. Each semantic conflict suppresses this channel — contested topics stay low by design.

coverage

Knowledge gaps closed. Seeded unknowns that have been resolved by evidence during the research loop.

A clarity of 0.41 after 53 ingested sources on a genuinely contested topic is correct behavior — the coherence channel stays suppressed when real epistemic conflict exists. The system knows it doesn't know.

Under the hood

System architecture

Three agents, one shared namespace, one belief graph. The SDK handles fusion, contradiction detection, and move ranking automatically.

User Question

Any debatable topic → GPT-4o config bootstrap

→

generateDebateConfig()

GPT-4o generates: two sides, goal node, 4 gap nodes, seed queries for rounds 1–2

↓

Shared Namespace — thinkn.ai beliefs SDK

PRO Agent

beliefs.after(webContent)

ANTI Agent

beliefs.after(webContent)

Judge Agent

beliefs.read() → before()

↓

debateDirector()

GPT-4o reads world.moves[] ranked by information gain → writes Exa queries

→

Exa Neural Search

Live web retrieval · 50–80 sources per debate · deduplication across all rounds

→

GPT-4o Verdict

judge.before() injects belief graph brief · streamed to UI via SSE

Truth has two sides.