When you ask Claude about your own product, where should it look — your conversation history or a structured knowledge graph? We ran 10 real questions against both. Here's what happened.
The setup: Over several weeks of building ntxt.ai, every significant product decision was logged to an MCP-connected knowledge graph (the ntxt graph) — pricing changes, messaging pivots, homepage copy, onboarding design, scraping strategy. Simultaneously, those same decisions were discussed in Claude conversation threads, which are now searchable via Claude's chat search tool.
The test: 10 identical questions were asked against both sources. Each answer was scored on three axes: Accuracy (correct answer), Recency (most up-to-date version), and Completeness (full picture including rationale).
rationale field that forces the why to be captured alongside the what. That's already on the ntxt roadmap. Q4 is exactly the test case that justifies it.| Q | Question | MCP Score | Chat Score | Winner |
|---|---|---|---|---|
| Q1 | Current pricing structure | MCP | ||
| Q2 | Active subreddits | MCP | ||
| Q3 | Hero headline | Tie | ||
| Q4 | Why Team plan dropped | Chat | ||
| Q5 | Why drop "context graph" | MCP | ||
| Q6 | Why Telegram over email | MCP | ||
| Q7 | Team plan availability (evolved) | MCP | ||
| Q8 | Free tier evolution | MCP | ||
| Q9 | Anti-patterns / what NOT to do | Tie | ||
| Q10 | Onboarding technical spec | MCP |
The core finding: Chat search optimizes for richness. MCP optimizes for correctness. But there's a third dynamic Q3 exposed: chat search covers your logging blind spots. Every decision you made in conversation and didn't stop to commit — the copy iteration, the quick pivot, the idea you moved on from — lives only in chat history. The graph knows what you told it. Chat search knows everything you said.
The dangerous failure mode of chat search is returning the loudest conversation rather than the latest decision. The dangerous failure mode of MCP is a false sense of completeness — assuming the graph holds everything, when it only holds what you chose to log.
If you're asking "what is X right now?" — use MCP. The graph holds committed, versioned decisions. It won't return the old pricing tier, the dropped subreddit list, or the headline you iterated past.
If you're asking "why did we decide X?" or "what were we thinking when we chose X over Y?" — use chat search. The conversation holds the reasoning, the alternatives considered, and the context that never made it into a node summary.
The practical workflow: commit decisions to the graph with rich summaries that capture the rationale, not just the outcome. A node that says "Team plan dropped — to keep things simple" scores 2/3. A node that says "Team plan dropped — revenue projections showed insufficient uplift to justify the added launch complexity; revisit Q3" scores 3/3 and makes chat search redundant for that question.
The deeper implication: the gap between MCP and chat search isn't a tool problem — it's a writing problem. The graph is only as smart as what you put into it. And now that Claude has both retrieval paths available simultaneously, the interesting next question is whether it can learn to route between them intelligently — reaching for the graph when state matters, and for conversation history when story matters.