> **Usage**: This agent is referenced by the `/aggregate-people-team-faqs` slash command. When the command runs, it follows the deduplication logic defined in this file. This is NOT a Task tool agent - it's a custom workflow component executed directly by slash commands.
Compares new question clusters against existing FAQs to identify genuinely novel questions for addition.
/plugin marketplace add Uniswap/ai-toolkit/plugin install development-codebase-tools@uniswap-ai-toolkitUsage: This agent is referenced by the
/aggregate-people-team-faqsslash command. When the command runs, it follows the deduplication logic defined in this file. This is NOT a Task tool agent - it's a custom workflow component executed directly by slash commands.
Role: Perform semantic deduplication by comparing new question clusters against existing Notion FAQ entries to identify truly novel questions.
Given two sets of data:
Identify which new questions are genuinely novel versus already covered in the existing database. Use semantic understanding to detect duplicates even when phrased differently.
JSON array from question-extractor agent:
[
{
"canonical_question": "...",
"suggested_category": "...",
"frequency": 3,
"source_channels": [...],
"confidence": "high",
"example_messages": [...],
"cluster_size": 3,
"notes": "..."
}
]
List of questions currently in the Notion database:
[
{
"question": "The existing question from Notion",
"category": "Supplies|Process|Facilities|Office",
"status": "New|Answered|Documented",
"source_channels": ["nyc-office", "support"]
}
]
For each new question cluster, determine if it:
✅ New: "Where are spare HDMI cables stored?" Existing: "Where can I find AV equipment and cables?" → DUPLICATE (same information need, existing is broader but covers it)
✅ New: "How do I submit expense reports?" Existing: "What's the process for filing expenses?" → DUPLICATE (same process, just worded differently)
✅ New: "What are office hours on Fridays?" Existing: "When is the office open?" → DUPLICATE (Friday hours would be covered in general office hours)
❌ New: "How do I connect my laptop to the projector?" Existing: "Where are spare HDMI cables stored?" → SEPARATE (different questions - one is procedure, one is location)
❌ New: "What's the budget limit for lunch orders?" Existing: "How do I order lunch for the team?" → SEPARATE (related but require different answers)
❌ New: "Who do I contact about broken chairs?" Existing: "How do I request new office furniture?" → SEPARATE (repair vs. new purchase - different processes)
Semantic equivalence matters more than exact wording
Broader existing questions can cover specific new ones
Different aspects require separate entries
Consider the answer you'd give
Update frequency if duplicate
Return a JSON array analyzing each new question:
[
{
"new_question": "The canonical question from the new cluster",
"decision": "NEW|DUPLICATE|RELATED",
"confidence": "high|medium|low",
"reasoning": "Clear explanation of why this decision was made",
"matching_existing_question": "The existing Notion question if DUPLICATE/RELATED, null if NEW",
"recommendation": "Specific recommendation for action",
"metadata": {
"frequency": 3,
"source_channels": [...],
"suggested_category": "...",
"example_messages": [...]
}
}
]
Err on the side of considering things duplicates
High confidence when:
Low confidence when:
Flag for manual review:
[
{
"new_question": "Where are spare HDMI cables stored in the office?",
"decision": "DUPLICATE",
"confidence": "high",
"reasoning": "Existing FAQ 'Where can I find AV equipment and cables?' covers this specific question. HDMI cables would be included in AV equipment.",
"matching_existing_question": "Where can I find AV equipment and cables?",
"recommendation": "Skip - already covered by existing FAQ",
"metadata": {
"frequency": 3,
"source_channels": [{"name": "nyc-office", "id": "CPUFYKWLE"}],
"suggested_category": "Supplies",
"example_messages": [...]
}
},
{
"new_question": "How do I report a broken office chair?",
"decision": "NEW",
"confidence": "high",
"reasoning": "No existing FAQ covers the process for reporting broken furniture. The existing 'How do I request new office furniture?' is about new purchases, not repairs.",
"matching_existing_question": null,
"recommendation": "Add as new FAQ entry",
"metadata": {
"frequency": 2,
"source_channels": [{"name": "nyc-office", "id": "CPUFYKWLE"}],
"suggested_category": "Facilities",
"example_messages": [...]
}
},
{
"new_question": "What's the lunch ordering budget per person?",
"decision": "RELATED",
"confidence": "medium",
"reasoning": "Existing FAQ 'How do I order lunch for the team?' covers the ordering process but doesn't specify budget limits. This could be a sub-point of that FAQ or a separate entry.",
"matching_existing_question": "How do I order lunch for the team?",
"recommendation": "Consider adding if budget is a frequent question, or update existing FAQ to include budget info",
"metadata": {
"frequency": 4,
"source_channels": [{"name": "nyc-office-lunch", "id": "C022ZNC5NP5"}],
"suggested_category": "Process",
"example_messages": [...]
}
}
]
Return only the JSON array of deduplication analysis. No additional explanation unless there are errors or ambiguities that need clarification.
After analysis, only return questions where decision is "NEW" and confidence is "high" or "medium" - these are the ones that will actually be added to Notion. Questions marked as "DUPLICATE" or "RELATED" with "low" confidence should be excluded from the final output.
Designs feature architectures by analyzing existing codebase patterns and conventions, then providing comprehensive implementation blueprints with specific files to create/modify, component designs, data flows, and build sequences