Enterprise Multi-Bot Platform

Beyond a
Chatbot Widget

An enterprise-grade multi-bot AI platform with unlimited workspaces, isolated data per bot, membership-aware access control, and 40+ AI models. No per-bot fees, no per-user charges, no per-message caps.

80+
Sources per Deep Query
3,072
Embedding Dimensions
<5s
Quick Mode Response
99.9%
Uptime SLA
Live Platform Demo

See It In Action

This is not a mockup. This is a production deployment serving thousands of members with tiered content access, real-time streaming, and cross-session memory.

Enterprise AI Assistant
Membership-aware knowledge retrieval
Online

How can I help you today?

Ask anything about your organization's knowledge base, policies, and documents.

Compare our health insurance plans
What changed in the Q4 compliance policy?
Show me a cost breakdown for Mexico residency
Summarize the latest board meeting notes

Compare the tax benefits of Portugal vs Panama for US citizens

Classifying intent... Retrieving from 4 collections... Generating response...

Based on 12 verified sources from your knowledge base, here is a detailed comparison:

Tax_Guide_Portugal_2025.pdf
Panama_Residency_Overview.pdf
US_Expat_Tax_Obligations.pdf
Intent: Comparison 12 Sources High Confidence
Send a message...
Powered by Private RAG Infrastructure
12 Enterprise Capabilities

What Makes This an Enterprise Solution

Every component is purpose-built for organizations that need unlimited bot workspaces, isolated data, and more than a simple Q&A widget — without per-bot or per-user pricing.

01

Membership Authentication & Tiered Access

Token-based authentication integrating with WordPress, custom CMS, or any membership platform. Per-user identity with group-based access control routes queries to the correct content collection. Content is physically separated into distinct vector database collections per tier.

Per-user identity on every request
Group-based collection routing
Admin-aware elevated access
IP-based free-tier rate limiting
02

Multi-Collection Knowledge Base

Content organized into separate vector database collections per membership tier. Each independently indexed, updated, and searchable. Supports articles, PDFs, video transcripts, podcasts, HTML, JSON, and CSV data.

Independent per-tier collections
Incremental CMS sync without re-indexing
Multi-format content support
Content deduplication & tracking
03

Persistent Session Management

Every conversation tied to a unique session ID with full multi-turn context. Chat history stored in PostgreSQL, surviving server restarts. Sessions track message count, timestamps, user identity, access tier, detected intents, and IP addresses.

PostgreSQL-backed persistence
Multi-turn context preservation
Auto-generated session titles
Searchable via admin dashboard
04

Cross-Session User Memory

Graph-based knowledge store (Graphiti) maintains long-term memory across sessions. Key facts stored as episodes in each user's memory graph. Subsequent visits retrieve relevant history for personalized responses. Memory is scoped per user per collection — zero cross-user leakage.

Graph-based episode storage
Cross-session fact retrieval
Per-user memory isolation
Personalized context injection
05

Async Job Queue with SSE Streaming

Chat requests queued as jobs in PostgreSQL, processed by a dedicated background worker. Real-time progress streamed via Redis pub/sub and Server-Sent Events. 10 granular progress stages from job creation to completion.

PostgreSQL job queue
Redis pub/sub SSE streaming
10 granular progress stages
Multi-worker concurrency
06

Intent Classification System

Hybrid classifier (regex + LLM fallback) analyzes each query before response generation. Detected intents trigger specialized system prompts optimized for tables, comparisons, cost breakdowns, how-to guides, recommendations, and resource location.

Pattern matching + LLM fallback
7 specialized prompt templates
Configurable confidence threshold
Per-message intent analytics
07

Deep Research vs Quick Mode

Two response modes give users control over depth vs speed. Quick Mode scans 15 sources in ~5 seconds. Deep Research scans 80 sources with 20 context documents and up to 16,000 tokens for comprehensive analysis.

08

AI-Powered Web Search

Augments knowledge base answers with live web data. LLM generates targeted queries, executed via Tavily API. Results compared against knowledge base context to detect outdated information. Sources scored on domain authority, freshness, and consensus.

Multi-query execution via MCP
Date-filtered retrieval
Domain authority scoring
HIGH/MEDIUM/LOW confidence metrics
09

Smart Multi-Entity Retrieval

Automatically detects comparison queries, decomposes them into sub-queries, retrieves separately from the vector store for each entity, then merges and deduplicates results for balanced coverage.

10

Source Citations & Rich Metadata

Every response includes title, URL, date, content type, and relevance score. Web sources include domain authority. Dates normalized to ISO format. Sources deduplicated across knowledge base and web. 6 contextual follow-up questions generated per response.

11

Admin Analytics Dashboard

Login-protected admin interface for monitoring chatbot usage. Session browser with filtering by user, date, and collection. Full conversation transcripts with metadata. User analytics including message counts, active sessions, and usage patterns.

12

Production Operations

Gunicorn with 4 Uvicorn async workers for concurrent requests. Separate systemd-managed worker process. Health monitoring endpoint, comprehensive logging, graceful shutdown, CORS whitelisting, and configurable LLM/embedding backends.

Two Modes, One Pipeline

Quick Mode

Default
15 sources scanned
Top 5 used for synthesis
4,000 max tokens
~5-10 seconds
Fast factual lookups

Deep Research

Power Mode
80 sources scanned
Top 20 used for synthesis
16,000 max tokens
~15-30 seconds
In-depth analysis & research
System Architecture

Production-Grade Infrastructure

Every component is battle-tested in production, serving real users with real membership tiers and real content archives.

Frontend (CMS / Portal)
POST /start_job
FastAPI + Gunicorn
Auth, rate limiting, job creation, SSE streaming
Redis pub/sub
Background Worker
Intent classification, vector retrieval, LLM generation
Qdrant Vector DB
Document chunks + 3072-dim embeddings
PostgreSQL
Sessions, messages, job queue
Redis
Pub/sub SSE, rate limiting
Graphiti Memory
Cross-session knowledge graph
Technology Stack

Built With Best-in-Class Tools

FastAPI
API Framework
LlamaIndex
RAG Framework
Qdrant
Vector Database
PostgreSQL
Chat & Job Store
Redis
Pub/Sub & Caching
Graphiti
Long-term Memory
OpenRouter
LLM Gateway
Gemini Embeddings
3072-dim Vectors
Gunicorn
WSGI Server
Uvicorn
ASGI Workers
Tavily
Web Search API
Jinja2
Dashboard Templates
systemd
Process Manager
Python 3.10
Runtime
MCP Protocol
Tool Integration
Platform Comparison

Per-Bot Pricing vs Unlimited Platform

See why organizations choose our unlimited multi-bot platform over competitors that charge per bot, per user, and per message.

Capability
Simple Widget
FRENZY.BOT Enterprise
Answer from knowledge base
Membership authentication
Tiered content access control
Persistent multi-turn sessions
Cross-session user memory
Async job queue + SSE streaming
Intent-based prompt optimization
Deep research vs quick mode
Live web search + source validation
Smart multi-entity comparison
Source citations with metadata
Admin dashboard with analytics
Rate limiting for free users
Multi-worker concurrent processing
Configurable LLM backend
Incremental content sync from CMS
Production Case Study

Deployed & Battle-Tested

Currently powering the AI knowledge assistant for a major financial publishing platform with thousands of paying members and multiple content tiers.

4
Content Collections
1000s
Active Members
7
Intent Types
24/7
Availability

Multi-Tier Content

Free, Standard, Premium, and VIP content collections — each independently indexed and access-controlled per member subscription.

Content Types Indexed

Articles, newsletters, podcast transcripts, video transcripts, country profiles, CSV data, and HTML archives — all searchable via RAG.

Real-Time Streaming

10-stage progress updates streamed to the frontend via SSE. Users see exactly what the system is doing at every step.

Personalized Memory

Returning members get context-aware responses that reference their previous conversations and stated preferences.

Start Your Multi-Bot Deployment

Tell us about your organization, bot workspace requirements, and content volume. Our engineering team will prepare a custom architecture proposal.

NDA available on request
Custom architecture proposal
No commitment required