Enterprise Multi-Bot Platform

Beyond a
Chatbot Widget

An enterprise-grade multi-bot AI platform with unlimited workspaces, isolated data per bot, membership-aware access control, and 40+ AI models. No per-bot fees, no per-user charges, no per-message caps.

Book a Technical Review See Architecture

80+

Sources per Deep Query

3,072

Embedding Dimensions

<5s

Quick Mode Response

99.9%

Uptime SLA

Live Platform Demo

See It In Action

This is not a mockup. This is a production deployment serving thousands of members with tiered content access, real-time streaming, and cross-session memory.

Enterprise AI Assistant

Membership-aware knowledge retrieval

Online

How can I help you today?

Ask anything about your organization's knowledge base, policies, and documents.

Compare our health insurance plans

What changed in the Q4 compliance policy?

Show me a cost breakdown for Mexico residency

Summarize the latest board meeting notes

Compare the tax benefits of Portugal vs Panama for US citizens

Classifying intent... Retrieving from 4 collections... Generating response...

Based on 12 verified sources from your knowledge base, here is a detailed comparison:

Tax_Guide_Portugal_2025.pdf

Panama_Residency_Overview.pdf

US_Expat_Tax_Obligations.pdf

Intent: Comparison 12 Sources High Confidence

Send a message...

12 Enterprise Capabilities

What Makes This an Enterprise Solution

Every component is purpose-built for organizations that need unlimited bot workspaces, isolated data, and more than a simple Q&A widget — without per-bot or per-user pricing.

Membership Authentication & Tiered Access

Token-based authentication integrating with WordPress, custom CMS, or any membership platform. Per-user identity with group-based access control routes queries to the correct content collection. Content is physically separated into distinct vector database collections per tier.

Per-user identity on every request

Group-based collection routing

Admin-aware elevated access

IP-based free-tier rate limiting

Multi-Collection Knowledge Base

Content organized into separate vector database collections per membership tier. Each independently indexed, updated, and searchable. Supports articles, PDFs, video transcripts, podcasts, HTML, JSON, and CSV data.

Independent per-tier collections

Incremental CMS sync without re-indexing

Multi-format content support

Content deduplication & tracking

Persistent Session Management

Every conversation tied to a unique session ID with full multi-turn context. Chat history stored in PostgreSQL, surviving server restarts. Sessions track message count, timestamps, user identity, access tier, detected intents, and IP addresses.

PostgreSQL-backed persistence

Multi-turn context preservation

Auto-generated session titles

Searchable via admin dashboard

Cross-Session User Memory

Graph-based knowledge store (Graphiti) maintains long-term memory across sessions. Key facts stored as episodes in each user's memory graph. Subsequent visits retrieve relevant history for personalized responses. Memory is scoped per user per collection — zero cross-user leakage.

Graph-based episode storage

Cross-session fact retrieval

Per-user memory isolation

Personalized context injection

Async Job Queue with SSE Streaming

Chat requests queued as jobs in PostgreSQL, processed by a dedicated background worker. Real-time progress streamed via Redis pub/sub and Server-Sent Events. 10 granular progress stages from job creation to completion.

PostgreSQL job queue

Redis pub/sub SSE streaming

10 granular progress stages

Multi-worker concurrency

Intent Classification System

Hybrid classifier (regex + LLM fallback) analyzes each query before response generation. Detected intents trigger specialized system prompts optimized for tables, comparisons, cost breakdowns, how-to guides, recommendations, and resource location.

Pattern matching + LLM fallback

7 specialized prompt templates

Configurable confidence threshold

Per-message intent analytics

Deep Research vs Quick Mode

Two response modes give users control over depth vs speed. Quick Mode scans 15 sources in ~5 seconds. Deep Research scans 80 sources with 20 context documents and up to 16,000 tokens for comprehensive analysis.

AI-Powered Web Search

Augments knowledge base answers with live web data. LLM generates targeted queries, executed via Tavily API. Results compared against knowledge base context to detect outdated information. Sources scored on domain authority, freshness, and consensus.

Multi-query execution via MCP

Date-filtered retrieval

Domain authority scoring

HIGH/MEDIUM/LOW confidence metrics

Smart Multi-Entity Retrieval

Automatically detects comparison queries, decomposes them into sub-queries, retrieves separately from the vector store for each entity, then merges and deduplicates results for balanced coverage.

Source Citations & Rich Metadata

Every response includes title, URL, date, content type, and relevance score. Web sources include domain authority. Dates normalized to ISO format. Sources deduplicated across knowledge base and web. 6 contextual follow-up questions generated per response.

Admin Analytics Dashboard

Login-protected admin interface for monitoring chatbot usage. Session browser with filtering by user, date, and collection. Full conversation transcripts with metadata. User analytics including message counts, active sessions, and usage patterns.

Production Operations

Gunicorn with 4 Uvicorn async workers for concurrent requests. Separate systemd-managed worker process. Health monitoring endpoint, comprehensive logging, graceful shutdown, CORS whitelisting, and configurable LLM/embedding backends.

Two Modes, One Pipeline

Quick Mode

Default

15 sources scanned

Top 5 used for synthesis

4,000 max tokens

~5-10 seconds

Fast factual lookups

Deep Research

Power Mode

80 sources scanned

Top 20 used for synthesis

16,000 max tokens

~15-30 seconds

In-depth analysis & research

System Architecture

Production-Grade Infrastructure

Every component is battle-tested in production, serving real users with real membership tiers and real content archives.

Frontend (CMS / Portal)

POST /start_job

FastAPI + Gunicorn

Auth, rate limiting, job creation, SSE streaming

Redis pub/sub

Background Worker

Intent classification, vector retrieval, LLM generation

Qdrant Vector DB

Document chunks + 3072-dim embeddings

PostgreSQL

Sessions, messages, job queue

Redis

Pub/sub SSE, rate limiting

Graphiti Memory

Cross-session knowledge graph

Technology Stack

Built With Best-in-Class Tools

FastAPI

API Framework

LlamaIndex

RAG Framework

Qdrant

Vector Database

PostgreSQL

Chat & Job Store

Redis

Pub/Sub & Caching

Graphiti

Long-term Memory

OpenRouter

LLM Gateway

Gemini Embeddings

3072-dim Vectors

Gunicorn

WSGI Server

Uvicorn

ASGI Workers

Tavily

Web Search API

Jinja2

Dashboard Templates

systemd

Process Manager

Python 3.10

Runtime

MCP Protocol

Tool Integration

Platform Comparison

Per-Bot Pricing vs Unlimited Platform

See why organizations choose our unlimited multi-bot platform over competitors that charge per bot, per user, and per message.

Capability

Simple Widget

FRENZY.BOT Enterprise

Answer from knowledge base

Membership authentication

Tiered content access control

Persistent multi-turn sessions

Cross-session user memory

Async job queue + SSE streaming

Intent-based prompt optimization

Deep research vs quick mode

Live web search + source validation

Smart multi-entity comparison

Source citations with metadata

Admin dashboard with analytics

Rate limiting for free users

Multi-worker concurrent processing

Configurable LLM backend

Incremental content sync from CMS

Production Case Study

Deployed & Battle-Tested

Currently powering the AI knowledge assistant for a major financial publishing platform with thousands of paying members and multiple content tiers.

Content Collections

1000s

Active Members

Intent Types

24/7

Availability

Multi-Tier Content

Free, Standard, Premium, and VIP content collections — each independently indexed and access-controlled per member subscription.

Content Types Indexed

Articles, newsletters, podcast transcripts, video transcripts, country profiles, CSV data, and HTML archives — all searchable via RAG.

Real-Time Streaming

10-stage progress updates streamed to the frontend via SSE. Users see exactly what the system is doing at every step.

Personalized Memory

Returning members get context-aware responses that reference their previous conversations and stated preferences.

Start Your Multi-Bot Deployment

Tell us about your organization, bot workspace requirements, and content volume. Our engineering team will prepare a custom architecture proposal.

Full Name

Work Email

Company / Organization

Describe your content, user tiers, and requirements

NDA available on request

Custom architecture proposal

No commitment required

Beyond a Chatbot Widget