How Archie Works
A LangGraph agentic pipeline with streaming Groq inference, live web search, and persistent session memory.
Pipeline Flow
Node Breakdown
loadSession
No LLM · pure I/O
- Queries Neon DB for existing session (name, email, importantInfo)
- Reads knowledge.json → kbJsonToText() converts JSON to clean == SECTION == text
- Builds numbered user-question list for session memory introspection
orchestrator
Groq LLM · intent router
- Fast-path regex catches trivial greetings — skips all agent calls
- LLM decomposes message into typed tasks: search | internal_kb
- Detects personal mode (emotional) and session mode (introspection)
- Multi-intent: one message can spawn multiple parallel tasks
queryRewriter
Groq LLM · search optimiser
- Runs one Groq call per search task — all in parallel via Promise.all
- Rewrites vague intent into a precise 3–8 word Google query
- Resolves pronoun references using conversation history
- KB tasks bypass rewriting — passed through unchanged
agents
Parallel execution · Promise.all
- Search agent: calls Serper API → returns top 5 results + image cards
- KB agent: keyword match scoring on kbJsonToText output (≥30% → return full profile)
- Both run simultaneously — total latency = slowest single agent, not sum
- Results tagged by intent for multi-source attribution in synthesis
judge
No LLM · decision logger
- Inspects final combined context — logs whether it is sufficient
- Populates thinkingLogs array shown in Thinking Mode UI
- No LLM call — pure state inspection
Synthesis + Streaming
Groq streaming · Archie persona
- Builds layered system prompt: Archie persona → mode addendum → grounding rules
- KB context: "answer only from this profile, do not invent"
- Search context: "live results, use them, cite as [[N]](url)"
- Streams llama-3.3-70b-versatile response as ReadableStream to client
- Client parses: [THINKING], [SEARCH_DATA], [SUGGESTIONS], plain text
Session Memory
Each browser session gets a unique ID. Every conversation turn is appended to a JSONB array in Neon DB. A secondary Groq call extracts the user's name, email, and notable facts from their message — stored alongside the session. On the next turn, loadSession re-reads this, letting Archie remember who you are and personalise responses.
Other DB tables (posts, admins, subscribers) belong to the blog and newsletter — separate from the chatbot pipeline.
If You Want to Scale Up
Current stack is fully free. These are paid upgrades for production-grade robustness — not necessary for a portfolio.
| Component | Current (Free) | Production Upgrade |
|---|---|---|
| KB Search | Keyword match (≥30% score) | Pinecone / Weaviate — semantic vector search |
| LLM Inference | Groq free tier · 100k tok/day | Groq paid / OpenAI GPT-4o — higher rate limits |
| Rate Limiting | None | Upstash Redis — sliding window per IP |
| Auth | Client-generated session IDs | Clerk / NextAuth — verified user sessions |
| Observability | console.error + thinkingLogs | LangSmith — full LangGraph trace monitoring |
| Embeddings | None | OpenAI text-embedding-3-small — semantic KB |