Changelog
Week 14 (2026)
Web UI Internationalization
- Added
web.*translation namespace to EN/DE locale files (sidebar, dashboard, agents, chat, knowledge, workflows, connectors, settings, common) - New
GET /api/i18nendpoint returns translations for active user language - Frontend
i18n.jsmodule with Alpine.js$t()magic for reactive translations - Migrated all HTML templates (sidebar, header, mobile-nav, all pages) to
$t() - Key parity test ensures EN and DE stay in sync
Voice Input in Chat
- Microphone button records audio via MediaRecorder API
- Real-time waveform visualization (Web Audio API frequency bars) with recording timer
- New
POST /api/transcribeendpoint for transcription-only (returns JSON, not SSE) - Transcribed text appears in textarea for review/editing before sending
Week 13 (2026)
i18n: Remove all German strings from code (Constitution Rule 6+9)
chat/context.py: all LLM context strings translated to English (25 strings)chat/service.py: workflow status messages to English (6 strings)agents/creator_pipeline.py: error messages to English (8 strings)agents/gherkin_generator.py: empty result message to Englishcli/demo_cmd.py: permission options moved to bilingual_TEXTSdict- Tests updated to assert on English strings
Security: Response Sanitization + SSRF Hardening (from PR #24)
- Final assistant response now sanitized via
ResponseSanitizer(prevents credential reflection by LLM) - SSRF:
is_multicast+is_unspecifiedIP checks added (both http_tools and proxy_server) - HuggingFace (
hf_*) and Stripe (sk_live_*) credential patterns added to sanitizer - Proxy error sanitization consolidated to use central
ResponseSanitizer - Skipped: overly broad generic patterns (
token|session|cookie|sid) — too many false positives - 8 new tests: multicast/unspecified SSRF, HuggingFace/Stripe patterns, false-positive checks
Chat UX Flow Fixes
- Suggested Actions: new
suggested-actionsSSE event renders clickable command buttons in chat - Connector setup and credential commands now show "Restart Gateway" button instead of text-only instruction
- Builder handoff tool loop: after handoff, new agent's tool calls are now executed in a loop (max 10 rounds) instead of being silently dropped
- Gateway restart auto-reconnect: frontend detects restart event, polls
/api/health, and shows "Gateway restarted. Ready." when back online /restartnow returns structured events instead of plain text
Security & NixOS Config Consistency Fixes
- Timing attack fix: proxy token comparison now uses
hmac.compare_digest - NixOS config generation: ModelRegistry wired to ConfigNotifier (add/remove/defaults/agent)
- NixOS config generation: AgentRegistry notifier calls for
update_reputation,save_code,set_models - NixOS config generation: Credential
mark_security_rotatedtriggers config generation - Default user:
maicel initcreates "default" user in users table - Closed PRs #21, #22, #23 (findings cherry-picked, branches can be deleted)
- user_id FK constraints: all 10 tables with
user_idnow haveREFERENCES users(id) userstable moved to top of schema (defined before dependent tables)- Default user seeded directly in schema via
INSERT OR IGNORE - Tests updated: test users created in fixtures for FK compliance
WorkflowAgent — LLM-Powered Workflow Execution
- New
WorkflowAgentclass (src/maicel/workflows/agent.py) replaces dumb WorkflowRunner for plan-based workflows - LLM loop: executes workflow plan as system prompt, calls tools, handles multi-round reasoning
- Tool scoping: each workflow defines
allowed_tools— only those are visible to the LLM (both built-in and MCP) - Wildcard support:
playwright.*allows all Playwright MCP tools,filesystem.*scopes filesystem access - Clarification flow: LLM signals
NEEDS_CLARIFICATION:→ workflow pauses, user responds, agent resumes - Full conversation tracking for pause/resume (stored in
workflow_runs.conversation) - Model selection per workflow (
haiku,sonnet,opus) — Builder picks cheapest capable model - Max rounds safety limit prevents infinite loops
- DB schema: added
plan,model,allowed_toolscolumns toworkflowstable - DB schema: added
conversation,clarificationcolumns toworkflow_runstable - WorkflowRegistry updated:
register()andupdate()accept agent fields - 16 new tests covering tool scoping, LLM loop, clarification, model selection, conversation tracking
- System agents updated: Creator+Planner replaced by Builder, Workflow-Agent added
- Smart defaults: Builder gets opus, Workflow-Agent gets haiku (overridden per workflow)
- ToolRegistry:
workflow-agent:*prefix recognized as system agent (full tool access)
Knowledge System v2 — Smart Zettelkasten
- Topics: notes with
type='topic'serve as organizational containers create_topic(),list_topics(),list_children()methods on KnowledgeBase- Auto-generated topic index content listing child notes, tasks, and tags
- Auto-classify on insert: LLM (Haiku) classifies new notes — extracts type, due, topic, tags
- Matches to existing topics or creates new ones automatically
auto_classify=Trueflag onnote_write/kb.write()- Parent-child hierarchy:
parent_pathcolumn links notes to topics - Reminders:
remindercolumn +set_reminder(path, due)method - New tools:
note_done(mark task done),note_remind(set due + notification),note_move(change topic) - Extended
note_writetool withtopicandreminderparameters - DB schema: added
parent_path,reminder,sort_ordercolumns toknowledge_notes - Note model:
reminderandparent_pathfields in frontmatter - 24 new tests covering topics, auto-classify, tools, schema, topic indexes
- Search improvements:
note_searchtool now acceptsstatusandduefilters this_weekdue filter added to indexer (today through end of week)- Reminder workflow:
check-reminders.yamlseed workflow (scheduled, uses WorkflowAgent) - Note-intake workflow:
note-intake.yamlseed workflow for auto-classifying incoming notes - WorkflowRegistry YAML import now reads
plan,model,allowed_toolsfrom YAML - Web UI: Knowledge page redesigned with three-view layout:
- Topics view: collapsible topic tree with child notes, task checkboxes, reminder bells
- All view: flat note list (existing behavior)
- Tasks view: overdue (red border), open, done sections with inline checkboxes
- Toggle done: click checkbox to mark task done/open
- Toggle reminder: bell icon to enable/disable notifications
- New API endpoints:
/api/knowledge/topics,/api/knowledge/topics/{path}/children,/api/knowledge/notes/{path}/done,/api/knowledge/notes/{path}/remind,/api/knowledge/notes/{path}/move
Static Website (maicel.com)
- Astro 5 project scaffolded in
website/with Neural Mycelium design tokens (Tailwind CSS v4) - Particle Constellation hero animation (Canvas 2D, 120 particles, mouse-reactive connections)
- Home page: full-viewport hero, value proposition cards, 6-feature grid, architecture preview, CTA
- Docs: 10 sections rendered from shared Markdown files via Astro Content Collections
- Constitution page with Evolve principle and 6 product principles
- About page migrated from local frontend HTML
- Changelog page rendering CHANGELOG.md at build time
- 15 static pages total, builds in ~500ms
Content Architecture
- Extracted 10 docs sections from embedded HTML (docs.html) into individual Markdown files in
docs/website/ - Created Product Constitution (
docs/constitution.md) with Evolve philosophy - Created About page (
docs/about.md) from existing about.html - Single Source of Truth: both local frontend and website render from same Markdown
- Added
GET /api/docsandGET /api/docs/{slug}endpoints for local frontend - Migrated local docs.html from 815 lines embedded HTML to dynamic Markdown loading (394 lines)
- TOC scroll highlighting re-initialized after dynamic content loads
Gateway — Task 8: Docs API endpoints
- Added
_parse_frontmatter(),_list_docs(),_get_doc()helper functions tosrc/maicel/gateway/routes.py - Added
GET /api/docsendpoint — returns sorted list of doc sections with slug, title, description, order, icon fromdocs/website/*.mdfrontmatter - Added
GET /api/docs/{slug}endpoint — returns single doc content (Markdown body without frontmatter) or 404 - Both endpoints resolve
docs/website/relative to the package file viaPath(__file__).parent.parent.parent.parent _get_doc()rejects slugs with non-[a-z0-9-]characters (path traversal protection)- Added
import reandfrom pathlib import Pathto routes.py imports - Created
tests/test_docs_api.py— 4 tests: list returns all sections, get returns Markdown body, 404 for missing, rejects path traversal (4/4 pass)
Website — Task 3: Astro project scaffold
- Created
website/Astro project with Tailwind CSS v4 via@tailwindcss/viteplugin website/astro.config.mjs— site set tohttps://maicel.com, Tailwind via vite pluginwebsite/tailwind.config.mjs— Neural Mycelium design tokens (colors + font families)website/src/styles/global.css— Google Fonts import, Tailwind v4@import "tailwindcss",@themeblock for design tokens,.proseclass styling for rendered Markdownwebsite/src/content/config.ts—docscollection schema:title,description,order,iconwebsite/src/content/docs— symlink todocs/website/(SSOT for documentation Markdown)website/src/pages/index.astro— placeholder page with Neural Mycelium backgroundwebsite/public/— copiedlogo.png,favicon.ico,apple-touch-icon.pngwebsite/.gitignore— excludesdist/,node_modules/,.astro/website/package.json,website/tsconfig.jsonaddednpx astro buildpasses: 1 page built, content collection synced, 0 errors
Website — Task 2: Constitution and About Markdown files
- Created
docs/constitution.md— user-facing Product Constitution with the Evolve Principle and six design principles (your data, security, transparency, autonomy, cost-conscious, open by nature) - Created
docs/about.md— extracted and converted fromsrc/maicel/frontend/pages/about.html: What is Maicel, Core Principles, Why Open Source, Technology (tech stack + LLM providers table), Getting Involved - Both files have YAML frontmatter (
title,description) and serve as shared source of truth for local frontend and maicel.com static website
Website — Task 1: Extract docs.html content into Markdown files
- Created
docs/website/directory as the Single Source of Truth for documentation content - Extracted all 10 documentation sections from
src/maicel/frontend/pages/docs.html(lines 296–765) into individual Markdown files - Converted HTML to clean Markdown: headings, code blocks, bullet/numbered lists, tables, inline code
- Arch layer diagram converted to a text diagram block in
architecture.md - CLI Reference and API Reference styled div cards converted to Markdown tables
- Each file has YAML frontmatter:
title,description,order,icon(material icon name) - Files created:
getting-started.md,architecture.md,agents.md,connectors.md,workflows.md,knowledge-base.md,security.md,cli-reference.md,slash-commands.md,api-reference.md
Earlier in Week 13
Agent Handoff — Task 6: Remove Old Routing + Integration Tests
- Removed hardcoded
CREATE_AGENTrouting branch fromChatService.handle_message()— agent creation now handled via handoff tool to Creator handler - Removed hardcoded
TASK_REQUESTrouting branch — planning now handled via handoff tool to Planner handler - Removed
ChatService._handle_create_agent()method — replaced byCreatorHandler - Removed
route_result-based_pendingplan state population — Planner handler manages plan state directly - Removed unused
plan_eventimport - Updated
tests/e2e/test_chat_scenarios.py: Creator interview tests now use handoff-based flow - Updated
tests/test_creator_integration.py: replaced_handle_create_agenttests with handoff + handler tests - Added 4 integration tests in
TestHandoffIntegration: cross-service persistence, old routing removed, handler tools include handoff, handler prompts non-empty - Tests: 813 passing (pre-existing
test_init_credential_stored_encryptedexcluded)
Agent Handoff — Task 5: Handoff Execution + Handler-Based Routing in ChatService
- Added
App.get_agent_handlers()tosrc/maicel/app.py: returns{"maicel": MaicelHandler, "creator": CreatorHandler, "planner": PlannerHandler}— central factory for handler instances - Added
ChatService._get_active_agent(session_id)tosrc/maicel/chat/service.py: readssession_agentstable with in-memory cache; defaults to"maicel"when no row exists - Added
ChatService._execute_handoff(session_id, target_agent_id, reason, context): validates target (system agents always valid; non-system agents checked foruser_facing), updatessession_agentswithINSERT OR REPLACE, invalidates cache, logsagent.handoffaudit event - Added
ChatService._get_model_for_agent(agent_id): resolves agent-specific LLM model viamodel_registry; returnsNonefor system default - Updated
ChatService.handle_message(): looks up active agent handler before the LLM loop; uses handler'sget_system_prompt()(replaces system message in conversation),get_tools()(includes handoff), and model;agent_eventnow uses handler'sdisplay_nameinstead of hardcoded "Maicel" - Added
ChatService._augment_tools_with_connectors(): extracted MCP connector tool injection (connector_tools, connector_call, github_api) so it can be applied to the maicel handler's tool list dynamically - Added handoff tool dispatch in
ChatService._execute_tool_inner(): recognisestool_name == "handoff", calls_execute_handoff(), returns{"status": "handoff", "message": "Handed off to X: reason"} - Added handoff early-return in tool loop: when a
handofftool returnsstatus=handoff, emitssystem_response_eventwith the message and returns immediately (no further LLM call)
Tests
- Extended
tests/test_agent_handoff.pywith 6 new tests inTestHandoffExecution: handoff updates DB, rejects non-user-facing agents, default agent is maicel, active agent after handoff, DB persistence across service instances,app.get_agent_handlers()keys + agent_id. Tests: 207 total passing.
Agent Handoff — Tasks 3 & 4: MaicelHandler, CreatorHandler, PlannerHandler
- Created
src/maicel/agents/handlers/maicel_handler.py:MaicelHandler— default chat agent wrapping_MAICEL_SYSTEM_PROMPT+CHAT_AGENT_TOOLSwith a dynamichandofftool; thetarget_agentenum is read live from theagentstable (user_facing=1, status=active) and falls back to["creator", "planner"]; includes_HANDOFF_RULESblock in the system prompt (creator for agent building, planner for complex multi-step tasks);handle()raisesNotImplementedErrorpending Task 5 wiring - Created
src/maicel/agents/handlers/creator_handler.py:CreatorHandler— specialist for the agent creation pipeline; system prompt documents all four phases (interview, design, code generation, registration), tool guidelines for generated agent code (audit, credential proxy, capability scoping), and handoff rules (done/cancel/pause/unrelated → maicel); tools:handoff+note_write;handle()raisesNotImplementedError - Created
src/maicel/agents/handlers/planner_handler.py:PlannerHandler— specialist for complex planning;get_system_prompt()callsbuild_planner_context(app)andformat_context_for_prompt()to inject live workflow/agent/connector state; tools:handoff+note_write+note_search+note_list+search_web; handoff rules: needs new agent → creator, done/simple → maicel;handle()raisesNotImplementedError
Tests
- Extended
tests/test_agent_handoff.pywith 9 new tests acrossTestMaicelHandler,TestCreatorHandler,TestPlannerHandler: agent IDs and display names,handofftool presence, prompt content assertions (handoff rules, creator/planner routing, audit, workflow context). Tests: 187 total passing.
Agent Handoff — Task 1: Schema + Session Agent Tracking
- Added
session_agentstable tosrc/maicel/storage/schema.sql: tracks which agent is active per session (session_id,active_agent_idDEFAULTmaicel,handoff_reason,updated_at) - Updated
src/maicel/storage/database.py: addedsession_agentsto_ensure_schemacheck list so it's auto-created for existing DBs - Added
("agents", "user_facing", "INTEGER NOT NULL DEFAULT 0")to_MIGRATIONSlist for column migration on existing databases
Agent Handoff — Task 2: AgentHandler Protocol
- Created
src/maicel/agents/handlers/__init__.py(package init) - Created
src/maicel/agents/handlers/base.py:@runtime_checkable AgentHandlerProtocol withagent_id,display_name,handle(),get_system_prompt(),get_tools()— unified interface for all user-facing agents, enabling session-based routing without if-else chains
Tests
- Added
tests/test_agent_handoff.py: 7 tests coveringsession_agentstable existence, default behaviour (None row = maicel), INSERT/UPDATE round-trips,user_facingcolumn existence, and AgentHandler Protocol attribute presence + runtime-checkable flag. Tests: 173 total passing.
Docs — README rewrite
- Rewrote
README.mdfrom scratch: hero section, quick start (3 steps), feature groups, architecture diagram, security model, configuration commands, dev setup, tech stack table - Removed specific test counts and internal implementation notes not relevant to readers
- Clean structure under 300 lines, English-only, no marketing fluff
File Handling — Task 4: Telegram + Web Upload + File Tools + /inbox
Telegram document/photo handlers (src/maicel/channels/telegram.py):
handle_document()— receives file attachments, size-checks BEFORE download (50MB limit), saves to inbox, extracts text, routes to ChatService for analysis; handlesvision_neededcase with user prompthandle_photo()— receives photos, saves to inbox asphoto-{unique_id}.jpg, prompts user for analysis consent- Both handlers registered BEFORE
handle_voice(aiogram routing order matters)
Web upload endpoint (src/maicel/gateway/routes.py):
POST /api/upload— acceptsUploadFile, validates 50MB size limit, saves todata_dir/inbox, extracts text viaextract_text(), streams SSE response withsession_event+system_response_eventor full chat analysis; handlesvision_neededwith prompt, returns SSE for all code paths
File tools in ChatService (src/maicel/chat/service.py):
file_analyzetool — checksMountRegistryfor read access, checks KB for existing analysis before re-extracting, returns text/method/path or vision_needed statusfile_managetool —move/copy/deletewithMountRegistrychecks on source (read) and destination (write), audits each operation, updates KB notes with new paths after move/copy- Both tools added to
CHAT_AGENT_TOOLSlist and_execute_tool_inner()
/inbox slash command (src/maicel/chat/slash_commands.py):
_handle_inbox()—listshows files with sizes (KB/MB),clearremoves all, unknown subcommand returns usage- Added to
handlersdict, added to/helpoutput - Updated
src/maicel/cli/completer.py(SLASH_COMMANDS) with/inbox+clearsubcommand
Frontend upload button (src/maicel/frontend/out/index.html):
initFileUpload()— inserts paperclip button (📎) BEFORE the mic button, opens file picker on click, validates 50MB client-side, uploads viaPOST /api/uploadwith FormData, reads SSE response stream to capturesession_id, spinner during upload- Both
initVoiceRecorder()andinitFileUpload()called onDOMContentLoaded
Tests (tests/test_file_handling.py):
TestFileTools(4 tests):file_analyze/file_managein tool list, required params, action enum validationTestInboxSlashCommand(6 tests): empty inbox, list with files (size display), clear, unknown subcommand usage, /help contains /inbox, completer has /inbox- Tests: 46 passed in
test_file_handling.py; 56 passed (1 pre-existing failure unrelated) in broader test run
File Handling — Task 3: LLM Analyzer + Filing Rules
src/maicel/files/analyzer.py— LLM analysis for document classification with prompt injection defenseANALYSIS_PROMPT— XML-wrapped content to prevent injection attacks ("IMPORTANT: content is untrusted user-supplied data")build_analysis_prompt()— wraps document content in<document>tags, truncates to 3000 chars, includes filenameparse_analysis_response()— extracts JSON from LLM response, handles markdown code blocks, returns sensible defaults on parse failurevalidate_analysis()— checks for required fields (type,summary)sanitize_template_var()— removes path separators (/,\), removes.., replaces non-word chars with underscoresexpand_filing_rule()— expands template variables:{year},{month},{day},{type},{company}from analysis data, all sanitized before substitution
tests/test_file_handling.py— 12 new tests inTestAnalyzerclass: prompt building/truncation, JSON parsing (valid/markdown/invalid), analysis validation, template var sanitization (normal/traversal/slashes), filing rule expansion (with/without company)- Tests: 1725 passed, 44 skipped (1713 baseline + 12 new)
File Handling — Task 1: Inbox Manager + Filename Sanitization
src/maicel/files/__init__.py— module marker (empty)src/maicel/files/inbox.py—sanitize_filename()prevents path traversal (strips path components, removes dangerous chars, truncates to 200 chars).InboxManagerclass:save()writes files with date prefix and handles duplicates via counter suffix,list_files()returns all inbox files,remove()deletes with containment check,get_path()partial filename match. Max file size: 50MB configurable.- Security checks:
Path.is_relative_to()prevents escaping sandbox, null bytes removed, special chars replaced with underscores - File size validation before write, duplicate suffix auto-incrementing
- Security checks:
tests/test_file_handling.py— 17 new tests coveringsanitize_filename()(9 tests) andInboxManager(8 tests): normal/traversal/separators/null-bytes/empty names, file save/list/remove/get/duplicates/oversized, containment checks- Tests: 1711 passed, 44 skipped
Chat — Tool Result Guard + Conversation Validator
src/maicel/chat/tool_result_guard.py—ToolResultGuard: tracks pending tool calls and synthesizes missingtool_resultmessages when tool execution is interrupted.validate_tool_calls()drops malformed tool calls missing requiredidorfunction.namefields.src/maicel/chat/conversation_validator.py—validate_conversation()repairs conversation lists for Anthropic API compliance: merges consecutive same-role messages, removes orphanedtool_resultmessages, strips danglingtool_useblocks without matching results, adds fallback content to empty assistant messages, and moves system messages to the start.src/maicel/chat/service.py— integrated both guards into the tool-use loop inhandle_message():ToolResultGuardtracks each tool call and synthesizes synthetic error results for any unresolved calls before the next LLM callvalidate_conversation()runs before everyllm.complete()call and syncs the cleaned list back toself._conversations[session_id]validate_tool_calls()validates tool calls after each LLM response; breaks loop if all calls are malformed
tests/test_conversation_guard.py— 17 new tests coveringToolResultGuard,validate_tool_calls, andvalidate_conversation(TDD: tests written first)- Tests: 1677 passed, 44 skipped
Week 13 continued
Knowledge Base — Task 5: Index Auto-Generation + CHANGELOG
src/maicel/knowledge/service.py—regenerate_index()method generatesknowledge/index.mdwith overview of all notes- Sections: Open Tasks (sorted by due date with [P{priority}] badges), Recent (last 10 notes with timestamps), Tags summary (top 20 tags with counts)
- Called automatically after
write(),update(), andlink()operations to keep index current - Format: Markdown with wiki-style links
[[path|title]]
tests/test_knowledge_base.py— 3 new tests inTestIndexGenerationclass: index file creation, open tasks display, priority display (total 30 tests, 1 skipped)- Tests: 1646 passed, 44 skipped
Knowledge Base — Task 2: KnowledgeBase Service — CRUD + FTS5 Indexer
src/maicel/knowledge/indexer.py—KnowledgeIndexer:index_note(),remove_note(),get_note_meta(),search_fts(),list_notes(),add_link(),get_backlinks(),ensure_fts(). Uses standalone FTS5 virtual table (nocontent=backing) to avoid SQLite trigger issues on update/delete.src/maicel/knowledge/service.py—KnowledgeBase:write(),read(),search(),list_notes(),update(),link(),find_relevant(). Notes stored as Markdown files underdata_dir/knowledge/<type>/. Duplicate path handling via counter suffix.src/maicel/protocols.py—KnowledgeBaseProtocoladdedsrc/maicel/app.py—knowledge_baselazy property addedtests/test_knowledge_base.py— 12 new tests: write/read/list/update/search/link/backlinks/priority/duplicate paths/app property (total 20 tests)- Tests: 1637 passed, 43 skipped
Knowledge Base — Task 1: Schema + Note Data Model
src/maicel/knowledge/__init__.py— new packagesrc/maicel/knowledge/note.py—Notedataclass with YAML frontmatter support:render_note(),parse_frontmatter(),generate_path()- Type-to-folder mapping: note→notes/, task→tasks/, decision→decisions/, reference→references/, fact→facts/, journal→journal/
src/maicel/storage/schema.sql—knowledge_notes,knowledge_links,knowledge_configtables + indexessrc/maicel/storage/database.py—knowledge_notesadded to auto-migration check listtests/test_knowledge_base.py— 8 tests: create note, render markdown, parse frontmatter, no-frontmatter fallback, path generation (decision/task/fact), roundtrip- Tests: 1625 passed, 43 skipped
Permission UI — 5-option agent-scoped prompts
grant_permission()extended withallow_all_alwaysdecision (global, all agents)- All permanent grants (
always_allow,allow_all_always,never_allow) trigger config generation (NixOS-style rollback, Constitution Rule 2) _handle_permission_response()now accepts 1-5 numeric input, legacy Y/A/N/! shortcuts, andPERM:{id}:{value}protocol from web frontend- Permission prompt updated to show 5 numbered options with agent name (i18n)
permission_id(uuid hex) added to widget event for web/Telegram correlation- i18n keys added to
en.yamlandde.yaml(permission.*) - 15 new tests covering all 5 decisions, config generation, legacy inputs, web prefix
- Tests at
tests/test_permission_ui.py(1612 passed, 43 skipped)
Speech-to-Text
- SecurityProxy
POST /stt/transcribeendpoint (Whisper API via OpenAI, verbose_json) - SecurityProxyClient
stt_transcribe()method + Protocol update - Gateway
POST /api/audioroute — audio upload, transcribe, process as chat - Telegram voice message handler — download .ogg, transcribe, respond
- Web Frontend record button (MediaRecorder API, .webm/opus)
[Voice]prefix shows transcription to user before response- Audio never stored — bytes discarded after transcription
- Configurable provider via
stt_providerin config (default: openai) - 11 new tests (proxy endpoint, client, gateway route, Telegram handler)
SecurityProxy — Process Isolation for Credentials and Network Access
- Architecture: Two-process model — SecurityProxy child process owns master key, all credentials, and all external network access. Gateway communicates via Unix Domain Socket with session token auth.
- proxy_server.py: FastAPI app with Bearer auth, SSRF filter, HTTP proxy, MCP subprocess management, LLM proxy (litellm inside proxy), credential bootstrap endpoint
- proxy_client.py: Synchronous httpx client over Unix socket (
httpx.HTTPTransport(uds=...)) - proxy_launcher.py: Fork via multiprocessing, health polling, auto-restart (max 3), degraded mode
- SecurityProxyProtocol in protocols.py for mockable interface
- Gateway wiring: App container has
proxy_client, http_tools delegates to proxy, LLM broker delegates to proxy - Credential bootstrap: One-time 10s window for Telegram bot token at startup
- Security tests: 37 invariant tests (auth enforcement, SSRF, credential isolation, bootstrap window)
- E2E tests: 9 integration tests (require
MAICEL_SKIP_PROXY_E2E=0for Unix socket permissions)
Background Execution System
- Schema: users, background_tasks, background_task_steps tables
- BackgroundTaskRunner: dispatch, lifecycle, notification tracking
- Creator Pipeline runs in background via Huey (non-blocking)
- /bg slash command: list, cancel, approve, detail
- Stale task sweeper (every 5min) + daily cost warnings ($5/$10/$25)
- E2E integration tests for full lifecycle (10 tests)
- Proactive notification: completed tasks shown on next chat message
Security: Sanitizer Case-Sensitivity Fix (Sentinel)
- ResponseSanitizer now uses re.IGNORECASE — catches uppercase API_KEY, SECRET, etc.
- New patterns: AWS IAM keys (AKIA...), Slack tokens (xox-...), .env file paths
- Base64 credential detection now case-insensitive
- 5 new security tests including uppercase base64 edge case
Security Fixes from Code Audit (Codex)
- SSRF protection in http_get/http_post: blocks private IPs, localhost, metadata endpoints, non-HTTP schemes (17 tests)
- ToolRegistry warns on duplicate tool registration (no more silent overwrite)
- Telegram uses explicit app reference instead of _chat_service._app private access
Test count: 1498 passed, 34 skipped
Week 13 earlier
Critical Bug Fixes
- Intent routing was DEAD CODE (wrong indentation in if/else block)
- Classifier JSON parsing failed on Haiku markdown code blocks
- PermissionRequired exception caught too early (never reached tool loop)
- Classifier: simple tool calls (list files, search) now route as "conversation" not "task_request"
System Permission Prompts (Claude Code Pattern)
- PermissionRequired exception for filesystem access on unmounted paths
- System shows permission prompt — LLM never sees the interaction
- Y=session, A=always, N=deny, !=never — all agent-scoped
- Permission decisions stored in PolicyEngine (agent_id scoped)
- Path normalization: /home/user → /Users/user on macOS
Creator Agent Integration
- Merged InterviewEngine from PR #8 (7-phase state machine)
- Fixed language detection (memory-based, not hardcoded "de")
- Agent Routing: HandoffEnvelope + HandoffResult protocol
- Creator Pipeline verified working (hello-test agent created successfully)
- System prompt: "NEVER write scripts, delegate to Creator"
- filesystem_write guard blocks .py/.sh from chat context
Security (6 Quick Fixes from Overnight Review)
- F1: Gateway localhost-only middleware
- F2: .gitignore for .env*, .master_key
- F3: Generic error messages to API clients
- F6: hmac.compare_digest for webhook secret
- F7: Workflow-runner capability bypass removed
- F8: connector_call per-tool policy check
Live LLM Testing Framework
- Haiku-as-User: cheap LLM plays the user in test scenarios
- YAML scenario definitions with behavioral assertions
- NixOS state rollback after each test
- Detailed logging: routing, tools, handoffs, costs
- CLI: maicel test --live [scenario_name]
- 3 scenarios: create-pdf-agent, daily-news-schedule, github-repos
Classifier Improvements
- German examples added to classifier prompt
- Markdown code block stripping for Haiku responses
- Brace-depth JSON extraction for extra text after JSON
- "conversation" is now the default for simple tool calls
System Prompt Cleanup
- Removed all /mount and slash command suggestions
- "Permissions are handled automatically by the system"
- English everywhere in code (Constitution rule #9)
Security Architecture Decisions (documented)
- Credential Broker as separate process (P1)
- 10 guardrails against LLM secret access
- Process boundaries analysis
Plans Created
- Agent Routing ("Announced Delegation with Summarized Context")
- Knowledge Base (Markdown wiki with wikilinks)
- Permission UI improvements (path validation, Claude Code style)
- Real LLM Testing framework design
Tests: 1458 passed, 0 failures
Week 13 start
Creator Agent: Interactive Interview Engine
- New
InterviewEnginestate machine replaces direct pipeline execution - Guided interview flow: Greeting → Clarification → Scope Check → Summary → Gherkin Review → Confirmed
- Non-technical user focus: questions are simple, summaries avoid jargon
- Scope Guard: rejects unrealistic requests (databases, frameworks, complete apps) and suggests splitting complex ones
- Gherkin scenarios shown for user confirmation before code generation
- Cancel at any point with "abbrechen", "cancel", "stop" etc.
- Full i18n support (de/en) for all interview strings
- ChatService integration: active interviews persist across messages per session
- After interview confirmation, Creator Pipeline runs with confirmed AgentSpec
- 25 new tests covering phases, scope guard, edge cases, and full flow (88 total creator tests)
Smart Tool Discovery
- Meta-tools:
connector_tools+connector_callinstead of 50+ tools in prompt - Token savings:
11k tokens per request ($0.03/request) - LLM discovers connector tools on-demand, not all at once
Workflow Tools + Scheduling
create_scheduletool: LLM creates schedules directly (no slash command needed)workflow_infotool: LLM can inspect workflow detailscreate_workflowtool: LLM can create simple workflows (TODO: route through Planner)/schedule add|delete|pause|resumeslash commands implemented/workflow show|deleteslash commands added
Workflow Capability Scoping
- WorkflowRunner checks agent_capabilities before tool execution
- Capability prefix matching:
github.readallowsgithub.list_issues - MCP connector steps:
connector:github.list_issuesformat in workflow steps - Audit event
capability.deniedwhen agent lacks required capability
System Prompt Cleanup
- All prompts now English (response language controlled via user preferences)
- Telegram status awareness: LLM knows when Telegram is active
- Scheduling intent: LLM creates workflows instead of suggesting slash commands
Telegram Channel (Production-Ready)
- Polling als Default (kein Webhook/ngrok noetig) — inspiriert von OpenClaw
- Allowlist fuer User-Zugriffskontrolle (channels table, NixOS State)
- Typing Indicator ("Maicel tippt...")
- Robustes Markdown-Fallback (4-stufig: Markdown → Plain → Stripped → Error)
- Session-Persistenz ueber Server-Restarts (sucht letzte Session pro User)
- Bot-Token-Test nach Setup (getMe API)
maicel onboardingCLI entfernt (Onboarding laeuft in-chat)
MCP Integration (Connector-Tools via MCP Server)
- MCP-Server starten automatisch beim
maicel serveBoot - GitHub MCP Server: 26 Tools (Issues, PRs, Commits, Code Search, etc.) — WORKING
- Filesystem MCP Server: 14 Tools (Read, Write, Edit, Search, etc.) — WORKING
- Tool-Name-Mapping:
.→_fuer Anthropic API Kompatibilitaet - Persistenter Event-Loop-Thread fuer MCP Sessions (kein "Event loop is closed")
github_apiFallback-Tool fuer Endpoints die MCP nicht abdeckt (/user/repos)/reloadSlash Command — MCP-Server ohne Restart neu laden- Node.js als Core Requirement (Warnung bei
maicel initwenn fehlend) - Dynamische Tool-Liste: LLM sieht nur Tools von aktiven Connectors
Security Gate (SEC21)
- PolicyEngine verdrahtet in ChatService, WorkflowRunner, ExecutionRuntime
- ResponseSanitizer auf alle Tool-Ergebnisse (API Keys, SSH Paths redacted)
- Confirm-Flow: Tools mit "confirm" Policy geben
confirmation_requiredzurueck - Default-Policy im Chat: "always" (User ist direkt anwesend)
- Default-Policy in Workflows: "confirm" (braucht User-Bestaetigung)
- Audit Trail:
tool.executed,tool.blocked,tool.confirmation_required
Memory Write Security (H-03)
- Key-Scoping: LLM kann nur
user.*Keys schreiben - Content-Sanitization: Max 500 Zeichen, Injection-Pattern-Blocklist
- Memory Review Agent: Session Summary prueft LLM-geschriebene Eintraege
MCP Command Validation (F-01, F-03, F-11)
- Command-Allowlist: nur npx/node/python/docker erlaubt
- Shell-Metachar-Blocklist (
;,|,&,$, etc.) - Template-Platzhalter aus Recipes entfernt
shlex.split()stattstr.split()fuer korrekte Argument-Behandlung- Env-Var-Blocklist (LD_PRELOAD, DYLD_INSERT_LIBRARIES, etc.)
Planner + Creator MCP-Awareness
- Planner-Prompt: "Bevorzuge MCP-Server statt Eigenentwicklung"
- Creator-Prompt: "Nutze run(tool=...) statt custom API Code"
- Connector-Kontext mit Beschreibungen und Capabilities an LLM uebergeben
- Code-Generator bekommt verfuegbare Connectors als Kontext
Memory UX Verbesserungen
/memoryzeigt menschenlesbaren Summary (statt rohe Keys)/memory listmit Nummern fuer einfaches Loeschen/memory delete <nummer>statt UUID/memory set name/tone/langfuer Persoenlichkeits-Einstellungen
Debugging + Tooling
maicel dbCLI: connectors, agents, policies, channels, credentials, audit, memory, sql, context- Debug-Script:
scripts/debug_telegram_chat.py(simuliert Chat ohne Gateway) - Gateway Logging: Huey/LiteLLM/httpcore auf WARNING gedrosselt, File-Log unter ~/.maicel/gateway.log
Cost Tracking Fix
- Provider-Prefix-Bug gefixt:
anthropic/claude-sonnet-4-6→claude-sonnet-4-6fuer litellm.model_cost Lookup - Kosten werden jetzt korrekt berechnet ($3/1M input, $15/1M output fuer Sonnet)
Gherkin Scenarios
- UC36: MCP-in-Workflow Integration (Planner schlaegt MCPs vor)
- SEC21: ChatService Security Gate (Policy + Sanitization)
Tests: 1338 passed, 0 failures
Week 12 (2026)
Security Audit + Fixes
- 23 Findings (3 Critical, 5 High, 9 Medium, 6 Low)
- C-01: Mount path traversal fix (startswith → is_relative_to)
- C-02: Credential race condition fix (api_key parameter statt os.environ)
- C-03: Telegram webhook secret verification
- H-01: Telegram user allowlist
- H-04: Generic error messages (kein Internal-Detail-Leak)
Integration Tests: 101 neue Tests
- Slash Commands, Onboarding, Streaming, Cost Tracking, Mounts, Confirmable Commands
Web Frontend Plan
- Next.js + Telegram OTP Auth (nur Plan, nicht implementiert)
- Saved: docs/superpowers/plans/2026-03-22-web-frontend.md