IAF Architecture Map

This document serves as the operations navigation guide for external LLMs (Claude Code, GPT Agent, etc.) taking over IAF. It answers not "what functions are in each file," but "which files can be touched, which should be left alone, and which areas to modify when touching them."

Core principle: Do not touch infrastructure unless absolutely necessary. 99% of requirements can be fulfilled by adjusting Adjustable components.

v0.4 changes: Integrated full external LLM takeover plan. Added MANIFEST.json, validate.py, call_log.jsonl, tool hot-reload, Tube retry/data-passing, auto_commit.sh and other component descriptions. Maintains independent agent engine copy design (possibility management) without introducing engine inheritance.


0. External LLM Takeover Entry Point

When an external LLM first takes over an IAF instance, read these three files in order to begin work:

OrderFileQuestion It Answers
1MANIFEST.jsonWhat does the system look like right now? What agents, tubes, and dispatches exist?
2PLAYBOOK.mdWhat file operations correspond to each intent? What are the steps?
3This documentHow does the system work? What is each file's role? Which can be touched and which cannot?

0.1 MANIFEST.json (System Map)

Auto-generated by generate_manifest.py scanning directories, also auto-refreshed when chat_server.py starts.

{
  "framework": "IAF",
  "version": "0.1.0",
  "generated_at": "2026-04-01T14:30:00Z",

  "structure": {
    "agents_dir": "agents/",
    "template_dir": "template/",
    "dispatch_dir": "dispatch/",
    "tube_dir": "tube/",
    "global_config": "config.json",
    "tube_config": "tube/tubes.json",
    "tube_log": "tube/tube_log.jsonl",
    "pages_dir": "pages/"
  },

  "agents": {
    "charlie": {
      "config": "agents/charlie/agent_config.json",
      "soul": "agents/charlie/SOUL.md",
      "tools": ["file_tools.py", "shell_tools.py", "dispatch_tools.py", "tube_tools.py"],
      "history": "agents/charlie/history.jsonl",
      "call_log": "agents/charlie/call_log.jsonl",
      "model": "google/gemini-3-flash-preview"
    }
  },

  "dispatches": {
    "roundtable": {
      "config": "dispatch/roundtable/dispatch_config.json",
      "agents": ["charlie", "mcmillan"],
      "ui": "dispatch/roundtable/roundtable.html"
    }
  },

  "tubes": {
    "morning_news": {
      "enabled": true,
      "triggers": ["cron:0 3 * * *", "manual"],
      "steps": ["agent:charlie"]
    }
  },

  "conventions": {
    "tool_file_pattern": "*_tools.py",
    "tool_export_variable": "TOOLS",
    "context_strategy_dir": "context/",
    "skill_dir": "skills/"
  }
}

External LLM manual refresh: python3 generate_manifest.py

0.2 PLAYBOOK.md (Operations Manual)

Plain-text operations manual covering complete steps for all common operations. See standalone file.

0.3 validate.py (Validation Script)

External LLM calls via Bash after modifying files to confirm changes are valid.

python3 validate.py agent charlie          # Validate single agent
python3 validate.py tool agents/charlie/tools/http_tools.py  # Validate tool file
python3 validate.py tube                   # Validate tubes.json
python3 validate.py all                    # Global validation

Output format: Success → OK, Failure → FAIL: N error(s) + line-by-line error descriptions. Plain text, no colours.

0.4 auto_commit.sh (Safety Snapshot)

External LLM calls before performing batch modifications:

bash auto_commit.sh "Pre-modification snapshot: adding http tools to charlie"

Rollback: git revert HEAD or git checkout -- {file}


1. Global Classification

All files in the framework fall into two categories:

CategoryMeaningAI Operation Mode
InfrastructureHow the framework operates. The plumbing.Read once to build understanding, then don't touch
AdjustableWhat the system does. The building blocks and wiring.May be operated on every task

Decision rule: Upon receiving a requirement, first confirm whether it can be achieved by adjusting Adjustable components. Only when the requirement involves the framework's capability boundary itself (new LLM response formats, new storage mechanisms, new communication protocols) should you consider touching Infrastructure.


2. Infrastructure Inventory (Do Not Touch)

2.1 Shared Layer lib/

FileLinesRoleInterface AI Needs to Know
lib/llm_client.py~76HTTP LLM calls + retry + error classificationcall_llm(url, key, model, messages, tools=None) → response
lib/token_utils.py~13Token count estimationestimate_tokens(text) → int

2.2 Web Service Layer

FileLinesRoleWhat AI Needs to Know
chat_server.py~220Flask router + Tube Runner startup + MANIFEST generationContains no business logic, only route dispatch. Auto-starts TubeRunner background thread and generates MANIFEST.json on startup
dispatch_routes.py~330Dispatch layer Flask BlueprintCalls dispatch.py public functions, contains no orchestration logic
tube_routes.py~280Tube layer Flask BlueprintManages tube listing, status queries, manual triggers, log read/clear

2.3 Agent Engine template/

FileLinesRoleWhat AI Needs to Know
template/core/direct_llm.py~290Core loop enginecall_agent(message, mode, max_loops) → response. Loads system prompt via context_files. Includes call_log.jsonl structured logging
template/core/tool_executor.py~60Tool auto-discovery registry + hot-reloadScans tools/*_tools.py, imports TOOLS dict. Checks tools/ directory mtime before each execute() call, auto-rescans on changes
template/context/sliding_window.py~47Default context trimming strategytrim(messages, max_tokens) → trimmed_messages
template/tools/file_tools.py~78Default toolset (read/write files, list directory)Can be replaced or extended after copying to new Agent
template/tools/TOOL_CONTRACT.mdTool file format contractExternal LLM references this when writing new tools

Note: template/ is the copy source. When creating a new Agent: cp -r template/ agents/xxx/, after which files in agents/xxx/ become Adjustable. Do not modify template/ itself.

direct_llm.py five-layer message assembly model:

Layer 1: context_files content → concatenated as system prompt
Layer 2: skills trigger matching → injected as user+assistant dialogue pairs when matched
Layer 3: history.jsonl → loaded in chat mode only, skipped in batch mode
Layer 4: current user message
Layer 5: trim → ensures total stays within max_context - 8000

direct_llm.py path resolution rules (shared by context_files and skill_file):

PriorityResolution MethodExample
1Agent directory relative path"SOUL.md"agents/xxx/SOUL.md
2Framework root relative path"dispatch/roundtable/rules/default.md"
3Absolute path"/data/external/reference.md"

direct_llm.py structured logging (call_log.jsonl):

Each call_agent() invocation records the following events in agents/xxx/call_log.jsonl:

EventKey FieldsMeaning
call_startedmodel, mode, message_previewAgent was called
llm_callloop, tokens_est, duration_msOne LLM API call
tool_callloop, tool_name, args_summary, result_length, is_errorOne tool execution
call_completedloops_used, reply_length, total_duration_msAgent returned result
call_failederrorAgent call failed

External LLM viewing: tail -20 agents/xxx/call_log.jsonl or grep "tool_error" agents/xxx/call_log.jsonl

2.4 Dispatch Strategy Infrastructure

Within each dispatch strategy folder, the following files are Infrastructure:

FileLinesRoleWhat AI Needs to Know
dispatch_base.py~478Tool loop, LLM response parsing, staging management, status trackingSee interface table below
session_manager.py~200JSONL session CRUD + staging formattingcreate_session(), append_to_session(), load_session(), list_sessions(), delete_session(), format_session_history()
context_injector.py~80Reads files by context_files path list, builds messages arraybuild_context(agent_id, config, project_root, user_message) → (messages, provider, model)
context/sliding_window.py~150Configuration-driven context trimmingtrim_records(records, max_tokens, trim_strategy) → trimmed_records

dispatch_base.py key interfaces:

get_llm_caller(project_root) → call_llm_fn or None
load_global_config(project_root) → dict
resolve_llm_endpoint(provider, global_config) → (url, key)
load_agent_tools(agent_id, project_root) → (tool_definitions, tool_functions)
call_with_tool_loop(messages, url, key, model, call_llm_fn, tool_definitions, tool_functions, max_tool_loops) → (content, tool_history)
write_agent_memory(agent_id, tool_history) → None
write_staging_history(project_root, session_id) → None
clear_staging() → None
set_status(session_id, round_num, agent_id, agent_name, status) → None
clear_status() → None
get_status() → dict

2.5 Tube Layer Infrastructure

2.5.1 Core Engine

FileLinesRoleWhat AI Needs to Know
tube/tube_runner.py~370Main loop engine: polls tubes.json, checks trigger conditions, executes steps serially (with retry and failure strategies), writes logs, manages staging data passingTubeRunner(interval=15).run() — runs as daemon thread in chat_server

Runtime mechanism key points:

2.5.2 Pluggable Trigger Sources triggers/

Each .py file is a trigger source type with unified interface:

def check(config: dict, state: dict) -> bool
FileLinesRoleConfig Example
triggers/cron.py~44Scheduled trigger (depends on croniter){"expr": "0 3 * * *"}
triggers/manual.py~32Flag file trigger (API or file creation){} (no config needed)

state fields (provided by tube_runner):

FieldTypeDescription
nowdatetime (UTC)Current time
last_triggereddatetime or NoneLast trigger time for this tube
tube_idstrCurrent tube ID
flag_dirstrManual trigger flag file directory path

User extension: Drop a .py file in triggers/, implement check() interface. Zero code changes.

2.5.3 Pluggable Drive Targets targets/

Each .py file is a drive target type with unified interface:

def build_command(step: dict, project_root: str) -> list[str]
FileLinesRolestep.type
targets/agent.py~34Builds Agent subprocess command"agent"
targets/dispatch.py~40Builds Dispatch subprocess command"dispatch"

User extension: Drop a .py file in targets/, implement build_command() interface. Zero code changes. Exception: type=tube steps don't go through targets/ modules; they're handled inline recursively by tube_runner.

2.5.4 Tube API Endpoints

MethodPathFunction
GET/api/tubesList all tube definitions + real-time running/idle status
GET/api/tube/statusLightweight status query (id, enabled, status only)
POST/api/tube/triggerManual trigger (body: {"tube_id": "xxx"})
GET/api/tube/log?tail=50&tube_id=xxxQuery logs (filterable)
GET/api/tube/log/grouped?per_tube=10Return logs grouped by tube
DELETE/api/tube/log?tube_id=xxxClear logs (can clear per tube)

2.5.5 tube_log.jsonl Event Types

EventKey FieldsMeaning
runner_startedintervaltube_runner started
runner_stoppedtube_runner stopped
tube_triggeredtube_id, step_countTube was triggered
step_startedtube_id, step_index, step_type, step_target, payloadA step began
step_completedtube_id, step_index, step_type, step_target, exit_code, duration_secA step succeeded
step_failedtube_id, step_index, step_type, step_target, exit_code, duration_sec, stderr_tailA step failed
step_retrytube_id, step_index, attempt, max_attempts, delay_secA step retrying
tube_completedtube_id, duration_secAll steps completed
tube_stoppedtube_id, stopped_at_step, duration_secStopped due to step failure
trigger_errortube_id, trigger_type, errorTrigger check exception

2.6 Framework-Level Utility Tools

FileRoleUser
generate_manifest.pyScans directories to generate MANIFEST.jsonExternal LLM via Bash
validate.pyValidates agent/tool/tube config legalityExternal LLM via Bash
auto_commit.shGit snapshot (safety net before batch modifications)External LLM via Bash

These three files are Infrastructure, but external LLMs only call them, never modify them.


3. Adjustable Components Inventory (AI Operation Targets)

3.1 Agent Adjustable Components

Within each agents/xxx/ folder:

FilePurposeModification Scenarios
agent_config.jsonModel, provider, context sources, skill trigger rulesChange model, modify context file list, add/remove skill triggers
SOUL.mdAgent identity definition (referenced via context_files)Change personality, role, behavioural guidelines
skills/*.mdTask instruction filesAdd/remove instruction content
tools/*_tools.pyAgent's available toolsAdd/remove tools (drop file = effective, auto-discovered via hot-reload)
context/sliding_window.pyThis Agent's trimming strategyWhen different trimming behaviour than default is needed

Engine code (core/direct_llm.py, core/tool_executor.py) is semi-infrastructure: not modified by default after copying from template. Only modify when the user needs this Agent to be fundamentally different from others at the engine level — this is the core promise of "possibility management."

Engine modification protocol: Each agent has an independent engine copy. When modifying engine logic: 1) Modify one agent's copy and test. 2) After confirmation, diff against other agents' same-named files. 3) Sync individually to agents that need the same modification (not all agents need syncing — some may have special logic). 4) After syncing, run python3 validate.py all

3.2 Dispatch Adjustable Components

Within each dispatch/xxx/ folder:

FilePurposeModification Scenarios
dispatch.pyOrchestration logicChange collaboration mode (round-robin → debate → star topology → serial pipeline)
dispatch_config.jsonParticipating agents, rounds, trimming strategy, context_filesAdd/remove agents, change rounds, change context sources
rules/*.mdAgent's role definition in this collaborationChange agent's role in collaboration

Key design: Dispatch does not call Agent's call_agent. It directly calls lib/llm_client.call_llm(), assembling context itself via context_injector. The Agent folder serves Dispatch only as a data source (SOUL.md, rules, config), not as an executor.

3.3 Tube Adjustable Components

FilePurposeModification Scenarios
tube/tubes.jsonDeclarative definition of all tubes — single source of topology truthAdd/remove tubes, change triggers, change steps, change signal topology
pages/tube-dashboard.htmlTube visual monitoring pageCustomise UI display

3.4 UI Adjustable Components

FilePurposeModification Scenarios
index.htmlYellow pages, feature entry indexGenerally unchanged (auto-discovery)
chat.htmlBasic chat interfaceCustomise chat UI
pages/*.htmlUser-built feature pagesAdd page = drop file
dispatch/xxx/*.htmlDispatch-specific UICustomise collaboration interface

4. AI Operations Decision Tree

Received requirement
  │
  ├─ Need a new Agent?
  │   → bash auto_commit.sh "snapshot before adding agent"
  │   → cp -r template/ agents/xxx/
  │   → Edit SOUL.md
  │   → Edit agent_config.json (set context_files, skills, model)
  │   → Don't touch engine code
  │   → python3 validate.py agent xxx
  │   → python3 generate_manifest.py
  │
  ├─ Need to change Agent's loaded files?
  │   → Edit agent_config.json's context_files array
  │   → Add path = add file, remove path = remove file
  │   → Don't touch direct_llm.py
  │
  ├─ Need to add tools to Agent?
  │   → Reference template/tools/TOOL_CONTRACT.md for format
  │   → Drop *_tools.py file in agents/xxx/tools/
  │   → Hot-reload auto-discovery, no code changes
  │   → python3 validate.py tool agents/xxx/tools/new_tools.py
  │
  ├─ Need Agent to execute shell commands?
  │   → cp template/tools/shell_tools.py agents/xxx/tools/
  │
  ├─ Need Agent to trigger Tubes?
  │   → cp template/tools/tube_tools.py agents/xxx/tools/
  │
  ├─ Need Agent to initiate multi-Agent collaboration?
  │   → cp template/tools/dispatch_tools.py agents/xxx/tools/
  │
  ├─ Need to add skills to Agent?
  │   → Drop .md file in agents/xxx/skills/
  │   → Method A: Add trigger rule in agent_config.json skills array (exact match)
  │   → Method B: Add entry in skill_router.md (semantic trigger, needs file_tools)
  │   → Method C: Add directly to context_files (full injection, always loaded)
  │
  ├─ Need a new Dispatch strategy?
  │   → cp -r dispatch/roundtable/ dispatch/xxx/
  │   → Edit dispatch.py ORCHESTRATION LOGIC section
  │   → Edit dispatch_config.json, rules/
  │   → Don't touch dispatch_base.py, session_manager.py, context_injector.py
  │
  ├─ Need to adjust signal topology?
  │   → Edit tube/tubes.json
  │   → Don't touch tube_runner.py, triggers/, targets/
  │   → python3 validate.py tube
  │
  ├─ Need a new Tube trigger source type?
  │   → Drop .py file in tube/triggers/, implement check(config, state) → bool
  │   → Use filename as type in tubes.json
  │   → Don't touch tube_runner.py
  │
  ├─ Need a new Tube drive target type?
  │   → Drop .py file in tube/targets/, implement build_command(step, project_root) → list
  │   → Use filename as type in steps of tubes.json
  │   → Don't touch tube_runner.py
  │
  ├─ Need to manually trigger a Tube?
  │   → curl -X POST http://127.0.0.1:5000/api/tube/trigger \
  │       -H "Content-Type: application/json" \
  │       -d '{"tube_id": "xxx"}'
  │   → or: touch tube/manual_triggers/xxx
  │   → or: Agent triggers via tube_tools.py
  │
  ├─ Need to check system runtime status?
  │   → Agent chat history: tail agents/xxx/history.jsonl
  │   → Agent run log: tail agents/xxx/call_log.jsonl
  │   → Tube execution log: tail tube/tube_log.jsonl
  │   → Tube step output: Read tube/staging/{tube_id}_{timestamp}/
  │   → Error troubleshooting: grep "error\|failed" tube/tube_log.jsonl
  │   → Global validation: python3 validate.py all
  │
  ├─ Need to modify Agent engine logic?
  │   → bash auto_commit.sh "snapshot before engine modification"
  │   → Modify one agent's core/ copy first and test
  │   → diff against other agents' same-named files
  │   → Sync individually (note: not all agents need syncing)
  │   → python3 validate.py all
  │
  ├─ Need a new UI page?
  │   → Drop .html file in pages/
  │   → Auto-discovered by yellow pages
  │
  ├─ Need to rollback changes?
  │   → git diff → see what changed
  │   → git log --oneline -10 → see snapshot history
  │   → git checkout -- {file} → restore single file
  │   → git revert HEAD → rollback most recent commit
  │
  └─ None of the above?
      → May need to touch Infrastructure
      → First confirm: can this really not be achieved via Adjustable components?
      → If confirmed, read the corresponding Infrastructure source code, understand, then modify
      → After modification: python3 validate.py all to verify no other modules affected

5. Agent Capability Three-Layer Model Quick Reference

LayerCarrierConfig LocationLoad TimingPurpose
Identity & Knowledge.md / .txt filesagent_config.json context_filesEvery callInformation the Agent always knows
Conditional Instructions.md files + trigger rulesagent_config.json skillsOn keyword matchInstructions injected for specific scenarios
Execution Capability*_tools.py filesAuto-discovered (drop file = effective, hot-reload)Every callOperations the Agent can perform

6. File Change Impact Scope Quick Reference

File ChangedImpact Scope
agents/xxx/SOUL.mdThat Agent only
agents/xxx/agent_config.jsonThat Agent only
agents/xxx/skills/*.mdThat Agent only
agents/xxx/tools/*.pyThat Agent only (hot-reload, effective next call)
agents/xxx/core/direct_llm.pyThat Agent only (others have independent copies, unaffected)
dispatch/xxx/dispatch.pyThat Dispatch strategy only
dispatch/xxx/dispatch_config.jsonThat Dispatch strategy only
dispatch/xxx/rules/*.mdThat Dispatch strategy only
tube/tubes.jsonSignal topology (does not affect Agent and Dispatch internals)
tube/triggers/new.pyOnly tubes using that trigger type
tube/targets/new.pyOnly tubes using that target type
pages/*.htmlThat page only
MANIFEST.jsonNo runtime impact (read-only for external LLMs)
lib/llm_client.pyGlobal — all Agent and Dispatch LLM calls
chat_server.pyGlobal — all API routes + Tube Runner startup
tube/tube_runner.pyGlobal — all Tube polling and execution
template/No direct impact — only affects future new Agents

Key characteristic: The first 14 rows' impact scope is all "that module only" or "only tubes using it." This is the value of module isolation — AI can safely make local modifications with no global side effects.


7. Hot-Reload Mechanism Quick Reference

File TypeHot-Reload MethodEffective Timing
agent_config.json_load_config() re-reads on every call_agent()Next agent call
SOUL.md / context_files_load_context_files() re-reads every timeNext agent call
tools/*_tools.pytool_executor checks tools/ directory mtimeNext agent call
tubes.jsontube_runner _load_tubes() on every polling cycleNext polling cycle (≤15 seconds)
dispatch_config.jsonRe-read on every dispatch callNext dispatch call
rules/*.mdRe-read via context_injector every timeNext dispatch call
pages/*.htmlFlask route re-reads on every requestNext page visit

Conclusion: After an external LLM modifies files, no server restart is needed. All Adjustable components are hot-loaded.


8. Dangerous Operations Checklist (Always Git Commit Before Modifying)

OperationRisk LevelDescription
Delete agents/ directoryHighAgent data and history permanently lost
Modify config.json provider keyHighAll agents' LLM calls may fail
Modify any agent's core/direct_llm.pyMediumAffects only that agent, but may break core loop
Modify dispatch/xxx/dispatch_base.pyHighAffects all collaboration sessions for that strategy
Modify lib/llm_client.pyHighAffects all global LLM calls
Modify tube/tube_runner.pyHighAffects all global Tube execution
Delete or heavily modify existing tubes in tubes.jsonMediumMay interrupt running automation flows

Before operation: bash auto_commit.sh "description"
After operation: python3 validate.py all


9. External LLM Full Takeover Acceptance Scenario

1. Read MANIFEST.json
   → Understand what agents, tubes, dispatches the system has

2. Read PLAYBOOK.md
   → Know how to operate

3. Bash: bash auto_commit.sh "pre-modification snapshot"
   → Safety backup

4. Bash: cp -r template/ agents/monitor/
   → Create new agent

5. Write agents/monitor/SOUL.md
   → Write "You are a website monitoring agent, check site status when called..."

6. Edit agents/monitor/agent_config.json
   → Set display_name, model

7. Read template/tools/TOOL_CONTRACT.md
   → Understand tool file format

8. Write agents/monitor/tools/http_tools.py
   → Write HTTP check tool conforming to contract

9. Bash: python3 validate.py agent monitor
   → Output "OK"

10. Edit tube/tubes.json
    → Add a tube triggering monitor agent every 10 minutes

11. Bash: python3 validate.py tube
    → Output "OK"

12. Bash: python3 generate_manifest.py
    → Update MANIFEST.json

13. Bash: bash auto_commit.sh "Added monitor agent and scheduled check tube"
    → Save modifications

14. After 10 minutes:
    Read tube/tube_log.jsonl → Confirm tube triggered
    Read agents/monitor/call_log.jsonl → Confirm agent ran

15. If issues:
    Bash: git diff → See what changed
    Bash: git revert HEAD → Rollback