Data Machine Documentation

Complete user and agent documentation for Data Machine, the WordPress automation product layer that combines pipelines, flows, jobs, chat agents, system tasks, policy-gated tools, REST/WP-CLI surfaces, and extension-owned handlers on top of the generic Agents API substrate.

Agent Orientation

If you are an agent landing in this repository, start with this map:

text
Data Machine core
  -> pipelines/flows/jobs: repeatable automation and execution state
  -> engine/actions: bounded cycle execution and job transitions
  -> AI runtime: request assembly, tools, policies, memory, transcripts
  -> REST/WP-CLI/admin: operator and integration surfaces

Agents API dependency
  -> generic agent contracts, durable conversation loop, memory-store and
     approval vocabulary, transcript/lock/event primitives

Extension plugins
  -> additional handlers, tools, abilities, bundle extras, GitHub/workspace,
     social, business, editor, frontend chat, and events behavior

Core should expose behavior through abilities, hooks, REST, CLI, handlers, and bundle contracts. Extension-specific behavior belongs in extension plugins unless it is a generic Data Machine primitive.

Quick Navigation

Core Concepts

Architecture Deep Dives

Engine & Services

Handler Documentation

  • Fetch Handlers: Source-specific data retrieval with deduplication, filtering, and engine data storage.
  • Publish Handlers: Modular destination integrations with consistent response formatting and logging.
  • Upsert Handlers: Identity-aware create-or-update operations — find existing content by identity strategy, update if changed, create if new.

AI Tools

  • Tools Overview: Static, ability-backed, and adjacent handler tools available to AI agents.
  • Execute Workflow: Modular execution of multi-step workflows from the chat toolset.
  • Static Registry Tools: Research, memory, workflow-management, and site-operation tools registered through datamachine_tools.
  • Pipeline Handler Tools: Runtime tools generated from adjacent fetch, publish, and upsert handlers.
  • Policy-Gated Tools: Tools are composed from sources and filtered by mode, memory, action, and handler policies before exposure.
  • Global Tools: Local Search, Web Fetch, WordPress Post Reader, daily memory, image generation, search-console/analytics, and others used across agents.
  • Chat Tools: AddPipelineStep, ApiQuery, ConfigureFlowSteps, ConfigurePipelineStep, CreateFlow, CreatePipeline, RunFlow, UpdateFlow, and other workflow management tools.

API Reference

  • API Overview: Canonical REST route inventory sourced from inc/Api registrations (api/index.md).
  • Endpoints: Agents, Agent Ping, Analytics, Auth, Chat, Email, Execute, Files, Flows, Internal Links, Jobs, Logs, Pipelines, Settings, System, and other REST resources.
  • Workflow Endpoints: Execute, flows, queues, webhooks, jobs, logs, and processed items.
  • Agent Endpoints: Chat, chat sessions, agents, access, tokens, memory files, daily memory, Agent Ping callbacks, and pending-action resolution.
  • Discovery/Settings Endpoints: Handlers, step types, providers, tools, settings, system tasks, auth, and users.

Development

  • Hooks: Core actions, filters, and engine hooks for extension development.
  • REST Integration: Patterns for extending the REST API and custom endpoints.
  • Extension Boundaries: Core provides generic primitives; data-machine-code, socials, business, editor, frontend-chat, events, and other plugins own their product-specific handlers/tools/abilities.

Admin Interface

  • Pipeline Builder: React-based page for creating pipelines, configuring steps, and enabling tools.
  • Settings Configuration: Provider credentials, tool defaults, and global behavior settings.
  • Jobs Management: React-based job history and admin cleanup actions.

Documentation Structure

docs/
├── overview.md                        # System overview, data flow, and key concepts
├── architecture.md                    # Execution engine, architecture principles, and shared components
├── architecture/                      # Architecture deep dives (axes, policies, primitives)
├── CHANGELOG.md                       # Semantic changelog for releases
├── core-system/                       # Engine, services, and core infrastructure pieces
│   ├── abilities-api.md               # Current datamachine/* ability domains, permissions, and registration contract
│   ├── ai-directives.md               # AI directive system and priority hierarchy
│   ├── ai-conversation-loop.md        # Data Machine turn runner over Agents API conversation loop
│   ├── agent-bundles.md               # Portable agent bundle schema and extras contract
│   ├── daily-memory-system.md         # Daily memory files, task, tool, CLI, and REST surface
│   ├── engine-execution.md            # Execution cycle and Single Item Execution Model
│   ├── troubleshooting-problem-flows.md # Monitoring consecutive failures and no-items
│   ├── http-client.md                 # Centralized HTTP client architecture
│   ├── import-export.md               # Pipeline import/export functionality
│   ├── memory-policy.md               # Memory section policy and pending writes
│   ├── wp-cli.md                      # Command reference and aliases
│   └── [other core system docs...]
├── handlers/                          # Fetch, publish, and update handler specifics
├── ai-tools/                          # AI agent tools, workflows, and tool usage
├── admin-interface/                   # User guidance for admin pages
├── api/                               # REST API for consumers
│   ├── index.md                       # Complete API overview and common patterns
│   └── endpoints/                     # Individual REST endpoint documentation
│       ├── agents.md                  # Agent CRUD, access grants, and tokens
│       ├── agent-ping.md              # Bearer-token ping callback routes
│       ├── email.md                   # Email send/fetch/mailbox routes
│       ├── internal-links.md          # Link audit and diagnostics routes
│       └── errors.md                  # Error handling reference
├── development/                       # Developer-focused documentation
│   ├── agents-api-pre-extraction-audit.md # Agents API/Data Machine boundary record
│   ├── agents-api-duplicated-substrate-inventory.md # Remaining substrate follow-ups
│   ├── hooks/                         # Core actions, filters, and engine hooks
│   └── rest-integration.md            # REST API extension patterns
└── README.md                          # This navigation and orientation page

Current Runtime Surfaces

Use these source files as authoritative anchors when docs and code disagree:

  • inc/Cli/Bootstrap.php registers the WP-CLI command tree and aliases.
  • inc/Api/ registers the Data Machine REST product API.
  • inc/Engine/AI/conversation-loop.php shows the Data Machine turn runner and Agents API loop boundary.
  • inc/Engine/AI/Tools/ contains tool sources, execution, and policy resolution.
  • inc/Engine/AI/Actions/ contains pending-action adapters, REST/ability surfaces, and resolver plumbing.
  • inc/Engine/AI/System/Tasks/ contains background system tasks including daily memory and retention.

Component Coverage

Refer to the individual files listed above for implementation details, operational guidance, and API references.