Data Machine User Documentation

Last updated: May 21, 2026

AI-first WordPress plugin for automating and orchestrating content workflows with a visual pipeline builder, conversational chat agent, REST API, and extensibility through handlers and tools.

Agent-First Architecture

Data Machine is designed for AI agents as primary users, not just tool operators.

The Self-Orchestration Pattern

While humans use Data Machine to automate content workflows, AI agents can use it to automate themselves:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   AGENT     │ ──▶ │   QUEUE     │ ──▶ │  PIPELINE   │ ──▶ │ AGENT PING  │
│ queues task │     │  persists   │     │  executes   │     │  wakes agent│
│             │     │  context    │     │             │     │             │
└─────────────┘     └─────────────┘     └─────────────┘     └──────┬──────┘
       ▲                                                          │
       └──────────────────────────────────────────────────────────┘
                         Agent processes, queues next task

Key concepts:

Prompt Queue as Project Memory: Queue items persist across sessions, storing project context that survives context window limits. Your multi-week project becomes a series of queued prompts.
Agent Ping for Continuity: The agent_ping step type triggers external agents (via webhook) after pipeline completion. This is how the loop closes — you get notified when it’s your turn to act. Agent Ping is outbound-only; inbound triggers use the REST API.
Phased Execution: Complex projects execute in stages over days or weeks. Each stage completes, pings the agent, and the agent queues the next stage.
Autonomous Loops: An agent can run indefinitely: process result → queue next task → sleep → wake on ping → repeat. Use explicit stop conditions to avoid runaway loops.

Prompt Queue as Project Memory: Queue items persist across sessions, storing project context that survives context window limits. Your multi-week project becomes a series of queued prompts.

System Architecture

Pipelines are reusable workflow templates that store handler order, tool selections, and AI settings.
Flows instantiate pipelines with schedule metadata, flow-level overrides, and runtime configuration values stored per flow.
Ephemeral Workflows (@since v0.8.0) are temporary, on-the-fly workflows triggered via the REST API. They skip database persistence for the workflow definition itself, using sentinel values (flow_id='direct', pipeline_id='direct') and dynamic configuration stored within the job’s engine snapshot.
Jobs track individual flow executions, persist engine parameters, and power the fully React-based Jobs dashboard for real-time monitoring. Jobs support parent-child relationships for batch execution via parent_job_id.
Steps execute sequentially (Fetch → AI → Publish/Update) with shared base classes that enforce validation, logging, and engine data synchronization.

Multi-Agent Architecture

Agent Ping for Continuity: The agent_ping step type triggers external agents (via webhook) after pipeline completion. This is how the loop closes — you get notified when it’s your turn to act. Agent Ping is outbound-only; inbound triggers use the REST API.

Agent Registry: Agents are stored in datamachine_agents with unique slugs, owner relationships, and configuration.
Access Control: The datamachine_agent_access table implements role-based access (viewer, operator, admin) for sharing agents across WordPress users.
Resource Scoping: Agents have both agent_id and agent_slug, similar to WordPress posts having IDs and slugs. Storage tables keep agent_id; portable runtime/export contexts may also carry agent_slug. Queries filter by resolved agent context automatically.
Filesystem Isolation: Each agent gets its own directory under agents/{slug}/ for identity files (SOUL.md, MEMORY.md) and daily memory.
Three-Layer Directory System: Memory files are organized into shared (site-wide), agent (identity), and user (personal) layers below Data Machine’s files root in WordPress uploads.

Phased Execution: Complex projects execute in stages over days or weeks. Each stage completes, pings the agent, and the agent queues the next stage.

Memory System

Autonomous Loops: An agent can run indefinitely: process result → queue next task → sleep → wake on ping → repeat. Use explicit stop conditions to avoid runaway loops.

Agent Memory Files

Prompt Queue as Project Memory: Queue items persist across sessions, storing project context that survives context window limits. Your multi-week project becomes a series of queued prompts.

Layer	Directory	Contents
Shared	shared layer	SITE.md, RULES.md (site-wide context)
Agent	agent slug layer	SOUL.md, MEMORY.md (agent identity and knowledge)
User	user ID layer	USER.md (human preferences)

Daily Memory System

Maintains MEMORY.md with the single daily_memory prompt when activity or size requires compaction
Archives session-specific content to daily files, including deterministic overflow artifacts for very large memory files

Phased Execution: Complex projects execute in stages over days or weeks. Each stage completes, pings the agent, and the agent queues the next stage.

Memory Path Discovery

bash

wp datamachine memory paths --allow-root

Autonomous Loops: An agent can run indefinitely: process result → queue next task → sleep → wake on ping → repeat. Use explicit stop conditions to avoid runaway loops.

This transforms Data Machine from a content automation tool into a self-scheduling execution layer for AI agents.

Abilities API

Data Machine supports multiple agents on a single WordPress installation (@since v0.36.1). Each agent has its own identity, memory, and resource scope.

Flow, pipeline, and flow-step operations live in focused classes under inc/Abilities/Flow/, inc/Abilities/Pipeline/, and inc/Abilities/FlowStep/; PipelineStepAbilities handles pipeline-step ordering and synchronization.
Job abilities monitor execution outcomes, retries, manual failure, recovery, summaries, and deletion.
ProcessedItemsAbilities deduplicates content across executions by tracking previously processed identifiers.
AgentAbilities manages agent CRUD, active-agent selection, bundle import/export, audience access grants, renaming, and deletion.
AgentTokenAbilities manages per-agent bearer tokens for external clients.
AgentMemoryAbilities provides section-based read, write, append, search, and self-memory operations on memory files.
DailyMemoryAbilities manages daily memory files — read, write, list, search, and delete by date.
EmailAbilities, SourceInventoryAbility, and SourceAggregateAbility expose email operations plus source-level inventory and aggregate reporting.
LogAbilities and the LogRepository aggregate log entries in the wp_datamachine_logs table for filtering in the admin UI.
Cache invalidation is handled by ability-level clearCache() methods to ensure dynamic handler and step type registrations are immediately reflected across the system.

See Multi-Agent Architecture for details.

System Tasks Framework

Data Machine uses WordPress itself as the persistent memory layer for AI agents — files on disk, conversations in the database, context assembled at request time.

Job lifecycle: completeJob(), failJob(), reschedule() with attempt tracking (max 24 retries)
Editable prompts: getPromptDefinitions() system with overrides stored in datamachine_task_prompts option
Undo system: supportsUndo() and undo() for reversible operations, with effect types for post content, meta, attachments, and featured images

Built-in System Tasks

Task Type	Class	Description
`image_generation`	`ImageGenerationTask`	AI-powered image generation
`image_optimization`	`ImageOptimizationTask`	Image compression and optimization
`alt_text_generation`	`AltTextTask`	AI-generated alt text for images
`internal_linking`	`InternalLinkingTask`	Automated internal link injection
`daily_memory_generation`	`DailyMemoryTask`	Daily memory synthesis and MEMORY.md cleanup
`meta_description_generation`	`MetaDescriptionTask`	AI-generated meta descriptions

Job Undo System

Markdown files organized in three layers:

post_content_modified — restores WordPress revisions
post_meta_set — restores previous meta values
attachment_created — deletes created attachments
featured_image_set — restores or removes thumbnails

bash

wp datamachine jobs undo <job_id> --allow-root
wp datamachine jobs undo <job_id> --dry-run --allow-root

Data Flow

DataPacket standardizes the payload (content, metadata, attachments) that AI agents receive, keeping packets chronological and clean of URLs when not needed.
EngineData stores engine-specific parameters such as source_url, image_url, and flow context, which fetch handlers persist via the datamachine_engine_data filter for downstream handlers.
FilesRepository modules (DirectoryManager, FileStorage, RemoteFileDownloader, ImageValidator, FileCleanup, FileRetrieval) isolate file storage per flow, validate uploads, and enforce automatic cleanup after jobs complete.

AI Integration

Tool-first architecture enables AI agents (pipeline, chat, and system) to call tools resolved through ToolPolicyResolver, ToolSourceRegistry, and the Data Machine datamachine_tools registry.
PromptBuilder + RequestBuilder apply layered directives via the datamachine_directives filter so every request includes identity, context, and site-specific instructions.
Static registry tools include research, site operations, memory, and workflow-management tools whose modes decide whether they appear in chat, pipeline, or system requests.
Adjacent handler tools are generated from previous/next pipeline steps so AI steps can publish, upsert, or skip items using the neighboring handler configuration.
WP_Agent_Tool_Parameters + ToolResultFinder gather parameter metadata for tools and interpret results inside data packets to keep conversations consistent.

Authentication & Security

Authentication providers extend BaseAuthProvider, BaseOAuth1Provider, or BaseOAuth2Provider under /inc/Core/OAuth/; concrete providers live next to their handlers in core or extension plugins.
OAuth handlers (OAuth1Handler, OAuth2Handler) standardize callback handling, nonce validation, and credential storage.
Capability checks use Data Machine capabilities through PermissionHelper, with support for WP-CLI, Action Scheduler, pre-authenticated contexts, and agent bearer-token ceilings. WordPress nonces guard browser REST requests; inputs run through sanitize_* helpers before hitting abilities.
Multi-agent permissions: PermissionHelper handles agent-level access checks via resolve_scoped_agent_id(), can_access_agent(), and owns_agent_resource().
HttpClient centralizes outbound HTTP requests with consistent headers, browser-mode simulation, timeout control, and logging via datamachine_log.

Scheduling & Jobs

Action Scheduler drives scheduled flow execution while REST endpoints handle immediate runs.
Flow schedules support manual runs, one-time execution, and recurring intervals (from 5 minutes to weekly). See Scheduling Intervals for available options.
System task scheduling: DailyMemoryTask and other system tasks run on cron schedules via Action Scheduler.
Batch execution: Jobs support parent-child relationships via parent_job_id for processing multiple items in coordinated batches.
Job abilities and repositories update statuses, emit extensibility actions (datamachine_update_job_status), and link jobs to logs and processed items for auditing.

Admin Interface

React-First Architecture: Admin pages are React apps built with @wordpress/components and TanStack Query for server state.
Client UI state: The Pipelines page uses a small Zustand store for UI state (pipeline selection, modals, chat sidebar). Other pages may use local React state.
Pipeline Builder: Visual pipeline/flow configuration with modal-driven step and handler settings.
Job Management: React dashboard for job history with server-driven pagination and admin cleanup modal.
Logs Interface: React logs viewer with filtering controls and REST-backed content loading.
Integrated Chat: Collapsible sidebar for context-aware pipeline automation and AI-driven workflow assistance, using specialized tools to manage the entire ecosystem.
Agent Management: Agent creation, configuration, and access control UI.

Key Capabilities

Multi-agent support with isolated identity, memory, and resources per agent on a single WordPress installation.
Multi-platform publishing via core fetch/publish/upsert handlers for files, RSS, email, and WordPress, plus extension-provided handlers for social, business, and event destinations.
Daily memory system for automatic temporal knowledge management with AI-driven pruning.
System tasks for background AI operations (image generation, alt text, internal linking, meta descriptions) with undo support.
Extension points through filters such as datamachine_handlers, datamachine_tools, agents_api_tool_sources, datamachine_step_types, datamachine_auth_providers, and datamachine_engine_data.
Directive orchestration ensures every AI request is context-aware, tool-enabled, and consistent with site policies.
Chartable logging, deduplication, and error handling keep operators informed about job outcomes and prevent duplicate processing.

View on GitHub