Universal Web Scraper Test Command
WP-CLI command for testing the Universal Web Scraper handler with any target URL. Supports web pages, ICS feeds, and JSON APIs.
wp data-machine-events test-event-scraper --target_url=<url>Parameters
Required
--target_url=<url>: The web page URL, ICS feed, or JSON API to test
Examples
Web Page Scraping
wp data-machine-events test-event-scraper --target_url=https://example.com/eventsICS Calendar Feeds
wp data-machine-events test-event-scraper --target_url=https://tockify.com/api/feeds/ics/calendar-nameGoogle Calendar Export
wp data-machine-events test-event-scraper --target_url=webcal://calendar.google.com/calendar/ical/...Output
The command displays:
- Target URL: The URL being tested
- Extraction Details:
- Packet title
- Source type (e.g.,
wix_events,json_ld,raw_html) - Extraction method
- Event title and start date
- Venue name and address
- Status: OK (complete venue/address coverage) or WARNING (incomplete coverage)
- Warnings: Any extraction warnings encountered
Venue Coverage Warnings
The command evaluates venue data completeness:
- Packet title
- Source type (e.g.,
wix_events,json_ld,raw_html) - Extraction method
- Event title and start date
- Venue name and address
Raw HTML packets indicate AI extraction is needed for venue data.
Exit Codes
- Packet title
- Source type (e.g.,
wix_events,json_ld,raw_html) - Extraction method
- Event title and start date
- Venue name and address
Use Cases
- Missing venue name: Venue override required in flow configuration
- Missing address fields: Address, city, and state are required for geocoding
Reliability & Debugging
The test command is essential for verifying the scraper’s Smart Fallback and Browser Spoofing capabilities. When testing URLs known to have strict bot detection, observe the logs for "retrying with standard mode" to confirm the fallback is functioning correctly.
ICS Calendar Feed Support
The Universal Web Scraper now directly supports ICS/iCal feed URLs, replacing the deprecated ICS Calendar handler.
Supported ICS Formats
0: Command completed successfully1: Error (missing required parameter)
ICS Handler Migration
The legacy ICS Calendar handler is deprecated. Migrate existing flows using:
wp data-machine-events migrate-handlers --handler=ics_calendar --dry-runTo perform the migration:
wp data-machine-events migrate-handlers --handler=ics_calendarThe migration tool automatically:
- Handler Testing: Verify the scraper works on a new venue website
- Extraction Debugging: Inspect raw extraction results before running a full pipeline. If extraction fails, the command outputs the full raw HTML to assist in troubleshooting.
- Coverage Assessment: Check if venue data will be complete after import
- Platform Detection: Identify which extractor is being used (Wix, Squarespace, etc.)
ICS-Specific Output
When testing ICS feeds, the command displays:
- Direct
.icsfiles - Tockify feeds
- Google Calendar exports
- Apple Calendar exports
- Outlook calendar exports
- Any standard ICS/iCal feed