WP_HTML_Active_Formatting_Elements
List of active formatting elements for handling mis-nested formatting tags.
Source: wp-includes/html-api/class-wp-html-active-formatting-elements.php
Since: 6.4.0
Access: private
Overview
The active formatting elements list handles cases where formatting elements are improperly nested in HTML. For example:
<p><b>Bold <i>Bold-Italic</b> Italic</i></p>
In a proper tree, the </b> cannot close the <b> because <i> is still open. The active formatting elements list helps reconstruct the proper formatting after such cases.
Markers are inserted when entering certain elements (APPLET, OBJECT, MARQUEE, TEMPLATE, TD, TH, CAPTION) to prevent formatting from "leaking" across element boundaries.
Methods
count()
Returns the number of elements in the list.
public function count()
Returns: Number of elements.
current_node()
Returns the last element in the list.
public function current_node()
Returns: WP_HTML_Token|null — Last element or null if empty.
contains_node()
Checks if a specific token is in the list.
public function contains_node( WP_HTML_Token $token )
| Parameter | Type | Description |
|---|---|---|
$token |
WP_HTML_Token | Token to find |
Returns: bool — true if found.
push()
Adds an element to the list.
public function push( WP_HTML_Token $token )
| Parameter | Type | Description |
|---|---|---|
$token |
WP_HTML_Token | Token to add |
Implements the "Noah’s Ark clause" — if there are already three elements with the same tag name, namespace, and attributes after the last marker, the earliest one is removed before adding the new element.
insert_marker()
Inserts a marker into the list.
public function insert_marker(): void
Markers prevent formatting from leaking across element boundaries. They are inserted when entering:
- APPLET
- OBJECT
- MARQUEE
- TEMPLATE
- TD
- TH
- CAPTION
remove_node()
Removes a specific token from the list.
public function remove_node( WP_HTML_Token $token )
| Parameter | Type | Description |
|---|---|---|
$token |
WP_HTML_Token | Token to remove |
Returns: bool — true if found and removed.
clear_up_to_last_marker()
Removes all elements up to and including the last marker.
public function clear_up_to_last_marker(): void
Called when leaving elements that insert markers (APPLET, OBJECT, etc.) to clean up any formatting elements that were opened inside.
walk_down()
Generator that walks from first (oldest) to last (newest).
public function walk_down()
Example:
// For '<em><strong><a>'
foreach ( $list->walk_down() as $node ) {
echo $node->node_name . ' -> ';
}
// Output: EM -> STRONG -> A ->
walk_up()
Generator that walks from last (newest) to first (oldest).
public function walk_up()
Example:
// For '<em><strong><a>'
foreach ( $list->walk_up() as $node ) {
echo $node->node_name . ' -> ';
}
// Output: A -> STRONG -> EM ->
Formatting Elements
The following elements are added to the active formatting elements list:
- A (anchor)
- B (bold)
- BIG
- CODE
- EM (emphasis)
- FONT
- I (italic)
- NOBR
- S (strikethrough)
- SMALL
- STRIKE
- STRONG
- TT (teletype)
- U (underline)
Usage Example
$list = new WP_HTML_Active_Formatting_Elements();
// Parsing '<td><b><i>' — marker inserted for TD
$list->insert_marker();
$list->push( new WP_HTML_Token( 'bk1', 'B', false ) );
$list->push( new WP_HTML_Token( 'bk2', 'I', false ) );
// List: [marker, B, I]
echo $list->count(); // 3
// Leaving TD — clear back to marker
$list->clear_up_to_last_marker();
// List is now empty
echo $list->count(); // 0
Reconstruction Algorithm
When the parser needs to reconstruct active formatting elements (e.g., after text or a new element):
- If the list is empty, or the last entry is a marker, or the last entry is already on the stack of open elements, stop.
- Let entry be the last entry in the list.
- Rewind: If entry is not the first entry and the entry before it is neither a marker nor in the stack, let entry be that previous entry and repeat step 3.
- Advance: Create an element for the token for which entry was created, append it to the current node, push it onto the stack, and replace entry with a new entry for this element.
- If entry is not the last entry in the list, go to step 4.
This ensures formatting is properly re-applied after being interrupted.