Charset and Collation
WordPress uses UTF-8 character encoding with MySQL/MariaDB collations optimized for Unicode.
Default Settings
Since WordPress 4.2 (2015):
// wp-config.php defaults
define( 'DB_CHARSET', 'utf8mb4' );
define( 'DB_COLLATE', '' ); // Uses database default
utf8mb4 vs utf8
| Feature | utf8 | utf8mb4 |
|---|---|---|
| Max bytes per character | 3 | 4 |
| Emoji support | ❌ | ✅ |
| Full Unicode | ❌ | ✅ |
| Supplementary characters | ❌ | ✅ |
WordPress requires utf8mb4 for full emoji and international character support.
Common Collations
| Collation | Description |
|---|---|
utf8mb4_unicode_ci |
Unicode sorting, case-insensitive |
utf8mb4_unicode_520_ci |
Unicode 5.2, improved sorting |
utf8mb4_general_ci |
Faster but less accurate sorting |
utf8mb4_bin |
Binary comparison (case-sensitive) |
utf8mb4_0900_ai_ci |
MySQL 8.0 default, accent-insensitive |
WordPress default: utf8mb4_unicode_520_ci
Checking Current Settings
Database Level
SELECT @@character_set_database, @@collation_database;
Table Level
SELECT table_name, table_collation
FROM information_schema.tables
WHERE table_schema = 'your_database';
Column Level
SELECT column_name, character_set_name, collation_name
FROM information_schema.columns
WHERE table_schema = 'your_database'
AND table_name = 'wp_posts';
Show Table Status
SHOW TABLE STATUS LIKE 'wp_posts';
WordPress Configuration
In wp-config.php
// Full 4-byte UTF-8 (recommended)
define( 'DB_CHARSET', 'utf8mb4' );
// Let WordPress choose collation
define( 'DB_COLLATE', '' );
// Or specify explicitly
define( 'DB_COLLATE', 'utf8mb4_unicode_520_ci' );
In $wpdb
global $wpdb;
$wpdb->charset; // 'utf8mb4'
$wpdb->collate; // 'utf8mb4_unicode_520_ci'
Converting Existing Tables
Check for Mixed Encodings
SELECT table_name, table_collation
FROM information_schema.tables
WHERE table_schema = 'your_database'
AND table_collation NOT LIKE 'utf8mb4%';
Convert Table to utf8mb4
ALTER TABLE wp_posts
CONVERT TO CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_520_ci;
Convert All Tables
-- Generate ALTER statements
SELECT CONCAT(
'ALTER TABLE `', table_name, '` ',
'CONVERT TO CHARACTER SET utf8mb4 ',
'COLLATE utf8mb4_unicode_520_ci;'
)
FROM information_schema.tables
WHERE table_schema = 'your_database'
AND table_type = 'BASE TABLE';
Index Length Limits
The 767 Byte Limit
Older MySQL (pre-5.7) limits index key length to 767 bytes.
With utf8mb4 (4 bytes/char): 767 / 4 = 191 characters max
This is why you see indexes like:
KEY `meta_key` (`meta_key`(191))
Modern MySQL (5.7+)
With innodb_large_prefix=ON and ROW_FORMAT=DYNAMIC:
- Index limit: 3072 bytes
- 768 characters for utf8mb4
WordPress tables use ROW_FORMAT=Dynamic when available.
Common Issues
Emoji Broken
Cause: Tables still using utf8 (3-byte)
Fix: Convert to utf8mb4
"Specified key was too long"
Cause: Index exceeds byte limit
Fix:
- Use
(191)prefix for varchar indexes - Enable large prefix support
- Use
ROW_FORMAT=DYNAMIC
Collation Mismatch Errors
Cause: Joining tables with different collations
Fix: Ensure all tables use same collation, or cast in query:
SELECT * FROM wp_posts p
JOIN wp_postmeta pm ON p.ID = pm.post_id
WHERE pm.meta_key COLLATE utf8mb4_unicode_520_ci = 'my_key';
Mojibake (Garbled Characters)
Cause: Double-encoding or wrong connection charset
Fix:
// Ensure connection uses utf8mb4
$wpdb->set_charset( $wpdb->dbh, 'utf8mb4' );
Best Practices
- Use utf8mb4 for new installations
- Use utf8mb4_unicode_520_ci collation
- Set charset in wp-config.php before install
- Check collation consistency across all tables
- Backup before converting existing databases
- Test emoji after conversion (📝✅🎉)
Checking Emoji Support
-- Insert test emoji
INSERT INTO wp_options (option_name, option_value, autoload)
VALUES ('emoji_test', '🎉📝✅', 'no');
-- Verify it stored correctly
SELECT option_value FROM wp_options WHERE option_name = 'emoji_test';
-- Clean up
DELETE FROM wp_options WHERE option_name = 'emoji_test';
Server Requirements
- MySQL 5.5.3+ or MariaDB 5.5+
innodb_file_format = Barracuda(for large indexes)innodb_large_prefix = ON(MySQL < 8.0)