Skip to content Skip to sidebar

Learn what is the recommended database collation and character set when building multilingual sites and how to check and change them.

What Are Character Sets and Collations in MySQL?

When WordPress stores your site’s content in the database (like posts, titles, translated strings, etc.), it uses a character set to represent the characters and a collation to define how those characters are compared and sorted.

  • Character set (charset) defines how characters are stored in the database — essentially, which bytes represent which characters
  • Collation defines the rules for comparing those characters (e.g., case-sensitivity, accent-sensitivity, and sorting order)

Why It Matters for Multilingual Sites

WPML supports multilingual content, which often includes:

  • Special characters (e.g. ñ, é, ö)
  • Non-Latin scripts (e.g. Arabic, Japanese, Chinese, Hebrew)
  • Emoji and symbols (🎉, ✔️, etc.)

Many of these require more than 3 bytes to be stored correctly. If your database uses an incompatible collation, characters may be lost, replaced with �, or throw database errors (e.g., “Incorrect string value”).

That’s why we strongly recommend using a collation that supports 4-byte Unicode characters.

Our Recommendation

Use the utf8mb4 character set and a Unicode-compatible collation, such as:

  • utf8mb4_unicode_ci – widely compatible, good default
  • utf8mb4_unicode_520_ci – better Unicode handling on MySQL 5.6+
  • utf8mb4_general_ci – slightly faster, less accurate Unicode comparison
  • utf8mb4_bin – case- and accent-sensitive (binary comparison)

These are all safe to use with WPML.

Important: The old utf8 character set in MySQL only supports up to 3 bytes per character. It cannot handle some characters like emoji or some ideograms. Avoid using it for multilingual content.

What Is the Default in MySQL?

MySQL VersionDefault CharsetDefault CollationUnicode-Compatible
< 5.5latin1latin1_swedish_ci❌ No
5.5.xutf8utf8_general_ci❌ No
5.7+utf8mb4utf8mb4_general_ci / utf8mb4_unicode_ci✅ Yes
8.0+utf8mb4utf8mb4_0900_ai_ci✅ Yes

WordPress may not always follow MySQL’s default — it can override it via configuration or migrations.

Setting Charset and Collation for new sites

To ensure WordPress creates new tables with the correct character set and collation, define the following in your wp-config.php:

define( 'DB_CHARSET', 'utf8mb4' ); // If you have utf8 that's fine, WP will automatically map it as utf8mb4
define( 'DB_COLLATE', 'utf8mb4_unicode_ci' );

This will apply to new installations and newly created tables.

Checking Charset and Collation for Existing Sites and Tables

Here’s how to check the character set and collation on your existing sites.

Option 1: phpMyAdmin

  1. Go to your site’s database.
  2. Check the “Collation” column next to each table.
  3. Click a table to view each column’s individual collation.

Option 2: WP-CLI

bash

wp db query "SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE table_schema = 'your_db_name';"

You can also inspect individual columns:

bash

wp db query "SHOW FULL COLUMNS FROM wp_posts;"

If you see that the collation starts with utf8mb4_ then it’s ready to support all types of unicode and 4-byte ideograms. The character set is also fine. 

If you see something else, for example utf8_* then you need to change it.

Please note that collation can be different per-column.

Updating Charset and Collation for Existing Tables and Columns

Updating the DB_CHARSET and DB_COLLATE constants in your wp-config.php file only affects new tables. To apply changes to existing tables and columns, you must convert them manually using SQL.

We also recommend updating the database’s default MySQL collation using the following command:

ALTER DATABASE your_database_name CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Always Start with A Back-Up

Before running any conversion queries:

  • Export your database using phpMyAdmin or mysqldump
  • Store it safely in case something goes wrong

Warning: Collation Updates May Fail Under Strict Mode

When updating the collation of an entire table (or database), MySQL may rebuild tables and revalidate all column defaults — including unrelated ones like DATETIME. If your database is running in strict SQL mode (STRICT_TRANS_TABLES, NO_ZERO_DATE), columns with default values like ‘0000-00-00 00:00:00’ will trigger errors such as:

#1067 - Invalid default value.

What is Strict Mode?

Strict mode controls how MySQL handles invalid or missing values when data is added or updated into the database. A value can be invalid for several reasons. For example, it might have the wrong data type for the column, or it might be out of range.

You can find information about the different types of data validation for Strict mode in the MySQL documentation

How Strict Mode Affects a WordPress Website Database

By default, WordPress uses ‘0000-00-00 00:00:00’ as default for some datetime columns, which is not accepted by the NO_ZERO_DATE strict mode.

Note that these defaults were allowed during the original table creation (e.g., during WordPress setup), but fail upon ALTER operations in strict environments.

How to Check if Strict Mode is Enabled

Run the following command:

SELECT @@SESSION.sql_mode, @@GLOBAL.sql_mode;

Look for:

  • STRICT_TRANS_TABLES
  • STRICT_ALL_TABLES
  • NO_ZERO_DATE

If you find any of these, strict mode is enabled.

To avoid errors during character set and collation updates, you only need to disable the NO_ZERO_DATE strict mode.

How to Proceed Safely

Do one of the following to proceed safely:

1. Avoid altering columns and tables unless the collation actually needs changing.

2. Temporarily disable strict mode for the session only while running conversion queries. You can do this using the following command:

SET SESSION sql_mode = REPLACE(@@sql_mode, 'NO_ZERO_DATE', '');

Note: there are future plans for MySQL to merge strict modes.

In case of other problems particular to your data, you may want to disable any strict mode using the following command:

SET SESSION sql_mode = '';

Queries for Converting Data

Convert the Entire Database

The following generates an ALTER TABLE query for each table in your database:

SELECT CONCAT(
  'ALTER TABLE `', TABLE_NAME, '` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;'
) AS query
  FROM INFORMATION_SCHEMA.TABLES
  WHERE TABLE_SCHEMA = 'your_database_name';

Run the resulting queries in phpMyAdmin, or copy-paste into MySQL CLI / WP-CLI.

This ensures all tables in your database adopt the new charset and collation.

(Optional/Accessory) Convert a Specific Table

ALTER TABLE wp_posts 
  CONVERT TO CHARACTER SET utf8mb4 
  COLLATE utf8mb4_unicode_ci;

This updates all text columns in the wp_posts table to use utf8mb4 and the specified collation.

(Optional/Accessory) Convert a Specific Column

ALTER TABLE wp_posts 
  CHANGE post_title post_title TEXT 
  CHARACTER SET utf8mb4 
  COLLATE utf8mb4_unicode_ci;

Note:

  • You must fully re-declare the column (type, name) in CHANGE.
  • Ensure the column type matches the existing one.

Common Errors When Converting

Here are two common errors you might encounter when converting tables and columns.

Mismatched Charset and Collation

-- INVALID: utf8 collation with utf8mb4 charset
ALTER TABLE wp_posts CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8_general_ci;
-- Error: COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'utf8mb4'

Solution:

  • Use a matching collation: utf8mb4_*

Index key length limit

ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes

In some configurations, specially for MySQL versions below 5.7, the maximum index key length may be exceeded when adding one more byte per character.

Solution:

  • Shorten indexed VARCHAR(255) to VARCHAR(191) if needed
  • Or upgrade to MySQL 5.7+ and ensure innodb_large_prefix is enabled

Summary

  • Use utf8mb4 character set for full Unicode support
  • Use compatible collations: utf8mb4_unicode_ci, utf8mb4_0900_ai_ci, etc.
  • Update wp-config.php for new tables
  • Run SQL migrations to fix existing tables
  • Never mix character sets and collations — MySQL will reject or break them.