MySQL 字符编码更改。是否保留了数据完整性?

MySQL character encoding change. Is data integrity preserved?

我必须将数据库编码从 latin-1 转换为 utf-8。

我知道转换数据库是通过

的命令完成的
ALTER DATABASE db_name
    [[DEFAULT] CHARACTER SET charset_name]
    [[DEFAULT] COLLATE collation_name]

Source 并通过

命令转换现有的 table
ALTER TABLE tbl_name
    [[DEFAULT] CHARACTER SET charset_name]
    [COLLATE collation_name]

Source.

但是,数据库已经存在,涉及敏感信息。我的问题是我已经拥有的数据是否会被更改。这个问题的目的是让我在做更改之前必须先给出一个估计。

每个(字符串类型)都有其自己的字符集和排序规则元数据。

如果在指定 的数据类型时(即上次创建或更改时),没有明确给出字符 set/collation,则table 的默认字符集和排序规则将用于该列。

如果在指定table时没有显式给出默认字符set/collation,则将使用数据库的默认字符集和排序规则对于 table 的默认值。

您在问题中引用的命令仅分别更改数据库的默认字符 sets/collations 和 table。换句话说,它们只会影响 table 之后创建的列 - 它们 不会 影响现有列(或数据)。

要更新现有数据,您应该首先阅读 ALTER TABLE 上手册页的 Changing the Character Set 部分:

Changing the Character Set

To change the table default character set and all character columns (CHAR, VARCHAR, TEXT) to a new character set, use a statement like this:

ALTER TABLE <em>tbl_name</em> CONVERT TO CHARACTER SET <em>charset_name</em>;

The statement also changes the collation of all character columns. If you specify no COLLATE clause to indicate which collation to use, the statement uses default collation for the character set. If this collation is inappropriate for the intended table use (for example, if it would change from a case-sensitive collation to a case-insensitive collation), specify a collation explicitly.

For a column that has a data type of VARCHAR or one of the TEXT types, CONVERT TO CHARACTER SET changes the data type as necessary to ensure that the new column is long enough to store as many characters as the original column. For example, a TEXT column has two length bytes, which store the byte-length of values in the column, up to a maximum of 65,535. For a latin1 TEXT column, each character requires a single byte, so the column can store up to 65,535 characters. If the column is converted to utf8, each character might require up to three bytes, for a maximum possible length of 3 × 65,535 = 196,605 bytes. That length does not fit in a TEXT column's length bytes, so MySQL converts the data type to MEDIUMTEXT, which is the smallest string type for which the length bytes can record a value of 196,605. Similarly, a VARCHAR column might be converted to MEDIUMTEXT.

To avoid data type changes of the type just described, do not use CONVERT TO CHARACTER SET. Instead, use MODIFY to change individual columns. For example:

ALTER TABLE t MODIFY latin1_text_col TEXT CHARACTER SET utf8;
ALTER TABLE t MODIFY latin1_varchar_col VARCHAR(<em>M</em>) CHARACTER SET utf8;

If you specify CONVERT TO CHARACTER SET binary, the CHAR, VARCHAR, and TEXT columns are converted to their corresponding binary string types (BINARY, VARBINARY, BLOB). This means that the columns no longer will have a character set attribute and a subsequent CONVERT TO operation will not apply to them.

If charset_name is DEFAULT in a CONVERT TO CHARACTER SET operation, the character set named by the character_set_database system variable is used.

 Warning

The CONVERT TO operation converts column values between the original and named character sets. This is not what you want if you have a column in one character set (like latin1) but the stored values actually use some other, incompatible character set (like utf8). In this case, you have to do the following for each such column:

ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;

The reason this works is that there is no conversion when you convert to or from BLOB columns.

要仅更改 table 的 默认 字符集,请使用此语句:

ALTER TABLE <em>tbl_name</em> DEFAULT CHARACTER SET <em>charset_name</em>;

单词DEFAULT是可选的。默认字符集是如果您没有为稍后添加到 table 的列指定字符集(例如,使用 ALTER TABLE ... ADD column)时使用的字符集。

foreign_key_checks system variable is enabled, which is the default setting, character set conversion is not permitted on tables that include a character string column used in a foreign key constraint. The workaround is to disable foreign_key_checks before performing the character set conversion. You must perform the conversion on both tables involved in the foreign key constraint before re-enabling foreign_key_checks. If you re-enable foreign_key_checks 仅转换一个 table 后,ON DELETE CASCADEON UPDATE CASCADE 操作可能会损坏引用 table 中的数据,因为到在这些操作期间发生的隐式转换(错误 #45290、错误 #74816)。