MySQL 字符编码更改。是否保留了数据完整性?
MySQL character encoding change. Is data integrity preserved?
我必须将数据库编码从 latin-1 转换为 utf-8。
我知道转换数据库是通过
的命令完成的
ALTER DATABASE db_name
[[DEFAULT] CHARACTER SET charset_name]
[[DEFAULT] COLLATE collation_name]
Source 并通过
命令转换现有的 table
ALTER TABLE tbl_name
[[DEFAULT] CHARACTER SET charset_name]
[COLLATE collation_name]
但是,数据库已经存在,涉及敏感信息。我的问题是我已经拥有的数据是否会被更改。这个问题的目的是让我在做更改之前必须先给出一个估计。
每个(字符串类型)列都有其自己的字符集和排序规则元数据。
如果在指定 列 的数据类型时(即上次创建或更改时),没有明确给出字符 set/collation,则table 的默认字符集和排序规则将用于该列。
如果在指定table时没有显式给出默认字符set/collation,则将使用数据库的默认字符集和排序规则对于 table 的默认值。
您在问题中引用的命令仅分别更改数据库的默认字符 sets/collations 和 table。换句话说,它们只会影响 table 之后创建的列 - 它们 不会 影响现有列(或数据)。
要更新现有数据,您应该首先阅读 ALTER TABLE
上手册页的 Changing the Character Set 部分:
Changing the Character Set
To change the table default character set and all character columns (CHAR
, VARCHAR
, TEXT
) to a new character set, use a statement like this:
ALTER TABLE <em>tbl_name</em> CONVERT TO CHARACTER SET <em>charset_name</em>;
The statement also changes the collation of all character columns. If you specify no COLLATE
clause to indicate which collation to use, the statement uses default collation for the character set. If this collation is inappropriate for the intended table use (for example, if it would change from a case-sensitive collation to a case-insensitive collation), specify a collation explicitly.
For a column that has a data type of VARCHAR
or one of the TEXT
types, CONVERT TO CHARACTER SET
changes the data type as necessary to ensure that the new column is long enough to store as many characters as the original column. For example, a TEXT
column has two length bytes, which store the byte-length of values in the column, up to a maximum of 65,535. For a latin1
TEXT
column, each character requires a single byte, so the column can store up to 65,535 characters. If the column is converted to utf8
, each character might require up to three bytes, for a maximum possible length of 3 × 65,535 = 196,605 bytes. That length does not fit in a TEXT
column's length bytes, so MySQL converts the data type to MEDIUMTEXT
, which is the smallest string type for which the length bytes can record a value of 196,605. Similarly, a VARCHAR
column might be converted to MEDIUMTEXT
.
To avoid data type changes of the type just described, do not use CONVERT TO CHARACTER SET
. Instead, use MODIFY
to change individual columns. For example:
ALTER TABLE t MODIFY latin1_text_col TEXT CHARACTER SET utf8;
ALTER TABLE t MODIFY latin1_varchar_col VARCHAR(<em>M</em>) CHARACTER SET utf8;
If you specify CONVERT TO CHARACTER SET binary
, the CHAR
, VARCHAR
, and TEXT
columns are converted to their corresponding binary string types (BINARY
, VARBINARY
, BLOB
). This means that the columns no longer will have a character set attribute and a subsequent CONVERT TO
operation will not apply to them.
If charset_name
is DEFAULT
in a CONVERT TO CHARACTER SET
operation, the character set named by the character_set_database
system variable is used.
Warning
The CONVERT TO
operation converts column values between the original and named character sets. This is not what you want if you have a column in one character set (like latin1
) but the stored values actually use some other, incompatible character set (like utf8
). In this case, you have to do the following for each such column:
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;
The reason this works is that there is no conversion when you convert to or from BLOB
columns.
要仅更改 table 的 默认 字符集,请使用此语句:
ALTER TABLE <em>tbl_name</em> DEFAULT CHARACTER SET <em>charset_name</em>;
单词DEFAULT
是可选的。默认字符集是如果您没有为稍后添加到 table 的列指定字符集(例如,使用 ALTER TABLE ... ADD column
)时使用的字符集。
当 foreign_key_checks
system variable is enabled, which is the default setting, character set conversion is not permitted on tables that include a character string column used in a foreign key constraint. The workaround is to disable foreign_key_checks
before performing the character set conversion. You must perform the conversion on both tables involved in the foreign key constraint before re-enabling foreign_key_checks
. If you re-enable foreign_key_checks
仅转换一个 table 后,ON DELETE CASCADE
或 ON UPDATE CASCADE
操作可能会损坏引用 table 中的数据,因为到在这些操作期间发生的隐式转换(错误 #45290、错误 #74816)。
我必须将数据库编码从 latin-1 转换为 utf-8。
我知道转换数据库是通过
的命令完成的ALTER DATABASE db_name
[[DEFAULT] CHARACTER SET charset_name]
[[DEFAULT] COLLATE collation_name]
Source 并通过
命令转换现有的 tableALTER TABLE tbl_name
[[DEFAULT] CHARACTER SET charset_name]
[COLLATE collation_name]
但是,数据库已经存在,涉及敏感信息。我的问题是我已经拥有的数据是否会被更改。这个问题的目的是让我在做更改之前必须先给出一个估计。
每个(字符串类型)列都有其自己的字符集和排序规则元数据。
如果在指定 列 的数据类型时(即上次创建或更改时),没有明确给出字符 set/collation,则table 的默认字符集和排序规则将用于该列。
如果在指定table时没有显式给出默认字符set/collation,则将使用数据库的默认字符集和排序规则对于 table 的默认值。
您在问题中引用的命令仅分别更改数据库的默认字符 sets/collations 和 table。换句话说,它们只会影响 table 之后创建的列 - 它们 不会 影响现有列(或数据)。
要更新现有数据,您应该首先阅读 ALTER TABLE
上手册页的 Changing the Character Set 部分:
Changing the Character Set
To change the table default character set and all character columns (
CHAR
,VARCHAR
,TEXT
) to a new character set, use a statement like this:ALTER TABLE <em>tbl_name</em> CONVERT TO CHARACTER SET <em>charset_name</em>;
The statement also changes the collation of all character columns. If you specify no
COLLATE
clause to indicate which collation to use, the statement uses default collation for the character set. If this collation is inappropriate for the intended table use (for example, if it would change from a case-sensitive collation to a case-insensitive collation), specify a collation explicitly.For a column that has a data type of
VARCHAR
or one of theTEXT
types,CONVERT TO CHARACTER SET
changes the data type as necessary to ensure that the new column is long enough to store as many characters as the original column. For example, aTEXT
column has two length bytes, which store the byte-length of values in the column, up to a maximum of 65,535. For alatin1
TEXT
column, each character requires a single byte, so the column can store up to 65,535 characters. If the column is converted toutf8
, each character might require up to three bytes, for a maximum possible length of 3 × 65,535 = 196,605 bytes. That length does not fit in aTEXT
column's length bytes, so MySQL converts the data type toMEDIUMTEXT
, which is the smallest string type for which the length bytes can record a value of 196,605. Similarly, aVARCHAR
column might be converted toMEDIUMTEXT
.To avoid data type changes of the type just described, do not use
CONVERT TO CHARACTER SET
. Instead, useMODIFY
to change individual columns. For example:ALTER TABLE t MODIFY latin1_text_col TEXT CHARACTER SET utf8; ALTER TABLE t MODIFY latin1_varchar_col VARCHAR(<em>M</em>) CHARACTER SET utf8;
If you specify
CONVERT TO CHARACTER SET binary
, theCHAR
,VARCHAR
, andTEXT
columns are converted to their corresponding binary string types (BINARY
,VARBINARY
,BLOB
). This means that the columns no longer will have a character set attribute and a subsequentCONVERT TO
operation will not apply to them.If
charset_name
isDEFAULT
in aCONVERT TO CHARACTER SET
operation, the character set named by thecharacter_set_database
system variable is used.Warning
The
CONVERT TO
operation converts column values between the original and named character sets. This is not what you want if you have a column in one character set (likelatin1
) but the stored values actually use some other, incompatible character set (likeutf8
). In this case, you have to do the following for each such column:ALTER TABLE t1 CHANGE c1 c1 BLOB; ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;
The reason this works is that there is no conversion when you convert to or from
BLOB
columns.要仅更改 table 的 默认 字符集,请使用此语句:
ALTER TABLE <em>tbl_name</em> DEFAULT CHARACTER SET <em>charset_name</em>;
单词
DEFAULT
是可选的。默认字符集是如果您没有为稍后添加到 table 的列指定字符集(例如,使用ALTER TABLE ... ADD column
)时使用的字符集。当
foreign_key_checks
system variable is enabled, which is the default setting, character set conversion is not permitted on tables that include a character string column used in a foreign key constraint. The workaround is to disableforeign_key_checks
before performing the character set conversion. You must perform the conversion on both tables involved in the foreign key constraint before re-enablingforeign_key_checks
. If you re-enableforeign_key_checks
仅转换一个 table 后,ON DELETE CASCADE
或ON UPDATE CASCADE
操作可能会损坏引用 table 中的数据,因为到在这些操作期间发生的隐式转换(错误 #45290、错误 #74816)。