如何在 Google 表格中拆分包含标志的表情符号而不将标志分成 2 个字符

How to split emojis that contain flags without the flag breaking into 2 characters in Google Sheets

这是我的初始字符串:

我用了一种不太优雅的方式来分解表情符号。

=if(len(I88) = 4, REGEXEXTRACT(I88,"(.+?)\s*(.+?)"),if(len(I88) = 6, REGEXEXTRACT(I88,"(.+?)\s*(.+?)\s*(.+?)"),if(len(I88) = 8, REGEXEXTRACT(I88,"(.+?)\s*(.+?)\s*(.+?)\s*(.+?)"),if(len(I88) = 10, REGEXEXTRACT(I88,"(.+?)\s*(.+?)\s*(.+?)\s*(.+?)\s*(.+?)"), REGEXEXTRACT(I88,"\s*(.+?)" )))))

结果是 4 列而不是 3 列:这就是它的样子

  |  |   |     

我留下管道来表示单独的一列

我要的是这个:

 |  |  

简答

要正确分隔三个表情符号,我们需要使用 custom function. Fortunaly there are JavaScript libraries that could be used for this like the one shared in the answer by Orlin Giorgiev to Get grapheme character count in javascript strings?

说明

OP 公式返回四个元素而不是三个,因为 Google 表格内置函数需要四个“字符”(实际上它们是代码点),需要超过 4 个十六进制数字来表示它们。每组代表表情符号的“字符”称为“星码点”。

来自https://mathiasbynens.be/notes/javascript-unicode

Astral code points are pretty easy to recognize: if you need more than 4 hexadecimal digits to represent the code point, it’s an astral code point.


Internally, JavaScript [as well Google Sheets built-in functions] represents astral symbols as surrogate pairs, and it exposes the separate surrogate halves as separate “characters”. If you represent the symbols using nothing but ECMAScript 5-compatible escape sequences, you’ll see that two escapes are needed for each astral symbol. This is confusing, because humans generally think in terms of Unicode symbols or graphemes instead.

自定义函数

function SPLITGRAPHEMES(string) {
  var splitter = new GraphemeSplitter();
  return splitter.splitGraphemes(string); 
}

注意:不要忘记包含 referred JavaScript library

语法

假设A1包含。要将三个表情符号拆分为 1 x 3 数组,请使用以下公式:

=TRANSPOSE(SPLITGRAPHEMES(A1))

注:Windows本问答中的表情符号()与ChromeOS中的表情不一样,所以上段使用了图片。