uniq 将不相等的行视为相等

Question

我希望此命令的输出不同：

$ echo -e "あいうえお\nオエウイア" | uniq -c
      2 あいうえお

这两行不相同。
与此示例比较，按预期工作：

$ echo -e "aiueo\noeuia" | uniq -c
      1 aiueo
      1 oeuia

这是 Unicode 还是 UTF-8 问题？我没有找到任何支持 "exotic" 个字符的选项。

编辑：我在使用带有日语输入的排序时遇到了类似的问题。 a\nb\na\nb\n 形式的输入（或者，省略 '\n'，abab）保持这种状态，我希望它是 aabb 或至少 bbaa.

Answer 1

给你 - echo -e "あいうえお\nオエウイア" | uni2ascii -q | uniq -c | ascii2uni

uniq treats lines as equal when they are not