使用 SED 删除多行表达式中的逗号

Question

我有一个这样的文本文件：

CREATE TABLE `table_user` (
  `user_id` int(11) NOT NULL AUTO_INCREMENT,
  `user_attribute1` int(11) NOT NULL,
  PRIMARY KEY (`user_id`),
  UNIQUE KEY `fk_user_idx` (`user_id`,`user_attribute1`),
  KEY `fk_user_attribute1_idx` (`user_attribute1`),
) ENGINE=InnoDB;


CREATE TABLE `table_product` (
  `product_id` int(11) NOT NULL AUTO_INCREMENT,
  `product_attribute1` int(11) NOT NULL,
  PRIMARY KEY (`product_id`),
  UNIQUE KEY `fk_product_idx` (`product_id`,`product_attribute1`),
  KEY `fk_product_attribute1_idx` (`product_attribute1`),


) ENGINE=InnoDB;

CREATE TABLE `table_ads` (
  `ad_id` int(11) NOT NULL AUTO_INCREMENT,
  `ad_attribute1` int(11) NOT NULL,
  PRIMARY KEY (`ad_id`),
  UNIQUE KEY `fk_ad_idx` (`ad_id`,`ad_attribute1`),
  KEY `fk_ad_attribute1_idx` (`ad_attribute1`),





) ENGINE=InnoDB;

您会注意到，在每个“Create table”的右括号之前，有一行以逗号和可变数量的新行结尾.

在 Bash 中使用 sed 命令我想删除最后一个逗号字符以创建有效的 SQL.

我试试这样的表达方式

sed 's/,[[:space:]]*)//'

但它不起作用，可能我需要进行多行搜索，但我不知道该怎么做。

如何实现？

Answer 1

使用 gnu-sed 你可以使用 -z 选项来做到这一点：

sed -zE 's/,\n*(\n\) ENGINE)//g' file.db

CREATE TABLE `table_user` (
  `user_id` int(11) NOT NULL AUTO_INCREMENT,
  `user_attribute1` int(11) NOT NULL,
  PRIMARY KEY (`user_id`),
  UNIQUE KEY `fk_user_idx` (`user_id`,`user_attribute1`),
  KEY `fk_user_attribute1_idx` (`user_attribute1`)
) ENGINE=InnoDB;


CREATE TABLE `table_product` (
  `product_id` int(11) NOT NULL AUTO_INCREMENT,
  `product_attribute1` int(11) NOT NULL,
  PRIMARY KEY (`product_id`),
  UNIQUE KEY `fk_product_idx` (`product_id`,`product_attribute1`),
  KEY `fk_product_attribute1_idx` (`product_attribute1`)
) ENGINE=InnoDB;

CREATE TABLE `table_ads` (
  `ad_id` int(11) NOT NULL AUTO_INCREMENT,
  `ad_attribute1` int(11) NOT NULL,
  PRIMARY KEY (`ad_id`),
  UNIQUE KEY `fk_ad_idx` (`ad_id`,`ad_attribute1`),
  KEY `fk_ad_attribute1_idx` (`ad_attribute1`)
) ENGINE=InnoDB;

Answer 2

如果您不能使用 GNU 的 sed 扩展，您仍然可以使用标准的 sed，但它很麻烦。为此，我会选择 perl:

perl -e '$lines=join("",<>); $lines =~ s/,\s*\)/\n)/g; print $lines;' < sqlfile

从 <> 读取（与 <STDIN> 相同）returns 一行（在标量上下文中）或所有行的数组（在 wantarray 上下文中）。我们想要一个标量，以便我们可以替换多行，因此我使用 join 它接受一个数组和 returns 一个标量。

正则表达式找到 , 后跟 0 个或多个空白字符（包括换行符）后跟 )。然后它用一个换行符和一个 ).

替换它找到的内容

Answer 3

这是我喜欢反转文件的那种问题，然后我们将删除第一个非空行后面的尾随逗号 以右括号开头的行。

tac file.sql | awk '
  NF && p {sub(/,[[:blank:]]*$/, ""); p = 0}
   == ")" {p = 1}
  1
' | tac

[:blank:]是由横白space（space,tab）组成的字符class.

或者，一个不错的紧凑型 perl 单行代码怎么样

perl -0777 -pe 's/,(?=\s+[)])//g' file.sql

-0777 选项与 -p 一起将整个文件插入默认的 $_ 变量，并自动打印它。

使用 SED 删除多行表达式中的逗号

Remove comma character in multi line expression using SED

regex

bash

sed

multiline