处理只改变空格的大块头
Dealing with hunks that only change whitespace
在我维护的代码中,我有时会收到提交者无缘无故地重排段落的拉取请求。这是一个例子:
diff --git a/knuth.tex b/knuth.tex
index 2f6a2f8..7b0827d 100644
--- a/knuth.tex
+++ b/knuth.tex
@@ -1,6 +1,6 @@
Thus, I came to the conclusion that the designer of a new
system must not only be the implementer and first
-large||scale user; the designer should also write the first
+large-scale user; the designer should also write the first
user manual.
The separation of any of these four components would have
@@ -9,8 +9,7 @@ all these activities, literally hundreds of improvements
would never have been made, because I would never have
thought of them or perceived why they were important.
-But a system cannot be successful if it is too strongly
-influenced by a single person. Once the initial design is
-complete and fairly robust, the real test begins as people
-with many different viewpoints undertake their own
-experiments.
+But a system cannot be successful if it is too strongly influenced by
+a single person. Once the initial design is complete and fairly
+robust, the real test begins as people with many different viewpoints
+undertake their own experiments.
如您所见,第一个 hunk 通过将 ||
替换为 -
引入了实际更改,而第二个 hunk 除了换行和空格外没有任何更改。事实上,第二个 hunk 的 word-diff
将是空的。
是否可以制定一项政策(例如在 GitHub 或我的 CI 中)以拒绝包含此类“空”块的提交,或者重新格式化补丁以忽略这些块完全让我可以 git apply
没有他们?
相关:How to git-apply a git word diff
如果您正在寻找内置解决方案,我不知道有没有。然而,这并不意味着它不能相对容易地内置到 CI 系统中。
您可以将适当的 git diff
命令的输出通过管道传输到如下脚本中,如果输入包含上述第二个大块的补丁,该脚本将退出 1。
#!/usr/bin/env ruby
def filter(arr)
arr.join.split("\n\n").map { |x| x.gsub(/\s+/, ' ') }.join("\n\n")
end
def should_reject(before, after)
return false if before.empty? && after.empty?
before = filter(before)
after = filter(after)
return true if before == after
false
end
chunk = nil
before = []
after = []
while (line = gets)
trimmed = line[1..-1]
case line
when /^(\+\+\+|---)/
# Do nothing.
when /^@@ /
if should_reject(before, after)
warn "Useless change to hunk #{chunk}"
exit 1
end
chunk = line
before = []
after = []
when /^ /
before << trimmed
after << trimmed
when /^\+/
after << trimmed
when /^-/
before << trimmed
end
end
if should_reject(before, after)
warn "Useless change to hunk #{chunk}"
exit 1
end
它本质上是将每个大块拆分成块,块之间有一个空行,将所有空白都变成空格,然后进行比较。如果它们相等,它会抱怨并以非零值退出。您可能希望将其修改为更健壮,例如处理 CRLF 结尾等,但该方法是可行的。
附带说明一下,使这些更改更容易被发现的一种方法是使用每行句子样式。每句话不分长短,一整行,每行只有一个句子。这使得区分任何类型的更改变得更加容易,并且完全避免了包装问题。
在我维护的代码中,我有时会收到提交者无缘无故地重排段落的拉取请求。这是一个例子:
diff --git a/knuth.tex b/knuth.tex
index 2f6a2f8..7b0827d 100644
--- a/knuth.tex
+++ b/knuth.tex
@@ -1,6 +1,6 @@
Thus, I came to the conclusion that the designer of a new
system must not only be the implementer and first
-large||scale user; the designer should also write the first
+large-scale user; the designer should also write the first
user manual.
The separation of any of these four components would have
@@ -9,8 +9,7 @@ all these activities, literally hundreds of improvements
would never have been made, because I would never have
thought of them or perceived why they were important.
-But a system cannot be successful if it is too strongly
-influenced by a single person. Once the initial design is
-complete and fairly robust, the real test begins as people
-with many different viewpoints undertake their own
-experiments.
+But a system cannot be successful if it is too strongly influenced by
+a single person. Once the initial design is complete and fairly
+robust, the real test begins as people with many different viewpoints
+undertake their own experiments.
如您所见,第一个 hunk 通过将 ||
替换为 -
引入了实际更改,而第二个 hunk 除了换行和空格外没有任何更改。事实上,第二个 hunk 的 word-diff
将是空的。
是否可以制定一项政策(例如在 GitHub 或我的 CI 中)以拒绝包含此类“空”块的提交,或者重新格式化补丁以忽略这些块完全让我可以 git apply
没有他们?
相关:How to git-apply a git word diff
如果您正在寻找内置解决方案,我不知道有没有。然而,这并不意味着它不能相对容易地内置到 CI 系统中。
您可以将适当的 git diff
命令的输出通过管道传输到如下脚本中,如果输入包含上述第二个大块的补丁,该脚本将退出 1。
#!/usr/bin/env ruby
def filter(arr)
arr.join.split("\n\n").map { |x| x.gsub(/\s+/, ' ') }.join("\n\n")
end
def should_reject(before, after)
return false if before.empty? && after.empty?
before = filter(before)
after = filter(after)
return true if before == after
false
end
chunk = nil
before = []
after = []
while (line = gets)
trimmed = line[1..-1]
case line
when /^(\+\+\+|---)/
# Do nothing.
when /^@@ /
if should_reject(before, after)
warn "Useless change to hunk #{chunk}"
exit 1
end
chunk = line
before = []
after = []
when /^ /
before << trimmed
after << trimmed
when /^\+/
after << trimmed
when /^-/
before << trimmed
end
end
if should_reject(before, after)
warn "Useless change to hunk #{chunk}"
exit 1
end
它本质上是将每个大块拆分成块,块之间有一个空行,将所有空白都变成空格,然后进行比较。如果它们相等,它会抱怨并以非零值退出。您可能希望将其修改为更健壮,例如处理 CRLF 结尾等,但该方法是可行的。
附带说明一下,使这些更改更容易被发现的一种方法是使用每行句子样式。每句话不分长短,一整行,每行只有一个句子。这使得区分任何类型的更改变得更加容易,并且完全避免了包装问题。