如何从特定字符串中删除多余的空格?
How to remove extra spaces from a specific string?
我有一个如下所示的字符串:
Ireland, UK, United States of America, Belgium, Germany , Some Country, ...
我需要有关 Regex
或 String.Replace
函数的帮助,以删除多余的空格,以便结果如下:
Ireland,UK,United States of America,Belgium,Germany,Same Country,
谢谢。
您可以通过用逗号分隔输入,然后将多个空格修剪并缩小为 1,然后String.Join
返回来实现。
仅展示如何使用 LINQ 完成:
Console.Write(String.Join(",", _
"Ireland, UK, United States of America, Belgium, Germany , Some Country," _
.Split(","c) _
.Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
.ToArray()))
关键是Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")
,其中多个空格缩减为1。
结果:Ireland,UK,United States of America,Belgium,Germany,Some Country,
虽然 stribizhev 写的答案适合这种情况,但我想借此机会强调使用正则表达式执行简单任务对性能的(负面)影响。
替代方案明显比正则表达式快 (x2)(处理这些情况时正则表达式总是很慢)
我的方法基于递归删除空格。我创建了两个版本:第一个使用常规循环(withoutRegex
),第二个使用 LINQ(withoutRegex2
;实际上,它与 stribizhev 的答案相同,除了 Regex
部分)。
Private Function withoutRegex(input As String) As String
Dim output As String = ""
Dim temp() = input.Split(","c)
For i As Integer = 0 To temp.Length - 1
output = output & recursiveSpaceRemoval(temp(i).Trim()) & If(i < temp.Length - 1, ",", "")
Next
Return output
End Function
Private Function withoutRegex2(input As String) As String
Return String.Join(",", _
input _
.Split(","c) _
.Select(Function(x) recursiveSpaceRemoval(x.Trim())) _
.ToArray())
End Function
Private Function recursiveSpaceRemoval(input As String) As String
Dim output As String = input.Replace(" ", " ")
If output = input Then Return output
Return recursiveSpaceRemoval(output)
End Function
为了证明我的观点,我创建了以下测试框架:
Dim input As String = "Ireland, UK, United States of America, Belgium, Germany , Some Country"
Dim output As String = ""
Dim count As Integer = 0
Dim countMax As Integer = 20
Dim with0 As Long = 0
Dim without As Long = 0
Dim without2 As Long = 0
While count < countMax
count = count + 1
Dim sw As Stopwatch = New Stopwatch
sw.Start()
output = withRegex(input)
sw.Stop()
with0 = with0 + sw.ElapsedTicks
sw = New Stopwatch
sw.Start()
output = withoutRegex(input)
sw.Stop()
without = without + sw.ElapsedTicks
sw = New Stopwatch
sw.Start()
output = withoutRegex2(input)
sw.Stop()
without2 = without2 + sw.ElapsedTicks
End While
MessageBox.Show("With: " & with0.ToString)
MessageBox.Show("Without: " & without.ToString)
MessageBox.Show("Without 2: " & without2.ToString)
其中withRegex
指的是stribizhev的回答,即:
Private Function withRegex(input As String) As String
Return String.Join(",", _
input _
.Split(","c) _
.Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
.ToArray())
End Function
这是一个简单的测试框架,它分析非常快速的动作,其中每一位都很重要(20 次循环迭代的要点正是试图提高测量的可靠性)。也就是说:即使改变调用方法的顺序也会影响结果。
无论如何,在我的所有测试中,方法之间的差异或多或少保持一致。经过一些测试后我得到的平均值是:
With: 2500-2700
Without: 1100-1300
Without2: 900-1200
注意:至于这是对正则表达式性能的一般批评(至少,在足够简单的情况下,可以很容易地用我在这里展示的内容的替代方案替换),关于如何改进的任何建议它(正则表达式的性能)在 .NET 中将非常受欢迎。但请避免笼统的不清楚的陈述,并尽可能具体(例如,通过建议对提议的测试框架进行更改)。
我有一个如下所示的字符串:
Ireland, UK, United States of America, Belgium, Germany , Some Country, ...
我需要有关 Regex
或 String.Replace
函数的帮助,以删除多余的空格,以便结果如下:
Ireland,UK,United States of America,Belgium,Germany,Same Country,
谢谢。
您可以通过用逗号分隔输入,然后将多个空格修剪并缩小为 1,然后String.Join
返回来实现。
仅展示如何使用 LINQ 完成:
Console.Write(String.Join(",", _
"Ireland, UK, United States of America, Belgium, Germany , Some Country," _
.Split(","c) _
.Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
.ToArray()))
关键是Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")
,其中多个空格缩减为1。
结果:Ireland,UK,United States of America,Belgium,Germany,Some Country,
虽然 stribizhev 写的答案适合这种情况,但我想借此机会强调使用正则表达式执行简单任务对性能的(负面)影响。
替代方案明显比正则表达式快 (x2)(处理这些情况时正则表达式总是很慢)
我的方法基于递归删除空格。我创建了两个版本:第一个使用常规循环(withoutRegex
),第二个使用 LINQ(withoutRegex2
;实际上,它与 stribizhev 的答案相同,除了 Regex
部分)。
Private Function withoutRegex(input As String) As String
Dim output As String = ""
Dim temp() = input.Split(","c)
For i As Integer = 0 To temp.Length - 1
output = output & recursiveSpaceRemoval(temp(i).Trim()) & If(i < temp.Length - 1, ",", "")
Next
Return output
End Function
Private Function withoutRegex2(input As String) As String
Return String.Join(",", _
input _
.Split(","c) _
.Select(Function(x) recursiveSpaceRemoval(x.Trim())) _
.ToArray())
End Function
Private Function recursiveSpaceRemoval(input As String) As String
Dim output As String = input.Replace(" ", " ")
If output = input Then Return output
Return recursiveSpaceRemoval(output)
End Function
为了证明我的观点,我创建了以下测试框架:
Dim input As String = "Ireland, UK, United States of America, Belgium, Germany , Some Country"
Dim output As String = ""
Dim count As Integer = 0
Dim countMax As Integer = 20
Dim with0 As Long = 0
Dim without As Long = 0
Dim without2 As Long = 0
While count < countMax
count = count + 1
Dim sw As Stopwatch = New Stopwatch
sw.Start()
output = withRegex(input)
sw.Stop()
with0 = with0 + sw.ElapsedTicks
sw = New Stopwatch
sw.Start()
output = withoutRegex(input)
sw.Stop()
without = without + sw.ElapsedTicks
sw = New Stopwatch
sw.Start()
output = withoutRegex2(input)
sw.Stop()
without2 = without2 + sw.ElapsedTicks
End While
MessageBox.Show("With: " & with0.ToString)
MessageBox.Show("Without: " & without.ToString)
MessageBox.Show("Without 2: " & without2.ToString)
其中withRegex
指的是stribizhev的回答,即:
Private Function withRegex(input As String) As String
Return String.Join(",", _
input _
.Split(","c) _
.Select(Function(m) Regex.Replace(m.Trim(), "\p{Zs}{2,}", " ")) _
.ToArray())
End Function
这是一个简单的测试框架,它分析非常快速的动作,其中每一位都很重要(20 次循环迭代的要点正是试图提高测量的可靠性)。也就是说:即使改变调用方法的顺序也会影响结果。
无论如何,在我的所有测试中,方法之间的差异或多或少保持一致。经过一些测试后我得到的平均值是:
With: 2500-2700
Without: 1100-1300
Without2: 900-1200
注意:至于这是对正则表达式性能的一般批评(至少,在足够简单的情况下,可以很容易地用我在这里展示的内容的替代方案替换),关于如何改进的任何建议它(正则表达式的性能)在 .NET 中将非常受欢迎。但请避免笼统的不清楚的陈述,并尽可能具体(例如,通过建议对提议的测试框架进行更改)。