使用 ruby 删除特定字符串(电子邮件地址)中的所有空格
removing all spaces within a specific string (email address) using ruby
用户可以输入文本,但我提取数据的方式通常包含不必要的回车 returns 和 spaces。
为了删除那些使输入看起来更像一个真实的句子,我使用以下内容:
string.delete!("\n")
string = string.squeeze(" ").gsub(/([.?!]) */,' ')
但在以下情况下,我在电子邮件中收到了意外的 space:
string = "Hey what is \n\n\n up joeblow@dude.com \n okay"
我得到以下信息:
"Hey what is up joeblow@dude. com okay"
如何为字符串的电子邮件部分启用例外,以便得到以下信息:
"Hey what is up joeblow@dude.com okay"
已编辑
您的方法执行以下操作:
string.squeeze(" ") # replaces each squence of " " by one space
gsub(/([.?!] */, ' ') # check if there is a space after every char in the between the brackets [.?!]
# and whether it finds one or more or none at all
# it adds another space, this is why the email address
# is splitted
我猜你真正想要的是,如果标点符号后面没有space,就加一个space。您可以改为执行此操作。
string.gsub(/([.?!])\W/, ' ') # if there is a non word char after
# those punctuation chars, just add a space
然后你只需要用一个 space 替换每个 space 字符序列。所以最后的解决方案是:
string.gsub(/([.?!])(?=\W)/, ' ').gsub(/\s+/, ' ')
# ([.?!]) => this will match the ., ?, or !. and capture it
# (?=\W) => this will match any non word char but will not capture it.
# so /([.?!])(?=\W)/ will find punctuation between parenthesis that
# are followed by a non word char (a space or new line, or even
# puctuation for example).
# ' ' => is for the captured group (i.e. string that match the
# group ([.?!]) which is a single char in this case.), so it will add
# a space after the matched group.
如果您可以摆脱挤压语句,那么使用 Nafaa 的答案是最简单的方法,但我列出了另一种方法以防它有帮助:
string = string.split(" ").join(" ")
但是,如果你想保留那个 squeeze 语句,你可以修改 Nafaa 的方法并在 squeeze 语句之后使用它:
string.gsub(/\s+/, ' ').gsub('. com', '.com')
或者直接更改字符串:
string.gsub('. com', '.com')
用户可以输入文本,但我提取数据的方式通常包含不必要的回车 returns 和 spaces。
为了删除那些使输入看起来更像一个真实的句子,我使用以下内容:
string.delete!("\n")
string = string.squeeze(" ").gsub(/([.?!]) */,' ')
但在以下情况下,我在电子邮件中收到了意外的 space:
string = "Hey what is \n\n\n up joeblow@dude.com \n okay"
我得到以下信息:
"Hey what is up joeblow@dude. com okay"
如何为字符串的电子邮件部分启用例外,以便得到以下信息:
"Hey what is up joeblow@dude.com okay"
已编辑
您的方法执行以下操作:
string.squeeze(" ") # replaces each squence of " " by one space
gsub(/([.?!] */, ' ') # check if there is a space after every char in the between the brackets [.?!]
# and whether it finds one or more or none at all
# it adds another space, this is why the email address
# is splitted
我猜你真正想要的是,如果标点符号后面没有space,就加一个space。您可以改为执行此操作。
string.gsub(/([.?!])\W/, ' ') # if there is a non word char after
# those punctuation chars, just add a space
然后你只需要用一个 space 替换每个 space 字符序列。所以最后的解决方案是:
string.gsub(/([.?!])(?=\W)/, ' ').gsub(/\s+/, ' ')
# ([.?!]) => this will match the ., ?, or !. and capture it
# (?=\W) => this will match any non word char but will not capture it.
# so /([.?!])(?=\W)/ will find punctuation between parenthesis that
# are followed by a non word char (a space or new line, or even
# puctuation for example).
# ' ' => is for the captured group (i.e. string that match the
# group ([.?!]) which is a single char in this case.), so it will add
# a space after the matched group.
如果您可以摆脱挤压语句,那么使用 Nafaa 的答案是最简单的方法,但我列出了另一种方法以防它有帮助:
string = string.split(" ").join(" ")
但是,如果你想保留那个 squeeze 语句,你可以修改 Nafaa 的方法并在 squeeze 语句之后使用它:
string.gsub(/\s+/, ' ').gsub('. com', '.com')
或者直接更改字符串:
string.gsub('. com', '.com')