解析地址的正则表达式

Regular Expressions to parse addresses

我正在尝试学习如何使用正则表达式来解析 location/address 字符串。 不幸的是,我得到的数据与大多数地址的写入方式不一致且非常规。以下是我目前所拥有的,我遇到的问题是我需要多次解析字符串以将其归结为正确的格式。

以下面的字符串为例:102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649我想要的最终结果是110 Spruce, Greenwood, SC 29649

代码:

l = nil
location_str = "102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649"
1.upto(4).each do |attempt|
  l = Location.from_string(location_str)
  puts "TRYING: #{location_str}"
  break if !l.nil?
  location_str.gsub!(/^[^,:\-]+\s*/, '')
end

输出:

TRYING: 102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: , 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: , 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: , 108 Spruce, 110 Spruce, Greenwood, SC 29649

预期:

TRYING: 102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: 108 Spruce, 110 Spruce, Greenwood, SC 29649
TRYING: 110 Spruce, Greenwood, SC 29649

假设格式为:

"Stuff you aren't interested in, more stuff, more stuff, etc., house, city, state zip"

然后您只需使用美元符号锚定到字符串的末尾即可获取最后 3 个部分:

location_str[/[^,]*,[^,]*,[^,]*$/]

没有正则表达式的尝试:

address = "102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649"
elements = address.split(",").map(&:strip)
city, state_and_zip = elements[elements.length-2..-1]
addresses = elements[0...elements.length-2]

p [addresses.last, city, state_and_zip].join(",")

输出:

"110 Spruce,Greenwood,SC 29649"

这是不止一种方法的事情之一。还有一个:

def address_from_location_string(location)
  *_, address, city, state_zip = location.split(/\s*,\s*/)
  "#{address}, #{city}, #{state_zip}"
end

address_from_location_string("102 Spruce, 108 Spruce, 110 Spruce, Greenwood, SC 29649")
# => "110 Spruce, Greenwood, SC 29649"