PHP:在第一个不是价格小数点或字符串最后一个字符的句点拆分字符串

PHP: Split a string at the first period that isn't the decimal point in a price or the last character of the string

我想按照标题中列出的参数拆分字符串。我尝试了一些不同的方法,包括使用 preg_match,但到目前为止收效甚微,我觉得可能有一个更简单的解决方案,但我还没有找到。

我有一个与标题中提到的“价格”匹配的正则表达式(见下文)。

/(?=.)\£(([1-9][0-9]{0,2}(,[0-9]{3})*)|[0-9]+)?(\.[0-9]{1,2})?/

下面是一些示例场景以及我想要的结果:

示例 1:

input: "This string should not split as the only periods that appear are here £19.99 and also at the end."
output: n/a

示例 2:

input: "This string should split right here. As the period is not part of a price or at the end of the string."
output: "This string should split right here"

示例 3:

input: "There is a price in this string £19.99, but it should only split at this point. As I want it to ignore periods in a price"
output: "There is a price in this string £19.99, but it should only split at this point"

我建议使用

preg_split('~\£(?:[1-9]\d{0,2}(?:,\d{3})*|[0-9]+)?(?:\.\d{1,2})?(*SKIP)(*F)|\.(?!\s*$)~u', $string)

参见regex demo

该模式与您的模式匹配,\£(?:[1-9]\d{0,2}(?:,\d{3})*|[0-9]+)?(?:\.\d{1,2})? 跳过 它与 (*SKIP)(*F),否则,它匹配 non-final .\.(?!\s*$) (即使有尾随空白字符)。

如果您真的只需要在第一次出现符合条件的点时拆分,您可以使用匹配方法:

preg_match('~^((?:\£(?:[1-9]\d{0,2}(?:,\d{3})*|[0-9]+)?(?:\.\d{1,2})?|[^.])+)\.(.*)~su', $string, $match)

regex demo。这里,

  • ^ - 匹配字符串开始位置
  • ((?:\£(?:[1-9]\d{0,2}(?:,\d{3})*|[0-9]+)?(?:\.\d{1,2})?|[^.])+) - 出现一次或多次您的货币模式或 . 字符
  • 以外的任何一个字符
  • \. - 一个 . 字符
  • (.*) - 第 2 组:字符串的其余部分。

您可以简单地使用这个正则表达式:

\. 由于第一句话后只有 space(而不是价格),所以这应该也可以,对吧?

要将文本拆分成句子,避免出现不同的陷阱,例如数字中的点或千位分隔符以及一些缩写(例如 etc.),最好的工具是 intlBreakIterator 旨在处理自然语言:

$str = 'There is a price in this string £19.99, but it should only split at this point. As I want it to ignore periods in a price';

$si = IntlBreakIterator::createSentenceInstance('en-US');
$si->setText($str);
$si->next();

echo substr($str, 0, $si->current());

IntlBreakIterator::createSentenceInstance returns 一个迭代器,给出字符串中不同句子的索引。

它也考虑了 ?!...。除了数字或价格陷阱外,它也适用于这种字符串:

$str = 'John Smith, Jr. was running naked through the garden crying "catch me! catch me!", but no one was chasing him. His psychatre looked at him from the window with a circumspect eye.';

更多关于 IntlBreakIterator here.

使用的规则