如何在 Common Lisp 中使用正则表达式来获取字符串中的所有内容,直到最后一次出现“/”?

How to use regular expression in Common Lisp to get everything in a string until the last occurence of "/"?

假设我有这个字符串:

"http://www.gnu.org/software/emacs/manual/html_node/emacs/index.html"

我想要一个正则表达式,以便:

CL-USER> (some-regex "http://www.gnu.org/software/emacs/manual/html_node/emacs/index.html")

会return:

"http://www.gnu.org/software/emacs/manual/html_node/emacs/"

如果我在之前的输出中再次使用相同的函数:

CL-USER> (some-regex "http://www.gnu.org/software/emacs/manual/html_node/emacs/")

它将再次获取所有内容,直到最后一个“/”:

"http://www.gnu.org/software/emacs/manual/html_node/"

最好使用cl-ppcre

您的第二个示例不是返回最后一个之前的所有内容,而是返回倒数第二个斜杠的所有内容。我想您不想包含尾部斜杠以使其更规则。那么正则表达式在简单情况下可能是 (.*)/.*。然而,当没有路径时,这会令人惊讶:

CL-USER> (defun shorten-uri-string (s)
           (aref (nth-value 1 (cl-ppcre:scan-to-strings "(.*)/.*" s)) 0))
SHORTEN-URI-STRING
CL-USER> (shorten-uri-string
          "http://www.gnu.org/software/emacs/manual/html_node/emacs/index.html")
"http://www.gnu.org/software/emacs/manual/html_node/emacs"
CL-USER> (shorten-uri-string *)
"http://www.gnu.org/software/emacs/manual/html_node"
CL-USER> (shorten-uri-string *)
"http://www.gnu.org/software/emacs/manual"
CL-USER> (shorten-uri-string *)
"http://www.gnu.org/software/emacs"
CL-USER> (shorten-uri-string *)
"http://www.gnu.org/software"
CL-USER> (shorten-uri-string *)
"http://www.gnu.org"
CL-USER> (shorten-uri-string *)
"http:/"

我建议通过解析将 URI 视为数据结构,而不是字符串。解析器还知道 URI 每个部分中有关 allowed/disallowed 个字符的所有信息。

例如解析为puri:

CL-USER> (defun shorten-uri-path (uri)
           (let* ((puri (puri:parse-uri uri))
                  (new-puri (puri:copy-uri puri)))
             (when (puri:uri-parsed-path puri)
               (setf (puri:uri-parsed-path new-puri)
                     (butlast (puri:uri-parsed-path puri))))
             new-puri))
SHORTEN-URI-PATH
CL-USER> (shorten-uri-path
          "http://www.gnu.org/software/emacs/manual/html_node/emacs/index.html")
#<PURI:URI http://www.gnu.org/software/emacs/manual/html_node/emacs>
CL-USER> (shorten-uri-path *)
#<PURI:URI http://www.gnu.org/software/emacs/manual/html_node>
CL-USER> (shorten-uri-path *)
#<PURI:URI http://www.gnu.org/software/emacs/manual>
CL-USER> (shorten-uri-path *)
#<PURI:URI http://www.gnu.org/software/emacs>
CL-USER> (shorten-uri-path *)
#<PURI:URI http://www.gnu.org/software>
CL-USER> (shorten-uri-path *)
#<PURI:URI http://www.gnu.org/>
CL-USER> (shorten-uri-path *)
#<PURI:URI http://www.gnu.org/>

您可以使用 puri:render-uri 将 URI 呈现到流中。您还可以显式处理查询和片段。