OpenEdge:如何从字符串中删除 HTML 标签?

OpenEdge: how to remove HTML tags from a string?

我试过这样做:

REPLACE(string, "<*>", "").

不过好像不行。

REPLACE 不是这样的。里面没有通配符匹配。

我在下面提供了一种执行此操作的简单方法。但是,在很多情况下这都行不通 - 格式不正确 html 等。但也许您可以从这里开始,然后自己继续前进。

我所做的是在文本中查找 < 和 > 并用竖线 (|) 替换它之间的所有内容(您可以 select 任何字符 - 最好是文本中不存在的字符。完成后所有管道都被移除。

同样,这是一个快速而肮脏的解决方案,对生产不安全...

PROCEDURE cleanHtml:
    DEFINE INPUT  PARAMETER pcString  AS CHARACTER   NO-UNDO.
    DEFINE OUTPUT PARAMETER pcCleaned AS CHARACTER   NO-UNDO.

    DEFINE VARIABLE iHtmlTagBegins AS INTEGER     NO-UNDO.
    DEFINE VARIABLE iHtmlTagEnds   AS INTEGER     NO-UNDO.
    DEFINE VARIABLE lHtmlTagActive AS LOGICAL     NO-UNDO.

    DEFINE VARIABLE i AS INTEGER     NO-UNDO.

    DO i = 1 TO LENGTH(pcString):
        IF lHtmlTagActive = FALSE AND SUBSTRING(pcString, i, 1) = "<" THEN DO:
            iHtmlTagBegins = i.
            lHtmlTagActive = TRUE.
        END.

        IF lHtmlTagActive AND SUBSTRING(pcString, i, 1) = ">" THEN DO:
            iHtmlTagEnds = i.
            lHtmlTagActive = FALSE.

            SUBSTRING(pcString, iHtmlTagBegins, iHtmlTagEnds - iHtmlTagBegins + 1) = FILL("|", iHtmlTagEnds - iHtmlTagBegins).
        END.
    END.

    pcCleaned = REPLACE(pcString, "|", "").

END PROCEDURE.

DEFINE VARIABLE c AS CHARACTER   NO-UNDO.

RUN cleanHtml("This is a <b>text</b> with a <i>little</i> bit of <strong>html</strong> in it!", OUTPUT c).

MESSAGE c VIEW-AS ALERT-BOX.