有没有办法判断一个字符串是否包含 AppleScript 中的“\u0000”?

Is there a way to tell if a string contains "\u0000" in AppleScript?

我有一个以 E\u0000R\u0000R\u0000 开头的字符串。换句话说,ERR的每个字母后面跟着\u0000.

\u0000是这个字符:&#;

参见:https://www.unicodepedia.com/unicode/basic-latin/0/control-0000/

我想在 AppleScript 中做这样的事情:

if varStr contains "\u0000" then remove "\u0000" from varStr

是否可行,如果可行,如何实现?

您可以使用AppleScript's text item delimiters来完成任务。

这是一个例子:

set myString to read "/private/tmp/file.txt"

log myString

if myString contains "\u0000" then
    set curTID to AppleScript's text item delimiters
    set AppleScript's text item delimiters to {"\u0000"}
    set myString to text items of myString
    set AppleScript's text item delimiters to {""}
    set myString to myString as text
    set AppleScript's text item delimiters to curTID
end if

log myString

备注:

测试文件中只有以下内容:

E\u0000R\u0000R\u0000

log 命令 只是显示一些输出, myString 变量 字面上只是 ERR 在使用 AppleScript's text item delimiters.

对其进行强制操作后



您也可以在 处理程序 中使用 AppleScript's text item delimiters,就像 Finding and Replacing Text in a String

中的处理程序一样
set myString to read "/private/tmp/file.txt"

if myString contains "\u0000" then
    
    set myString to my findAndReplaceInText(myString, "\u0000", "")
    
end if



on findAndReplaceInText(theText, theSearchString, theReplacementString)
    set AppleScript's text item delimiters to theSearchString
    set theTextItems to every text item of theText
    set AppleScript's text item delimiters to theReplacementString
    set theText to theTextItems as string
    set AppleScript's text item delimiters to ""
    return theText
end findAndReplaceInText

我已经使用处理程序解决了它(来自developer.apple.com):

on decodeCharacterHexString(theCharacters)
    copy theCharacters to {theIdentifyingCharacter, theMultiplierCharacter, theRemainderCharacter}
    set theHexList to "123456789ABCDEF"
    if theMultiplierCharacter is in "ABCDEF" then
        set theMultiplierAmount to offset of theMultiplierCharacter in theHexList
    else
        set theMultiplierAmount to theMultiplierCharacter as integer
    end if
    if theRemainderCharacter is in "ABCDEF" then
        set theRemainderAmount to offset of theRemainderCharacter in theHexList
    else
        set theRemainderAmount to theRemainderCharacter as integer
    end if
    set theASCIINumber to (theMultiplierAmount * 16) + theRemainderAmount
    return (ASCII character theASCIINumber)
end decodeCharacterHexString

然后我可以调用它:

set u00 to decodeCharacterHexString("%00")
if myStr contains u00 then
    set myStr to replace_chars(myStr, u00, "")
end if


on replace_chars(this_text, search_string, replacement_string)
    set AppleScript's text item delimiters to the search_string
    set the item_list to every text item of this_text
    set AppleScript's text item delimiters to the replacement_string
    set this_text to the item_list as string
    set AppleScript's text item delimiters to ""
    return this_text
end replace_chars

抱歉,正如@red_menace 所指出的,ASCII character 已被弃用。我是无知的。

正确的做法是:

if myStr contains character id 0 then

假设 AppleScript 字符串包含错误编码的 UTF16-LE 数据(这是对您看到的所有 NUL 字节的最可能解释),您可以通过 NSData 进行一些重组来重新编码它:

use framework "Foundation"

-- extract the string's raw bytes as-is
set d to (current application's NSString's stringWithString:badString)'s dataUsingEncoding:(current application's NSNEXTSTEPStringEncoding)

-- reencode the raw bytes as UTF16-LE
set goodString to (current application's NSString's alloc()'s initWithData:d encoding:(current application's NSUTF16LittleEndianStringEncoding)) as text

正确的解决方案是从源头上解决问题,我认为这是一些 elderly/cross-platform 应用程序,因为我不希望基于 Cocoa 的应用程序搞砸字符串编码,例如那。 (尽管这将取决于它的开发人员是否仍然关心 Apple 已经放弃的古老而坚硬的遗留技术,所以如果他们不关心也不要怪他们。)