有没有办法判断一个字符串是否包含 AppleScript 中的“\u0000”?
Is there a way to tell if a string contains "\u0000" in AppleScript?
我有一个以 E\u0000R\u0000R\u0000
开头的字符串。换句话说,ERR
的每个字母后面跟着\u0000
.
\u0000
是这个字符:&#;
参见:https://www.unicodepedia.com/unicode/basic-latin/0/control-0000/
我想在 AppleScript 中做这样的事情:
if varStr contains "\u0000" then remove "\u0000" from varStr
是否可行,如果可行,如何实现?
您可以使用AppleScript's text item delimiters
来完成任务。
这是一个例子:
set myString to read "/private/tmp/file.txt"
log myString
if myString contains "\u0000" then
set curTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"\u0000"}
set myString to text items of myString
set AppleScript's text item delimiters to {""}
set myString to myString as text
set AppleScript's text item delimiters to curTID
end if
log myString
备注:
测试文件中只有以下内容:
E\u0000R\u0000R\u0000
log
命令 只是显示一些输出, myString
变量 字面上只是 ERR
在使用 AppleScript's text item delimiters
.
对其进行强制操作后
您也可以在 处理程序 中使用 AppleScript's text item delimiters
,就像 Finding and Replacing Text in a String
中的处理程序一样
set myString to read "/private/tmp/file.txt"
if myString contains "\u0000" then
set myString to my findAndReplaceInText(myString, "\u0000", "")
end if
on findAndReplaceInText(theText, theSearchString, theReplacementString)
set AppleScript's text item delimiters to theSearchString
set theTextItems to every text item of theText
set AppleScript's text item delimiters to theReplacementString
set theText to theTextItems as string
set AppleScript's text item delimiters to ""
return theText
end findAndReplaceInText
我已经使用处理程序解决了它(来自developer.apple.com):
on decodeCharacterHexString(theCharacters)
copy theCharacters to {theIdentifyingCharacter, theMultiplierCharacter, theRemainderCharacter}
set theHexList to "123456789ABCDEF"
if theMultiplierCharacter is in "ABCDEF" then
set theMultiplierAmount to offset of theMultiplierCharacter in theHexList
else
set theMultiplierAmount to theMultiplierCharacter as integer
end if
if theRemainderCharacter is in "ABCDEF" then
set theRemainderAmount to offset of theRemainderCharacter in theHexList
else
set theRemainderAmount to theRemainderCharacter as integer
end if
set theASCIINumber to (theMultiplierAmount * 16) + theRemainderAmount
return (ASCII character theASCIINumber)
end decodeCharacterHexString
然后我可以调用它:
set u00 to decodeCharacterHexString("%00")
if myStr contains u00 then
set myStr to replace_chars(myStr, u00, "")
end if
on replace_chars(this_text, search_string, replacement_string)
set AppleScript's text item delimiters to the search_string
set the item_list to every text item of this_text
set AppleScript's text item delimiters to the replacement_string
set this_text to the item_list as string
set AppleScript's text item delimiters to ""
return this_text
end replace_chars
抱歉,正如@red_menace 所指出的,ASCII character
已被弃用。我是无知的。
正确的做法是:
if myStr contains character id 0 then
假设 AppleScript 字符串包含错误编码的 UTF16-LE 数据(这是对您看到的所有 NUL 字节的最可能解释),您可以通过 NSData 进行一些重组来重新编码它:
use framework "Foundation"
-- extract the string's raw bytes as-is
set d to (current application's NSString's stringWithString:badString)'s dataUsingEncoding:(current application's NSNEXTSTEPStringEncoding)
-- reencode the raw bytes as UTF16-LE
set goodString to (current application's NSString's alloc()'s initWithData:d encoding:(current application's NSUTF16LittleEndianStringEncoding)) as text
正确的解决方案是从源头上解决问题,我认为这是一些 elderly/cross-platform 应用程序,因为我不希望基于 Cocoa 的应用程序搞砸字符串编码,例如那。 (尽管这将取决于它的开发人员是否仍然关心 Apple 已经放弃的古老而坚硬的遗留技术,所以如果他们不关心也不要怪他们。)
我有一个以 E\u0000R\u0000R\u0000
开头的字符串。换句话说,ERR
的每个字母后面跟着\u0000
.
\u0000
是这个字符:&#;
参见:https://www.unicodepedia.com/unicode/basic-latin/0/control-0000/
我想在 AppleScript 中做这样的事情:
if varStr contains "\u0000" then remove "\u0000" from varStr
是否可行,如果可行,如何实现?
您可以使用AppleScript's text item delimiters
来完成任务。
这是一个例子:
set myString to read "/private/tmp/file.txt"
log myString
if myString contains "\u0000" then
set curTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"\u0000"}
set myString to text items of myString
set AppleScript's text item delimiters to {""}
set myString to myString as text
set AppleScript's text item delimiters to curTID
end if
log myString
备注:
测试文件中只有以下内容:
E\u0000R\u0000R\u0000
log
命令 只是显示一些输出, myString
变量 字面上只是 ERR
在使用 AppleScript's text item delimiters
.
您也可以在 处理程序 中使用 AppleScript's text item delimiters
,就像 Finding and Replacing Text in a String
set myString to read "/private/tmp/file.txt"
if myString contains "\u0000" then
set myString to my findAndReplaceInText(myString, "\u0000", "")
end if
on findAndReplaceInText(theText, theSearchString, theReplacementString)
set AppleScript's text item delimiters to theSearchString
set theTextItems to every text item of theText
set AppleScript's text item delimiters to theReplacementString
set theText to theTextItems as string
set AppleScript's text item delimiters to ""
return theText
end findAndReplaceInText
我已经使用处理程序解决了它(来自developer.apple.com):
on decodeCharacterHexString(theCharacters)
copy theCharacters to {theIdentifyingCharacter, theMultiplierCharacter, theRemainderCharacter}
set theHexList to "123456789ABCDEF"
if theMultiplierCharacter is in "ABCDEF" then
set theMultiplierAmount to offset of theMultiplierCharacter in theHexList
else
set theMultiplierAmount to theMultiplierCharacter as integer
end if
if theRemainderCharacter is in "ABCDEF" then
set theRemainderAmount to offset of theRemainderCharacter in theHexList
else
set theRemainderAmount to theRemainderCharacter as integer
end if
set theASCIINumber to (theMultiplierAmount * 16) + theRemainderAmount
return (ASCII character theASCIINumber)
end decodeCharacterHexString
然后我可以调用它:
set u00 to decodeCharacterHexString("%00")
if myStr contains u00 then
set myStr to replace_chars(myStr, u00, "")
end if
on replace_chars(this_text, search_string, replacement_string)
set AppleScript's text item delimiters to the search_string
set the item_list to every text item of this_text
set AppleScript's text item delimiters to the replacement_string
set this_text to the item_list as string
set AppleScript's text item delimiters to ""
return this_text
end replace_chars
抱歉,正如@red_menace 所指出的,ASCII character
已被弃用。我是无知的。
正确的做法是:
if myStr contains character id 0 then
假设 AppleScript 字符串包含错误编码的 UTF16-LE 数据(这是对您看到的所有 NUL 字节的最可能解释),您可以通过 NSData 进行一些重组来重新编码它:
use framework "Foundation"
-- extract the string's raw bytes as-is
set d to (current application's NSString's stringWithString:badString)'s dataUsingEncoding:(current application's NSNEXTSTEPStringEncoding)
-- reencode the raw bytes as UTF16-LE
set goodString to (current application's NSString's alloc()'s initWithData:d encoding:(current application's NSUTF16LittleEndianStringEncoding)) as text
正确的解决方案是从源头上解决问题,我认为这是一些 elderly/cross-platform 应用程序,因为我不希望基于 Cocoa 的应用程序搞砸字符串编码,例如那。 (尽管这将取决于它的开发人员是否仍然关心 Apple 已经放弃的古老而坚硬的遗留技术,所以如果他们不关心也不要怪他们。)