SQL 服务器 2008 patindex 递归
SQL server 2008 patindex recursion
我想找到表达式的最新实例,然后继续寻找更好的匹配项,然后选择最佳匹配项。
我正在查看的单元格是一个重复附加的日志,其中包含注释,后跟用户名和时间戳。
示例单元格内容:
Starting the investigation.
JWAYNE entered the notes above on 08/12/1976 12:01
Taking over the case. Not a lot of progress recently.
CEASTWOOD entered the notes above on 03/14/2001 09:04
No wonder this case is not progressing, the whole town is covering up some shenanigans!
CEASTWOOD entered the notes above on 03/21/2001 05:23
Star command was right, this investigation has been tossed around like a hot potato for a long time!
BLIGHTYEAR entered the notes above on 08/29/2659 08:01
我不是数据库范式规则方面的专家,但很烦人的是条目挤在一个单元格中,这让我的工作是隔离和检查特定单词的注释,尤其是当单元格重复多行时直到调查结束,这会将未来阶段的笔记放入过去事件的笔记列中,最重要的是时间戳使时间戳 PATINDEX 甚至几分钟的余量都不可靠,如下所示:
CaseID, Username, Notes, Phase, Timestamp
E18902, JWAYNE, Starting....08:01, E1, 03/14/2001 09:13
E18902, CEASTWOOD, Starting....08:01, E2, 03/14/2001 09:13
E18902, CEASTWOOD, Starting....08:01, E3, 03/21/2001 05:34
E18902, BLIGHTYEAR,Starting....08:01, E4, 08/29/2659 07:58
现在我正在对整个字符串进行反向操作,然后使用 patindex 来查找用户名,然后子字符串化到 select 只有调查阶段的注释,问题是同一用户输入注释对于多个阶段,我的简单 "look for the first match staring at the end of the string moving to the top" 选择了错误的条目。我的第一个想法是搜索用户名,然后再次检查以查看更上一层的条目是否更匹配(注意时间戳与列时间戳),但我不确定如何编码...
我是否必须进行复杂的字符串拆分或是否有更简单的解决方案?
这是我的建议。这是一个记录,但如果愿意,您可以将其转换为用户定义的 table 值函数。
我将使用上面的示例数据。
declare @sourceText nvarchar(max)
, @workText nvarchar(max)
, @xml xml
set @sourceText = <your example text in your question>
set @workText = @sourceText
-- We're going to replace all the carriage returns and line feeds with
-- characters unlikely to appear in your text. (If they are, use some
-- other character.)
set @workText = REPLACE(@workText, char(10), '|')
set @workText = REPLACE(@workText, char(13), '|')
-- Now, we're going to turn your text into XML. Our first target is
-- the string of four "|" characters that the blank lines between entries
-- will be turned into. (If you've got 3, or 6, or blanks in between,
-- adjust accordingly.)
set @workText = REPLACE(@workText, '||||', '</line></entry><entry><line>')
-- Now we replace every other "|".
set @workText = REPLACE(@workText, '|', '</line><line>')
-- Now we construct the rest of the XML and convert the variable to an
-- actual XML variable.
set @workText = '<entry><line>' + @workText + '</line></entry>'
set @workText = REPLACE(@workText, '<line></line>','') -- Get rid of any empty nodes.
set @xml = CONVERT(xml, @workText)
我们现在应该有一个看起来像这样的 XML 片段。 (此时在SQL中插入select @xml
即可看到)
<entry>
<line>Starting the investigation.</line>
<line>JWAYNE entered the notes above on 08/12/1976 12:01</line>
</entry>
<entry>
<line>Taking over the case. Not a lot of progress recently.</line>
<line>CEASTWOOD entered the notes above on 03/14/2001 09:04</line>
</entry>
<entry>
<line>No wonder this case is not progressing, the whole town is covering up some shenanigans!</line>
<line>CEASTWOOD entered the notes above on 03/21/2001 05:23</line>
</entry>
<entry>
<line>Star command was right, this investigation has been tossed around like a hot potato for a long time!</line>
<line>BLIGHTYEAR entered the notes above on 08/29/2659 08:01</line>
</entry>
我们现在可以将 XML 转换为我们更喜欢的 XML :
set @xml = @xml.query(
'for $entry in /entry
return <entry><data>
{
for $line in $entry/line[position() < last()]
return string($line)
}
</data>
<timestamp>{ data($entry/line[last()]) }</timestamp>
</entry>
')
这让我们 XML 看起来像这样(由于篇幅原因,只显示了一个条目):
<entry>
<data>Starting the investigation.</data>
<timestamp>JWAYNE entered the notes above on 08/12/1976 12:01</timestamp>
</entry>
您可以使用以下查询将其转换回表格数据:
select EntryData = R.lines.value('data[1]', 'nvarchar(max)')
, EntryTimestamp = R.lines.value('timestamp[1]', 'nvarchar(MAX)')
from @xml.nodes('/entry') as R(lines)
...并获取如下所示的数据。
从那里,您可以做任何您需要做的事情。
我想找到表达式的最新实例,然后继续寻找更好的匹配项,然后选择最佳匹配项。
我正在查看的单元格是一个重复附加的日志,其中包含注释,后跟用户名和时间戳。
示例单元格内容:
Starting the investigation.
JWAYNE entered the notes above on 08/12/1976 12:01
Taking over the case. Not a lot of progress recently.
CEASTWOOD entered the notes above on 03/14/2001 09:04
No wonder this case is not progressing, the whole town is covering up some shenanigans!
CEASTWOOD entered the notes above on 03/21/2001 05:23
Star command was right, this investigation has been tossed around like a hot potato for a long time!
BLIGHTYEAR entered the notes above on 08/29/2659 08:01
我不是数据库范式规则方面的专家,但很烦人的是条目挤在一个单元格中,这让我的工作是隔离和检查特定单词的注释,尤其是当单元格重复多行时直到调查结束,这会将未来阶段的笔记放入过去事件的笔记列中,最重要的是时间戳使时间戳 PATINDEX 甚至几分钟的余量都不可靠,如下所示:
CaseID, Username, Notes, Phase, Timestamp
E18902, JWAYNE, Starting....08:01, E1, 03/14/2001 09:13
E18902, CEASTWOOD, Starting....08:01, E2, 03/14/2001 09:13
E18902, CEASTWOOD, Starting....08:01, E3, 03/21/2001 05:34
E18902, BLIGHTYEAR,Starting....08:01, E4, 08/29/2659 07:58
现在我正在对整个字符串进行反向操作,然后使用 patindex 来查找用户名,然后子字符串化到 select 只有调查阶段的注释,问题是同一用户输入注释对于多个阶段,我的简单 "look for the first match staring at the end of the string moving to the top" 选择了错误的条目。我的第一个想法是搜索用户名,然后再次检查以查看更上一层的条目是否更匹配(注意时间戳与列时间戳),但我不确定如何编码...
我是否必须进行复杂的字符串拆分或是否有更简单的解决方案?
这是我的建议。这是一个记录,但如果愿意,您可以将其转换为用户定义的 table 值函数。
我将使用上面的示例数据。
declare @sourceText nvarchar(max)
, @workText nvarchar(max)
, @xml xml
set @sourceText = <your example text in your question>
set @workText = @sourceText
-- We're going to replace all the carriage returns and line feeds with
-- characters unlikely to appear in your text. (If they are, use some
-- other character.)
set @workText = REPLACE(@workText, char(10), '|')
set @workText = REPLACE(@workText, char(13), '|')
-- Now, we're going to turn your text into XML. Our first target is
-- the string of four "|" characters that the blank lines between entries
-- will be turned into. (If you've got 3, or 6, or blanks in between,
-- adjust accordingly.)
set @workText = REPLACE(@workText, '||||', '</line></entry><entry><line>')
-- Now we replace every other "|".
set @workText = REPLACE(@workText, '|', '</line><line>')
-- Now we construct the rest of the XML and convert the variable to an
-- actual XML variable.
set @workText = '<entry><line>' + @workText + '</line></entry>'
set @workText = REPLACE(@workText, '<line></line>','') -- Get rid of any empty nodes.
set @xml = CONVERT(xml, @workText)
我们现在应该有一个看起来像这样的 XML 片段。 (此时在SQL中插入select @xml
即可看到)
<entry> <line>Starting the investigation.</line> <line>JWAYNE entered the notes above on 08/12/1976 12:01</line> </entry> <entry> <line>Taking over the case. Not a lot of progress recently.</line> <line>CEASTWOOD entered the notes above on 03/14/2001 09:04</line> </entry> <entry> <line>No wonder this case is not progressing, the whole town is covering up some shenanigans!</line> <line>CEASTWOOD entered the notes above on 03/21/2001 05:23</line> </entry> <entry> <line>Star command was right, this investigation has been tossed around like a hot potato for a long time!</line> <line>BLIGHTYEAR entered the notes above on 08/29/2659 08:01</line> </entry>我们现在可以将 XML 转换为我们更喜欢的 XML :
set @xml = @xml.query(
'for $entry in /entry
return <entry><data>
{
for $line in $entry/line[position() < last()]
return string($line)
}
</data>
<timestamp>{ data($entry/line[last()]) }</timestamp>
</entry>
')
这让我们 XML 看起来像这样(由于篇幅原因,只显示了一个条目):
<entry> <data>Starting the investigation.</data> <timestamp>JWAYNE entered the notes above on 08/12/1976 12:01</timestamp> </entry>
您可以使用以下查询将其转换回表格数据:
select EntryData = R.lines.value('data[1]', 'nvarchar(max)') , EntryTimestamp = R.lines.value('timestamp[1]', 'nvarchar(MAX)') from @xml.nodes('/entry') as R(lines)
...并获取如下所示的数据。
从那里,您可以做任何您需要做的事情。