SQL 解析出多个子串
SQL Parse out multiple substrings
我有一个很长很复杂的字符串,带有换行符 - 我很难解析。我需要能够为以下每个字段创建一个包含一列的 select 查询。
理想的做法是找到 new line break
- 对于每一行 - 返回到 :
冒号之前的所有内容都应该是列的名称,而 :
之间的所有内容new ling break
应该是字段中的数据。
所有数据都以字符串形式返回,所以我只是为下面的每一行构建一个 select 语句。我不确定这是否可能。
第二种选择,硬编码并说出类似 CHARINDEX ( 'Home Phone:' ,notes, 0)
我找到主 phone 字符串的位置,然后拉出 :
和 new ling break
之间的所有内容指定字符串后。
在这种情况下,我查询中的每个 select 项都会说 - 查找字符串 "Home Phone" 并拉出冒号后面的内容,或者查找字符串 "School Name" 等.
这是数据的样子(在一个名为 notes
的全字符串中):
Home Phone: 1234567890
Cell Phone: 1234567890
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music:
How did they hear: Other, provide more info: Former partner teacher in the Middle School
Type: Public/Charter
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaa aaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaa aaaaa aaaaaa aaaaaa aaaaaaaaaaaa aaaaaaaaaaaa aaa aaaa aaaaa aaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbb bbbbbbbbb bbbbbbb bbbbbb bbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccccccccccccccccc cccccccccccccccccc ccccccccccc ccccccccccccc ccccccccccccccccc cccccccc
所以输出看起来像这样(所有长问题也在每个字段中得到回答)。
Home Phone Cell Phone Date of Birth: … Type: Question 1 : Question 2: Question 3:
1234567890 1234567890 1/1/1971 Public/Charter aaaaaaaa aaaaaaaaaaaaa. bbb bbbbbbbbbb ccccccccccccccccccccccc
我不确定这是否有意义 -- 但非常感谢任何和所有建议。
提取子字符串和新行字符的代码——但这是硬编码的。我无法弄清楚如何动态地做到这一点。
SELECT ltrim(rtrim(CHARINDEX ( 'Home Phone:' ,notes, 0) + LEN('Home Phone: '))) as 'beggining',
ltrim(rtrim(CHARINDEX ( CHAR(10) ,notes, 0))) as 'ending',
SUBSTRING(notes,(CHARINDEX ( 'Home Phone:' ,notes, 0) + LEN('Home Phone: ')),(LEN('Home Phone: '))) as 'home phone',
FROM table a
谢谢!
我知道这里有很多人不喜欢这种分离器,但我更喜欢它。它最多只能处理 8000 个输入值,并且分隔符只是一个字符。然而,它有一些其他分离器所没有的好东西,除非你有大量的输入,否则它几乎可以满足所有需求。您可以在此处找到代码。 http://www.sqlservercentral.com/articles/Tally+Table/72993/ 评论(需要登录)运行了很多页,并且对这个拆分器进行了非常冗长的讨论。
然后其他人更喜欢使用数据透视表这种东西,我更喜欢交叉表(也称为条件聚合),因为我发现语法远没有那么迟钝。
我冒昧地稍微修改了您的示例数据。我更改了单元格 phone 的值,因此它与主页 phone 不同。我还缩短了问题的回复,因为它们不需要数百个字符来演示该技术。
declare @SomeValue varchar(8000)
set @SomeValue = 'Home Phone: 1234567890
Cell Phone: 3344556677
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music:
How did they hear: Other, provide more info: Former partner teacher in the Middle School
Type: Public/Charter
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc';
select
MAX(case when s.ItemNumber = 1 then x.Item end) as HomePhone
, MAX(case when s.ItemNumber = 2 then x.Item end) as DOB
, MAX(case when s.ItemNumber = 3 then x.Item end) as DOB
, MAX(case when s.ItemNumber = 4 then x.Item end) as SchoolName
, MAX(case when s.ItemNumber = 5 then x.Item end) as SchoolAddress
, MAX(case when s.ItemNumber = 6 then x.Item end) as SchoolCity
, MAX(case when s.ItemNumber = 7 then x.Item end) as SchoolState
, MAX(case when s.ItemNumber = 8 then x.Item end) as SchoolZip
, MAX(case when s.ItemNumber = 9 then x.Item end) as YearsTeaching
, MAX(case when s.ItemNumber = 10 then x.Item end) as GradeLevels
, MAX(case when s.ItemNumber = 11 then x.Item end) as TotalStudents
, MAX(case when s.ItemNumber = 12 then x.Item end) as Subject
, MAX(case when s.ItemNumber = 13 then x.Item end) as HowHeard
, MAX(case when s.ItemNumber = 14 then x.Item end) as SchoolType
, MAX(case when s.ItemNumber = 15 then x.Item end) as Question1
, MAX(case when s.ItemNumber = 16 then x.Item end) as Question2
, MAX(case when s.ItemNumber = 17 then x.Item end) as Question3
from dbo.DelimitedSplit8K(@SomeValue, CHAR(10)) s
cross apply dbo.DelimitedSplit8K(s.Item, ':') x
大部分功劳 (90%) 应该归功于 Alex K,他提供了有关查找字符第 n 次出现的深入答案
SQL Server - find nth occurrence in a string
我接受了那个答案,根据您的问题对其进行了调整,然后应用 PIVOT 将其分解为所需的 rows/columns。此方法应该能够为您需要的尽可能多的独特问题集创建所需的输出,前提是它们始终具有相同的逻辑(每个 question/answer 由换行符分隔)。
--Creates temporary table for testing, ID column and second set of data
--used to ensure query works for each unique set of questions
IF OBJECT_ID('tempdb..#Results') IS NOT NULL
DROP TABLE #Results
CREATE TABLE #Results
(ID INT IDENTITY(1,1) NOT NULL,
Notes NVARCHAR(4000) NOT NULL)
INSERT INTO #Results
(Notes)
VALUES
('Home Phone: 1234567890
Cell Phone: 1234567890
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music:
How did they hear: Other, provide more info: Former partner teacher in the Middle School
Type: Public/Charter '),
('Home Phone: test
Cell Phone: test
Date of Birth: test
School Name: test
Address:test
School City: test
School State: test
School Zip: test
Years Teaching: test
Grade Levels: test
Total Students: test
Subject: test
How did they hear: test
Type: test ');
--Recursive CTE to determine the position of each successive line break
--Used CHARINDEX to search CHAR(13) and CHAR(10) and find line breaks and carriage returns
WITH cte
AS
(SELECT ID, Notes, 1 AS Starts, CHARINDEX(CHAR(13)+CHAR(10),Notes) AS Pos
FROM #Results
UNION ALL
SELECT ID, Notes, Pos +1, CHARINDEX(CHAR(13)+CHAR(10),Notes,Pos+1) AS Pos
FROM cte
WHERE
pos >0),
--2nd CTE breaks each question set into it's own row
cte2
AS
(SELECT ID, Notes,Starts, Pos,
SUBSTRING(Notes, Starts,
CASE
WHEN pos > 0 THEN (pos - starts)
ELSE LEN(notes)
END) AS Token
FROM cte),
--3rd CTE cleans up the data, separating the Questions/Answers into separate columns
--REPLACE is used to remove Line Break (CHAR(10)), output was then showing a TAB so used
--double REPLACE and removed CHAR(9) (tab)
--LTRIM removes leading space
cte3
AS
(SELECT ID,
LTRIM(REPLACE(REPLACE(SUBSTRING(Token,CHARINDEX(CHAR(13)+CHAR(10),Token),CHARINDEX(':',Token)),CHAR(10),''),CHAR(9),'')) AS Question,
LTRIM(SUBSTRING(Token,CHARINDEX(':',Token)+1,4000)) AS Answer
FROM cte2)
--Pivot separates each Question/Answer row into it's own column
SELECT *
FROM
(SELECT ID, Question, Answer
FROM cte3) AS a
PIVOT
(MAX(Answer)
FOR [Question] IN([Address],[Cell Phone],[Date of Birth],[Grade Levels],[Home Phone],[How did they hear],
[School City],[School Name],[School State],[School Zip],[Subject],[Total Students],[Type],[Years Teaching])) AS pvt
我在每个部分都发表了评论,希望能解释我的逻辑,但如果您有任何问题,请告诉我。
编辑:动态枢轴
可以使用动态 SQL 创建一个 PIVOT,它会自动选取所有 "Question" 列并进行相应调整。我不相信它可以一步完成,因为我必须使用多个 CTE。我要做的是采取上述用于创建 CTE、CTE2 和 CTE3 的步骤(基本上是 PIVOT 查询之前的所有内容)并创建这些步骤的视图,然后使用该视图执行以下操作(对于我的示例,视图称为 "Questionaire")
DECLARE @columns AS NVARCHAR(MAX)
DECLARE @query AS NVARCHAR(MAX)
SET @columns = STUFF((SELECT DISTINCT ',' + QUOTENAME(q.question)
FROM questionaire AS q
FOR XML PATH(''), TYPE
).value('.','NVARCHAR(MAX)')
,1,1,'')
SET @query = 'SELECT ID, '+ @columns +' FROM
(
SELECT ID, Answer, Question
FROM questionaire
) AS a
PIVOT
(
MAX(Answer)
FOR Question IN(' +@columns+')
) AS p'
EXECUTE(@query)
您可以像这样尝试 xml
,但我在 music
和 provide more info
之后删除了额外的 :
。
DECLARE @string nvarchar(max) = '
Home Phone: 1234567890
Cell Phone: 1234567890
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music
How did they hear: Other, provide more info, Former partner teacher in the Middle School
Type: Public/Charter
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaa aaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaa aaaaa aaaaaa aaaaaa aaaaaaaaaaaa aaaaaaaaaaaa aaa aaaa aaaaa aaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbb bbbbbbbbb bbbbbbb bbbbbb bbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccccccccccccccccc cccccccccccccccccc ccccccccccc ccccccccccccc ccccccccccccccccc cccccccc'
,@xml as xml
SELECT @xml = REPLACE ('<mystring><fieldname id="'+REPLACE(REPLACE(right(@string,LEN(@string)-2),':','" >'),CHAR(10),'</fieldname><fieldname id="')+'</fieldname></mystring>' ,CHAR(13),'')
SELECT
n.v.value('(fieldname[@id="Home Phone"])[1]','NVARCHAR(11)') AS 'Home Phone',
n.v.value('(fieldname[@id="Cell Phone"])[1]','NVARCHAR(11)') AS 'Cell Phone',
n.v.value('(fieldname[@id="Date of Birth"])[1]','NVARCHAR(12)') AS 'Date of Birth',
n.v.value('(fieldname[@id="School Name"])[1]','NVARCHAR(30)') AS 'School Name',
n.v.value('(fieldname[@id="Address"])[1]','NVARCHAR(30)') AS 'Address',
n.v.value('(fieldname[@id="School City"])[1]','NVARCHAR(15)') AS 'School City',
n.v.value('(fieldname[@id="School State"])[1]','NVARCHAR(10)') AS 'School State',
n.v.value('(fieldname[@id="School Zip"])[1]','NVARCHAR(6)') AS 'School Zip',
n.v.value('(fieldname[@id="Years Teaching"])[1]','NVARCHAR(5)') AS 'Years Teaching',
n.v.value('(fieldname[@id="Grade Levels"])[1]','NVARCHAR(15)') AS 'Grade Levels',
n.v.value('(fieldname[@id="Total Students"])[1]','NVARCHAR(5)') AS 'Total Students',
n.v.value('(fieldname[@id="How did they hear"])[1]','NVARCHAR(100)') AS 'How did they hear',
n.v.value('(fieldname[@id="Type"])[1]','NVARCHAR(25)') AS 'Type',
n.v.value('(fieldname[@id="Question 1"])[1]','NVARCHAR(128)') AS 'Question 1',
n.v.value('(fieldname[@id="Question 2"])[1]','NVARCHAR(128)') AS 'Question 2',
n.v.value('(fieldname[@id="Question 3"])[1]','NVARCHAR(128)') AS 'Question 3'
FROM @xml.nodes('mystring') as n(v);
结果:
Home Phone Cell Phone Date of Birth School Name Address School City School State School Zip Years Teaching Grade Levels Total Students How did they hear Type Question 1 Question 2 Question 3
----------- ----------- ------------- ------------------------------ ------------------------------ --------------- ------------ ---------- -------------- --------------- -------------- ---------------------------------------------------------------------------------------------------- ------------------------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
1234567890 1234567890 01/01/1971 James Jones High School 123 Main Street Queens PA 32112 12 Middle School 120 Other, provide more info, Former partner teacher in the Middle School Public/Charter aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa aaaaaaaaaaaaaaaaa bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccc
(1 row(s) affected)
我有一个很长很复杂的字符串,带有换行符 - 我很难解析。我需要能够为以下每个字段创建一个包含一列的 select 查询。
理想的做法是找到 new line break
- 对于每一行 - 返回到 :
冒号之前的所有内容都应该是列的名称,而 :
之间的所有内容new ling break
应该是字段中的数据。
所有数据都以字符串形式返回,所以我只是为下面的每一行构建一个 select 语句。我不确定这是否可能。
第二种选择,硬编码并说出类似 CHARINDEX ( 'Home Phone:' ,notes, 0)
我找到主 phone 字符串的位置,然后拉出 :
和 new ling break
之间的所有内容指定字符串后。
在这种情况下,我查询中的每个 select 项都会说 - 查找字符串 "Home Phone" 并拉出冒号后面的内容,或者查找字符串 "School Name" 等.
这是数据的样子(在一个名为 notes
的全字符串中):
Home Phone: 1234567890
Cell Phone: 1234567890
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music:
How did they hear: Other, provide more info: Former partner teacher in the Middle School
Type: Public/Charter
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaa aaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaa aaaaa aaaaaa aaaaaa aaaaaaaaaaaa aaaaaaaaaaaa aaa aaaa aaaaa aaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbb bbbbbbbbb bbbbbbb bbbbbb bbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccccccccccccccccc cccccccccccccccccc ccccccccccc ccccccccccccc ccccccccccccccccc cccccccc
所以输出看起来像这样(所有长问题也在每个字段中得到回答)。
Home Phone Cell Phone Date of Birth: … Type: Question 1 : Question 2: Question 3:
1234567890 1234567890 1/1/1971 Public/Charter aaaaaaaa aaaaaaaaaaaaa. bbb bbbbbbbbbb ccccccccccccccccccccccc
我不确定这是否有意义 -- 但非常感谢任何和所有建议。
提取子字符串和新行字符的代码——但这是硬编码的。我无法弄清楚如何动态地做到这一点。
SELECT ltrim(rtrim(CHARINDEX ( 'Home Phone:' ,notes, 0) + LEN('Home Phone: '))) as 'beggining',
ltrim(rtrim(CHARINDEX ( CHAR(10) ,notes, 0))) as 'ending',
SUBSTRING(notes,(CHARINDEX ( 'Home Phone:' ,notes, 0) + LEN('Home Phone: ')),(LEN('Home Phone: '))) as 'home phone',
FROM table a
谢谢!
我知道这里有很多人不喜欢这种分离器,但我更喜欢它。它最多只能处理 8000 个输入值,并且分隔符只是一个字符。然而,它有一些其他分离器所没有的好东西,除非你有大量的输入,否则它几乎可以满足所有需求。您可以在此处找到代码。 http://www.sqlservercentral.com/articles/Tally+Table/72993/ 评论(需要登录)运行了很多页,并且对这个拆分器进行了非常冗长的讨论。
然后其他人更喜欢使用数据透视表这种东西,我更喜欢交叉表(也称为条件聚合),因为我发现语法远没有那么迟钝。
我冒昧地稍微修改了您的示例数据。我更改了单元格 phone 的值,因此它与主页 phone 不同。我还缩短了问题的回复,因为它们不需要数百个字符来演示该技术。
declare @SomeValue varchar(8000)
set @SomeValue = 'Home Phone: 1234567890
Cell Phone: 3344556677
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music:
How did they hear: Other, provide more info: Former partner teacher in the Middle School
Type: Public/Charter
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc';
select
MAX(case when s.ItemNumber = 1 then x.Item end) as HomePhone
, MAX(case when s.ItemNumber = 2 then x.Item end) as DOB
, MAX(case when s.ItemNumber = 3 then x.Item end) as DOB
, MAX(case when s.ItemNumber = 4 then x.Item end) as SchoolName
, MAX(case when s.ItemNumber = 5 then x.Item end) as SchoolAddress
, MAX(case when s.ItemNumber = 6 then x.Item end) as SchoolCity
, MAX(case when s.ItemNumber = 7 then x.Item end) as SchoolState
, MAX(case when s.ItemNumber = 8 then x.Item end) as SchoolZip
, MAX(case when s.ItemNumber = 9 then x.Item end) as YearsTeaching
, MAX(case when s.ItemNumber = 10 then x.Item end) as GradeLevels
, MAX(case when s.ItemNumber = 11 then x.Item end) as TotalStudents
, MAX(case when s.ItemNumber = 12 then x.Item end) as Subject
, MAX(case when s.ItemNumber = 13 then x.Item end) as HowHeard
, MAX(case when s.ItemNumber = 14 then x.Item end) as SchoolType
, MAX(case when s.ItemNumber = 15 then x.Item end) as Question1
, MAX(case when s.ItemNumber = 16 then x.Item end) as Question2
, MAX(case when s.ItemNumber = 17 then x.Item end) as Question3
from dbo.DelimitedSplit8K(@SomeValue, CHAR(10)) s
cross apply dbo.DelimitedSplit8K(s.Item, ':') x
大部分功劳 (90%) 应该归功于 Alex K,他提供了有关查找字符第 n 次出现的深入答案
SQL Server - find nth occurrence in a string
我接受了那个答案,根据您的问题对其进行了调整,然后应用 PIVOT 将其分解为所需的 rows/columns。此方法应该能够为您需要的尽可能多的独特问题集创建所需的输出,前提是它们始终具有相同的逻辑(每个 question/answer 由换行符分隔)。
--Creates temporary table for testing, ID column and second set of data
--used to ensure query works for each unique set of questions
IF OBJECT_ID('tempdb..#Results') IS NOT NULL
DROP TABLE #Results
CREATE TABLE #Results
(ID INT IDENTITY(1,1) NOT NULL,
Notes NVARCHAR(4000) NOT NULL)
INSERT INTO #Results
(Notes)
VALUES
('Home Phone: 1234567890
Cell Phone: 1234567890
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music:
How did they hear: Other, provide more info: Former partner teacher in the Middle School
Type: Public/Charter '),
('Home Phone: test
Cell Phone: test
Date of Birth: test
School Name: test
Address:test
School City: test
School State: test
School Zip: test
Years Teaching: test
Grade Levels: test
Total Students: test
Subject: test
How did they hear: test
Type: test ');
--Recursive CTE to determine the position of each successive line break
--Used CHARINDEX to search CHAR(13) and CHAR(10) and find line breaks and carriage returns
WITH cte
AS
(SELECT ID, Notes, 1 AS Starts, CHARINDEX(CHAR(13)+CHAR(10),Notes) AS Pos
FROM #Results
UNION ALL
SELECT ID, Notes, Pos +1, CHARINDEX(CHAR(13)+CHAR(10),Notes,Pos+1) AS Pos
FROM cte
WHERE
pos >0),
--2nd CTE breaks each question set into it's own row
cte2
AS
(SELECT ID, Notes,Starts, Pos,
SUBSTRING(Notes, Starts,
CASE
WHEN pos > 0 THEN (pos - starts)
ELSE LEN(notes)
END) AS Token
FROM cte),
--3rd CTE cleans up the data, separating the Questions/Answers into separate columns
--REPLACE is used to remove Line Break (CHAR(10)), output was then showing a TAB so used
--double REPLACE and removed CHAR(9) (tab)
--LTRIM removes leading space
cte3
AS
(SELECT ID,
LTRIM(REPLACE(REPLACE(SUBSTRING(Token,CHARINDEX(CHAR(13)+CHAR(10),Token),CHARINDEX(':',Token)),CHAR(10),''),CHAR(9),'')) AS Question,
LTRIM(SUBSTRING(Token,CHARINDEX(':',Token)+1,4000)) AS Answer
FROM cte2)
--Pivot separates each Question/Answer row into it's own column
SELECT *
FROM
(SELECT ID, Question, Answer
FROM cte3) AS a
PIVOT
(MAX(Answer)
FOR [Question] IN([Address],[Cell Phone],[Date of Birth],[Grade Levels],[Home Phone],[How did they hear],
[School City],[School Name],[School State],[School Zip],[Subject],[Total Students],[Type],[Years Teaching])) AS pvt
我在每个部分都发表了评论,希望能解释我的逻辑,但如果您有任何问题,请告诉我。
编辑:动态枢轴
可以使用动态 SQL 创建一个 PIVOT,它会自动选取所有 "Question" 列并进行相应调整。我不相信它可以一步完成,因为我必须使用多个 CTE。我要做的是采取上述用于创建 CTE、CTE2 和 CTE3 的步骤(基本上是 PIVOT 查询之前的所有内容)并创建这些步骤的视图,然后使用该视图执行以下操作(对于我的示例,视图称为 "Questionaire")
DECLARE @columns AS NVARCHAR(MAX)
DECLARE @query AS NVARCHAR(MAX)
SET @columns = STUFF((SELECT DISTINCT ',' + QUOTENAME(q.question)
FROM questionaire AS q
FOR XML PATH(''), TYPE
).value('.','NVARCHAR(MAX)')
,1,1,'')
SET @query = 'SELECT ID, '+ @columns +' FROM
(
SELECT ID, Answer, Question
FROM questionaire
) AS a
PIVOT
(
MAX(Answer)
FOR Question IN(' +@columns+')
) AS p'
EXECUTE(@query)
您可以像这样尝试 xml
,但我在 music
和 provide more info
之后删除了额外的 :
。
DECLARE @string nvarchar(max) = '
Home Phone: 1234567890
Cell Phone: 1234567890
Date of Birth: 01/01/1971
School Name: James Jones High School
Address:123 Main Street
School City: Queens
School State: PA
School Zip: 32112
Years Teaching: 12
Grade Levels: Middle School
Total Students: 120
Subject: Music
How did they hear: Other, provide more info, Former partner teacher in the Middle School
Type: Public/Charter
Question 1: aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaa aaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa aaaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaa aaaaa aaaaaa aaaaaa aaaaaaaaaaaa aaaaaaaaaaaa aaa aaaa aaaaa aaaaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa aaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaa aaaaaaaaa aaaaaaaaaaaa.
Question 2: bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbb bbbbbbbbb bbbbbbb bbbbbb bbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbbbb bbbbbbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbb
Question 3: ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccccccccccccccccc cccccccccccccccccc ccccccccccc ccccccccccccc ccccccccccccccccc cccccccc'
,@xml as xml
SELECT @xml = REPLACE ('<mystring><fieldname id="'+REPLACE(REPLACE(right(@string,LEN(@string)-2),':','" >'),CHAR(10),'</fieldname><fieldname id="')+'</fieldname></mystring>' ,CHAR(13),'')
SELECT
n.v.value('(fieldname[@id="Home Phone"])[1]','NVARCHAR(11)') AS 'Home Phone',
n.v.value('(fieldname[@id="Cell Phone"])[1]','NVARCHAR(11)') AS 'Cell Phone',
n.v.value('(fieldname[@id="Date of Birth"])[1]','NVARCHAR(12)') AS 'Date of Birth',
n.v.value('(fieldname[@id="School Name"])[1]','NVARCHAR(30)') AS 'School Name',
n.v.value('(fieldname[@id="Address"])[1]','NVARCHAR(30)') AS 'Address',
n.v.value('(fieldname[@id="School City"])[1]','NVARCHAR(15)') AS 'School City',
n.v.value('(fieldname[@id="School State"])[1]','NVARCHAR(10)') AS 'School State',
n.v.value('(fieldname[@id="School Zip"])[1]','NVARCHAR(6)') AS 'School Zip',
n.v.value('(fieldname[@id="Years Teaching"])[1]','NVARCHAR(5)') AS 'Years Teaching',
n.v.value('(fieldname[@id="Grade Levels"])[1]','NVARCHAR(15)') AS 'Grade Levels',
n.v.value('(fieldname[@id="Total Students"])[1]','NVARCHAR(5)') AS 'Total Students',
n.v.value('(fieldname[@id="How did they hear"])[1]','NVARCHAR(100)') AS 'How did they hear',
n.v.value('(fieldname[@id="Type"])[1]','NVARCHAR(25)') AS 'Type',
n.v.value('(fieldname[@id="Question 1"])[1]','NVARCHAR(128)') AS 'Question 1',
n.v.value('(fieldname[@id="Question 2"])[1]','NVARCHAR(128)') AS 'Question 2',
n.v.value('(fieldname[@id="Question 3"])[1]','NVARCHAR(128)') AS 'Question 3'
FROM @xml.nodes('mystring') as n(v);
结果:
Home Phone Cell Phone Date of Birth School Name Address School City School State School Zip Years Teaching Grade Levels Total Students How did they hear Type Question 1 Question 2 Question 3
----------- ----------- ------------- ------------------------------ ------------------------------ --------------- ------------ ---------- -------------- --------------- -------------- ---------------------------------------------------------------------------------------------------- ------------------------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
1234567890 1234567890 01/01/1971 James Jones High School 123 Main Street Queens PA 32112 12 Middle School 120 Other, provide more info, Former partner teacher in the Middle School Public/Charter aaaaaaaa aaaaaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaaaaaaa aaaaaaaaa aaaaaaa aaaa aaa aaaaaaaa aaaaaa aaaaaaaa aaaaaaaaaaaaaaaaa bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb bbbbbbbbbbbb bbbbbbb bbb bbbbbbbbbb bbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbb ccccccccccccccccccccccc cccccccc ccccccccccc cccccccccccccccccccccc ccc ccccccccc cccccccccccccc ccccccccccccccccccccc cccccccc
(1 row(s) affected)