正则表达式(re2 googlesheets)多行单元格中的多个值

Regex (re2 googlesheets) multiple values in multiline cell

卡在如何通过 arrayformula 从多行单元格中读取和美化这些值。

我使用正则表达式作为前一行可能会有所不同。


请只输入公式,不要自定义代码


第一列看起来像一组这样的东西: ``` [配置] 姓名 = the_name 纹理 = blah.dds 成本 = 1000

[效果0] 值 = 1000 类型 = ATTR_A

[效果1] 值 = 8 类型 = ATTR_B

[特征0] 姓名 = feature_blah

[组件] 0 = comp_one,1

[资源] res_one = 1 res_five = 1 res_four = 1

<br/>
Where to be useful elsewhere, at minimum it needs each [tag] set ([effect\d], [feature\d], ect) to be in one column each, for example the 'effects' column would look like:

ATTR_A:1000,ATTR_B:8


and so on.



Desired output can also be seen in the included spreadsheet


<br/>
<b>Here is the example spreadsheet:</b>

https://docs.google.com/spreadsheets/d/1arMaaT56S_STTvRr2OxCINTyF-VvZ95Pm3mljju8Cxw/edit?usp=sharing


**Current REGEXREPLACE**

Kinda works, finds each 'type' and 'value' great, just cant figure out how to extract just that from the rest, tried capture (and non-capturing) groups before and after but didnt work

=ARRAYFORMULA(REGEXREPLACE($A3:$A,"[\n.][effect\d][\n.](.)\n(.)","1:$1 2:$2"))


**Current SUBSTITUTE + REGEXEXTRACT + REGEXREPLACE**

A different approach entirely, also kinda works, longer form though and left with having to parse the values out of that string, where got stuck again. Idea was to use this to simplify, then regexreplace like above. Getting stuck removing content around the final matches though, and if can do that then above approach is fine too.

//先运行一个替补 =ARRAYFORMULA(替换(替换($A3:$A,char(10),";"),";;",char(10))) // 然后这个的变体(放弃单行 'effect/d' 所以打破它来尝试让它工作) =ARRAYFORMULA(IF(A3:A<>"",IFERROR(REGEXEXTRACT(A3:A,"(?m)^(?:[effect0]);(.)$" )&";;")&""&IFERROR(REGEXEXTRACT(A3:A,"(?m)^(?:[effect1]);(.)$")&" ;;")&""&IFERROR(REGEXEXTRACT(A3:A,"(?m)^(?:[effect2]);(.)$")&";;" ),"")) // 然后像上面那样使用regexreplace =ARRAYFORMULA(REGEXREPLACE($B3:$B,"value = (.);type = (.);;","1:$1 2:$2"))


**--EDIT--**

Also, as my updated 'Desired Output' sheet shows (see timestamped comment below), bonus kudos if you can also extract just the values of matching 'type's to those extra columns (see spreadsheet). 

All good if you cant though, just realized would need that too for lookups.

**--END OF EDIT--**

<br/>
Ive tried dozens of things, discarding each in turn, had a quick look in version history to grab out two promising attempts and shared them in separate sheets.

One of these also used SUBSTITUTE to simplify input column, im happy for a solution using either RAW or the SUBSTITUTE results.


<br/>
**Potentially Useful links:**

https://github.com/google/re2/wiki/Syntax



<br/>

<b>Just some more words:</b>

I also have looked at dozens of Whosebug and google support pages, so tried both REGEXEXTRACT and REGEXREPLACE, both promising but missing that final tweak. And i tried dozens of tweaks already on both.


Any help would be great, and hopefully help others in future since examples with spreadsheets are great since every new REGEX seems to be a new adventure ;) 

<br/>
P.S. if we can think of better title for OP, please say in comment or your answer :)

这是一个有趣的练习。 :-)

首先警告:我添加了一些 "input data"。示例:

[feature1]
name = feature_active_spoiler2

[components]
0 = spoiler,1
1 = spoilerA, 2

所以输出有"extra"个输出。

查看选项卡 ADW's Solution

粘贴到B3:

=ARRAYFORMULA(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
 IF(C3:E<>"", C2:E2&":"&C3:E, )),,999^99))), " ", ", "))

粘贴到 C3:

=ARRAYFORMULA(IFNA(REGEXEXTRACT($A3:$A, "(\d+)\ntype = "&C2)))

粘贴到 D3:

=ARRAYFORMULA(IFNA(REGEXEXTRACT($A3:$A, "(\d+)\ntype = "&D2)))

粘贴到E3:

=ARRAYFORMULA(IFNA(REGEXEXTRACT($A3:$A, "(\d+)\ntype = "&E2)))

粘贴在F3:

=ARRAYFORMULA(IFNA(REGEXEXTRACT(A3:A, "\[feature\d+\]\nname = (.*)")))

粘贴到G3:

=ARRAYFORMULA(IFNA(REGEXEXTRACT(A3:A, "\[components\]\n\d+ = (.*)")))

粘贴到H3:

=ARRAYFORMULA(IFNA(REGEXREPLACE(INDEX(SPLIT(REGEXEXTRACT(
 REGEXREPLACE(A3:A, "\n", ", "), "\[resources\], (.*)"), "["),,1), ", , $", )))

spreadsheet demo