EDI 文件上的正则表达式
Regex over EDI File
大家好,我有一个 EDI 文件,其中包含数量、交货日期等信息。
现在我想用正则表达式拆分它,这样我就可以用所需的信息拆分行。所以附上你找到文件内容。我用像 LIN+.* 或 LIN+.*? 这样的表达式试过了?但后来我只把所有的 LIN 段放在一起,或者把 LIN 段分开但信息较少。我想将每个 LIN 元素与其后的全部信息分开。
有人可以帮助我吗?
UNB+UNOA:2+094200005561400986LA:ZZ+MTEL+200406:1436+34906++++1'UNH+112490+DELFOR:D:96A:UN'BGM+241+2004060008796+9'DTM+137:202004061436:203'DTM+157:20200406:102'DTM+36:20200206:102'NAD+BY+FRSFA0222838V::92'NAD+SE+000563X::92'UNS+D'NAD+CN+VP1::92++TEST+SK TEST:204 TEST:TEST 22:TEST ST TEST+++37540+FRA'LIN+1+3+441344:IN'PIA+1+7PK1150:VN'IMD+++:::VO-VKMV 7PK1150 VP'LOC+11+999'LOC+159+999'RFF+ON:P092303'QTY+113:100.00:PC'SCC+1'DTM+2:20200116:102'RFF+AAJ:P092303:100'QTY+113:100.00:PC'SCC+1'DTM+2:20200206:102'RFF+AAJ:P092304:100'LIN+2+3+502107:IN'PIA+1+3PK670:VN'IMD+++:::VO-VKMV 3PK670 EDC'LOC+11+999'LOC+159+999'RFF+ON:P088273'QTY+113:300.00:PC'SCC+1'DTM+2:20190503:102'RFF+AAJ:P088273:100'LIN+3+3+502109:IN'PIA+1+6PK970:VN'IMD+++:::VO-VKMV 6PK970 EDC'LOC+11+999'LOC+159+999'RFF+ON:P084470'QTY+113:200.00:PC'SCC+1'DTM+2:20190422:102'RFF+AAJ:P084470:100'LIN+4+3+6DK1215:IN'PIA+1+AVRRV50D1-VKMV 6DK1215:VN'IMD+++:::6DK1215'LOC+11+999'LOC+159+999'RFF+ON:P046369'QTY+48:533.00:PC'RFF+AAK:32299'DTM+171:20181109:102'QTY+113:533.00:PC'SCC+1'DTM+2:20190419:102'RFF+AAJ:P046369:100'LIN+5+3+6DK1320:IN'PIA+1+AVRRV50D1-VKMV 6DK1320?+282:VN'IMD+++:::6DK1320'LOC+11+999'LOC+159+999'RFF+ON:P061903'QTY+48:115.00:PC'RFF+AAK:43146'DTM+171:20181003:102'QTY+113:104.00:PC'SCC+1'DTM+2:20181005:102'RFF+AAJ:P061903:100'QTY+113:104.00:PC'SCC+1'DTM+2:20181102:102'RFF+AAJ:P062034:100'UNS+S'UNT+75+112490'UNZ+1+34906' ```
您可以使用
LIN(?:(?!LIN).)*
或者,更高效的版本(在 之后):
LIN[^L]*(?:L(?!IN)[^L]*)*
见regex demo #1 and regex demo #2
(?:(?!LIN).)*
模式匹配任何不以 LIN
字符序列开始的字符 (.
),0 次或更多次,但尽可能多.
[^L]*(?:L(?!IN)[^L]*)*
模式匹配除 L
以外的任何 0 个或多个字符,然后出现 0 个或多个 L
序列,但后面没有跟 IN
,然后L
.
以外的 0+ 个字符
大家好,我有一个 EDI 文件,其中包含数量、交货日期等信息。 现在我想用正则表达式拆分它,这样我就可以用所需的信息拆分行。所以附上你找到文件内容。我用像 LIN+.* 或 LIN+.*? 这样的表达式试过了?但后来我只把所有的 LIN 段放在一起,或者把 LIN 段分开但信息较少。我想将每个 LIN 元素与其后的全部信息分开。 有人可以帮助我吗?
UNB+UNOA:2+094200005561400986LA:ZZ+MTEL+200406:1436+34906++++1'UNH+112490+DELFOR:D:96A:UN'BGM+241+2004060008796+9'DTM+137:202004061436:203'DTM+157:20200406:102'DTM+36:20200206:102'NAD+BY+FRSFA0222838V::92'NAD+SE+000563X::92'UNS+D'NAD+CN+VP1::92++TEST+SK TEST:204 TEST:TEST 22:TEST ST TEST+++37540+FRA'LIN+1+3+441344:IN'PIA+1+7PK1150:VN'IMD+++:::VO-VKMV 7PK1150 VP'LOC+11+999'LOC+159+999'RFF+ON:P092303'QTY+113:100.00:PC'SCC+1'DTM+2:20200116:102'RFF+AAJ:P092303:100'QTY+113:100.00:PC'SCC+1'DTM+2:20200206:102'RFF+AAJ:P092304:100'LIN+2+3+502107:IN'PIA+1+3PK670:VN'IMD+++:::VO-VKMV 3PK670 EDC'LOC+11+999'LOC+159+999'RFF+ON:P088273'QTY+113:300.00:PC'SCC+1'DTM+2:20190503:102'RFF+AAJ:P088273:100'LIN+3+3+502109:IN'PIA+1+6PK970:VN'IMD+++:::VO-VKMV 6PK970 EDC'LOC+11+999'LOC+159+999'RFF+ON:P084470'QTY+113:200.00:PC'SCC+1'DTM+2:20190422:102'RFF+AAJ:P084470:100'LIN+4+3+6DK1215:IN'PIA+1+AVRRV50D1-VKMV 6DK1215:VN'IMD+++:::6DK1215'LOC+11+999'LOC+159+999'RFF+ON:P046369'QTY+48:533.00:PC'RFF+AAK:32299'DTM+171:20181109:102'QTY+113:533.00:PC'SCC+1'DTM+2:20190419:102'RFF+AAJ:P046369:100'LIN+5+3+6DK1320:IN'PIA+1+AVRRV50D1-VKMV 6DK1320?+282:VN'IMD+++:::6DK1320'LOC+11+999'LOC+159+999'RFF+ON:P061903'QTY+48:115.00:PC'RFF+AAK:43146'DTM+171:20181003:102'QTY+113:104.00:PC'SCC+1'DTM+2:20181005:102'RFF+AAJ:P061903:100'QTY+113:104.00:PC'SCC+1'DTM+2:20181102:102'RFF+AAJ:P062034:100'UNS+S'UNT+75+112490'UNZ+1+34906' ```
您可以使用
LIN(?:(?!LIN).)*
或者,更高效的版本(在
LIN[^L]*(?:L(?!IN)[^L]*)*
见regex demo #1 and regex demo #2
(?:(?!LIN).)*
LIN
字符序列开始的字符 (.
),0 次或更多次,但尽可能多.
[^L]*(?:L(?!IN)[^L]*)*
模式匹配除 L
以外的任何 0 个或多个字符,然后出现 0 个或多个 L
序列,但后面没有跟 IN
,然后L
.