从unstructerd信息中读出一些Term

Read out some Term from unstructerd Information

我有一个文件名列表。大约有100万行。 在这些行中,我有一个以 58 开头的值。这个值。我想读出从 58 开始的所有后续数字。通常是8个字符。不过也可以是9.

文件名差别很大。

我在考虑 search/find 函数或部分,但不幸的是我对 Excel 不是很熟悉。

010 AHKG100 58098085 70085-DIS-0082.pdf
002 AHMQ32X32 58098524.pdf
AHSG160-58098564(=A-3129)_01.dwg
003 MVTA_78_ 58098861.pdf

这些是文件名的一些变体

执行结果如下,在新的专栏中:

58098085
58098524
58098564
58098861

假设您的数据从单元格A1开始,在单元格B1中输入以下数组公式:

=LARGE(IFERROR(--MID(A1,FIND("58",A1,ROW($Z:INDEX($Z:$Z,LEN(A1)))),{8,9}),0),1)

必须Ctrl+Shift+Enter 在公式栏中完成公式后,否则它们将无法正常运行。然后你可以简单地向下拖动公式来应用。

The logic is to use FIND function to find the starting point of 58 in the string, and then use MID function to return the 8-Character or 9-Character from the string and use -- to check if they are numerical values, if not use IFERROR function to return 0 instead. Lastly use LARGE function to return the largest numeric value from the range, which should only contain the value number and 0 (if there is any).

编辑#2

你也可以使用下面两个不需要输入Ctrl+Shift+的公式输入:

=AGGREGATE(14,6,--MID(A1,FIND("58",A1,ROW($Z:INDEX($Z:$Z,LEN(A1)))),{8,9}),1)

This one is essentially the same as the first one but replaced LARGE+IFERROR with AGGREGATE which is able to ignore error values in the range.

=AGGREGATE(14,6,--MID(A1,ROW($Z:INDEX($Z:$Z,LEN(A1))),{8,9}),1)

This one directly extracting every 8-character and 9-character sub-strings from the main string, then use -- to check if the sub-string is a numerical value, and lastly use AGGREGATE to return the largest value from the range.

如果您有任何问题,请告诉我。干杯:)