从unstructerd信息中读出一些Term

Question

我有一个文件名列表。大约有100万行。在这些行中，我有一个以 58 开头的值。这个值。我想读出从 58 开始的所有后续数字。通常是8个字符。不过也可以是9.

文件名差别很大。

我在考虑 search/find 函数或部分，但不幸的是我对 Excel 不是很熟悉。

010 AHKG100 58098085 70085-DIS-0082.pdf
002 AHMQ32X32 58098524.pdf
AHSG160-58098564(=A-3129)_01.dwg
003 MVTA_78_ 58098861.pdf

这些是文件名的一些变体

执行结果如下，在新的专栏中：

Answer 1

假设您的数据从单元格A1开始，在单元格B1中输入以下数组公式:

=LARGE(IFERROR(--MID(A1,FIND("58",A1,ROW($Z:INDEX($Z:$Z,LEN(A1)))),{8,9}),0),1)

你必须按Ctrl+Shift+Enter 在公式栏中完成公式后，否则它们将无法正常运行。然后你可以简单地向下拖动公式来应用。

The logic is to use FIND function to find the starting point of 58 in the string, and then use MID function to return the 8-Character or 9-Character from the string and use -- to check if they are numerical values, if not use IFERROR function to return 0 instead. Lastly use LARGE function to return the largest numeric value from the range, which should only contain the value number and 0 (if there is any).

编辑#2

你也可以使用下面两个不需要输入Ctrl+Shift+的公式输入:

=AGGREGATE(14,6,--MID(A1,FIND("58",A1,ROW($Z:INDEX($Z:$Z,LEN(A1)))),{8,9}),1)

This one is essentially the same as the first one but replaced LARGE+IFERROR with AGGREGATE which is able to ignore error values in the range.

或

=AGGREGATE(14,6,--MID(A1,ROW($Z:INDEX($Z:$Z,LEN(A1))),{8,9}),1)

This one directly extracting every 8-character and 9-character sub-strings from the main string, then use -- to check if the sub-string is a numerical value, and lastly use AGGREGATE to return the largest value from the range.

如果您有任何问题，请告诉我。干杯:)

从unstructerd信息中读出一些Term

Read out some Term from unstructerd Information

excel

excel-formula

office365