使用 Python 从文件中读取和过滤列名

Read and filter column names from file using Python

我是 python 的新手,我的要求是从文件中获取列名。

文件可能包含以下类型的内容:

OPTIONS ( SKIP=1)
LOAD DATA
TRAILING NULLCOLS
(
A_TEST                              NULLIF TEST=BLANKS,
B_TEST                          NULLIF TEST=BLANKS,
C_TEST                                  NULLIF TEST=BLANKS,
CREATE_DT       DATE 'YYYYMMDDHH24MISS' NULLIF OPENING_DT=BLANKS,
D_CST CONSTANT 'FNAMELOAD'
)

我需要在第二个开括号和每行的第一个非空字符串之后获取数据,它的下一个值不像 CONSTANT 。 所以对于上面的格式化文件,我的预期输出是:

A_TEST,B_TEST,C_TEST,CREATE_DT.

你可以这样做:

f = open("data.txt", "r")
data=f.read()
from_index=data.rfind('(' ) # find index of the last occurence of the bracket

data_sel=data[from_index:] # select just chunk of data, starting from specified index
lst=data_sel.split('\n') #split by the new line
for line in lst:
    if line!='(' and line!=')' and "CONSTANT" not in line: # conditions, you will maybe have to tweak it, but here is some basic logic
        print(line.split(' ')[0]) # print the first element of the created array, or place it in the list, or something...