根据内容 CSV python 提取列

Question

我有一个如下所示的 csv 文件

h1,h2,h3
1 year,homo sapiens,fibrous tissue
3 minutes,homo sapiens,fibrous tissue
2 hours,homo sapiens,epithelial tissue

我正在尝试获取其中包含我提供的字符串的那一列。例如，如果我说年份，则需要将整个列附加到列表中，例如 [1 年，3 分钟，2 小时]。我完全不知道如何进行。非常感谢任何帮助。

编辑：问题在于，数据可以在任何列中。

Answer 1

我们可以使用列表理解以及 any 和 str.contains 的组合：

In [183]:
# filter the columns for only those that contain our text of interest
cols_of_interest = [col for col in df if any(df[col].str.contains('year'))]
cols_of_interest
Out[183]:
['h1']
In [184]:
# use the list as a column filter
df[cols_of_interest]
Out[184]:
          h1
0     1 year
1  3 minutes
2    2 hours

因此，这会通过调用矢量化 str 方法 contains.

来测试列中的 any 值是否包含感兴趣的文本

将列表理解包装到返回列表的函数中会很容易：

In [185]:

def cols_contains(text):
    return [col for col in df if any(df[col].str.contains(text))]

df[cols_contains('year')]
Out[185]:
          h1
0     1 year
1  3 minutes
2    2 hours

Answer 2

试试这个

f=open('your_file.csv','r')

x=[]
for i in f:
    x.append(i)


"first column"

for i in range(len(x)):
    print x[i].split(',')[0]

输出 h1

1 年

3 分钟

2 小时

"Second Column"


for i in range(len(x)):
    print x[i].split(',')[1]

输出：

h2

智人

根据内容 CSV python 提取列

Extract a column based on its contents CSV python

python

csv

string

extract

pandas