导入 .txt 文件 rowdies 但只导入给定的列

Question

我想导入一个包含 n 行和 25 列的 .txt 文件。因为有 10 000 000 行，我想按行导入它们，只保留 25 列中的前 7 列，然后将包含 7 列的新行写入新列表。这是我到目前为止尝试过但没有奏效的方法：

results = []
with open('allCountries.txt', newline='') as inputfile:
    for row in csv.reader(inputfile):
        results.append(row[:,[0,1,2,3,4,5,6,7]])
print(results)

错误：

TypeError: list indices must be integers or slices, not tuple

数据的link如下，但数据集有2GB大。 http://download.geonames.org/export/dump/allCountries.zip

感谢您的帮助！

Answer 1

您正在尝试切片，但使用了错误的语法。

来自docs：

While indexing is used to obtain individual characters, slicing allows you to obtain substring: [example omitted] Slice indices have useful defaults; an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced.

虽然我没有下载你的 2gb 数据集来检查，你也没有提供示例行，但从错误来看，每一行的结构似乎都是一个列表，列表中的每个项目代表一列。如果是这样，请尝试：

results = []
with open('allCountries.txt', newline='') as inputfile:
for row in csv.reader(inputfile):
    results.append(row[:7]])
print(results)

导入 .txt 文件 rowdies 但只导入给定的列

Import .txt file rowdies but only import given columns

python

import

pandas

rowwise

txt