写入 csv 文件时，为什么每个字母都在一列中？

Question

我使用的代码：

import urllib2
import csv
from bs4 import BeautifulSoup

url = "http://en.wikipedia.org/wiki/List_of_ongoing_armed_conflicts"
soup = BeautifulSoup(urllib2.urlopen(url))

fl = open('locations.csv', 'w')

def unique(countries):
    seen = set()
    for country in countries:
        l = country.lower()
        if l in seen:
            continue
        seen.add(l)
        yield country


locs = []
for row in soup.select('table.wikitable tr'):
    cells = row.find_all('td')
    if cells:
        for location in cells[3].find_all(text=True):
            locs.extend(location.split())

locs2 = []            
for locations in unique(locs):
    locations = locs2.extend(locations.split())
print sorted(locs2)

writer = csv.writer(fl)
writer.writerow(['location'])
for values in sorted(locs2):
    writer.writerow(values)

fl.close()

当我打印我正在编写的代码时，我在每个元素前面得到一个 u'，我认为这就是它以这种方式输出的原因。我尝试使用 .strip(u'') 但它给了我一个错误，即 .strip 不能使用，因为它是一个列表。我做错了什么？

Answer 1

locs2 是一个包含字符串的列表，而不是列表的列表。因此，您试图将单个字符串写成一行：

for values in sorted(locs2):
    writer.writerow(values)

这里values是一个字符串，writerow()把它当作一个序列。您传递给该函数的任何序列的每个元素都将被视为单独的列。

如果您想将所有位置写为一个行，请将整个列表传递给 writer.writerow():

writer.writerow(sorted(locs2))

如果您想为每个单独的位置写一个新行，请先将其包装在列表中：

for location in sorted(locs2):
    writer.writerow([location])

你不需要从字符串中加入 u 前缀；那只是 Python 告诉你你有 Unicode 字符串对象，而不是字节字符串对象：

>>> 'ASCII byte string'
'ASCII byte string'
>>> 'ASCII unicode string'.decode('ascii')
u'ASCII unicode string'

如果您想了解有关 Python 和 Unicode 的更多信息，请参阅以下信息：

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) 作者：乔尔·斯波尔斯基
Pragmatic Unicode 作者：内德·巴切尔德
Python Unicode HOWTO

写入 csv 文件时，为什么每个字母都在一列中？

When writing to a csv file, why is each letter in a column?

python

csv

unicode