在 python 3 中使用换行符将字符串写入 CSV

Question

在 Python 3.7 工作。

我目前正在从 API（Qualys 的 API，获取报告）中提取数据，具体来说。它 returns 一个包含 CSV 格式的所有报告数据的字符串，每个新行都用“\r\n”转义符指定。

（即 'foo,bar,stuff\r\n,more stuff,data,report\r\n,etc,etc,etc\r\n'）

我遇到的问题是将此字符串正确写入 CSV 文件。当在 Excel 中查看时，我尝试的每次代码迭代都会逐个单元格地写入数据，并且 \r\n 附加到它在字符串中的任何位置，全部在一行上，而不是在新行上。

（即 |foo|bar|stuff\r\n|更多东西|data|report\r\n|etc|etc|etc\r\n|）

我只是从 2 切换到 3，所以我几乎可以肯定这是语法错误或我对 python 3 如何处理新行定界符或类似行的理解的错误，但即使在查看了文档、此处和博客文章之后，我还是无法理解它，或者我一直遗漏了一些东西。

当前代码：

def dl_report(id, title):
    data = {'action': 'fetch', 'id': id}
    res = a.request('/api/2.0/fo/report/', data=data)
    print(type(res)) #returns string

    #input('pause')
    f_csv = open(title,'w', newline='\r\n')
    f_csv.write(res)
    f_csv.close

但我也尝试过：

with open(title, 'w', newline='\r\n') as f:
    writer = csv.writer(f,<tried encoding here, no luck>)
    writer.writerows(res)

#anyone else looking at this, this didn't work because of the difference 
#between writerow() and writerows()

我也尝试过各种方式来声明换行符，例如：

newline=''
newline='\n'
etc...

以及沿着这些思路进行的各种其他迭代。任何建议或指导或......此时的任何事情都会很棒。

编辑：

好的，我继续努力，这有点管用：

def dl_report(id, title):
data = {'action': 'fetch', 'id': id}
res = a.request('/api/2.0/fo/report/', data=data)
print(type(res)) #returns string

reader = csv.reader(res.split(r'\r\n'), delimiter=',')

with open(title, 'w') as outfile:
    writer = csv.writer(outfile, delimiter= '\n')
    writer.writerow(reader)

但这很丑陋，并且确实会在输出 CSV 中产生错误（某些行（少于 1%）没有解析为 CSV 行，可能是某处的格式错误..），但更令人担忧的是它当数据中出现“\”时工作不稳定。

我真的会对有效的解决方案感兴趣...更好？更多 pythonic？更一致会更好...

有什么想法吗？

Answer 1

如果我没看错你的问题，你就不能直接替换字符串吗？ with open(title, 'w') as f: f.write(res.replace("¥r¥n","¥n"))

Answer 2

查看此答案：

Python csv string to array

根据 CSVReader 的文档，它默认使用 \r\n 作为行分隔符。您的字符串应该可以正常工作。如果您将字符串加载到 CSVReader 对象中，那么您应该能够检查导出它的标准方法。

Answer 3

Python 字符串使用单个 \n 换行符。通常，读取文件时 \r\n 会转换为 \n 换行符转换为 \n 或 \r\n，具体取决于您的系统默认值和写入时的 newline= 参数。

在您的情况下，当您从网络界面阅读时，\r 并未被删除。当您使用 newline='\r\n' 打开文件时，python 按预期扩展了 \n，但是 \r 通过了，现在您的行是 \r\r\n。您可以通过以二进制模式重新读取文本文件看到：

>>> res = 'foo,bar,stuff\r\n,more stuff,data,report\r\n,etc,etc,etc\r\n'
>>> open('test', 'w', newline='\r\n').write(res)
54
>>> open('test', 'rb').read()
b'foo,bar,stuff\r\r\n,more stuff,data,report\r\r\n,etc,etc,etc\r\r\n'

既然你已经有了你想要的行尾，就用二进制模式写，跳过转换：

>>> open('test', 'wb').write(res.encode())
54
>>> open('test', 'rb').read()
b'foo,bar,stuff\r\n,more stuff,data,report\r\n,etc,etc,etc\r\n'

请注意，我使用了系统默认编码，但您可能希望对编码进行标准化。

Answer 4

根据您的评论，为您提供的数据实际上并不包括回车 returns 或换行符，它包括代表回车 escapes 的文本returns 和换行符（所以它确实有一个反斜杠，r，反斜杠，n 在数据中）。否则它已经是你想要的形式了，所以你根本不需要涉及 csv 模块，只需将转义解释为正确的值，然后直接写入数据即可。

使用 unicode-escape 编解码器（它也处理 ASCII 转义）相对简单：

import codecs  # Needed for text->text decoding

# ... retrieve data here, store to res ...

# Converts backslash followed by r to carriage return, by n to newline,
# and so on for other escapes
decoded = codecs.decode(res, 'unicode-escape')

# newline='' means don't perform line ending conversions, so you keep \r\n
# on all systems, no adding, no removing characters
# You may want to explicitly specify an encoding like UTF-8, rather than
# relying on the system default, so your code is portable across locales
with open(title, 'w', newline='') as f:
    f.write(decoded)

如果您收到的字符串实际上用引号引起来（因此 print(repr(s)) 两端都包含引号），它们可能会被解释为 JSON 字符串。在这种情况下，只需将 import 和 decoded 的创建替换为：

import json


decoded = json.loads(res)

在 python 3 中使用换行符将字符串写入 CSV

Writing a string to CSV using line escapes in python 3

python

csv

newline

delimiter

python-3.x