如何使用 pandas read_csv 读取具有反斜杠和双引号的csv文件

Question

我有一个这样的 CSV 文件（逗号分隔）

ID, Name,Context, Location
123,"John","{\"Organization\":{\"Id\":12345,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}","Road 1"
234,"Mike","{\"Organization\":{\"Id\":23456,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}","Road 2"

我想像这样创建 DataFrame：

ID | Name |Context                                                               |Location
123| John |{\"Organization\":{\"Id\":12345,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}|Road 1
234| Mike |{\"Organization\":{\"Id\":23456,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}|Road 2

你能帮我看看如何使用 pandas read_csv 吗？

Answer 1

答案 - 如果您愿意接受 \ 字符被剥离：

pd.read_csv(your_filepath, escapechar='\')

    ID  Name                                            Context  Location
0  123  John  {"Organization":{"Id":12345,"IsDefault":false}...    Road 1
1  234  Mike  {"Organization":{"Id":23456,"IsDefault":false}...    Road 2

如果您真的想要反斜杠，请回答 - 使用自定义转换器：

def backslash_it(x):
    return x.replace('"','\"')

pd.read_csv(your_filepath, escapechar='\', converters={'Context': backslash_it})

    ID  Name                                            Context Location
0  123  John  {\"Organization\":{\"Id\":12345,\"IsDefault\":...   Road 1
1  234  Mike  {\"Organization\":{\"Id\":23456,\"IsDefault\":...   Road 2

read_csv 上的

escapechar 用于实际读取 csv，然后自定义转换器将反斜杠放回原处。

请注意，我调整了 header 行以使列名称更容易匹配：

ID,Name,Context,Location

如何使用 pandas read_csv 读取具有反斜杠和双引号的csv文件

How to use pandas read_csv to read csv file having backward slash and double quotation

csv

pandas