如何使用 pandas read_csv 读取具有反斜杠和双引号的csv文件

How to use pandas read_csv to read csv file having backward slash and double quotation

我有一个这样的 CSV 文件(逗号分隔)

ID, Name,Context, Location
123,"John","{\"Organization\":{\"Id\":12345,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}","Road 1"
234,"Mike","{\"Organization\":{\"Id\":23456,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}","Road 2"

我想像这样创建 DataFrame:

ID | Name |Context                                                               |Location
123| John |{\"Organization\":{\"Id\":12345,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}|Road 1
234| Mike |{\"Organization\":{\"Id\":23456,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}|Road 2

你能帮我看看如何使用 pandas read_csv 吗?

答案 - 如果您愿意接受 \ 字符被剥离:

pd.read_csv(your_filepath, escapechar='\')

    ID  Name                                            Context  Location
0  123  John  {"Organization":{"Id":12345,"IsDefault":false}...    Road 1
1  234  Mike  {"Organization":{"Id":23456,"IsDefault":false}...    Road 2

如果您真的想要反斜杠,请回答 - 使用自定义转换器:

def backslash_it(x):
    return x.replace('"','\"')

pd.read_csv(your_filepath, escapechar='\', converters={'Context': backslash_it})

    ID  Name                                            Context Location
0  123  John  {\"Organization\":{\"Id\":12345,\"IsDefault\":...   Road 1
1  234  Mike  {\"Organization\":{\"Id\":23456,\"IsDefault\":...   Road 2
read_csv 上的

escapechar 用于实际读取 csv,然后自定义转换器将反斜杠放回原处。

请注意,我调整了 header 行以使列名称更容易匹配:

ID,Name,Context,Location