df.to_csv() as tab-delim 但逗号冲突

Question

我想将 DataFrame 保存为 tab-delimited .csv

df.to_csv('df.csv', index=False, sep ='\t')

然而，第 3 列有一个列表 object，巧合的是逗号：,.

因此，我的输出 df.csv 有很多列。第一个是 3 个值，由制表符正确分隔。第二个和更多是逗号分割值。

df（正确：3 列）：

            0                                                  1  \
0   Emissions  305-1~GHG emissions in metric tons of CO2e~Gro...   
1   Emissions  305-1~GHG emissions in metric tons of CO2e~Bio...   
2   Emissions    305-1~Direct (Scope 1) GHG emissions by gas~CO2   
3   Emissions    305-1~Direct (Scope 1) GHG emissions by gas~N20   
4   Emissions   305-1~Direct (Scope 1) GHG emissions by gas~HFCs   
5   Emissions   305-1~Direct (Scope 1) GHG emissions by gas~PFCs   
6   Emissions    305-1~Direct (Scope 1) GHG emissions by gas~SF6   
7   Emissions  305-2~GHG Emissions in metric tons of CO2e~Gro...   
8   Emissions  305-2~GHG Emissions in metric tons of CO2e~Gro...   
9   Emissions  305-2~GHG Emissions in metric tons of CO2e~Tot...   
10  Emissions  305-2~GHG Emissions in metric tons of CO2e~Tot...   
11  Emissions  103-1~Explanation of the material topic and it...   
12  Emissions   103-2~The management approach and its components   
13  Emissions        103-3~Evaluation of the management approach   

                                                    2  
0   [2014_2760, 2015_278585, 2016_409886, 2017_972...  
1   [2014_299605, 2015_477610, 2016_822657, 2017_8...  
2   [2014_444055, 2015_730929, 2016_766490, 2017_8...  
3   [2014_510811, 2015_583265, 2016_694522, 2017_7...  
4   [2014_162816, 2015_199622, 2016_228775, 2017_3...  
5   [2014_61824, 2015_569032, 2016_607814, 2017_77...  
6   [2014_60442, 2015_64418, 2016_329338, 2017_784...  
7   [2014_53078, 2015_500448, 2016_527776, 2017_61...  
8   [2014_165580, 2015_557426, 2016_894641, 2017_9...  
9   [2014_60142, 2015_84502, 2016_532996, 2017_893...  
10  [2014_71762, 2015_72349, 2016_195351, 2017_624...  
11  consumption rate fossil fuels coal oil emissio...  
12  how evaluate companys environmental management...  
13  evaluation effectiveness companys environmenta...

df.csv（不正确，技术上我想要 one 列，但对于原始 3 column-values 是 tab-delimited）：

简化的模板示例

df:

text | text | ['list', 'object', 'here', 'of', 'any', 'length']
text | text | ['foo', 'bar']

所需的 .CSV [一个文字列，但值由制表符 (->) 分隔]：

| text -> text -> ['list', 'object', 'here', 'of', 'any', 'length'] |
| text -> text -> ['foo', 'bar'] |

单列输出，值由制表符分隔。没有 headers 或索引

如何确保 Pandas 忽略列表 object 的 ,？

如果我需要提供更多详细信息，请告诉我。

Answer 1

仅供参考，您可以在变量查看器中的 df 上单击“复制值”（每个 IDE 的语义不同）（同样，名称更改取决于 IDE）以我可以复制的方式复制它的数据，但我根据您提供的内容创建了一个示例。

import pandas as pd
import csv

样本 df:

df = pd.DataFrame({'col1': ['Emissions', 'Emissions'], 'col2': ['305-1~GHG emissions in metric tons of CO2e~Gro...', '305-1~GHG emissions in metric tons of CO2e~Bio...'], 'col3': [['2014_2760, 2015_278585, 2016_409886'], ['[2014_299605, 2015_477610, 2016_822657']]})

~~现在这里的技巧是使用 quoting 参数，根据 docs 是：~~

> quoting : 来自 csv 模块的可选常量

~~Defaults to csv.QUOTE_MINIMAL. If you have set a float_format then floats are converted to strings and thus csv.QUOTE_NONNUMERIC will treat them as non-numeric.~~

编辑：

在澄清你的 objective 之后，apply 应该会实现它：

df = df[df.columns].apply(
    lambda x: ' -> '.join(x.astype(str)),
    axis=1)

保存文件：

df.to_csv('sample.csv', index=False)

输出：

df.to_csv() as tab-delim 但逗号冲突

df.to_csv() as tab-delim but commas conflict

python

csv

tabs

list

pandas

简化的模板示例

编辑：