以编程方式将 pandas 数据帧转换为 markdown table
Programmatically convert pandas dataframe to markdown table
我有一个从数据库生成的 Pandas 数据框,其中包含混合编码的数据。例如:
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
| ID | path | language | date | longest_sentence | shortest_sentence | number_words | readability_consensus |
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
| 0 | data/Eng/Sagitarius.txt | Eng | 2015-09-17 | With administrative experience in the prepa... | I am able to relocate internationally on short not... | 306 | 11th and 12th grade |
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
| 31 | data/Nor/Høylandet.txt | Nor | 2015-07-22 | Høgskolen i Østfold er et eksempel... | Som skuespiller har jeg både... | 253 | 15th and 16th grade |
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
正如所见,混合了英语和挪威语(我认为在数据库中编码为 ISO-8859-1)。我需要将此 Dataframe 输出的内容作为 Markdown table,但不会遇到编码问题。我关注了 this answer (from the question Generate Markdown tables?) 并得到了以下信息:
import sys, sqlite3
db = sqlite3.connect("Applications.db")
df = pd.read_sql_query("SELECT path, language, date, longest_sentence, shortest_sentence, number_words, readability_consensus FROM applications ORDER BY date(date) DESC", db)
db.close()
rows = []
for index, row in df.iterrows():
items = (row['date'],
row['path'],
row['language'],
row['shortest_sentence'],
row['longest_sentence'],
row['number_words'],
row['readability_consensus'])
rows.append(items)
headings = ['Date',
'Path',
'Language',
'Shortest Sentence',
'Longest Sentence since',
'Words',
'Grade level']
fields = [0, 1, 2, 3, 4, 5, 6]
align = [('^', '<'), ('^', '^'), ('^', '<'), ('^', '^'), ('^', '>'),
('^','^'), ('^','^')]
table(sys.stdout, rows, fields, headings, align)
但是,这会产生 UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 72: ordinal not in range(128)
错误。如何将 Dataframe 输出为 Markdown table?也就是说,为了将此代码存储在文件中以用于编写 Markdown 文档。我需要输出如下所示:
| ID | path | language | date | longest_sentence | shortest_sentence | number_words | readability_consensus |
|----|-------------------------|----------|------------|------------------------------------------------|--------------------------------------------------------|--------------|-----------------------|
| 0 | data/Eng/Sagitarius.txt | Eng | 2015-09-17 | With administrative experience in the prepa... | I am able to relocate internationally on short not... | 306 | 11th and 12th grade |
| 31 | data/Nor/Høylandet.txt | Nor | 2015-07-22 | Høgskolen i Østfold er et eksempel... | Som skuespiller har jeg både... | 253 | 15th and 16th grade |
试试这个。我让它工作了。
请参阅此答案末尾我的降价文件转换为 HTML 的屏幕截图。
import pandas as pd
# You don't need these two lines
# as you already have your DataFrame in memory
df = pd.read_csv("nor.txt", sep="|")
df.drop(df.columns[-1], axis=1)
# Get column names
cols = df.columns
# Create a new DataFrame with just the markdown
# strings
df2 = pd.DataFrame([['---',]*len(cols)], columns=cols)
#Create a new concatenated DataFrame
df3 = pd.concat([df2, df])
#Save as markdown
df3.to_csv("nor.md", sep="|", index=False)
是的,所以我借鉴了 Rohit (Python - Encoding string - Swedish Letters), extended 提出的问题,并得出以下结论:
# Enforce UTF-8 encoding
import sys
stdin, stdout = sys.stdin, sys.stdout
reload(sys)
sys.stdin, sys.stdout = stdin, stdout
sys.setdefaultencoding('UTF-8')
# SQLite3 database
import sqlite3
# Pandas: Data structures and data analysis tools
import pandas as pd
# Read database, attach as Pandas dataframe
db = sqlite3.connect("Applications.db")
df = pd.read_sql_query("SELECT path, language, date, shortest_sentence, longest_sentence, number_words, readability_consensus FROM applications ORDER BY date(date) DESC", db)
db.close()
df.columns = ['Path', 'Language', 'Date', 'Shortest Sentence', 'Longest Sentence', 'Words', 'Readability Consensus']
# Parse Dataframe and apply Markdown, then save as 'table.md'
cols = df.columns
df2 = pd.DataFrame([['---','---','---','---','---','---','---']], columns=cols)
df3 = pd.concat([df2, df])
df3.to_csv("table.md", sep="|", index=False)
一个重要的前提是 shortest_sentence
和 longest_sentence
列不包含不必要的换行符,在提交到 SQLite 数据库之前通过应用 .replace('\n', ' ').replace('\r', '')
将其删除。看来解决方案不是强制执行特定于语言的编码(ISO-8859-1
用于挪威语),而是使用 UTF-8
而不是默认的 ASCII
.
我 运行 通过我的 IPython 笔记本 (Python 2.7.10) 得到了一个 table 如下所示(appea[=31= 的固定间距]在这里):
| Path | Language | Date | Shortest Sentence | Longest Sentence | Words | Readability Consensus |
|-------------------------|----------|------------|----------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|-----------------------|
| data/Eng/Something1.txt | Eng | 2015-09-17 | I am able to relocate to London on short notice. | With my administrative experience in the preparation of the structure and content of seminars in various courses, and critiquing academic papers on various levels, I am confident that I can execute the work required as an editorial assistant. | 306 | 11th and 12th grade |
| data/Nor/NoeNorrønt.txt | Nor | 2015-09-17 | Jeg har grundig kjennskap til Microsoft Office og Adobe. | I løpet av studiene har jeg vært salgsmedarbeider for et større konsern, hvor jeg solgte forsikring til studentene og de faglige ansatte ved universitetet i Trønderlag, samt renholdsarbeider i et annet, hvor jeg i en periode var avdelingsansvarlig. | 205 | 18th and 19th grade |
| data/Nor/Ørret.txt.txt | Nor | 2015-09-17 | Jeg håper på positiv tilbakemelding, og møter naturligvis til intervju hvis det er ønskelig. | I løpet av studiene har jeg vært salgsmedarbeider for et større konsern, hvor jeg solgte forsikring til studentene og de faglige ansatte ved universitetet i Trønderlag, samt renholdsarbeider i et annet, hvor jeg i en periode var avdelingsansvarlig. | 160 | 18th and 19th grade |
因此,Markdown table 没有编码问题。
进一步改进答案,用于IPython笔记本:
def pandas_df_to_markdown_table(df):
from IPython.display import Markdown, display
fmt = ['---' for i in range(len(df.columns))]
df_fmt = pd.DataFrame([fmt], columns=df.columns)
df_formatted = pd.concat([df_fmt, df])
display(Markdown(df_formatted.to_csv(sep="|", index=False)))
pandas_df_to_markdown_table(infodf)
或使用tabulate:
pip install tabulate
使用示例在文档中。
更新
自 pandas 1.0 DataFrame 到 markdown 可用。请参阅@timvink (docs)
的回答
sqlite3 returns TEXT 字段默认使用 Unicode。在您从外部源(您没有在问题中提供)引入 table()
函数之前,一切都已设置好。
table()
函数有 str()
次调用不提供编码,因此使用 ASCII 来保护您。
您需要重写 table()
不要这样做,尤其是当您有 Unicode 对象时。只需将 str()
替换为 unicode()
,您可能会取得一些成功
将 DataFrame 导出到 markdown
我创建了以下函数,用于将 pandas.DataFrame 导出到 Python 中的 Markdown:
def df_to_markdown(df, float_format='%.2g'):
"""
Export a pandas.DataFrame to markdown-formatted text.
DataFrame should not contain any `|` characters.
"""
from os import linesep
return linesep.join([
'|'.join(df.columns),
'|'.join(4 * '-' for i in df.columns),
df.to_csv(sep='|', index=False, header=False, float_format=float_format)
]).replace('|', ' | ')
此功能可能不会自动修复 OP 的编码问题,但这与从 pandas 转换为降价是不同的问题。
我在这个 post 中尝试了上述几种解决方案,发现它最有效。
要将 pandas 数据框转换为降价 table,我建议使用 pytablewriter。
使用此 post:
中提供的数据
import pandas as pd
import pytablewriter
from StringIO import StringIO
c = StringIO("""ID, path,language, date,longest_sentence, shortest_sentence, number_words , readability_consensus
0, data/Eng/Sagitarius.txt , Eng, 2015-09-17 , With administrative experience in the prepa... , I am able to relocate internationally on short not..., 306, 11th and 12th grade
31 , data/Nor/Høylandet.txt , Nor, 2015-07-22 , Høgskolen i Østfold er et eksempel..., Som skuespiller har jeg både..., 253, 15th and 16th grade
""")
df = pd.read_csv(c,sep=',',index_col=['ID'])
writer = pytablewriter.MarkdownTableWriter()
writer.table_name = "example_table"
writer.header_list = list(df.columns.values)
writer.value_matrix = df.values.tolist()
writer.write_table()
这导致:
# example_table
ID | path |language| date | longest_sentence | shortest_sentence | number_words | readability_consensus
--:|--------------------------|--------|------------|------------------------------------------------|------------------------------------------------------|-------------:|-----------------------
0| data/Eng/Sagitarius.txt | Eng | 2015-09-17 | With administrative experience in the prepa... | I am able to relocate internationally on short not...| 306| 11th and 12th grade
31| data/Nor/Høylandet.txt | Nor | 2015-07-22 | Høgskolen i Østfold er et eksempel... | Som skuespiller har jeg både... | 253| 15th and 16th grade
这是 Markdown 渲染的屏幕截图。
这是一个示例函数,使用 pytablewriter
和一些正则表达式使降价 table 更类似于数据框在 Jupyter 上的外观(行 headers 为粗体)。
import io
import re
import pandas as pd
import pytablewriter
def df_to_markdown(df):
"""
Converts Pandas DataFrame to markdown table,
making the index bold (as in Jupyter) unless it's a
pd.RangeIndex, in which case the index is completely dropped.
Returns a string containing markdown table.
"""
isRangeIndex = isinstance(df.index, pd.RangeIndex)
if not isRangeIndex:
df = df.reset_index()
writer = pytablewriter.MarkdownTableWriter()
writer.stream = io.StringIO()
writer.header_list = df.columns
writer.value_matrix = df.values
writer.write_table()
writer.stream.seek(0)
table = writer.stream.readlines()
if isRangeIndex:
return ''.join(table)
else:
# Make the indexes bold
new_table = table[:2]
for line in table[2:]:
new_table.append(re.sub('^(.*?)\|', r'****|', line))
return ''.join(new_table)
我推荐 python-tabulate 用于生成 ascii 表的库。该库也支持 pandas.DataFrame
。
使用方法如下:
from pandas import DataFrame
from tabulate import tabulate
df = DataFrame({
"weekday": ["monday", "thursday", "wednesday"],
"temperature": [20, 30, 25],
"precipitation": [100, 200, 150],
}).set_index("weekday")
print(tabulate(df, tablefmt="pipe", headers="keys"))
输出:
| weekday | temperature | precipitation |
|:----------|--------------:|----------------:|
| monday | 20 | 100 |
| thursday | 30 | 200 |
| wednesday | 25 | 150 |
使用外部工具 pandoc
和管道:
def to_markdown(df):
from subprocess import Popen, PIPE
s = df.to_latex()
p = Popen('pandoc -f latex -t markdown',
stdin=PIPE, stdout=PIPE, shell=True)
stdoutdata, _ = p.communicate(input=s.encode("utf-8"))
return stdoutdata.decode("utf-8")
对于那些想知道如何使用 tabulate
来做到这一点的人,我想我把它放在这里是为了节省你一些时间:
print(tabulate(df, tablefmt="pipe", headers="keys", showindex=False))
另一种解决方案。这次通过围绕表格的薄包装:tabulatehelper
import numpy as np
import pandas as pd
import tabulatehelper as th
df = pd.DataFrame(np.random.random(16).reshape(4, 4), columns=('a', 'b', 'c', 'd'))
print(th.md_table(df, formats={-1: 'c'}))
输出:
| a | b | c | d |
|---------:|---------:|---------:|:--------:|
| 0.413284 | 0.932373 | 0.277797 | 0.646333 |
| 0.552731 | 0.381826 | 0.141727 | 0.2483 |
| 0.779889 | 0.012458 | 0.308352 | 0.650859 |
| 0.301109 | 0.982111 | 0.994024 | 0.43551 |
Pandas 合并了一个 PR 来支持 df.to_markdown() 方法。您可以找到更多详细信息 here 它应该很快就会可用。
Pandas1.0于2020年1月29日发布,支持markdown转换,现在可以直接做啦!
取自 docs 的示例:
df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b'])
print(df.to_markdown())
| | A | B |
|:---|----:|----:|
| a | 1 | 1 |
| a | 2 | 2 |
| b | 3 | 3 |
或没有索引:
print(df.to_markdown(index=False)) # use 'showindex' for pandas < 1.1
| A | B |
|----:|----:|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
我有一个从数据库生成的 Pandas 数据框,其中包含混合编码的数据。例如:
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
| ID | path | language | date | longest_sentence | shortest_sentence | number_words | readability_consensus |
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
| 0 | data/Eng/Sagitarius.txt | Eng | 2015-09-17 | With administrative experience in the prepa... | I am able to relocate internationally on short not... | 306 | 11th and 12th grade |
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
| 31 | data/Nor/Høylandet.txt | Nor | 2015-07-22 | Høgskolen i Østfold er et eksempel... | Som skuespiller har jeg både... | 253 | 15th and 16th grade |
+----+-------------------------+----------+------------+------------------------------------------------+--------------------------------------------------------+--------------+-----------------------+
正如所见,混合了英语和挪威语(我认为在数据库中编码为 ISO-8859-1)。我需要将此 Dataframe 输出的内容作为 Markdown table,但不会遇到编码问题。我关注了 this answer (from the question Generate Markdown tables?) 并得到了以下信息:
import sys, sqlite3
db = sqlite3.connect("Applications.db")
df = pd.read_sql_query("SELECT path, language, date, longest_sentence, shortest_sentence, number_words, readability_consensus FROM applications ORDER BY date(date) DESC", db)
db.close()
rows = []
for index, row in df.iterrows():
items = (row['date'],
row['path'],
row['language'],
row['shortest_sentence'],
row['longest_sentence'],
row['number_words'],
row['readability_consensus'])
rows.append(items)
headings = ['Date',
'Path',
'Language',
'Shortest Sentence',
'Longest Sentence since',
'Words',
'Grade level']
fields = [0, 1, 2, 3, 4, 5, 6]
align = [('^', '<'), ('^', '^'), ('^', '<'), ('^', '^'), ('^', '>'),
('^','^'), ('^','^')]
table(sys.stdout, rows, fields, headings, align)
但是,这会产生 UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 72: ordinal not in range(128)
错误。如何将 Dataframe 输出为 Markdown table?也就是说,为了将此代码存储在文件中以用于编写 Markdown 文档。我需要输出如下所示:
| ID | path | language | date | longest_sentence | shortest_sentence | number_words | readability_consensus |
|----|-------------------------|----------|------------|------------------------------------------------|--------------------------------------------------------|--------------|-----------------------|
| 0 | data/Eng/Sagitarius.txt | Eng | 2015-09-17 | With administrative experience in the prepa... | I am able to relocate internationally on short not... | 306 | 11th and 12th grade |
| 31 | data/Nor/Høylandet.txt | Nor | 2015-07-22 | Høgskolen i Østfold er et eksempel... | Som skuespiller har jeg både... | 253 | 15th and 16th grade |
试试这个。我让它工作了。
请参阅此答案末尾我的降价文件转换为 HTML 的屏幕截图。
import pandas as pd
# You don't need these two lines
# as you already have your DataFrame in memory
df = pd.read_csv("nor.txt", sep="|")
df.drop(df.columns[-1], axis=1)
# Get column names
cols = df.columns
# Create a new DataFrame with just the markdown
# strings
df2 = pd.DataFrame([['---',]*len(cols)], columns=cols)
#Create a new concatenated DataFrame
df3 = pd.concat([df2, df])
#Save as markdown
df3.to_csv("nor.md", sep="|", index=False)
是的,所以我借鉴了 Rohit (Python - Encoding string - Swedish Letters), extended
# Enforce UTF-8 encoding
import sys
stdin, stdout = sys.stdin, sys.stdout
reload(sys)
sys.stdin, sys.stdout = stdin, stdout
sys.setdefaultencoding('UTF-8')
# SQLite3 database
import sqlite3
# Pandas: Data structures and data analysis tools
import pandas as pd
# Read database, attach as Pandas dataframe
db = sqlite3.connect("Applications.db")
df = pd.read_sql_query("SELECT path, language, date, shortest_sentence, longest_sentence, number_words, readability_consensus FROM applications ORDER BY date(date) DESC", db)
db.close()
df.columns = ['Path', 'Language', 'Date', 'Shortest Sentence', 'Longest Sentence', 'Words', 'Readability Consensus']
# Parse Dataframe and apply Markdown, then save as 'table.md'
cols = df.columns
df2 = pd.DataFrame([['---','---','---','---','---','---','---']], columns=cols)
df3 = pd.concat([df2, df])
df3.to_csv("table.md", sep="|", index=False)
一个重要的前提是 shortest_sentence
和 longest_sentence
列不包含不必要的换行符,在提交到 SQLite 数据库之前通过应用 .replace('\n', ' ').replace('\r', '')
将其删除。看来解决方案不是强制执行特定于语言的编码(ISO-8859-1
用于挪威语),而是使用 UTF-8
而不是默认的 ASCII
.
我 运行 通过我的 IPython 笔记本 (Python 2.7.10) 得到了一个 table 如下所示(appea[=31= 的固定间距]在这里):
| Path | Language | Date | Shortest Sentence | Longest Sentence | Words | Readability Consensus |
|-------------------------|----------|------------|----------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|-----------------------|
| data/Eng/Something1.txt | Eng | 2015-09-17 | I am able to relocate to London on short notice. | With my administrative experience in the preparation of the structure and content of seminars in various courses, and critiquing academic papers on various levels, I am confident that I can execute the work required as an editorial assistant. | 306 | 11th and 12th grade |
| data/Nor/NoeNorrønt.txt | Nor | 2015-09-17 | Jeg har grundig kjennskap til Microsoft Office og Adobe. | I løpet av studiene har jeg vært salgsmedarbeider for et større konsern, hvor jeg solgte forsikring til studentene og de faglige ansatte ved universitetet i Trønderlag, samt renholdsarbeider i et annet, hvor jeg i en periode var avdelingsansvarlig. | 205 | 18th and 19th grade |
| data/Nor/Ørret.txt.txt | Nor | 2015-09-17 | Jeg håper på positiv tilbakemelding, og møter naturligvis til intervju hvis det er ønskelig. | I løpet av studiene har jeg vært salgsmedarbeider for et større konsern, hvor jeg solgte forsikring til studentene og de faglige ansatte ved universitetet i Trønderlag, samt renholdsarbeider i et annet, hvor jeg i en periode var avdelingsansvarlig. | 160 | 18th and 19th grade |
因此,Markdown table 没有编码问题。
进一步改进答案,用于IPython笔记本:
def pandas_df_to_markdown_table(df):
from IPython.display import Markdown, display
fmt = ['---' for i in range(len(df.columns))]
df_fmt = pd.DataFrame([fmt], columns=df.columns)
df_formatted = pd.concat([df_fmt, df])
display(Markdown(df_formatted.to_csv(sep="|", index=False)))
pandas_df_to_markdown_table(infodf)
或使用tabulate:
pip install tabulate
使用示例在文档中。
更新
自 pandas 1.0 DataFrame 到 markdown 可用。请参阅@timvink (docs)
的回答sqlite3 returns TEXT 字段默认使用 Unicode。在您从外部源(您没有在问题中提供)引入 table()
函数之前,一切都已设置好。
table()
函数有 str()
次调用不提供编码,因此使用 ASCII 来保护您。
您需要重写 table()
不要这样做,尤其是当您有 Unicode 对象时。只需将 str()
替换为 unicode()
将 DataFrame 导出到 markdown
我创建了以下函数,用于将 pandas.DataFrame 导出到 Python 中的 Markdown:
def df_to_markdown(df, float_format='%.2g'):
"""
Export a pandas.DataFrame to markdown-formatted text.
DataFrame should not contain any `|` characters.
"""
from os import linesep
return linesep.join([
'|'.join(df.columns),
'|'.join(4 * '-' for i in df.columns),
df.to_csv(sep='|', index=False, header=False, float_format=float_format)
]).replace('|', ' | ')
此功能可能不会自动修复 OP 的编码问题,但这与从 pandas 转换为降价是不同的问题。
我在这个 post 中尝试了上述几种解决方案,发现它最有效。
要将 pandas 数据框转换为降价 table,我建议使用 pytablewriter。 使用此 post:
中提供的数据import pandas as pd
import pytablewriter
from StringIO import StringIO
c = StringIO("""ID, path,language, date,longest_sentence, shortest_sentence, number_words , readability_consensus
0, data/Eng/Sagitarius.txt , Eng, 2015-09-17 , With administrative experience in the prepa... , I am able to relocate internationally on short not..., 306, 11th and 12th grade
31 , data/Nor/Høylandet.txt , Nor, 2015-07-22 , Høgskolen i Østfold er et eksempel..., Som skuespiller har jeg både..., 253, 15th and 16th grade
""")
df = pd.read_csv(c,sep=',',index_col=['ID'])
writer = pytablewriter.MarkdownTableWriter()
writer.table_name = "example_table"
writer.header_list = list(df.columns.values)
writer.value_matrix = df.values.tolist()
writer.write_table()
这导致:
# example_table
ID | path |language| date | longest_sentence | shortest_sentence | number_words | readability_consensus
--:|--------------------------|--------|------------|------------------------------------------------|------------------------------------------------------|-------------:|-----------------------
0| data/Eng/Sagitarius.txt | Eng | 2015-09-17 | With administrative experience in the prepa... | I am able to relocate internationally on short not...| 306| 11th and 12th grade
31| data/Nor/Høylandet.txt | Nor | 2015-07-22 | Høgskolen i Østfold er et eksempel... | Som skuespiller har jeg både... | 253| 15th and 16th grade
这是 Markdown 渲染的屏幕截图。
这是一个示例函数,使用 pytablewriter
和一些正则表达式使降价 table 更类似于数据框在 Jupyter 上的外观(行 headers 为粗体)。
import io
import re
import pandas as pd
import pytablewriter
def df_to_markdown(df):
"""
Converts Pandas DataFrame to markdown table,
making the index bold (as in Jupyter) unless it's a
pd.RangeIndex, in which case the index is completely dropped.
Returns a string containing markdown table.
"""
isRangeIndex = isinstance(df.index, pd.RangeIndex)
if not isRangeIndex:
df = df.reset_index()
writer = pytablewriter.MarkdownTableWriter()
writer.stream = io.StringIO()
writer.header_list = df.columns
writer.value_matrix = df.values
writer.write_table()
writer.stream.seek(0)
table = writer.stream.readlines()
if isRangeIndex:
return ''.join(table)
else:
# Make the indexes bold
new_table = table[:2]
for line in table[2:]:
new_table.append(re.sub('^(.*?)\|', r'****|', line))
return ''.join(new_table)
我推荐 python-tabulate 用于生成 ascii 表的库。该库也支持 pandas.DataFrame
。
使用方法如下:
from pandas import DataFrame
from tabulate import tabulate
df = DataFrame({
"weekday": ["monday", "thursday", "wednesday"],
"temperature": [20, 30, 25],
"precipitation": [100, 200, 150],
}).set_index("weekday")
print(tabulate(df, tablefmt="pipe", headers="keys"))
输出:
| weekday | temperature | precipitation |
|:----------|--------------:|----------------:|
| monday | 20 | 100 |
| thursday | 30 | 200 |
| wednesday | 25 | 150 |
使用外部工具 pandoc
和管道:
def to_markdown(df):
from subprocess import Popen, PIPE
s = df.to_latex()
p = Popen('pandoc -f latex -t markdown',
stdin=PIPE, stdout=PIPE, shell=True)
stdoutdata, _ = p.communicate(input=s.encode("utf-8"))
return stdoutdata.decode("utf-8")
对于那些想知道如何使用 tabulate
来做到这一点的人,我想我把它放在这里是为了节省你一些时间:
print(tabulate(df, tablefmt="pipe", headers="keys", showindex=False))
另一种解决方案。这次通过围绕表格的薄包装:tabulatehelper
import numpy as np
import pandas as pd
import tabulatehelper as th
df = pd.DataFrame(np.random.random(16).reshape(4, 4), columns=('a', 'b', 'c', 'd'))
print(th.md_table(df, formats={-1: 'c'}))
输出:
| a | b | c | d |
|---------:|---------:|---------:|:--------:|
| 0.413284 | 0.932373 | 0.277797 | 0.646333 |
| 0.552731 | 0.381826 | 0.141727 | 0.2483 |
| 0.779889 | 0.012458 | 0.308352 | 0.650859 |
| 0.301109 | 0.982111 | 0.994024 | 0.43551 |
Pandas 合并了一个 PR 来支持 df.to_markdown() 方法。您可以找到更多详细信息 here 它应该很快就会可用。
Pandas1.0于2020年1月29日发布,支持markdown转换,现在可以直接做啦!
取自 docs 的示例:
df = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=['a', 'a', 'b'])
print(df.to_markdown())
| | A | B |
|:---|----:|----:|
| a | 1 | 1 |
| a | 2 | 2 |
| b | 3 | 3 |
或没有索引:
print(df.to_markdown(index=False)) # use 'showindex' for pandas < 1.1
| A | B |
|----:|----:|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |