使用 python 合并两个电子表格 - 新工作表中的列源在源文件之间交替
Merge two spreadsheets using python - Source of columns in new sheets alternates between source files
我想编写一个 python 代码来合并两个 .csv 格式的跨页 sheet,以便新 sheet 中的第一列来自任一来源sheets 和所有其他新列均按交替顺序从源 sheets 派生。
这里是一个例子(以spreadsheet格式显示):
来源 1:
(A) name 1 (A) name 2 (A) name 3 (A) name 4
class 1
class 2
class 3
class 4
来源 2:
(B) name 1 (B) name 2 (B) name 3 (B) name 4
class 1
class 2
class 3
class 4
期望的结果:
(A) name 1 (B) name 1 (A) name 2 (B) name 2 (A) name 3 (B) name 3 (A) name 4 (B) name 4
class 1
class 2
class 3
class 4
编辑:
根据要求,这是我的数据示例(以 .csv 格式显示)
Sheet 1:
,(F) Abies amabilis,(F) Abies balsamea,(F) Abies bifolia,(F) Abies concolor,(F) Abies fraseri,(F) Abies grandis,(F) Abies lasiocarpa,(F) Abies magnifica,(F) Abies procera,(F) Larix decidua,(F) Larix laricina,(F) Picea abies,(F) Picea engelmannii,(F) Picea glauca,(F) Picea mariana,(F) Picea pungens,(F) Picea sitchensis,(F) Pinus albicaulis,(F) Pinus aristata,(F) Pinus attenuata,(F) Pinus banksiana,(F) Pinus cembroides,(F) Pinus clausa,(F) Pinus contorta,(F) Pinus coulteri,(F) Pinus echinata,(F) Pinus edulis,(F) Pinus elliottii,(F) Pinus engelmannii,(F) Pinus flexilis,(F) Pinus halepensis,(F) Pinus jeffreyi,(F) Pinus lambertiana,(F) Pinus leiophylla,(F) Pinus longaeva,(F) Pinus monophylla,(F) Pinus monticola,(F) Pinus mugo,(F) Pinus muricata,(F) Pinus palustris,(F) Pinus ponderosa,(F) Pinus pumila,(F) Pinus pungens,(F) Pinus quadrifolia,(F) Pinus radiata,(F) Pinus resinosa,(F) Pinus rigida,(F) Pinus serotina,(F) Pinus strobiformis,(F) Pinus strobus,(F) Pinus sylvestris,(F) Pinus taeda,(F) Pinus thunbergii,(F) Pinus torreyana,(F) Pinus virginiana,(F) Pseudotsuga macrocarpa,(F) Pseudotsuga menziesii,(F) Tsuga canadensis,(F) Tsuga heterophylla,(F) Tsuga mertensiana
48,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
52,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
58,0,0,0,1,0,0,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0
Sheet 2:
,(M) Abies amabilis,(M) Abies balsamea,(M) Abies bifolia,(M) Abies concolor,(M) Abies fraseri,(M) Abies grandis,(M) Abies lasiocarpa,(M) Abies magnifica,(M) Abies procera,(M) Larix decidua,(M) Larix laricina,(M) Picea engelmannii,(M) Picea glauca,(M) Picea mariana,(M) Picea pungens,(M) Picea sitchensis,(M) Pinus albicaulis,(M) Pinus aristata,(M) Pinus attenuata,(M) Pinus banksiana,(M) Pinus cembroides,(M) Pinus clausa,(M) Pinus contorta,(M) Pinus coulteri,(M) Pinus echinata,(M) Pinus edulis,(M) Pinus elliottii,(M) Pinus engelmannii,(M) Pinus flexilis,(M) Pinus halepensis,(M) Pinus jeffreyi,(M) Pinus lambertiana,(M) Pinus leiophylla,(M) Pinus longaeva,(M) Pinus monophylla,(M) Pinus monticola,(M) Pinus muricata,(M) Pinus palustris,(M) Pinus ponderosa,(M) Pinus pumila,(M) Pinus pungens,(M) Pinus quadrifolia,(M) Pinus radiata,(M) Pinus resinosa,(M) Pinus rigida,(M) Pinus serotina,(M) Pinus strobiformis,(M) Pinus strobus,(M) Pinus sylvestris,(M) Pinus thunbergii,(M) Pinus torreyana,(M) Pinus virginiana,(M) Tsuga canadensis,(M) Tsuga heterophylla,(M) Tsuga mertensiana
48,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
52,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,1,0,0,1,0,0,1,0,1,0,0,1,1,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1
58,0,0,1,0,0,1,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
我是一个非常新手的编码员,所以我尝试过的几乎不值一提。但是,我最初假设也许我可以 link sheet 使用 zip,它适用于列表。我还想也许我可以做一些像
for line in "Source 1.csv" and row in "Source 2.csv:
#then split the lines into lists and write to an outfile using list indices
提前,非常感谢您的帮助!
我认为您使用 zip()
的方法是正确的,但它变得有点棘手,因为它 returns 是每个源文件中成对值的列表。下面通过展平嵌套序列来解决这个问题。所以我认为以下内容应该有效。您还可以使用 zip()
(或 itertools.izip()
)并行遍历两个 csv 文件的行。
注意我通常在处理那种格式的文件时尝试使用 csv
模块,因为它通常可以节省很多时间和麻烦,而且它相当容易使用。
import csv
import itertools
with open("Source 1.csv", "rb") as source1, \
open("Source 2.csv", "rb") as source2, \
open("merged_output.csv", "wb") as merged_output:
source1_reader = csv.reader(source1, delimiter=',')
source2_reader = csv.reader(source2, delimiter=',')
merged_output_writer = csv.writer(merged_output, delimiter=',')
for row1, row2 in itertools.izip(source1_reader, source2_reader):
merged_output_writer.writerow(
tuple(itertools.chain.from_iterable(itertools.izip(row1, row2))))
我想编写一个 python 代码来合并两个 .csv 格式的跨页 sheet,以便新 sheet 中的第一列来自任一来源sheets 和所有其他新列均按交替顺序从源 sheets 派生。
这里是一个例子(以spreadsheet格式显示):
来源 1:
(A) name 1 (A) name 2 (A) name 3 (A) name 4
class 1
class 2
class 3
class 4
来源 2:
(B) name 1 (B) name 2 (B) name 3 (B) name 4
class 1
class 2
class 3
class 4
期望的结果:
(A) name 1 (B) name 1 (A) name 2 (B) name 2 (A) name 3 (B) name 3 (A) name 4 (B) name 4
class 1
class 2
class 3
class 4
编辑:
根据要求,这是我的数据示例(以 .csv 格式显示)
Sheet 1:
,(F) Abies amabilis,(F) Abies balsamea,(F) Abies bifolia,(F) Abies concolor,(F) Abies fraseri,(F) Abies grandis,(F) Abies lasiocarpa,(F) Abies magnifica,(F) Abies procera,(F) Larix decidua,(F) Larix laricina,(F) Picea abies,(F) Picea engelmannii,(F) Picea glauca,(F) Picea mariana,(F) Picea pungens,(F) Picea sitchensis,(F) Pinus albicaulis,(F) Pinus aristata,(F) Pinus attenuata,(F) Pinus banksiana,(F) Pinus cembroides,(F) Pinus clausa,(F) Pinus contorta,(F) Pinus coulteri,(F) Pinus echinata,(F) Pinus edulis,(F) Pinus elliottii,(F) Pinus engelmannii,(F) Pinus flexilis,(F) Pinus halepensis,(F) Pinus jeffreyi,(F) Pinus lambertiana,(F) Pinus leiophylla,(F) Pinus longaeva,(F) Pinus monophylla,(F) Pinus monticola,(F) Pinus mugo,(F) Pinus muricata,(F) Pinus palustris,(F) Pinus ponderosa,(F) Pinus pumila,(F) Pinus pungens,(F) Pinus quadrifolia,(F) Pinus radiata,(F) Pinus resinosa,(F) Pinus rigida,(F) Pinus serotina,(F) Pinus strobiformis,(F) Pinus strobus,(F) Pinus sylvestris,(F) Pinus taeda,(F) Pinus thunbergii,(F) Pinus torreyana,(F) Pinus virginiana,(F) Pseudotsuga macrocarpa,(F) Pseudotsuga menziesii,(F) Tsuga canadensis,(F) Tsuga heterophylla,(F) Tsuga mertensiana
48,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
52,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
58,0,0,0,1,0,0,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,0,0
Sheet 2:
,(M) Abies amabilis,(M) Abies balsamea,(M) Abies bifolia,(M) Abies concolor,(M) Abies fraseri,(M) Abies grandis,(M) Abies lasiocarpa,(M) Abies magnifica,(M) Abies procera,(M) Larix decidua,(M) Larix laricina,(M) Picea engelmannii,(M) Picea glauca,(M) Picea mariana,(M) Picea pungens,(M) Picea sitchensis,(M) Pinus albicaulis,(M) Pinus aristata,(M) Pinus attenuata,(M) Pinus banksiana,(M) Pinus cembroides,(M) Pinus clausa,(M) Pinus contorta,(M) Pinus coulteri,(M) Pinus echinata,(M) Pinus edulis,(M) Pinus elliottii,(M) Pinus engelmannii,(M) Pinus flexilis,(M) Pinus halepensis,(M) Pinus jeffreyi,(M) Pinus lambertiana,(M) Pinus leiophylla,(M) Pinus longaeva,(M) Pinus monophylla,(M) Pinus monticola,(M) Pinus muricata,(M) Pinus palustris,(M) Pinus ponderosa,(M) Pinus pumila,(M) Pinus pungens,(M) Pinus quadrifolia,(M) Pinus radiata,(M) Pinus resinosa,(M) Pinus rigida,(M) Pinus serotina,(M) Pinus strobiformis,(M) Pinus strobus,(M) Pinus sylvestris,(M) Pinus thunbergii,(M) Pinus torreyana,(M) Pinus virginiana,(M) Tsuga canadensis,(M) Tsuga heterophylla,(M) Tsuga mertensiana
48,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
52,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,1,0,0,1,0,0,1,0,1,0,0,1,1,1,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1
58,0,0,1,0,0,1,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
我是一个非常新手的编码员,所以我尝试过的几乎不值一提。但是,我最初假设也许我可以 link sheet 使用 zip,它适用于列表。我还想也许我可以做一些像
for line in "Source 1.csv" and row in "Source 2.csv:
#then split the lines into lists and write to an outfile using list indices
提前,非常感谢您的帮助!
我认为您使用 zip()
的方法是正确的,但它变得有点棘手,因为它 returns 是每个源文件中成对值的列表。下面通过展平嵌套序列来解决这个问题。所以我认为以下内容应该有效。您还可以使用 zip()
(或 itertools.izip()
)并行遍历两个 csv 文件的行。
注意我通常在处理那种格式的文件时尝试使用 csv
模块,因为它通常可以节省很多时间和麻烦,而且它相当容易使用。
import csv
import itertools
with open("Source 1.csv", "rb") as source1, \
open("Source 2.csv", "rb") as source2, \
open("merged_output.csv", "wb") as merged_output:
source1_reader = csv.reader(source1, delimiter=',')
source2_reader = csv.reader(source2, delimiter=',')
merged_output_writer = csv.writer(merged_output, delimiter=',')
for row1, row2 in itertools.izip(source1_reader, source2_reader):
merged_output_writer.writerow(
tuple(itertools.chain.from_iterable(itertools.izip(row1, row2))))