如何使用 openpyxl 将工作表从一个工作簿复制到另一个工作簿?
- 2025-04-10 09:46:00
- admin 原创
- 19
问题描述:
我有大量 EXCEL 文件(例如 200 个),我想将一个特定的工作表从一个工作簿复制到另一个工作簿。我做了一些调查,但找不到使用 Openpyxl 的方法
这是我迄今为止开发的代码
def copy_sheet_to_different_EXCEL(path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name):
''' Function used to copy one EXCEL sheet into another file.
def path_EXCEL_read,Sheet_name_to_copy,path_EXCEL_Save,Sheet_new_name
Input data:
1.) path_EXCEL_read: the location of the EXCEL file along with the name where the information is going to be saved
2.) Sheet_name_to_copy= The name of the EXCEL sheet to copy
3.) path_EXCEL_Save: The path of the EXCEL file where the sheet is going to be copied
3.) Sheet_new_name: The name of the new EXCEL sheet
Output data:
1.) Status= If 0, everything went OK. If 1, one error occurred.
Version History:
1.0 (2017-02-20): Initial version.
'''
status=0
if(path_EXCEL_read.endswith('.xls')==1):
print('ERROR - EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
status=1
return status
try:
wb = openpyxl.load_workbook(path_EXCEL_read,read_only=True)
except:
print('ERROR - EXCEL file does not exist in the following location:
{0}'.format(path_EXCEL_read))
status=1
return status
Sheet_names=wb.get_sheet_names() # We copare against the sheet name we would like to cpy
if ((Sheet_name_to_copy in Sheet_names)==0):
print('ERROR - EXCEL sheet does not exist'.format(Sheet_name_to_copy))
status=1
return status
# We checking if the destination file exists
if (os.path.exists(path_EXCEL_Save)==1):
#If true, file exist so we open it
if(path_EXCEL_Save.endswith('.xls')==1):
print('ERROR - Destination EXCEL xls file format is not supported by openpyxl. Please, convert the file to an XLSX format')
status=1
return status
try:
wdestiny = openpyxl.load_workbook(path_EXCEL_Save)
except:
print('ERROR - Destination EXCEL file does not exist in the following location:
{0}'.format(path_EXCEL_read))
status=1
return status
#we check if the destination sheet exists. If so, we will delete it
destination_list_sheets = wdestiny.get_sheet_names()
if((Sheet_new_name in destination_list_sheets) ==True):
print('WARNING - Sheet "{0}" exists in: {1}. It will be deleted!'.format(Sheet_new_name,path_EXCEL_Save))
wdestiny.remove_sheet(Sheet_new_name)
else:
wdestiny=openpyxl.Workbook()
# We copy the Excel sheet
try:
sheet_to_copy = wb.get_sheet_by_name(Sheet_name_to_copy)
target = wdestiny.copy_worksheet(sheet_to_copy)
target.title=Sheet_new_name
except:
print('ERROR - Could not copy the EXCEL sheet. Check the file')
status=1
return status
try:
wdestiny.save(path_EXCEL_Save)
except:
print('ERROR - Could not save the EXCEL sheet. Check the file permissions')
status=1
return status
#Program finishes
return status
解决方案 1:
我遇到了同样的问题。对我来说,样式、格式和布局非常重要。此外,我不想复制公式,只想复制(公式的)值。经过大量的尝试、错误和 stackoverflow,我想出了以下函数。它可能看起来有点吓人,但代码将工作表从一个 Excel 文件复制到另一个 Excel 文件(可能是现有文件),同时保留:
文本的字体和颜色
单元格的填充颜色
合并单元格
评论和超链接
单元格值的格式
每行和每列的宽度
行和列是否隐藏
冻结行
当您想要从多个工作簿中收集工作表并将它们绑定到一个工作簿中时,它很有用。我复制了大多数属性,但可能还有一些。在这种情况下,您可以使用此脚本作为起点来添加更多内容。
###############
## Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
## Please add the ..path\\+\\file.. and ..sheet_name.. according to your desire.
import openpyxl
from copy import copy
def copy_sheet(source_sheet, target_sheet):
copy_cells(source_sheet, target_sheet) # copy all the cel values and styles
copy_sheet_attributes(source_sheet, target_sheet)
def copy_sheet_attributes(source_sheet, target_sheet):
target_sheet.sheet_format = copy(source_sheet.sheet_format)
target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
target_sheet.merged_cells = copy(source_sheet.merged_cells)
target_sheet.page_margins = copy(source_sheet.page_margins)
target_sheet.freeze_panes = copy(source_sheet.freeze_panes)
# set row dimensions
# So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
for rn in range(len(source_sheet.row_dimensions)):
target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])
if source_sheet.sheet_format.defaultColWidth is None:
print('Unable to copy default column wide')
else:
target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)
# set specific column width and hidden property
# we cannot copy the entire column_dimensions attribute so we copy selected attributes
for key, value in source_sheet.column_dimensions.items():
target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min) # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max) # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)
def copy_cells(source_sheet, target_sheet):
for (row, col), source_cell in source_sheet._cells.items():
target_cell = target_sheet.cell(column=col, row=row)
target_cell._value = source_cell._value
target_cell.data_type = source_cell.data_type
if source_cell.has_style:
target_cell.font = copy(source_cell.font)
target_cell.border = copy(source_cell.border)
target_cell.fill = copy(source_cell.fill)
target_cell.number_format = copy(source_cell.number_format)
target_cell.protection = copy(source_cell.protection)
target_cell.alignment = copy(source_cell.alignment)
if source_cell.hyperlink:
target_cell._hyperlink = copy(source_cell.hyperlink)
if source_cell.comment:
target_cell.comment = copy(source_cell.comment)
wb_target = openpyxl.Workbook()
target_sheet = wb_target.create_sheet(..sheet_name..)
wb_source = openpyxl.load_workbook(..path\\+\\file_name.., data_only=True)
source_sheet = wb_source[..sheet_name..]
copy_sheet(source_sheet, target_sheet)
if 'Sheet' in wb_target.sheetnames: # remove default sheet
wb_target.remove(wb_target['Sheet'])
wb_target.save('out.xlsx')
解决方案 2:
我找到了一种解决方法
import openpyxl
xl1 = openpyxl.load_workbook('workbook1.xlsx')
# sheet you want to copy
s = openpyxl.load_workbook('workbook2.xlsx').active
s._parent = xl1
xl1._add_sheet(s)
xl1.save('some_path/name.xlsx')
解决方案 3:
您不能copy_worksheet()
在工作簿之间使用它来复制,因为它依赖于可能因工作簿而异的全局常量。唯一安全可靠的方法是逐行逐单元格地进行。
你可能想阅读关于此功能的讨论
解决方案 4:
为了提高速度,我在打开工作簿时使用data_only
和属性。而且速度也非常快。read_only
`iter_rows()`
@Oscar 的优秀答案需要进行一些更改才能支持 ReadOnlyWorksheet 和 EmptyCell
# Copy a sheet with style, format, layout, ect. from one Excel file to another Excel file
# Please add the ..path\\+\\file.. and ..sheet_name.. according to your desire.
import openpyxl
from copy import copy
def copy_sheet(source_sheet, target_sheet):
copy_cells(source_sheet, target_sheet) # copy all the cel values and styles
copy_sheet_attributes(source_sheet, target_sheet)
def copy_sheet_attributes(source_sheet, target_sheet):
if isinstance(source_sheet, openpyxl.worksheet._read_only.ReadOnlyWorksheet):
return
target_sheet.sheet_format = copy(source_sheet.sheet_format)
target_sheet.sheet_properties = copy(source_sheet.sheet_properties)
target_sheet.merged_cells = copy(source_sheet.merged_cells)
target_sheet.page_margins = copy(source_sheet.page_margins)
target_sheet.freeze_panes = copy(source_sheet.freeze_panes)
# set row dimensions
# So you cannot copy the row_dimensions attribute. Does not work (because of meta data in the attribute I think). So we copy every row's row_dimensions. That seems to work.
for rn in range(len(source_sheet.row_dimensions)):
target_sheet.row_dimensions[rn] = copy(source_sheet.row_dimensions[rn])
if source_sheet.sheet_format.defaultColWidth is None:
print('Unable to copy default column wide')
else:
target_sheet.sheet_format.defaultColWidth = copy(source_sheet.sheet_format.defaultColWidth)
# set specific column width and hidden property
# we cannot copy the entire column_dimensions attribute so we copy selected attributes
for key, value in source_sheet.column_dimensions.items():
target_sheet.column_dimensions[key].min = copy(source_sheet.column_dimensions[key].min) # Excel actually groups multiple columns under 1 key. Use the min max attribute to also group the columns in the targetSheet
target_sheet.column_dimensions[key].max = copy(source_sheet.column_dimensions[key].max) # https://stackoverflow.com/questions/36417278/openpyxl-can-not-read-consecutive-hidden-columns discussed the issue. Note that this is also the case for the width, not onl;y the hidden property
target_sheet.column_dimensions[key].width = copy(source_sheet.column_dimensions[key].width) # set width for every column
target_sheet.column_dimensions[key].hidden = copy(source_sheet.column_dimensions[key].hidden)
def copy_cells(source_sheet, target_sheet):
for r, row in enumerate(source_sheet.iter_rows()):
for c, cell in enumerate(row):
source_cell = cell
if isinstance(source_cell, openpyxl.cell.read_only.EmptyCell):
continue
target_cell = target_sheet.cell(column=c+1, row=r+1)
target_cell._value = source_cell._value
target_cell.data_type = source_cell.data_type
if source_cell.has_style:
target_cell.font = copy(source_cell.font)
target_cell.border = copy(source_cell.border)
target_cell.fill = copy(source_cell.fill)
target_cell.number_format = copy(source_cell.number_format)
target_cell.protection = copy(source_cell.protection)
target_cell.alignment = copy(source_cell.alignment)
if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.hyperlink:
target_cell._hyperlink = copy(source_cell.hyperlink)
if not isinstance(source_cell, openpyxl.cell.ReadOnlyCell) and source_cell.comment:
target_cell.comment = copy(source_cell.comment)
使用类似
wb = Workbook()
wb_source = load_workbook(filename, data_only=True, read_only=True)
for sheetname in wb_source.sheetnames:
source_sheet = wb_source[sheetname]
ws = wb.create_sheet("Orig_" + sheetname)
copy_sheet(source_sheet, ws)
wb.save(new_filename)
解决方案 5:
我的解决方法如下:
您有一个模板文件,假设它是“template.xlsx”。您打开它,根据需要对其进行更改,将其另存为新文件,然后关闭该文件。根据需要重复上述操作。只需确保在测试/摆弄时保留原始模板的副本即可。
解决方案 6:
我也有类似的需求,即将多个工作簿中的数据整理到一个工作簿中。因为 openpyxl 中没有可用的内置方法。
我创建了下面的脚本来完成这项工作。
注意:在我的用例中,所有工作簿都包含相同格式的数据。
from openpyxl import load_workbook
import os
# The below method is used to read data from an active worksheet and store it in memory.
def reader(file):
global path
abs_file = os.path.join(path, file)
wb_sheet = load_workbook(abs_file).active
rows = []
# min_row is set to 2, to ignore the first row which contains the headers
for row in wb_sheet.iter_rows(min_row=2):
row_data = []
for cell in row:
row_data.append(cell.value)
# custom column data I am adding, not needed for typical use cases
row_data.append(file[17:-6])
# Creating a list of lists, where each list contain a typical row's data
rows.append(row_data)
return rows
if __name__ == '__main__':
# Folder in which my source excel sheets are present
path = r'C:Users omDesktopQt'
# To get the list of excel files
files = os.listdir(path)
for file in files:
rows = reader(file)
# below mentioned file name should be already created
book = load_workbook('new.xlsx')
sheet = book.active
for row in rows:
sheet.append(row)
book.save('new.xlsx')
解决方案 7:
我刚刚发现了这个问题。正如这里提到的,一个很好的解决方法是修改wb
内存中的原始内容,然后用另一个名称保存它。例如:
import openpyxl
# your starting wb with 2 Sheets: Sheet1 and Sheet2
wb = openpyxl.load_workbook('old.xlsx')
sheets = wb.sheetnames # ['Sheet1', 'Sheet2']
for s in sheets:
if s != 'Sheet2':
sheet_name = wb.get_sheet_by_name(s)
wb.remove_sheet(sheet_name)
# your final wb with just Sheet1
wb.save('new.xlsx')
解决方案 8:
使用 openpyxl - 边框复制失败。就我而言 - 使用 xlwings 成功。它在操作系统中打开 excel,将选项卡复制到其他 excel,保存、重命名并关闭。
import openpyxl, os
import xlwings as xw
def copy_tab(file_old, tab_source, file_new, tab_destination):
delete_tab = False
if not os.path.exists(file_new):
wb_target = openpyxl.Workbook()
wb_target.save(file_new)
delete_tab = True
wb = xw.Book(file_old)
app = wb.app
app.visible = False
sht = wb.sheets[tab_source]
new_wb = xw.Book(file_new)
new_app = new_wb.app
new_app.visible = False
sht.api.Copy(None, After=new_wb.sheets[-1].api)
if delete_tab:
new_wb.sheets['Sheet'].delete()
wb.close()
for sheet in new_wb.sheets:
if tab_destination in sheet.name:
sheet.delete()
new_wb.sheets[tab_source].name = tab_destination
new_wb.save()
new_wb.close()
if __name__ == "__main__":
file_old = r"C:ile_old.xlsx"
file_new = r"C:ile_new.xlsx"
copy_tab(file_old, "sheet_old", file_new, "sheet_new")
解决方案 9:
我要补充@Oscar 的回答:
target_sheet.print_options = copy(source_sheet.print_options)
target_sheet.protection = copy(source_sheet.protection)
target_sheet.sheet_state = copy(source_sheet.sheet_state)
target_sheet.views = copy(source_sheet.views)
target_sheet.data_validations = copy(source_sheet.data_validations)
解决方案 10:
我使用的解决方法是将当前工作表保存为 pandas 数据框,并将其加载到所需的 excel 工作簿中
解决方案 11:
在Oscar 的解决方案中使用deepcopy
而不是循环将会消除 LibreOffice 错误:“无法完全加载数据,因为超出了每张工作表的最大行数。 ”source_sheet.row_dimensions
解决方案 12:
其实,这可以用一种非常简单的方式完成!
只需 3 个步骤:
使用 load_workbook 打开文件
wb = load_workbook('文件_1.xlsx')
选择要复制的工作表
ws = wb.active
使用新文件的名称保存文件
wb.保存('新文件.xlsx')
此代码将把第一个文件(File_1.xlsx)的工作表保存到第二个文件(New_file.xlsx)。
扫码咨询,免费领取项目管理大礼包!