1, Foreword
Hello, I am an advanced Python player. A while ago, I shared the python automation article: Teach you how to use Python to easily split Excel into multiple CSV files,Teach you four ways to implement multi Excel and multi Sheet merging in batch with Python Later, in the Python advanced communication group, a fan named [playing cat] encountered a problem. She has many excel tables, and there are multiple sheet s in multiple excel. Now, these Excel files need to be merged.

Using the method corresponding to the above link to merge, it is found that only sheets can be merged, and others cannot be merged. This is really a problem.

Admittedly, it is feasible to open copy and paste one by one, but this method is time-consuming, laborious and error prone. Several files can also be manually processed. If there are dozens or even hundreds of files, you will be blind. However, for Python, so easy, let's have a look!
2, Project objectives
Use Python to realize the merging of multiple excel and multiple sheets. For the practical needs, use Python to batch merge all Sheet data in Excel files under all subfolders in the same folder. This demand is quite common in real life, and all of them are practical.
3, Project preparation
Software: PyCharm
Required libraries: pandas, os, glob
4, Project analysis
1) How to select Excel files to merge?
Use os and glob to obtain all Excel files to be merged.
2) How to select sheets to merge?
Use pandas library to read Excel and obtain the Sheet name to be merged.
3) How to merge?
Using pandas library, open all Sheet names one by one, and add and merge data through concat() function.
4) How do I save a file?
Using to_excel saves the data to get the final merged target file.
5, Project realization
Here are four methods for you. One is more concise than the other. The last three methods are provided by [Xiaoming boss], which is really too strong.
1. Method 1
This method is shared by [Wang Ning] boss. There are a lot of codes, but it is also a hands-on tutorial. It is very detailed and has comments. For details, you can stamp this article: Liberal arts students learn Python by themselves - batch summarize all Sheet data in all Excel files in the same path - basic knowledge 1.41 , the code is as follows.
# -*- coding: utf-8 -*- import pandas as pd import datetime import os # define a starting point of time start = datetime.datetime.now() def Set_Work_Path(x): try: os.chdir(x) route = os.getcwd() print(route) return route except Exception: print("No Result") work_path = r"E:\\PythonCrawler\\python_crawler-master\\MergeExcelSheet\\file\\" Set_Work_Path(work_path) # define a function to get all the xlsx file names after deleting old file if there. def Get_Dedicated_4Letter_File_List(x): path = os.getcwd() old_name = path + os.sep + "Summary data" + ".xlsx" # dim a txt name if os.path.exists(old_name): os.remove(old_name) files = os.listdir(path) # print(files) #check all files name in the path current_list = [] for i in range(0, len(files), 1): try: if files[i][-4:] == x and files[i][:4] != "Summary data": current_list.append(files[i]) except Exception: pass return current_list Current_Excel_list = Get_Dedicated_4Letter_File_List("xlsx") print(Current_Excel_list) # define a function to read all sheets one by one in excel file def Get_All_Sheets_Excel(x): file = pd.ExcelFile(x) list_sht_name = file.sheet_names # get list of sheets' names print(list_sht_name) list_sht_data = [] # get all sheet data sets into a list for i in range(0, len(list_sht_name), 1): list_sht_data.append(pd.read_excel(x, header=0, sheet_name=list_sht_name[i], index_col=None)) # merge all data sets together df = pd.concat(list_sht_data) # delete blank data df.dropna(axis=0, how="all", inplace=True) print(df) return df # define a list to get all data from sheets from different excel files data_list = [] for i in range(0, len(Current_Excel_list), 1): # print(Current_Excel_list[i]) data_list.append(Get_All_Sheets_Excel(Current_Excel_list[i])) data = pd.concat(data_list) data.dropna(axis=0, how="all", inplace=True) print(data) # save the data into excel file writer = pd.ExcelWriter("Summary data of Wang Ning.xlsx") data.to_excel(writer, encoding="utf_8_sig", sheet_name="DATA", index=False) # get the target pivot datasets writer.save() end = datetime.datetime.now() run_time = round((end-start).total_seconds()/60, 2) show = "The running time of the program is: %s minute" % run_time+",Done!" print(show)
The above code has strict requirements on the original data. The prerequisite is that all data are standard data sources and the field names and data structures are the same. In this way, there are still some limitations. But don't panic. The three methods of [Xiaoming boss] don't have this restriction. Let's have a look!
2. Method II
The following code is based on the improvement of the single sheet table consolidation code provided by [Xiaoming boss]. The key point is to merge sheets_ With the parameter name=none, it means to get all the sheets in the Excel file, which returns a dictionary. When traversing later, all values are obtained in the form of a dictionary. After that, in the place of 15 lines, it should be noted that the extend() method is used for adding. If the append() method is used, only the consolidated results of the last table will be obtained. This pit has been trodden by the editor himself, Interested friends can also step into the pit.
# -*- coding: utf-8 -*- import os import pandas as pd result = [] path = r"E:\\PythonCrawler\\python_crawler-master\\MergeExcelSheet\\testfile\\file" for root, dirs, files in os.walk(path, topdown=False): for name in files: if name.endswith(".xls") or name.endswith(".xlsx"): df = pd.read_excel(os.path.join(root, name), sheet_name=None) result.append(df) data_list = [] for data in result: # print(data.values()) data_list.extend(data.values()) # Note that this is the extend() function, not the append() function df = pd.concat(data_list) df.to_excel("testfile Merge all tables.xlsx", index=False) print("Merge complete!")
3. Method III
The following code is a code written by [Xiaoming boss], which uses the list append() method. Although the efficiency will be lower, it is still easy to process hundreds of thousands of files.
It should be noted that lines 6 and 7 in the code obtain the file path, where * * represents the recursion of sub files under the folder. The other is xls*, this is a regular writing method, which means that you can handle both xls format and xlsx format Excel files. It's really wonderful!
# -*- coding: utf-8 -*- import glob import pandas as pd path = "E:\\PythonCrawler\\python_crawler-master\\MergeExcelSheet\\file\\" data = [] for excel_file in glob.glob(f'{path}/**/[!~]*.xls*'): # for excel_file in glob.glob(f'{path}/[!~]*.xlsx'): excel = pd.ExcelFile(excel_file) for sheet_name in excel.sheet_names: df = excel.parse(sheet_name) data.append(df) # print(data) df = pd.concat(data, ignore_index=True) df.to_excel("Code provided by Xiaoming(Merge multiple tables)--glob and pandas Library list append method--Merge all tables.xlsx", index=False) print("Merge complete!")
4. Method 4
The following code is another code of the [Xiaoming boss] manual, using sheet_name=None and list extend() methods, set sheet_ With the parameter name=none, it means to obtain all the sheets in the Excel file, which returns a dictionary. When traversing later, all values are obtained in the form of a dictionary, which is more efficient than the previous methods.
It should be noted that lines 6 and 7 in the code obtain the file path, where * * represents the recursion of sub files under the folder. The other is xls*, this is a regular writing method, which means that you can handle both xls format and xlsx format Excel files. It's really wonderful!
# -*- coding: utf-8 -*- import glob import pandas as pd path = r"E:\PythonCrawler\python_crawler-master\MergeExcelSheet\file" data = [] # for excel_file in glob.glob(f'{path}/**/[!~]*.xlsx'): for excel_file in glob.glob(f'{path}/[!~]*.xlsx'): dfs = pd.read_excel(excel_file, sheet_name=None).values() data.extend(dfs) # print(data) df = pd.concat(data, ignore_index=True) df.to_excel("Code provided by Xiaoming(Merge multiple tables)--glob and pandas Library list extend method--concise--Merge all tables.xlsx", index=False) print("Merge complete!")
6, Effect display
1. Excel data before processing:

2. Operation progress prompt:

3. Consolidated results:

7, Summary
Based on the actual work and Python programming, this paper introduces four methods to realize the batch consolidation of all Sheet data in Excel files under all subfolders in the same folder, which reduces the trouble of copying and pasting, saves time and effort, and is not easy to make mistakes. There is not much code, and the loop addition is a bit convoluted. Just think about it. If you don't understand it, leave a message and ask questions at any time. Let's learn and progress together.
If you have ideas, you can also package the code in the article into an exe executable software, which can be packaged and sent to others for use. You can also earn tips. The tutorial on packaging will not be repeated here. Welcome to: Three Pyinstaller packaging commands you must remember -- using Python to package exe.
Finally, I would like to thank [Xiao Xiaoming], [Wang Ning] and [black and white] for their code, suggestions and guidance in the process of code implementation, and fans [playing cat] for their questions, so that we can make progress together!