Excel format (.xlsx) is one of the most common document formats when you’re dealing with data analysis. I generally prefer CSV document format, but sometimes you have to deal with Excel document too. One of the main differences between an Excel and CSV format is you can have up to several sheets in an Excel document, meanwhile in CSV document there is no sheet.
We are going to run through both reading Excel document from certain sheet only as well as reading Excel document from all available sheets.
Here is how you can import your data from your Excel file to pandas data frame.
Importing data from certain sheet only
Sometimes you only need to read certain sheet only. You can go straight putting the sheet name there, but in case you still don’t know all the available sheet names, you can print them all first.
Using xlrd
To do this method, you have to install xlrd first.
!pip install xlrd
After installing xlrd
you can import and use it as follows. The variable sheets
will contain all your Excel’s sheet names in a Python list.
import xlrd
fname = ""
xls = xlrd.open_workbook(fname, on_demand=True)
sheets = xls.sheet_names()
Then, you declare the list that will contain all the sheet names you want to read.
selected_sheets = ["sheets", "you", "want", "to", "read"]
After that, you can import certain sheet(s) from the Excel file.
import pandas as pd
d = []
for sheet in selected_sheets:
temp = pd.read_excel(fname, sheet_name=sheet)
d.append(temp)
df = pd.concat(d, axis=0, ignore_index=True)
Importing data from all sheets to one pandas data frame
If you want to read all sheets in Excel file to one huge pandas data frame, it’s even simpler as you only need pandas to do that.
import pandas as pd
sheets = pd.read_excel(fname, sheet_name=None)
df = pd.concat(sheets, axis=0, ignore_index=True)
The code above works by listing all sheets available and putting them in a list called sheets
. Then you concatenate the data in the list into one data frame.
I hope you find this post useful.
References
https://stackoverflow.com/questions/53877687/how-can-i-concat-multiple-dataframes-in-python
Panda are great visualization tool. It’s very helpful tool with python.
I’m python blogger,you can visit https://vipulkunwar503.code.blog
Hey, there. A little correction, pandas isn’t a visualization tool, it’s a library for data analysis. You may want to use Matplotlib or Plotly for visualization. Thanks for visiting.
Thanks, for correction
It’s great,Catris.