Skip to content

Importing An Excel File to Pandas Data Frame

Excel format (.xlsx) is one of the most common document formats when you’re dealing with data analysis. I generally prefer CSV document format, but sometimes you have to deal with Excel document too. One of the main differences between an Excel and CSV format is you can have up to several sheets in an Excel document, meanwhile in CSV document there is no sheet.

We are going to run through both reading Excel document from certain sheet only as well as reading Excel document from all available sheets.

Here is how you can import your data from your Excel file to pandas data frame.

Importing data from certain sheet only

Sometimes you only need to read certain sheet only. You can go straight putting the sheet name there, but in case you still don’t know all the available sheet names, you can print them all first.

Using xlrd

To do this method, you have to install xlrd first.

!pip install xlrd

After installing xlrd you can import and use it as follows. The variable sheets will contain all your Excel’s sheet names in a Python list.

import xlrd

fname = ""
xls = xlrd.open_workbook(fname, on_demand=True)
sheets = xls.sheet_names()

Then, you declare the list that will contain all the sheet names you want to read.

selected_sheets = ["sheets", "you", "want", "to", "read"]

After that, you can import certain sheet(s) from the Excel file.

import pandas as pd

d = []
for sheet in selected_sheets:
    temp = pd.read_excel(fname, sheet_name=sheet)
    d.append(temp)


df = pd.concat(d, axis=0, ignore_index=True)

Importing data from all sheets to one pandas data frame

If you want to read all sheets in Excel file to one huge pandas data frame, it’s even simpler as you only need pandas to do that.

import pandas as pd

sheets = pd.read_excel(fname, sheet_name=None)
df = pd.concat(sheets, axis=0, ignore_index=True)

The code above works by listing all sheets available and putting them in a list called sheets. Then you concatenate the data in the list into one data frame.

I hope you find this post useful.

References

https://stackoverflow.com/questions/12250024/how-to-obtain-sheet-names-from-xls-files-without-loading-the-whole-file

https://stackoverflow.com/questions/53877687/how-can-i-concat-multiple-dataframes-in-python

Advertisements

4 thoughts on “Importing An Excel File to Pandas Data Frame”

    1. Hey, there. A little correction, pandas isn’t a visualization tool, it’s a library for data analysis. You may want to use Matplotlib or Plotly for visualization. Thanks for visiting.

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: