Skip to content

Excel Reader

A class to read and extract data from Excel files (.xlsx, .xlsm, .xlsb).

This class provides methods to load workbooks, extract data from sheets, tables, and named ranges, with robust error handling.

__init__(file_path)

Initializes the ExcelReader with the path to the Excel file.

Parameters:

Name Type Description Default
file_path str | Path

Path to the Excel file, specified as a Path object or string.

required

Raises:

Type Description
ImportError

If the file does not exist, is not a valid Excel file, or if the openpyxl library cannot be imported.

extract_data(source_name, source_type)

Extracts data from an Excel file based on the source type and name.

Expected the specified source type to be either: - An entire worksheet with the header in the first row (sheet) - A named range (range) - An Excel table (table)

Parameters:

Name Type Description Default
source_name str

Name of the sheet, table, or range to extract.

required
source_type str

Type of source ('sheet', 'table', or 'range').

required

Raises:

Type Description
ImportError

If the source type is invalid or if there are issues extracting data.

Returns:

Name Type Description
data dict

Data extracted as a dictionary.

extract_from_range(range_name)

Extracts a table from an Excel file and returns it as a DataFrame.

Parameters:

Name Type Description Default
range_name str

Name of the range to extract.

required

Raises:

Type Description
ImportError

If the table cannot be loaded or if there are issues with the file.

Returns:

Name Type Description
data DataFrame

A DataFrame containing the range data.

extract_from_sheet(sheet_name)

Extracts a sheet from an Excel file and returns it as a DataFrame.

Parameters:

Name Type Description Default
sheet_name str

Name of the sheet to extract.

required

Raises:

Type Description
ImportError

If the sheet cannot be loaded or if there are issues with the file.

Returns:

Name Type Description
data DataFrame

A DataFrame containing the sheet data.

extract_from_table(table_name)

Extracts a table from an Excel file and returns it as a DataFrame.

Parameters:

Name Type Description Default
table_name str

Name of the table to extract.

required

Raises:

Type Description
ImportError

If the table cannot be loaded or if there are issues with the file.

Returns:

Name Type Description
data DataFrame

A DataFrame containing the table data.

load_excel_workbook(data_only=True, read_only=False)

Load an Excel workbook with robust error handling.

Parameters:

Name Type Description Default
data_only Optional[bool]

Whether to return cell values (not formulas).

True
read_only Optional[bool]

Use openpyxl's read-only mode for large files.

False

Raises:

Type Description
ImportError

If the file cannot be loaded due to various reasons (e.g., file not found, permission denied, invalid format).

Returns:

Name Type Description
object workbook

Workbook object or None if loading fails.