CSV Reader

CSVReader class for extracting data from CSV and updating files.

The header_config argument to __init__() is a list of dict objects that defines the expected structure of the CSV file. Each dict object in the list should have the following keys:

name: The name of the column.
type: The type of the column, which can be 'str', 'int', 'float', 'date', 'datetime' or 'time'.
format (optional): A string that defines the format for date, datetime, time or float types (e.g., "%Y-%m-%d" for date or ".2f" for float).
match (optional): A boolean indicating if this column should be used for matching records in the merge_data_sets() function.
sort (optional): An integer indicating the sort order of the column.
minimum (optional): A date, datetime or int that defines a minimum value for filtering by date or datetime.

For example, consider this CSV file:

Symbol,Date,Name,Currency,Price
ACM0006AU,2025-04-28,AB Managed Volatility Equities,AUD,1.82
CSA0038AU,2025-04-28,Bentham Global Income,AUD,1.00
ETL0018AU,2025-04-28,PIMCO Global Bond Wholesale,AUD,0.90

The header configuration might look like this:

header_config = [
    {
        "name": "Symbol",
        "type": "str",
        "match": True,
        "sort": 2,
    },
    {
        "name": "Date",
        "type": "date",
        "format": "%Y-%m-%d",
        "match": True,
        "sort": 1,
        "minimum": None,
    },
    {
        "name": "Name",
        "type": "str",
    },
    {
        "name": "Currency",
        "type": "str",
    },
    {
        "name": "Price",
        "type": "float",
        "format": ".2f",
    },
]

Class for reading and writing CSV files with header configuration.

`init(file_path, header_config=None)`

Initialize the CSVReader with the file path.

Parameters:

Name	Type	Description	Default
`file_path`	`Path \| str`	The path to the CSV file. If the file does not exist, it won't be created.	required
`header_config`	`Optional(list[dict])`	The header configuration for the CSV file.	`None`

Raises:

Type	Description
`TypeError`	If header_config is not structured correctly.
`ImportError`	If the file not exists but doesn't have a valid extension.

`merge_data_sets(primary_list, append_list)`

Merges two lists of dictionaries based on header configuration match fields and sorts the result.

Parameters:

Name	Type	Description	Default
`primary_list`	`list[dict]`	The primary list of dictionaries.	required
`append_list`	`list[dict]`	The list of dictionaries to append or override.	required

Raises:

Type	Description
`ValueError`	If the dictionaries in the lists do not have the same structure.

Returns:

Type	Description
`list[dict]`	list[dict]: The merged and sorted list of dictionaries.

`read_csv()`

Read the CSV file and return its content.

If the file does not exist, return None. If the file has a header but no data, returns an empty list.

Raises:

Type	Description
`ImportError`	If the CSV file is empty or has no header.
`ValueError`	A value read from the CSV file cannot be converted to the expected type as defined in header_config.

Returns:

Name	Type	Description
`data`	`list[dict]`	A list of rows from the CSV file or None if the file does not exist.

`sort_csv_data(csv_data)`

Sort the CSV data based on the header configuration.

Parameters:

Name	Type	Description	Default
`csv_data`	`list[dict]`	The data read from the CSV file.	required

Returns:

Type	Description
`list[dict]`	list[dict]: The sorted data.

`trim_csv_data(csv_data, max_lines=None, max_days=None)`

Trim the CSV data based on the header configuration and optionally the max_lines arg.

Parameters:

Name	Type	Description	Default
`csv_data`	`list[dict]`	The data read from the CSV file.	required
`max_lines`	`Optional(int)`	If provided, the maximum number of lines to return from csv_data. If this is >0 then it will return the first max_lines lines, if <0 then it will return all but the last abs(max_lines) lines. If None, no trimming is done.	`None`
`max_days`	`Optional(int)`	If provided, the maximum number of days to keep in the data based on date headers with 'minimum' set as an int. This overrides any 'minimum' values in the header configuration.	`None`

Returns:

Type	Description
`list[dict]`	list[dict]: The trimmed data.

`update_csv_file(new_data, new_filename=None, max_lines=None, max_days=None)`

Appends or merges the new_data into an existing CSV file. If the file does not exist, it will be created.

This function will also sort and trim the combined data according to the header configuration.

Parameters:

Name	Type	Description	Default
`new_data`	`list[dict]`	The new data to append or merge.	required
`new_filename`	`Optional(Path \| str), optional)`	If provided, the data will be written to this file instead of the original file.	`None`
`max_lines`	`Optional(int)`	If provided, the maximum number of lines to return from csv_data. If this is >0 then it will return the first max_lines lines, if <0 then it will return all but the last abs(max_lines) lines. If None, no trimming is done.	`None`
`max_days`	`Optional(int)`	If provided, the maximum number of days to keep in the data based on date headers with 'minimum' set as an int. This overrides any 'minimum' values in the header configuration.	`None`

Raises:

Type	Description
`RuntimeError`	If there is a problem processign the data.

Returns:

Name	Type	Description
`merged_data`	`list[dict]`	The merged and sorted data after appending or merging the new_data.

`write_csv(data, new_filename=None)`

Write data to the CSV file.

If the file does not exist, it will be created.
If the file exists, it will be overwritten.
The header will be written based on the header_config.
The data will be written in the order of the header_config.
If a header in the data does not exist in the header_config, it will be ignored.
If a header in the header_config does not exist in the data, throw an exception.
Date fields are formatted according to the format specified in header_config.

Parameters:

Name	Type	Description	Default
`data`	`list[dict]`	The data to write to the CSV file.	required
`new_filename`	`Optional(Path \| str), optional)`	If provided, the data will be written to this file instead of the original file.	`None`

Raises:

Type	Description
`ValueError`	If the data is empty or if a header in header_config is not found in the data.

Returns:

Type	Description
`bool`	True if the data was written successfully, False otherwise.

CSV Reader

__init__(file_path, header_config=None)

merge_data_sets(primary_list, append_list)

read_csv()

sort_csv_data(csv_data)

trim_csv_data(csv_data, max_lines=None, max_days=None)

update_csv_file(new_data, new_filename=None, max_lines=None, max_days=None)

write_csv(data, new_filename=None)

`init(file_path, header_config=None)`

`merge_data_sets(primary_list, append_list)`

`read_csv()`

`sort_csv_data(csv_data)`

`trim_csv_data(csv_data, max_lines=None, max_days=None)`

`update_csv_file(new_data, new_filename=None, max_lines=None, max_days=None)`

`write_csv(data, new_filename=None)`