Overview
Purpose
Load a CSV file into a pandas DataFrame with a restricted set of options and explicit checks on required columns and missing values.
Parameters
path: path of the file to read.sep: column delimiter passed topandas.read_csv.decimal: decimal separator, useful for files that use the Italian format.rename_columns:{source_name: destination_name}mapping applied after loading.required_columns: list of columns that must exist after the optional rename step.missing: policy forNaNvalues. It can be"error","drop", or"allow".comment: optional comment character forread_csv.skip_initial_space: if true, ignores spaces immediately after the delimiter.
Returns
A pd.DataFrame with the loaded data and, if requested, already renamed or filtered.
Errors and exceptions
ValueErrorifmissingis not a supported policy.ValueErrorif one or more required columns are missing.ValueErrorif the file contains missing values andmissing="error".
Example
from mespy import load_csv
df = load_csv(
"data/reference/test_misure.csv",
rename_columns={"misura_n": "n", "lunghezza_mm": "lunghezza", "sigma_mm": "sigma"},
required_columns=["n", "lunghezza", "sigma"],
missing="drop",
)
Practical notes
The
required_columnscheck happens afterrename_columns.missing="drop"removes incomplete rows withDataFrame.dropna().The function does not directly convert the
DataFrameinto numeric arrays: that step remains the responsibility of the statistics and plotting functions.