Implementation
This page describes the internal flow of covariance and the order in which separate validation of x and y, shape compatibility checks, optional weight validation, and final covariance calculation are applied. Unlike Overview, the goal here is not to repeat the parameters, but to show how the function actually builds the result.
The following snippets are taken from the current src/mespy/stats_utils.py implementation. Private helpers are mentioned only to clarify the flow; complete details are documented in _as_float_vector and _validate_weights.
Execution sequence
The implementation follows this sequence:
Converts
xandyinto one-dimensional, finitefloat64vectors.Checks that the two vectors have the same shape.
If
wis provided, it validates it against the shape ofx.If
w is None, it computesE[xy],E[x], andE[y]as simple means.If
wis provided, it computes the same quantities as weighted means using the same weight vector.Returns the difference
mean_xy - mean_x * mean_y.
Input validation and identity used
def covariance(x: ArrayLike, y: ArrayLike, w: ArrayLike | None = None) -> float:
x_values = _as_float_vector("x", x)
y_values = _as_float_vector("y", y)
if x_values.shape != y_values.shape:
raise ValueError("x e y devono avere la stessa lunghezza")
weights = _validate_weights(x_values, w)
if weights is None:
mean_xy = float(np.mean(x_values * y_values))
mean_x = float(np.mean(x_values))
mean_y = float(np.mean(y_values))
else:
w_sum = float(np.sum(weights))
mean_xy = float(np.sum(weights * x_values * y_values) / w_sum)
mean_x = float(np.sum(weights * x_values) / w_sum)
mean_y = float(np.sum(weights * y_values) / w_sum)
return float(mean_xy - mean_x * mean_y)
The first part of the flow is entirely dedicated to putting x and y on the same numeric footing.
The two inputs are validated separately, so both must be one-dimensional, non-empty, and composed of finite values.
Only after this normalization does the function check that
x_values.shape == y_values.shape.If the vectors are not compatible, the calculation does not even start.
Implemented formula
La funzione usa l’identita
Nel caso non pesato questo significa
Nel caso pesato le tre medie vengono sostituite da
Il valore finale resta in entrambi i casi
Important interactions and explicit limits
Some implementation choices are worth making explicit.
La funzione non introduce un parametro
ddof: implementa solo la definizione basata suE[xy] - E[x]E[y].Gli stessi pesi vengono usati per tutte le medie del ramo pesato; non esistono pesi distinti per
x,yoxy.Se
w is None, il flusso non passa daweighted_mean(...): le medie sono scritte direttamente connp.mean(...).Se
we presente, il codice non lo normalizza preventivamente; usa la normalizzazione implicita tramite divisione persum(w).Errori su valori non finiti o pesi non positivi vengono intercettati prima dell’ultima formula, cosi la funzione non restituisce mai covarianze
nanin silenzio.
Commented example
from mespy import covariance
x = [1.0, 2.0, 3.0]
y = [2.0, 4.0, 6.0]
w = [1.0, 1.0, 2.0]
print(covariance(x, y)) # medie semplici
print(covariance(x, y, w)) # medie pesate
Le due chiamate condividono la stessa identita matematica, ma cambiano il modo in cui vengono costruite mean_xy, mean_x e mean_y.
Senza pesi ogni punto contribuisce allo stesso modo.
Con
w, i contributi passano tutti attraverso la stessa normalizzazione pesata.