clophfit.testing.evaluation#
Functions for evaluating fit quality.
This module provides metrics for evaluating fitting performance, including: 1. Bias (accuracy) 2. Coverage (uncertainty quantification) 3. Residual distribution (goodness of fit) 4. Parameter error analysis
Functions#
|
Calculate the bias (mean error) of the estimates. |
|
Calculate the Root Mean Square Error (RMSE). |
|
Calculate the coverage probability of the confidence intervals. |
|
Evaluate the normality of residuals. |
|
Extract parameter value and error from a FitResult. |
Find available real data directories. |
|
|
Perform statistical comparison between two methods. |
Module Contents#
- clophfit.testing.evaluation.calculate_bias(estimated, true_value)#
Calculate the bias (mean error) of the estimates.
- Parameters:
estimated (np.ndarray) – Array of estimated values.
true_value (float) – The true value.
- Returns:
The bias (mean of estimated - true_value).
- Return type:
float
- clophfit.testing.evaluation.calculate_rmse(estimated, true_value)#
Calculate the Root Mean Square Error (RMSE).
- Parameters:
estimated (ArrayF) – Array of estimated values.
true_value (float) – The true value.
- Returns:
The RMSE.
- Return type:
float
- clophfit.testing.evaluation.calculate_coverage(estimated, errors, true_value, confidence=0.95)#
Calculate the coverage probability of the confidence intervals.
- Parameters:
estimated (ArrayF) – Array of estimated values.
errors (ArrayF) – Array of standard errors (1 sigma).
true_value (float) – The true value.
confidence (float) – The desired confidence level (default: 0.95).
- Returns:
The fraction of intervals that contain the true value.
- Return type:
float
- clophfit.testing.evaluation.evaluate_residuals(residuals)#
Evaluate the normality of residuals.
- Parameters:
residuals (np.ndarray) – Array of residuals.
- Returns:
Dictionary containing: - ‘shapiro_stat’: Shapiro-Wilk test statistic - ‘shapiro_p’: Shapiro-Wilk p-value - ‘mean’: Mean of residuals - ‘std’: Standard deviation of residuals
- Return type:
dict[str, float]
- clophfit.testing.evaluation.extract_params(fr, param_name='K')#
Extract parameter value and error from a FitResult.
- Parameters:
fr (FitResult[MiniT]) – The fit result object.
param_name (str) – The name of the parameter to extract (default: “K”).
- Returns:
(value, error). Returns (np.nan, np.nan) if extraction fails.
- Return type:
tuple[float, float]
- clophfit.testing.evaluation.load_real_data_paths()#
Find available real data directories.
- Returns:
Mapping of dataset name to path
- Return type:
dict[str, Path]
- clophfit.testing.evaluation.compare_methods_statistical(method1_errors, method2_errors, method1_name='Method 1', method2_name='Method 2', *, verbose=True)#
Perform statistical comparison between two methods.
Uses Mann-Whitney U test (non-parametric) for comparing absolute errors.
- Parameters:
method1_errors (Sequence[float]) – Errors from method 1
method2_errors (Sequence[float]) – Errors from method 2
method1_name (str) – Name of method 1
method2_name (str) – Name of method 2
verbose (bool, optional) – Whether to print detailed comparison info, defaults to True.
- Returns:
Statistical comparison results. The dictionary includes: - ‘test’: Name of the statistical test. - ‘statistic’: Test statistic value. - ‘p_value’: Computed p-value. - significant: Whether the difference is significant at alpha=0.05. - better_method: Method with lower MAE (or ‘Equivalent’). - mae1: Mean absolute error for method1. - mae2: Mean absolute error for method2. - error: Only present when comparison cannot be performed.
- Return type:
dict[str, float | bool | str]