clophfit.fitting.residuals#
Residual extraction and analysis utilities for fit results.
This module provides tools to extract, analyze, and validate residuals from fitting procedures. Useful for diagnostics, model validation, and comparing different fitting methods.
Classes#
Single residual data point with metadata. |
Functions#
Extract residual points from a fit result. |
|
Convert fit result residuals to a DataFrame. |
|
|
Collect residuals from multiple fit results into a single DataFrame. |
Compute residual statistics by label. |
|
|
Validate residual quality for a fit result. |
|
Compute covariance matrix of residuals for each label. |
|
Convert covariance matrices to correlation matrices. |
|
Detect systematic bias by label and x-range. |
|
Detect correlation between adjacent residuals within wells. |
|
Estimate potential systematic x-shifts per well (heuristics). |
|
Plot |standardized residual| vs predicted signal per label. |
|
Plot raw residual² vs y_err² per label (error calibration check). |
Module Contents#
- class clophfit.fitting.residuals.ResidualPoint#
Single residual data point with metadata.
- label#
Dataset label (e.g., ‘y1’, ‘y2’ for multi-label fits)
- Type:
str
- x#
X-value (pH or ligand concentration)
- Type:
float
- resid_weighted#
Weighted residual: (y - model) / y_err
- Type:
float
- resid_raw#
Raw residual: (y - model)
- Type:
float
- raw_i#
Index into the original (unmasked) arrays for this label (DataArray.xc/yc).
- Type:
int
- y_err#
Measurement uncertainty used during fitting.
- Type:
float
- predicted#
Model-predicted signal value (y - resid_raw).
- Type:
float
- clophfit.fitting.residuals.extract_residual_points(fr)#
Extract residual points from a fit result.
- Parameters:
fr (FitResult[Any]) – Fit result containing residuals and dataset
- Returns:
List of residual points with metadata for each observation
- Return type:
list[ResidualPoint]
- Raises:
ValueError – If residual length doesn’t match dataset sizes
Examples
>>> from clophfit.fitting.core import fit_binding_glob >>> from clophfit.fitting.data_structures import Dataset, DataArray >>> import numpy as np >>> # Create test data >>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0]) >>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x)) >>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10) >>> dataset = Dataset({"y1": da}, is_ph=True) >>> fr = fit_binding_glob(dataset) >>> residuals = extract_residual_points(fr) >>> len(residuals) > 0 True >>> residuals[0].label 'y1'
- clophfit.fitting.residuals.residual_dataframe(fr)#
Convert fit result residuals to a DataFrame.
- Parameters:
fr (FitResult[Any]) – Fit result to extract residuals from
- Returns:
DataFrame with columns: label, x, resid_weighted, resid_raw, raw_i, y_err, predicted
- Return type:
pd.DataFrame
Examples
>>> from clophfit.fitting.core import fit_binding_glob >>> from clophfit.fitting.data_structures import Dataset, DataArray >>> import numpy as np >>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0]) >>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x)) >>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10) >>> dataset = Dataset({"y1": da}, is_ph=True) >>> fr = fit_binding_glob(dataset) >>> df = residual_dataframe(fr) >>> "label" in df.columns and "x" in df.columns True
- clophfit.fitting.residuals.collect_multi_residuals(fit_results, round_x=3)#
Collect residuals from multiple fit results into a single DataFrame.
- Parameters:
fit_results (dict[str, FitResult[Any]]) – Dictionary mapping well/key identifiers to fit results
round_x (int | None) – Number of decimals to round x values (avoids float drift). Set to None to disable rounding.
- Returns:
Combined DataFrame with columns: well, label, x, resid_weighted, resid_raw, raw_i
- Return type:
pd.DataFrame
Examples
>>> from clophfit.fitting.core import fit_binding_glob >>> from clophfit.fitting.data_structures import Dataset, DataArray >>> import numpy as np >>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0]) >>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x)) >>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10) >>> dataset = Dataset({"y1": da}, is_ph=True) >>> results = {"A01": fit_binding_glob(dataset), "A02": fit_binding_glob(dataset)} >>> all_res = collect_multi_residuals(results) >>> "well" in all_res.columns True >>> len(all_res) == 10 # 2 wells * 5 points True
- clophfit.fitting.residuals.residual_statistics(df)#
Compute residual statistics by label.
- Parameters:
df (pd.DataFrame) – Residual DataFrame (from residual_dataframe or collect_multi_residuals)
- Returns:
Statistics by label: mean, std, median, mad, outlier_count
- Return type:
pd.DataFrame
Examples
>>> from clophfit.fitting.core import fit_binding_glob >>> from clophfit.fitting.data_structures import Dataset, DataArray >>> import numpy as np >>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0]) >>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x)) >>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10) >>> dataset = Dataset({"y1": da}, is_ph=True) >>> results = {"A01": fit_binding_glob(dataset)} >>> all_res = collect_multi_residuals(results) >>> stats = residual_statistics(all_res) >>> "mean" in stats.columns True
- clophfit.fitting.residuals.validate_residuals(fr, *, verbose=True)#
Validate residual quality for a fit result.
Checks for common issues: - Systematic bias (mean significantly different from 0) - Outliers (more than 5% beyond ±3-sigma) - Serial correlation (adjacent residuals)
- Parameters:
fr (FitResult[Any]) – Fit result to validate
verbose (bool) – Print warnings for failed checks
- Returns:
Dictionary of check results: {‘bias_ok’, ‘outliers_ok’, ‘correlation_ok’}
- Return type:
dict[str, bool]
Examples
>>> from clophfit.fitting.core import fit_binding_glob >>> from clophfit.fitting.data_structures import Dataset, DataArray >>> import numpy as np >>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0]) >>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x)) >>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10) >>> dataset = Dataset({"y1": da}, is_ph=True) >>> fr = fit_binding_glob(dataset) >>> checks = validate_residuals(fr, verbose=False) >>> isinstance(checks, dict) and "bias_ok" in checks True
- clophfit.fitting.residuals.compute_residual_covariance(all_res, value_col='resid_weighted')#
Compute covariance matrix of residuals for each label.
- Parameters:
all_res (pandas.DataFrame)
value_col (str)
- Return type:
dict[str, pandas.DataFrame]
- clophfit.fitting.residuals.compute_correlation_matrices(cov_by_label)#
Convert covariance matrices to correlation matrices.
- Parameters:
cov_by_label (dict[str, pandas.DataFrame])
- Return type:
dict[str, pandas.DataFrame]
- clophfit.fitting.residuals.analyze_label_bias(all_res, n_bins=5)#
Detect systematic bias by label and x-range.
- Parameters:
all_res (pandas.DataFrame)
n_bins (int)
- Return type:
tuple[pandas.DataFrame, pandas.DataFrame]
- clophfit.fitting.residuals.detect_adjacent_correlation(all_res)#
Detect correlation between adjacent residuals within wells.
- Parameters:
all_res (pandas.DataFrame)
- Return type:
tuple[pandas.DataFrame, dict[str, numpy.ndarray]]
- clophfit.fitting.residuals.estimate_x_shift_statistics(all_res, fit_results)#
Estimate potential systematic x-shifts per well (heuristics).
- Parameters:
all_res (pandas.DataFrame)
fit_results (dict[str, Any])
- Return type:
pandas.DataFrame
- clophfit.fitting.residuals.plot_residual_vs_predicted(all_res, title='')#
Plot |standardized residual| vs predicted signal per label.
A flat trend at ~0.80 (expected |N(0,1)|) confirms the error model is correctly calibrated. A rising trend indicates under-estimated errors at high signals (multiplicative noise).
- Parameters:
all_res (pd.DataFrame) – Residual DataFrame from
collect_multi_residuals. Must contain columnslabel,predicted, andresid_weighted.title (str, optional) – Figure suptitle suffix.
- Returns:
Matplotlib figure (one panel per label).
- Return type:
Figure
- clophfit.fitting.residuals.plot_residual_vs_yerr(all_res, title='')#
Plot raw residual² vs y_err² per label (error calibration check).
Points should scatter around the y=x line if the assigned uncertainties match the actual scatter. A slope < 1 means errors are over-estimated; slope > 1 means under-estimated.
- Parameters:
all_res (pd.DataFrame) – Residual DataFrame from
collect_multi_residuals. Must contain columnslabel,y_err, andresid_raw.title (str, optional) – Figure suptitle suffix.
- Returns:
Matplotlib figure (one panel per label).
- Return type:
Figure