clophfit.fitting.residuals#

Residual extraction and analysis utilities for fit results.

This module provides tools to extract, analyze, and validate residuals from fitting procedures. Useful for diagnostics, model validation, and comparing different fitting methods.

Classes#

ResidualPoint

Single residual data point with metadata.

Functions#

extract_residual_points(fr)

Extract residual points from a fit result.

residual_dataframe(fr)

Convert fit result residuals to a DataFrame.

collect_multi_residuals(fit_results[, round_x])

Collect residuals from multiple fit results into a single DataFrame.

residual_statistics(df)

Compute residual statistics by label.

validate_residuals(fr, *[, verbose])

Validate residual quality for a fit result.

compute_residual_covariance(all_res[, value_col])

Compute covariance matrix of residuals for each label.

compute_correlation_matrices(cov_by_label)

Convert covariance matrices to correlation matrices.

analyze_label_bias(all_res[, n_bins])

Detect systematic bias by label and x-range.

detect_adjacent_correlation(all_res)

Detect correlation between adjacent residuals within wells.

estimate_x_shift_statistics(all_res, fit_results)

Estimate potential systematic x-shifts per well (heuristics).

plot_residual_vs_predicted(all_res[, title])

Plot |standardized residual| vs predicted signal per label.

plot_residual_vs_yerr(all_res[, title])

Plot raw residual² vs y_err² per label (error calibration check).

Module Contents#

class clophfit.fitting.residuals.ResidualPoint#

Single residual data point with metadata.

label#

Dataset label (e.g., ‘y1’, ‘y2’ for multi-label fits)

Type:

str

x#

X-value (pH or ligand concentration)

Type:

float

resid_weighted#

Weighted residual: (y - model) / y_err

Type:

float

resid_raw#

Raw residual: (y - model)

Type:

float

raw_i#

Index into the original (unmasked) arrays for this label (DataArray.xc/yc).

Type:

int

y_err#

Measurement uncertainty used during fitting.

Type:

float

predicted#

Model-predicted signal value (y - resid_raw).

Type:

float

clophfit.fitting.residuals.extract_residual_points(fr)#

Extract residual points from a fit result.

Parameters:

fr (FitResult[Any]) – Fit result containing residuals and dataset

Returns:

List of residual points with metadata for each observation

Return type:

list[ResidualPoint]

Raises:

ValueError – If residual length doesn’t match dataset sizes

Examples

>>> from clophfit.fitting.core import fit_binding_glob
>>> from clophfit.fitting.data_structures import Dataset, DataArray
>>> import numpy as np
>>> # Create test data
>>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0])
>>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x))
>>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10)
>>> dataset = Dataset({"y1": da}, is_ph=True)
>>> fr = fit_binding_glob(dataset)
>>> residuals = extract_residual_points(fr)
>>> len(residuals) > 0
True
>>> residuals[0].label
'y1'
clophfit.fitting.residuals.residual_dataframe(fr)#

Convert fit result residuals to a DataFrame.

Parameters:

fr (FitResult[Any]) – Fit result to extract residuals from

Returns:

DataFrame with columns: label, x, resid_weighted, resid_raw, raw_i, y_err, predicted

Return type:

pd.DataFrame

Examples

>>> from clophfit.fitting.core import fit_binding_glob
>>> from clophfit.fitting.data_structures import Dataset, DataArray
>>> import numpy as np
>>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0])
>>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x))
>>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10)
>>> dataset = Dataset({"y1": da}, is_ph=True)
>>> fr = fit_binding_glob(dataset)
>>> df = residual_dataframe(fr)
>>> "label" in df.columns and "x" in df.columns
True
clophfit.fitting.residuals.collect_multi_residuals(fit_results, round_x=3)#

Collect residuals from multiple fit results into a single DataFrame.

Parameters:
  • fit_results (dict[str, FitResult[Any]]) – Dictionary mapping well/key identifiers to fit results

  • round_x (int | None) – Number of decimals to round x values (avoids float drift). Set to None to disable rounding.

Returns:

Combined DataFrame with columns: well, label, x, resid_weighted, resid_raw, raw_i

Return type:

pd.DataFrame

Examples

>>> from clophfit.fitting.core import fit_binding_glob
>>> from clophfit.fitting.data_structures import Dataset, DataArray
>>> import numpy as np
>>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0])
>>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x))
>>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10)
>>> dataset = Dataset({"y1": da}, is_ph=True)
>>> results = {"A01": fit_binding_glob(dataset), "A02": fit_binding_glob(dataset)}
>>> all_res = collect_multi_residuals(results)
>>> "well" in all_res.columns
True
>>> len(all_res) == 10  # 2 wells * 5 points
True
clophfit.fitting.residuals.residual_statistics(df)#

Compute residual statistics by label.

Parameters:

df (pd.DataFrame) – Residual DataFrame (from residual_dataframe or collect_multi_residuals)

Returns:

Statistics by label: mean, std, median, mad, outlier_count

Return type:

pd.DataFrame

Examples

>>> from clophfit.fitting.core import fit_binding_glob
>>> from clophfit.fitting.data_structures import Dataset, DataArray
>>> import numpy as np
>>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0])
>>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x))
>>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10)
>>> dataset = Dataset({"y1": da}, is_ph=True)
>>> results = {"A01": fit_binding_glob(dataset)}
>>> all_res = collect_multi_residuals(results)
>>> stats = residual_statistics(all_res)
>>> "mean" in stats.columns
True
clophfit.fitting.residuals.validate_residuals(fr, *, verbose=True)#

Validate residual quality for a fit result.

Checks for common issues: - Systematic bias (mean significantly different from 0) - Outliers (more than 5% beyond ±3-sigma) - Serial correlation (adjacent residuals)

Parameters:
  • fr (FitResult[Any]) – Fit result to validate

  • verbose (bool) – Print warnings for failed checks

Returns:

Dictionary of check results: {‘bias_ok’, ‘outliers_ok’, ‘correlation_ok’}

Return type:

dict[str, bool]

Examples

>>> from clophfit.fitting.core import fit_binding_glob
>>> from clophfit.fitting.data_structures import Dataset, DataArray
>>> import numpy as np
>>> x = np.array([9.0, 8.0, 7.0, 6.0, 5.0])
>>> y = 500 + 500 * 10 ** (7.0 - x) / (1 + 10 ** (7.0 - x))
>>> da = DataArray(xc=x, yc=y, y_errc=np.ones_like(y) * 10)
>>> dataset = Dataset({"y1": da}, is_ph=True)
>>> fr = fit_binding_glob(dataset)
>>> checks = validate_residuals(fr, verbose=False)
>>> isinstance(checks, dict) and "bias_ok" in checks
True
clophfit.fitting.residuals.compute_residual_covariance(all_res, value_col='resid_weighted')#

Compute covariance matrix of residuals for each label.

Parameters:
  • all_res (pandas.DataFrame)

  • value_col (str)

Return type:

dict[str, pandas.DataFrame]

clophfit.fitting.residuals.compute_correlation_matrices(cov_by_label)#

Convert covariance matrices to correlation matrices.

Parameters:

cov_by_label (dict[str, pandas.DataFrame])

Return type:

dict[str, pandas.DataFrame]

clophfit.fitting.residuals.analyze_label_bias(all_res, n_bins=5)#

Detect systematic bias by label and x-range.

Parameters:
  • all_res (pandas.DataFrame)

  • n_bins (int)

Return type:

tuple[pandas.DataFrame, pandas.DataFrame]

clophfit.fitting.residuals.detect_adjacent_correlation(all_res)#

Detect correlation between adjacent residuals within wells.

Parameters:

all_res (pandas.DataFrame)

Return type:

tuple[pandas.DataFrame, dict[str, numpy.ndarray]]

clophfit.fitting.residuals.estimate_x_shift_statistics(all_res, fit_results)#

Estimate potential systematic x-shifts per well (heuristics).

Parameters:
  • all_res (pandas.DataFrame)

  • fit_results (dict[str, Any])

Return type:

pandas.DataFrame

clophfit.fitting.residuals.plot_residual_vs_predicted(all_res, title='')#

Plot |standardized residual| vs predicted signal per label.

A flat trend at ~0.80 (expected |N(0,1)|) confirms the error model is correctly calibrated. A rising trend indicates under-estimated errors at high signals (multiplicative noise).

Parameters:
  • all_res (pd.DataFrame) – Residual DataFrame from collect_multi_residuals. Must contain columns label, predicted, and resid_weighted.

  • title (str, optional) – Figure suptitle suffix.

Returns:

Matplotlib figure (one panel per label).

Return type:

Figure

clophfit.fitting.residuals.plot_residual_vs_yerr(all_res, title='')#

Plot raw residual² vs y_err² per label (error calibration check).

Points should scatter around the y=x line if the assigned uncertainties match the actual scatter. A slope < 1 means errors are over-estimated; slope > 1 means under-estimated.

Parameters:
  • all_res (pd.DataFrame) – Residual DataFrame from collect_multi_residuals. Must contain columns label, y_err, and resid_raw.

  • title (str, optional) – Figure suptitle suffix.

Returns:

Matplotlib figure (one panel per label).

Return type:

Figure