clophfit.fitting.core#
Clophfit: Fitting of Cl- binding and pH titration curves.
This module provides a comprehensive suite of tools for analyzing titration data, particularly for chloride binding and pH titration experiments common in biochemistry, such as those involving fluorescent probes.
Core Functionality:#
Data Modeling: Implements a 1-site binding model suitable for both ligand concentration and pH titrations.
Spectral Data Processing: - Processes raw spectral data (e.g., from fluorescence spectroscopy). - Offers two methods for data reduction:
Singular Value Decomposition (SVD) to extract the most significant spectral component.
Band integration over a specified wavelength range.
Curve Fitting: Provides three distinct fitting backends to determine the dissociation constant (K) and other parameters:
Least-Squares (LM): Utilizes the lmfit library for robust non-linear least-squares minimization. Supports iterative reweighting and outlier removal.
Orthogonal Distance Regression (ODR): Employs scipy.odr to account for uncertainties in both x and y variables, which is crucial when x-values (e.g., pH measurements) have errors.
Bayesian Modeling (PyMC): Implements a hierarchical Bayesian model using pymc. This approach is powerful for:
Quantifying parameter uncertainties as full posterior distributions.
Modeling errors in x-values as latent variables.
Sharing information between multiple experiments (hierarchical fitting) to obtain more robust parameter estimates.
Result Visualization: Includes extensive plotting functions to visualize: - Raw and processed spectra. - Fitted curves with confidence intervals. - Diagnostic plots for SVD and Bayesian analyses (e.g., corner plots).
Functions#
|
Estimate initial weights for a DataArray by fitting it individually. |
Assign weights to all DataArrays within a Dataset. |
|
|
Analyze spectra titration, fit the data, and plot the results. |
|
Analyze multi-label titration datasets and visualize the results. |
|
Analyze multi-label spectra visualize the results. |
|
Remove outliers and reassign weights. |
|
RLS and outlier removal for multi-label titration datasets. |
|
Analyze multi-label titration datasets using ODR. |
|
Analyze multi-label titration datasets using IRLS. |
|
Identify outliers. |
Module Contents#
- clophfit.fitting.core.weight_da(da, *, is_ph)#
Estimate initial weights for a DataArray by fitting it individually.
The standard error of the residuals from this initial fit is used as the uncertainty (y_err) for subsequent weighted fits.
- Parameters:
da (DataArray) – The data array to be weighted.
is_ph (bool) – Whether the titration is pH-based.
- Returns:
True if the weighting fit was successful, False otherwise.
- Return type:
bool
- clophfit.fitting.core.weight_multi_ds_titration(ds)#
Assign weights to all DataArrays within a Dataset.
Iterates through each DataArray in the Dataset, calling weight_da to estimate y_err. For any DataArray where weighting fails (e.g., due to insufficient data), a fallback error is assigned based on the errors from successfully fitted arrays.
Optimized version with reduced set operations and memory allocations.
- Parameters:
- Return type:
None
- clophfit.fitting.core.analyze_spectra(spectra, *, is_ph, band=None)#
Analyze spectra titration, fit the data, and plot the results.
This function performs either Singular Value Decomposition (SVD) or integrates spectra over a specified band.
- Parameters:
spectra (pd.DataFrame) – The DataFrame containing spectra (one spectrum for each column).
is_ph (bool) – Whether the x-axis represents pH.
band (tuple[int, int] | None) – If provided, use the ‘band’ integration method. Otherwise, use ‘svd’.
- Returns:
An object containing the fit results and the summary plot.
- Return type:
FitResult[Minimizer]
- Raises:
ValueError – If the band parameters are not in the spectra’s index when the band method is used.
Notes
Creates plots of spectra, principal component vectors, singular values, fit of the first principal component and PCA for SVD; only of spectra and fit for Band method.
- clophfit.fitting.core.fit_binding_glob(ds, *, robust=False)#
Analyze multi-label titration datasets and visualize the results.
- Parameters:
ds (Dataset) – Input dataset with x, y, and y_err for each label.
robust (bool) – If True, use Huber loss for robust fitting (reduces outlier influence).
- Return type:
FitResult[Minimizer]
- Raises:
InsufficientDataError – If there are not enough data points for the number of parameters.
Notes
Parameter uncertainties are scaled by sqrt(reduced_chi_sq) via lmfit’s Minimizer(scale_covar=True), which improves coverage when errors are underestimated.
Residuals returned are WEIGHTED (weight * (observed - predicted)) where weight = 1/y_err. This is appropriate for heteroscedastic data where different observations have different uncertainties. For homoscedastic data, weighted and raw residuals are proportional.
- clophfit.fitting.core.analyze_spectra_glob(titration, ds, dbands=None)#
Analyze multi-label spectra visualize the results.
- Parameters:
titration (dict[str, pandas.DataFrame])
dbands (dict[str, tuple[int, int]] | None)
- Return type:
- clophfit.fitting.core.outlier2(ds, key='', threshold=3.0, *, plot_z_scores=False, error_model='uniform')#
Remove outliers and reassign weights.
- Parameters:
ds (Dataset) – Input dataset.
key (str) – Identifier for logging.
threshold (float) – Z-score threshold for outlier detection.
plot_z_scores (bool) – Whether to plot z-scores.
error_model (str) – Error reweighting model: “uniform” assigns uniform errors per label, “shot-noise” rescales physical errors preserving relative structure.
- Return type:
FitResult[Minimizer]
- clophfit.fitting.core.fit_binding_glob_reweighted(ds, key, threshold=2.05)#
RLS and outlier removal for multi-label titration datasets.
- Parameters:
key (str)
threshold (float)
- Return type:
clophfit.fitting.data_structures.FitResult[lmfit.minimizer.Minimizer]
- clophfit.fitting.core.fit_binding_glob_recursive(ds, max_iterations=15, tol=0.1)#
Analyze multi-label titration datasets using ODR.
- Parameters:
max_iterations (int)
tol (float)
- Return type:
clophfit.fitting.data_structures.FitResult[lmfit.minimizer.Minimizer]
- clophfit.fitting.core.fit_binding_glob_recursive_outlier(ds, tol=0.01, threshold=3.0)#
Analyze multi-label titration datasets using IRLS.
- Parameters:
tol (float)
threshold (float)
- Return type:
clophfit.fitting.data_structures.FitResult[lmfit.minimizer.Minimizer]
- clophfit.fitting.core.outlier_glob(residuals, *, threshold=2.0, plot_z_scores=False)#
Identify outliers.
- Parameters:
residuals (clophfit.clophfit_types.ArrayF)
threshold (float)
plot_z_scores (bool)
- Return type:
clophfit.clophfit_types.ArrayMask