clophfit.fitting.core ===================== .. py:module:: clophfit.fitting.core .. autoapi-nested-parse:: Clophfit: Fitting of Cl- binding and pH titration curves. This module provides a comprehensive suite of tools for analyzing titration data, particularly for chloride binding and pH titration experiments common in biochemistry, such as those involving fluorescent probes. Core Functionality: ------------------- 1. **Data Modeling**: Implements a 1-site binding model suitable for both ligand concentration and pH titrations. 2. **Spectral Data Processing**: - Processes raw spectral data (e.g., from fluorescence spectroscopy). - Offers two methods for data reduction: - Singular Value Decomposition (SVD) to extract the most significant spectral component. - Band integration over a specified wavelength range. 3. **Curve Fitting**: Provides three distinct fitting backends to determine the dissociation constant (K) and other parameters: - **Least-Squares (LM)**: Utilizes the `lmfit` library for robust non-linear least-squares minimization. Supports iterative reweighting and outlier removal. - **Orthogonal Distance Regression (ODR)**: Employs `scipy.odr` to account for uncertainties in both x and y variables, which is crucial when x-values (e.g., pH measurements) have errors. - **Bayesian Modeling (PyMC)**: Implements a hierarchical Bayesian model using `pymc`. This approach is powerful for: - Quantifying parameter uncertainties as full posterior distributions. - Modeling errors in x-values as latent variables. - Sharing information between multiple experiments (hierarchical fitting) to obtain more robust parameter estimates. 4. **Result Visualization**: Includes extensive plotting functions to visualize: - Raw and processed spectra. - Fitted curves with confidence intervals. - Diagnostic plots for SVD and Bayesian analyses (e.g., corner plots). Functions --------- .. autoapisummary:: clophfit.fitting.core.weight_da clophfit.fitting.core.weight_multi_ds_titration clophfit.fitting.core.analyze_spectra clophfit.fitting.core.fit_binding_glob clophfit.fitting.core.analyze_spectra_glob clophfit.fitting.core.outlier2 clophfit.fitting.core.fit_binding_glob_reweighted clophfit.fitting.core.fit_binding_glob_recursive clophfit.fitting.core.fit_binding_glob_recursive_outlier clophfit.fitting.core.outlier_glob Module Contents --------------- .. py:function:: weight_da(da, *, is_ph) Estimate initial weights for a DataArray by fitting it individually. The standard error of the residuals from this initial fit is used as the uncertainty (`y_err`) for subsequent weighted fits. :param da: The data array to be weighted. :type da: DataArray :param is_ph: Whether the titration is pH-based. :type is_ph: bool :returns: True if the weighting fit was successful, False otherwise. :rtype: bool .. py:function:: weight_multi_ds_titration(ds) Assign weights to all DataArrays within a Dataset. Iterates through each `DataArray` in the `Dataset`, calling `weight_da` to estimate `y_err`. For any `DataArray` where weighting fails (e.g., due to insufficient data), a fallback error is assigned based on the errors from successfully fitted arrays. Optimized version with reduced set operations and memory allocations. .. py:function:: analyze_spectra(spectra, *, is_ph, band = None) Analyze spectra titration, fit the data, and plot the results. This function performs either Singular Value Decomposition (SVD) or integrates spectra over a specified band. :param spectra: The DataFrame containing spectra (one spectrum for each column). :type spectra: pd.DataFrame :param is_ph: Whether the x-axis represents pH. :type is_ph: bool :param band: If provided, use the 'band' integration method. Otherwise, use 'svd'. :type band: tuple[int, int] | None :returns: An object containing the fit results and the summary plot. :rtype: FitResult[Minimizer] :raises ValueError: If the band parameters are not in the spectra's index when the band method is used. .. rubric:: Notes Creates plots of spectra, principal component vectors, singular values, fit of the first principal component and PCA for SVD; only of spectra and fit for Band method. .. py:function:: fit_binding_glob(ds, *, robust = False) Analyze multi-label titration datasets and visualize the results. :param ds: Input dataset with x, y, and y_err for each label. :type ds: Dataset :param robust: If True, use Huber loss for robust fitting (reduces outlier influence). :type robust: bool :rtype: FitResult[Minimizer] :raises InsufficientDataError: If there are not enough data points for the number of parameters. .. rubric:: Notes Parameter uncertainties are scaled by sqrt(reduced_chi_sq) via lmfit's Minimizer(scale_covar=True), which improves coverage when errors are underestimated. Residuals returned are WEIGHTED (weight * (observed - predicted)) where weight = 1/y_err. This is appropriate for heteroscedastic data where different observations have different uncertainties. For homoscedastic data, weighted and raw residuals are proportional. .. py:function:: analyze_spectra_glob(titration, ds, dbands = None) Analyze multi-label spectra visualize the results. .. py:function:: outlier2(ds, key = '', threshold = 3.0, *, plot_z_scores = False, error_model = 'uniform') Remove outliers and reassign weights. :param ds: Input dataset. :type ds: Dataset :param key: Identifier for logging. :type key: str :param threshold: Z-score threshold for outlier detection. :type threshold: float :param plot_z_scores: Whether to plot z-scores. :type plot_z_scores: bool :param error_model: Error reweighting model: "uniform" assigns uniform errors per label, "shot-noise" rescales physical errors preserving relative structure. :type error_model: str :rtype: FitResult[Minimizer] .. py:function:: fit_binding_glob_reweighted(ds, key, threshold = 2.05) RLS and outlier removal for multi-label titration datasets. .. py:function:: fit_binding_glob_recursive(ds, max_iterations = 15, tol = 0.1) Analyze multi-label titration datasets using ODR. .. py:function:: fit_binding_glob_recursive_outlier(ds, tol = 0.01, threshold = 3.0) Analyze multi-label titration datasets using IRLS. .. py:function:: outlier_glob(residuals, *, threshold = 2.0, plot_z_scores = False) Identify outliers.