clophfit.fitting.core
=====================

.. py:module:: clophfit.fitting.core

.. autoapi-nested-parse::

   Clophfit: Fitting of Cl- binding and pH titration curves.

   This module provides a comprehensive suite of tools for analyzing titration data,
   particularly for chloride binding and pH titration experiments common in biochemistry,
   such as those involving fluorescent probes.

   Core Functionality:
   -------------------
   1.  **Data Modeling**: Implements a 1-site binding model suitable for both
       ligand concentration and pH titrations.

   2.  **Spectral Data Processing**:
       -   Processes raw spectral data (e.g., from fluorescence spectroscopy).
       -   Offers two methods for data reduction:

           -   Singular Value Decomposition (SVD) to extract the most significant
               spectral component.
           -   Band integration over a specified wavelength range.

   3.  **Curve Fitting**: Provides three distinct fitting backends to determine the
       dissociation constant (K) and other parameters:

       -   **Least-Squares (LM)**: Utilizes the `lmfit` library for robust non-linear
           least-squares minimization. Supports iterative reweighting and outlier
           removal.
       -   **Orthogonal Distance Regression (ODR)**: Employs `scipy.odr` to account
           for uncertainties in both x and y variables, which is crucial when x-values
           (e.g., pH measurements) have errors.
       -   **Bayesian Modeling (PyMC)**: Implements a hierarchical Bayesian model
           using `pymc`. This approach is powerful for:

           -   Quantifying parameter uncertainties as full posterior distributions.
           -   Modeling errors in x-values as latent variables.
           -   Sharing information between multiple experiments (hierarchical fitting)
               to obtain more robust parameter estimates.

   4.  **Result Visualization**: Includes extensive plotting functions to visualize:
       -   Raw and processed spectra.
       -   Fitted curves with confidence intervals.
       -   Diagnostic plots for SVD and Bayesian analyses (e.g., corner plots).


Functions
---------

.. autoapisummary::

   clophfit.fitting.core.weight_da
   clophfit.fitting.core.weight_multi_ds_titration
   clophfit.fitting.core.analyze_spectra
   clophfit.fitting.core.fit_binding_glob
   clophfit.fitting.core.analyze_spectra_glob
   clophfit.fitting.core.outlier2
   clophfit.fitting.core.fit_binding_glob_reweighted
   clophfit.fitting.core.fit_binding_glob_recursive
   clophfit.fitting.core.fit_binding_glob_recursive_outlier
   clophfit.fitting.core.outlier_glob


Module Contents
---------------

.. py:function:: weight_da(da, *, is_ph)

   Estimate initial weights for a DataArray by fitting it individually.

   The standard error of the residuals from this initial fit is used as
   the uncertainty (`y_err`) for subsequent weighted fits.

   :param da: The data array to be weighted.
   :type da: DataArray
   :param is_ph: Whether the titration is pH-based.
   :type is_ph: bool

   :returns: True if the weighting fit was successful, False otherwise.
   :rtype: bool


.. py:function:: weight_multi_ds_titration(ds)

   Assign weights to all DataArrays within a Dataset.

   Iterates through each `DataArray` in the `Dataset`, calling `weight_da`
   to estimate `y_err`. For any `DataArray` where weighting fails (e.g., due
   to insufficient data), a fallback error is assigned based on the errors
   from successfully fitted arrays.

   Optimized version with reduced set operations and memory allocations.


.. py:function:: analyze_spectra(spectra, *, is_ph, band = None)

   Analyze spectra titration, fit the data, and plot the results.

   This function performs either Singular Value Decomposition (SVD) or
   integrates spectra over a specified band.

   :param spectra: The DataFrame containing spectra (one spectrum for each column).
   :type spectra: pd.DataFrame
   :param is_ph: Whether the x-axis represents pH.
   :type is_ph: bool
   :param band: If provided, use the 'band' integration method. Otherwise, use 'svd'.
   :type band: tuple[int, int] | None

   :returns: An object containing the fit results and the summary plot.
   :rtype: FitResult[Minimizer]

   :raises ValueError: If the band parameters are not in the spectra's index when the band
       method is used.

   .. rubric:: Notes

   Creates plots of spectra, principal component vectors, singular values, fit
   of the first principal component and PCA for SVD; only of spectra and fit
   for Band method.


.. py:function:: fit_binding_glob(ds, *, robust = False)

   Analyze multi-label titration datasets and visualize the results.

   :param ds: Input dataset with x, y, and y_err for each label.
   :type ds: Dataset
   :param robust: If True, use Huber loss for robust fitting (reduces outlier influence).
   :type robust: bool

   :rtype: FitResult[Minimizer]

   :raises InsufficientDataError: If there are not enough data points for the number of parameters.

   .. rubric:: Notes

   Parameter uncertainties are scaled by sqrt(reduced_chi_sq) via lmfit's
   Minimizer(scale_covar=True), which improves coverage when errors are
   underestimated.

   Residuals returned are WEIGHTED (weight * (observed - predicted)) where
   weight = 1/y_err. This is appropriate for heteroscedastic data where
   different observations have different uncertainties. For homoscedastic data,
   weighted and raw residuals are proportional.


.. py:function:: analyze_spectra_glob(titration, ds, dbands = None)

   Analyze multi-label spectra visualize the results.


.. py:function:: outlier2(ds, key = '', threshold = 3.0, *, plot_z_scores = False, error_model = 'uniform')

   Remove outliers and reassign weights.

   :param ds: Input dataset.
   :type ds: Dataset
   :param key: Identifier for logging.
   :type key: str
   :param threshold: Z-score threshold for outlier detection.
   :type threshold: float
   :param plot_z_scores: Whether to plot z-scores.
   :type plot_z_scores: bool
   :param error_model: Error reweighting model: "uniform" assigns uniform errors per label,
                       "shot-noise" rescales physical errors preserving relative structure.
   :type error_model: str

   :rtype: FitResult[Minimizer]


.. py:function:: fit_binding_glob_reweighted(ds, key, threshold = 2.05)

   RLS and outlier removal for multi-label titration datasets.


.. py:function:: fit_binding_glob_recursive(ds, max_iterations = 15, tol = 0.1)

   Analyze multi-label titration datasets using ODR.


.. py:function:: fit_binding_glob_recursive_outlier(ds, tol = 0.01, threshold = 3.0)

   Analyze multi-label titration datasets using IRLS.


.. py:function:: outlier_glob(residuals, *, threshold = 2.0, plot_z_scores = False)

   Identify outliers.