Mutual Information Functions
This module contains various mutual information estimation methods and related functions.
Core MI Functions
- driada.information.get_mi(x, y, shift=0, ds=1, k=5, estimator='gcmi', check_for_coincidence=False, mi_estimator_kwargs=None)[source]
Compute mutual information between two (possibly multidimensional) variables.
Efficiently calculates mutual information (MI) between continuous, discrete, or mixed-type variables. Supports both univariate and multivariate inputs, with time-shifted analysis capabilities for temporal dependencies.
- Parameters:
x (TimeSeries, MultiTimeSeries, or array-like) – First variable. Can be: - TimeSeries: univariate time series (continuous or discrete) - MultiTimeSeries: multivariate time series - array-like: converted to TimeSeries internally
y (TimeSeries, MultiTimeSeries, or array-like) – Second variable. Must have same length as x.
shift (int, default=0) – Number of samples to shift y after downsampling. Positive values shift y forward in time (y leads x). Used for time-delayed MI.
ds (int, default=1) – Downsampling factor. Takes every ds-th sample to reduce computation. Note: for GCMI with ds>1, copula transform is applied before downsampling which may affect accuracy for large ds or non-smooth signals.
k (int, default=5) – Number of nearest neighbors for KSG estimator. Common values: - k=4-5: optimal for most applications - k=3-10: for low dimensions (d≤3) - k=10-20: for higher dimensions
estimator ({'gcmi', 'ksg'}, default='gcmi') – MI estimation method: - ‘gcmi’: Gaussian Copula MI (fast, gives lower bound) - ‘ksg’: Kraskov-Stögbauer-Grassberger (slower, more accurate)
check_for_coincidence (bool, default=False) – If True, checks for MI(X,X) computation and handles appropriately: - For discrete single TimeSeries: returns H(X) (well-defined) - For continuous variables: raises ValueError (MI would be infinite) - For discrete MultiTimeSeries: raises NotImplementedError (not yet supported) Set to False to bypass this check (use with caution).
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.
- Returns:
Mutual information in bits. Always non-negative (clipped at 0). For GCMI, this is a lower bound on the true MI.
- Return type:
Notes
The function automatically handles different variable type combinations: - Continuous-Continuous: Uses GCMI or KSG as specified - Discrete-Discrete: Uses exact MI computation (same for both estimators) - Mixed (Continuous-Discrete): Uses appropriate mixed estimators - Multivariate: Supported for continuous variables only
For discrete-discrete MI, the estimator parameter is ignored since MI can be computed exactly from the joint probability distribution.
GCMI is recommended for most applications as it’s much faster and provides a useful lower bound. KSG is more accurate but computationally expensive, especially for large datasets.
Examples
>>> # Simple correlation detection >>> np.random.seed(42) >>> x = np.random.randn(1000) >>> y = x + np.random.randn(1000) * 0.5 >>> mi = get_mi(x, y) >>> print(f"MI = {mi:.3f} bits") MI = 1.114 bits
>>> # Time-delayed mutual information >>> ts1 = TimeSeries(np.sin(np.linspace(0, 10*np.pi, 1000))) >>> ts2 = TimeSeries(np.sin(np.linspace(0, 10*np.pi, 1000) + np.pi/4)) >>> mi_delay = get_mi(ts1, ts2, shift=25) # Check 25-sample delay
>>> # Multivariate MI >>> mts1 = MultiTimeSeries(np.random.randn(3, 1000), discrete=False) >>> mts2 = MultiTimeSeries(np.random.randn(2, 1000), discrete=False) >>> mi_multi = get_mi(mts1, mts2)
See also
get_1d_miMI for univariate time series (called internally)
get_multi_miMI between multiple and single time series
get_tdmiTime-delayed MI for finding optimal embedding delays
conditional_miConditional mutual information I(X;Y|Z)
- driada.information.get_1d_mi(ts1, ts2, shift=0, ds=1, k=5, estimator='gcmi', check_for_coincidence=True, mi_estimator_kwargs=None)[source]
Computes mutual information between two 1d variables efficiently
- Parameters:
ts1 (TimeSeries/MultiTimeSeries instance or numpy array) – First time series or variable
ts2 (TimeSeries/MultiTimeSeries instance or numpy array) – Second time series or variable
shift (int, default=0) – ts2 will be roll-moved by the number ‘shift’ after downsampling by ‘ds’ factor
ds (int, default=1) – downsampling constant (take every ‘ds’-th point)
k (int, default=5) – number of neighbors for ksg estimator
estimator (str, default='gcmi') –
Estimation method. Should be ‘ksg’ (accurate but slow) and ‘gcmi’ (fast, but estimates the lower bound on MI). In most cases ‘gcmi’ should be preferred.
Note on downsampling with GCMI: For performance reasons, when ds > 1, the copula transformation is applied to the full data before downsampling. This is an approximation that works well for small downsampling factors (ds ≤ 5) and smooth signals, but may introduce inaccuracies for large downsampling factors or highly variable signals.
check_for_coincidence (bool, default=True) – If True, checks for MI(X,X) computation at zero shift: - For discrete variables: returns H(X) - For continuous variables: raises ValueError (MI is infinite) Default: True.
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.
- Returns:
mi – Mutual information in bits (or its lower bound in case of ‘gcmi’ estimator) between ts1 and (possibly) shifted ts2. Both estimators return values in bits.
- Return type:
- driada.information.get_multi_mi(tslist, ts2, shift=0, ds=1, k=5, estimator='gcmi', mi_estimator_kwargs=None)[source]
Compute mutual information between multiple time series and a single time series.
- Parameters:
tslist (list of TimeSeries) – List of TimeSeries objects (multivariate X)
ts2 (TimeSeries) – Single TimeSeries object (Y)
shift (int, optional) – Number of samples to shift ts2. Default: 0
ds (int, optional) – Downsampling factor. Default: 1
k (int, optional) – Number of neighbors for KSG estimator. Default: 5
estimator (str, optional) –
Estimation method. ‘gcmi’ (fast, lower bound) or ‘ksg’ (slower, more accurate). Default: ‘gcmi’
Note on downsampling with GCMI: For performance reasons, when ds > 1, the copula transformation is applied to the full data before downsampling. This is an approximation that works well for small downsampling factors (ds ≤ 5) and smooth signals, but may introduce inaccuracies for large downsampling factors or highly variable signals.
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.
- Returns:
Mutual information I(X;Y) in bits where X is the multivariate input from tslist
- Return type:
- driada.information.conditional_mi(ts1, ts2, ts3, ds=1, k=5)[source]
Calculate conditional mutual information I(X;Y|Z).
Computes the conditional mutual information between ts1 (X) and ts2 (Y) given ts3 (Z) for various combinations of continuous and discrete variables.
- Parameters:
ts1 (TimeSeries) – First variable (X). Must be continuous.
ts2 (TimeSeries) – Second variable (Y). Can be continuous or discrete.
ts3 (TimeSeries) – Conditioning variable (Z). Can be continuous or discrete.
ds (int, optional) – Downsampling factor. Default: 1.
k (int, optional) – Number of neighbors for entropy estimation. Default: 5.
- Returns:
Conditional mutual information I(X;Y|Z) in bits.
- Return type:
- Raises:
ValueError – If ts1 is discrete (only continuous X is currently supported).
Notes
Supports four cases: - CCC: All continuous - uses Gaussian copula - CCD: X,Y continuous, Z discrete - uses Gaussian copula per Z value - CDC: X,Z continuous, Y discrete - uses chain rule identity - CDD: X continuous, Y,Z discrete - uses entropy decomposition
For the CDD case, GCMI estimator has limitations due to uncontrollable biases (copula transform does not conserve entropy). See https://doi.org/10.1002/hbm.23471 for details.
Conditional MI can be negative due to estimation biases, especially with finite samples. This function uses adaptive thresholds: - Small negatives (< 1% of entropy scale): Silently clipped to 0 - Moderate negatives (1-10% of scale): Clipped with warning - Large negatives (> 10% of scale): Raises ValueError
The CDD case is particularly prone to negative biases due to mixed estimators and receives more lenient treatment.
- driada.information.interaction_information(ts1, ts2, ts3, ds=1, k=5)[source]
Calculate three-way interaction information II(X;Y;Z).
The interaction information quantifies the amount of information that is shared among all three variables. It can be positive (synergy) or negative (redundancy).
- Parameters:
ts1 (TimeSeries) – First variable (X). Must be continuous.
ts2 (TimeSeries) – Second variable (Y). Can be continuous or discrete.
ts3 (TimeSeries) – Third variable (Z). Can be continuous or discrete.
ds (int, optional) – Downsampling factor. Default: 1.
k (int, optional) – Number of neighbors for entropy estimation. Default: 5.
- Returns:
Interaction information II(X;Y;Z) in bits. - II < 0: Redundancy (Y and Z provide overlapping information about X) - II > 0: Synergy (Y and Z together provide more information than separately)
- Return type:
Notes
The interaction information is computed using Williams & Beer convention: II(X;Y;Z) = I(X;Y|Z) - I(X;Y) = I(X;Z|Y) - I(X;Z)
This implementation assumes X is the target variable (e.g., neural activity) and Y, Z are predictor variables (e.g., behavioral features).
Time-Delayed MI
- driada.information.get_tdmi(data, min_shift=1, max_shift=100, nn=5, estimator='gcmi')[source]
Compute time-delayed mutual information (TDMI) for a time series.
Calculates mutual information between a time series and delayed versions of itself across a range of time lags. Useful for detecting temporal dependencies and optimal embedding delays.
- Parameters:
data (array-like) – 1D time series data.
min_shift (int, optional) – Minimum time lag to compute. Default: 1.
max_shift (int, optional) – Maximum time lag to compute (exclusive). Default: 100.
nn (int, optional) – Number of nearest neighbors for KSG MI estimation. Only used when estimator=’ksg’. Default: 5.
estimator ({'gcmi', 'ksg'}, optional) – MI estimator to use. ‘gcmi’ is faster but provides a lower bound, ‘ksg’ is more accurate but slower. Default: ‘gcmi’.
- Returns:
TDMI values in bits for each time lag from min_shift to max_shift-1.
- Return type:
Notes
The first minimum in TDMI often indicates optimal embedding delay
High TDMI at specific lags indicates periodic structure
All values are returned in bits for consistency
For long time series, ‘gcmi’ is recommended for speed
For precise embedding delay detection, ‘ksg’ may be more accurate
Examples
>>> data = np.sin(np.linspace(0, 10*np.pi, 1000)) >>> tdmi = get_tdmi(data, min_shift=1, max_shift=50) >>> optimal_delay = np.argmin(tdmi) + 1 # First minimum >>> >>> # Using KSG for more accuracy >>> tdmi_ksg = get_tdmi(data, min_shift=1, max_shift=50, estimator='ksg')
Similarity Measures
- driada.information.get_sim(x, y, metric, shift=0, ds=1, k=5, estimator='gcmi', check_for_coincidence=False, mi_estimator_kwargs=None)[source]
Computes similarity between two (possibly multidimensional) variables efficiently
- Parameters:
x (TimeSeries, MultiTimeSeries, or numpy.ndarray) – First time series. If numpy array, will be converted to TimeSeries (1D) or MultiTimeSeries (2D+).
y (TimeSeries, MultiTimeSeries, or numpy.ndarray) – Second time series. If numpy array, will be converted to TimeSeries (1D) or MultiTimeSeries (2D+).
metric (str) – Similarity metric to compute. Options include: - ‘mi’: Mutual information (supports multivariate data) - ‘spearman’, ‘pearson’, ‘kendall’: Correlation coefficients (univariate only) - ‘av’: Activity ratio (requires one binary and one continuous variable) - ‘fast_pearsonr’: Fast Pearson correlation (univariate only) - Any scipy.stats correlation function name (univariate only)
shift (int, optional) – Time shift to apply to y before computing similarity. Positive values shift y forward in time. Default is 0.
ds (int, optional) – Downsampling factor. Only every ds-th sample is used. Default is 1.
k (int, optional) – Number of nearest neighbors for KSG mutual information estimator. Only used when metric=’mi’ and estimator=’ksg’. Default is 5.
estimator ({'gcmi', 'ksg'}, optional) – Estimator to use for mutual information calculation. Only used when metric=’mi’. Default is ‘gcmi’.
check_for_coincidence (bool, optional) – Whether to check if x and y contain identical data (which would result in infinite MI). Only used for MI calculation. Default is False.
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.
- Returns:
similarity – Similarity value between x and (possibly shifted) y. The interpretation depends on the metric: - MI: Non-negative value in bits - Correlations: Value between -1 and 1 - Activity ratio: Non-negative ratio
- Return type:
- Raises:
ValueError – If metric is not supported for the given variable types (e.g., ‘av’ requires one binary and one continuous variable). If trying to use correlation metrics with multivariate data.
Exception – If multidimensional inputs are not provided as MultiTimeSeries.
Gaussian Copula MI (GCMI)
Fast parametric mutual information estimation using Gaussian copulas.
- driada.information.mi_gg(x, y, biascorrect=True, demeaned=False)
Mutual information (MI) between two Gaussian variables in bits.
Computes mutual information between two (possibly multidimensional) Gaussian variables using the entropy relation: I(X;Y) = H(X) + H(Y) - H(X,Y). Assumes variables follow a multivariate Gaussian distribution.
- Parameters:
x (ndarray) – First variable, shape (n_features_x, n_samples) or (n_samples,) for 1D. Columns correspond to samples, rows to dimensions/variables.
y (ndarray) – Second variable, shape (n_features_y, n_samples) or (n_samples,) for 1D. Must have same number of samples as x.
biascorrect (bool, default=True) – Whether to apply bias correction to the MI estimate. Uses psi function (digamma) correction for finite sample bias in entropy estimation.
demeaned (bool, default=False) – Whether input data already has zero mean. Set True if data has been copula-normalized or otherwise centered.
- Returns:
Mutual information in bits. Always non-negative, with 0 indicating independence.
- Return type:
- Raises:
ValueError – If x and y have different number of samples, or if inputs are not 1D or 2D arrays.
Notes
This function assumes data follows a multivariate Gaussian distribution. For non-Gaussian data, use copula normalization first (via ctransform).
The bias correction uses the Miller-Madow correction generalized to multivariate Gaussians, improving accuracy for small sample sizes.
Examples
>>> # Independent Gaussian variables >>> rng = np.random.RandomState(42) >>> x = rng.randn(1, 1000) # 1D Gaussian >>> y = rng.randn(1, 1000) # Independent 1D Gaussian >>> mi = mi_gg(x, y) >>> mi < 0.05 # Should be near 0 for independent variables True
>>> # Correlated Gaussian variables >>> x = rng.randn(1, 1000) >>> y = 0.7 * x + 0.3 * rng.randn(1, 1000) # Correlated >>> mi = mi_gg(x, y) >>> mi > 0.5 # Significant mutual information True
>>> # Multidimensional case >>> x = rng.randn(3, 1000) # 3D Gaussian >>> y = rng.randn(2, 1000) # 2D Gaussian >>> # Create correlation: y depends on x >>> y[0] = 0.5 * x[0] + 0.5 * rng.randn(1000) >>> mi = mi_gg(x, y) >>> mi > 0.2 # Detects the dependency True
See also
References
Ince, R. A., et al. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573.
- driada.information.gcmi_cc(x, y)[source]
Gaussian-Copula Mutual Information between two continuous variables.
Main user-facing function for computing mutual information between continuous variables using the Gaussian Copula MI (GCMI) method. Handles arbitrary continuous distributions by transforming marginals to Gaussian via copula normalization.
- Parameters:
x (ndarray) – First continuous variable, shape (n_features_x, n_samples) or (n_samples,) for 1D. If 2D, rows are features and columns are samples. If 1D, treated as single feature with multiple samples.
y (ndarray) – Second continuous variable, shape (n_features_y, n_samples) or (n_samples,) for 1D. Must have same number of samples as x.
- Returns:
Mutual information in bits. Always non-negative, with 0 indicating independence. Provides a lower bound to the true MI.
- Return type:
Notes
The GCMI method:
Transforms each variable to standard normal marginals using the empirical CDF (copula transform)
Computes MI under the Gaussian copula assumption
Applies bias correction for finite samples
This approach is:
Robust to outliers due to rank-based transform
Computationally efficient (no density estimation)
Provides MI lower bound (exact for jointly Gaussian data)
Suitable for continuous neural data (firing rates, LFP, etc.)
For discrete variables, use gcmi_cd or gcmi_ccd instead.
Examples
>>> # Linear relationship with non-Gaussian marginals >>> rng = np.random.RandomState(42) >>> x = rng.exponential(size=1000) # Non-Gaussian >>> y = 2 * x + rng.normal(0, 0.5, size=1000) >>> mi = gcmi_cc(x, y) >>> mi > 1.0 # Detects strong dependency True
>>> # Monotonic nonlinear relationship >>> x = rng.exponential(size=1000) >>> y = np.log(x + 1) + rng.normal(0, 0.1, size=1000) >>> mi = gcmi_cc(x, y) >>> mi > 0.5 # Detects monotonic dependency True
>>> # Multidimensional example >>> x = rng.randn(3, 1000) # 3D variable >>> y = rng.randn(2, 1000) # 2D variable >>> # Create dependency >>> y[0] = 0.5 * x[0] + 0.3 * x[1] + 0.2 * rng.randn(1000) >>> mi = gcmi_cc(x, y) >>> mi > 0.3 # Detects multivariate dependency True
See also
References
Ince, R. A., et al. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573.
- driada.information.gccmi_ccd(x, y, z, Zm=None)[source]
Gaussian-Copula CMI between 2 continuous variables conditioned on a discrete variable.
Calculates the conditional mutual information (CMI) between two continuous variables x and y, conditioned on a discrete variable z, using a Gaussian copula approach. This method can handle multivariate continuous variables.
The Gaussian copula transforms the marginal distributions to standard Gaussian while preserving the dependence structure, allowing efficient estimation of mutual information.
- Parameters:
x (array_like, shape (n_features_x, n_samples) or (n_samples,)) – First continuous variable. If multivariate, features are in rows and samples in columns.
y (array_like, shape (n_features_y, n_samples) or (n_samples,)) – Second continuous variable. If multivariate, features are in rows and samples in columns.
z (array_like, shape (n_samples,)) – Discrete conditioning variable. Must contain integer values in the range [0, max(z)] (inclusive).
Zm (int, optional) – Number of unique values in the discrete variable z. If None (default), it is automatically computed as len(np.unique(z)). Providing this value can be useful if you know z doesn’t contain all possible values.
- Returns:
I – Conditional mutual information I(X;Y|Z) in bits.
- Return type:
- Raises:
ValueError – If x or y have more than 2 dimensions. If z is not a 1D array. If z does not contain integer values. If the number of samples doesn’t match across inputs.
Notes
The function uses a Gaussian copula transformation to estimate CMI. For each value of the discrete conditioning variable z, it transforms the conditional distributions of x and y to Gaussian, then computes the CMI using entropy calculations on the Gaussian-transformed data.
This is particularly useful for analyzing neural data where the conditioning variable might represent experimental conditions, stimulus types, or behavioral states.
Examples
>>> # Continuous variables with dependency modulated by discrete state >>> rng = np.random.RandomState(42) >>> n_samples = 3000 >>> z = rng.choice([0, 1, 2], size=n_samples) # 3 discrete states >>> x = np.zeros(n_samples) >>> y = np.zeros(n_samples) >>> >>> # Different relationships for each state >>> for state in [0, 1, 2]: ... mask = z == state ... n_state = mask.sum() ... if state == 0: # Independent in state 0 ... x[mask] = rng.randn(n_state) ... y[mask] = rng.randn(n_state) ... elif state == 1: # Linear relationship in state 1 ... x[mask] = rng.randn(n_state) ... y[mask] = 0.8 * x[mask] + 0.2 * rng.randn(n_state) ... else: # Nonlinear in state 2 ... x[mask] = rng.uniform(-2, 2, n_state) ... y[mask] = x[mask]**2 + 0.5 * rng.randn(n_state) >>> >>> # Reshape for function input >>> x = x.reshape(1, -1) >>> y = y.reshape(1, -1) >>> >>> # CMI captures state-dependent relationships >>> cmi = gccmi_ccd(x, y, z) >>> cmi > 0.2 # Significant CMI due to state-dependent coupling True
>>> # Compare with unconditional MI >>> mi_uncond = gcmi_cc(x, y) >>> cmi > mi_uncond * 0.8 # CMI captures most of the dependency True
References
Ince, R. A., et al. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573.
- driada.information.cmi_ggg(x, y, z, biascorrect=True, demeaned=False)[source]
Conditional mutual information between two Gaussian variables given a third.
Computes CMI between two (possibly multidimensional) Gaussian variables x and y, conditioned on a third variable z. Uses entropy decomposition: I(X;Y|Z) = H(X|Z) + H(Y|Z) - H(X,Y|Z).
- Parameters:
x (array_like, shape (n_features_x, n_samples) or (n_samples,)) – First Gaussian variable. If 2D, rows are features and columns are samples. If 1D with shape (n,), converted to shape (1, n).
y (array_like, shape (n_features_y, n_samples) or (n_samples,)) – Second Gaussian variable. Must have same number of samples as x.
z (array_like, shape (n_features_z, n_samples) or (n_samples,)) – Conditioning Gaussian variable. Must have same number of samples as x and y.
biascorrect (bool, optional) – Whether to apply bias correction for finite samples. Default is True.
demeaned (bool, optional) – Whether input data already has zero mean (e.g., if copula-normalized). Default is False.
- Returns:
Conditional mutual information I(X;Y|Z) in bits.
- Return type:
- Raises:
ValueError – If x, y, or z are 3D or higher dimensional arrays. If number of samples don’t match between x, y, and z.
Notes
Conditional mutual information measures the dependency between X and Y after accounting for the influence of Z. If X and Y are independent given Z, then I(X;Y|Z) = 0.
This function assumes all variables follow a joint Gaussian distribution. For non-Gaussian data, apply copula normalization first.
The entropy decomposition uses: I(X;Y|Z) = H(X,Z) + H(Y,Z) - H(X,Y,Z) - H(Z)
Examples
>>> # Independent given Z >>> rng = np.random.RandomState(42) >>> z = rng.randn(1, 1000) >>> x = z + 0.5 * rng.randn(1, 1000) # X depends on Z >>> y = z + 0.5 * rng.randn(1, 1000) # Y depends on Z >>> # X and Y are correlated through Z >>> mi_uncond = mi_gg(x, y) >>> mi_uncond > 0.3 # Significant MI without conditioning True >>> # But independent given Z >>> cmi = cmi_ggg(x, y, z) >>> cmi < 0.05 # Near zero when conditioned on Z True
>>> # Direct dependency not explained by Z >>> z = rng.randn(1, 1000) >>> x = rng.randn(1, 1000) >>> y = 0.5 * x + 0.3 * z + 0.2 * rng.randn(1, 1000) >>> cmi = cmi_ggg(x, y, z) >>> cmi > 0.3 # Still significant after conditioning True
References
Cover, T. M., & Thomas, J. A. (2006). Elements of information theory.
- driada.information.mi_model_gd(x, y, Ym=None, biascorrect=True, demeaned=False)[source]
Mutual information between a Gaussian and a discrete variable in bits.
Computes MI between a (possibly multidimensional) Gaussian variable x and a discrete variable y using ANOVA-style model comparison. For 1D x this provides a lower bound to the mutual information.
Note: Each discrete class must have at least 2 samples for covariance estimation. Classes with fewer samples will be skipped with a warning.
- Parameters:
x (array_like, shape (n_features, n_samples) or (n_samples,)) – Gaussian variable data. If 2D, rows are features and columns are samples. If 1D with shape (n,), converted to shape (1, n) representing a single feature with n samples.
y (array_like, shape (n_samples,)) – Discrete variable containing integer values in range [0, max(y)]. Must be 1D array with same number of samples as x.
Ym (int, optional) – Number of discrete states. If None (default), automatically computed as np.max(y) + 1. Useful if y doesn’t contain all possible states.
biascorrect (bool, optional) – Whether to apply bias correction for finite samples. Default is True.
demeaned (bool, optional) – Whether input data x already has zero mean (e.g., if copula-normalized). Default is False.
- Returns:
Mutual information I(X;Y) in bits.
- Return type:
- Raises:
ValueError – If x is 3D or higher dimensional array. If y is not a 1D array. If number of samples don’t match between x and y. If Ym is not an integer.
Warning
- RuntimeWarning
If any class has fewer than 2 samples. These classes will be skipped in the MI calculation as covariance estimation requires at least 2 samples per class.
Examples
>>> import numpy as np >>> # Continuous data (2 variables, 100 samples) >>> x = np.random.randn(2, 100) >>> # Discrete labels (3 classes) >>> y = np.random.randint(0, 3, 100) >>> mi = mi_model_gd(x, y, Ym=3) >>> isinstance(mi, float) True >>> # Automatic Ym detection >>> mi_auto = mi_model_gd(x, y) >>> isinstance(mi_auto, float) True
See also
ent_gGaussian entropy estimation
KSG Estimators
Non-parametric mutual information estimation using k-nearest neighbors.
- driada.information.nonparam_mi_cc(x, y, z=None, k=5, base=2.718281828459045, alpha='auto', lf=5, precomputed_tree_x=None, precomputed_tree_y=None)[source]
Kraskov-Stögbauer-Grassberger (KSG) mutual information estimator.
Estimates mutual information between continuous variables using k-nearest neighbors. Can compute conditional MI when z is provided: I(X;Y|Z).
- Parameters:
x (array-like) – First variable, shape (n_samples,) or (n_samples, n_features_x).
y (array-like) – Second variable, shape (n_samples,) or (n_samples, n_features_y). Must have same number of samples as x.
z (array-like, optional) – Conditioning variable for conditional MI: I(X;Y|Z). Shape (n_samples,) or (n_samples, n_features_z).
k (int, default=5) – Number of nearest neighbors. Common values: - k = 4-5 for most applications - Use larger k for higher dimensions Must satisfy k < n_samples.
base (float, default=np.e) – Logarithm base. Use np.e for nats, 2 for bits, 10 for dits.
alpha (float or "auto", default="auto") – Local Non-uniformity Correction (LNC) parameter. - “auto”: automatically selects optimal alpha - float: manual alpha value (0 disables correction) - Warning: LNC disabled when k ≤ dimensionality
lf (int, default=5) – Leaf size for k-d tree construction. Smaller values may be faster for small datasets, larger values for big datasets.
precomputed_tree_x (BallTree/KDTree, optional) – Pre-built tree for x to avoid recomputation in repeated calls.
precomputed_tree_y (BallTree/KDTree, optional) – Pre-built tree for y to avoid recomputation in repeated calls.
- Returns:
Mutual information estimate in units determined by base. Always non-negative (up to estimation error).
- Return type:
Notes
Uses the KSG estimator algorithm 1: I(X;Y) = ψ(k) - <ψ(n_x + 1) + ψ(n_y + 1)> + ψ(n)
where: - ψ is the digamma function - n_x, n_y are the number of neighbors in X, Y spaces - <·> denotes average over all samples - n is the total number of samples
Small noise is added to continuous variables to break ties.
References
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138.
Gao, S., Ver Steeg, G., & Galstyan, A. (2015). Efficient Estimation of Mutual Information for Strongly Dependent Variables. AISTATS, PMLR 38:277-286. (LNC correction)
Examples
>>> # MI between correlated Gaussians >>> np.random.seed(42) >>> x = np.random.randn(1000) >>> y = x + np.random.randn(1000) * 0.5 >>> mi = nonparam_mi_cc(x, y) >>> print(f"MI = {mi:.3f} nats") MI = 0.760 nats
>>> # Conditional MI: I(X;Y|Z) >>> z = np.random.randn(1000) >>> cmi = nonparam_mi_cc(x, y, z=z) >>> print(f"cmi = {cmi:.3f} nats") cmi = 0.756 nats
- Raises:
ValueError – If arrays have different lengths, k is invalid, or base is not positive.
- driada.information.nonparam_mi_cd(x_continuous, y_discrete, k=5, base=2.718281828459045)[source]
Mutual information between continuous and discrete variables using KSG estimator.
Uses the mixed-type mutual information estimator from the KSG paper.
- Parameters:
x_continuous (array_like) – Continuous variable data of shape (n_samples,) or (n_samples, n_features). Should contain finite values.
y_discrete (array_like) – Discrete variable data of shape (n_samples,). Values should be discrete categories (integers or strings).
k (int, optional) – Number of nearest neighbors to use. Default is 5. Must be positive.
base (float, optional) – Logarithm base. Default is e (natural logarithm). Must be positive.
- Returns:
Mutual information in units determined by base. Always non-negative.
- Return type:
- Raises:
ValueError – If inputs have incompatible shapes or invalid values.
Notes
Computes MI as I(X;Y) = H(X) - H(X|Y) where H(X|Y) is the weighted average of conditional entropies. Categories with fewer than k+1 samples are skipped, which may introduce bias for small sample sizes.
- driada.information.nonparam_mi_dc(x_discrete, y_continuous, k=5, base=2.718281828459045)[source]
Mutual information between discrete and continuous variables using KSG estimator.
This is just the symmetric version of nonparam_mi_cd.
- Parameters:
x_discrete (array_like) – Discrete variable data of shape (n_samples,). Values should be discrete categories (integers or strings).
y_continuous (array_like) – Continuous variable data of shape (n_samples,) or (n_samples, n_features). Should contain finite values.
k (int, optional) – Number of nearest neighbors to use. Default is 5. Must be positive.
base (float, optional) – Logarithm base. Default is e (natural logarithm). Must be positive.
- Returns:
Mutual information in units determined by base. Always non-negative.
- Return type:
Notes
MI is symmetric, so this function simply swaps the arguments and calls nonparam_mi_cd. See that function for implementation details.