Mutual Information Functions

This module contains various mutual information estimation methods and related functions.

Core MI Functions

driada.information.get_mi(x, y, shift=0, ds=1, k=5, estimator='gcmi', check_for_coincidence=False, mi_estimator_kwargs=None)[source]

Compute mutual information between two (possibly multidimensional) variables.

Efficiently calculates mutual information (MI) between continuous, discrete, or mixed-type variables. Supports both univariate and multivariate inputs, with time-shifted analysis capabilities for temporal dependencies.

Parameters:

x (TimeSeries, MultiTimeSeries, or array-like) – First variable. Can be: - TimeSeries: univariate time series (continuous or discrete) - MultiTimeSeries: multivariate time series - array-like: converted to TimeSeries internally
y (TimeSeries, MultiTimeSeries, or array-like) – Second variable. Must have same length as x.
shift (int, default=0) – Number of samples to shift y after downsampling. Positive values shift y forward in time (y leads x). Used for time-delayed MI.
ds (int, default=1) – Downsampling factor. Takes every ds-th sample to reduce computation. Note: for GCMI with ds>1, copula transform is applied before downsampling which may affect accuracy for large ds or non-smooth signals.
k (int, default=5) – Number of nearest neighbors for KSG estimator. Common values: - k=4-5: optimal for most applications - k=3-10: for low dimensions (d≤3) - k=10-20: for higher dimensions
estimator ({'gcmi', 'ksg'}, default='gcmi') – MI estimation method: - ‘gcmi’: Gaussian Copula MI (fast, gives lower bound) - ‘ksg’: Kraskov-Stögbauer-Grassberger (slower, more accurate)
check_for_coincidence (bool, default=False) – If True, checks for MI(X,X) computation and handles appropriately: - For discrete single TimeSeries: returns H(X) (well-defined) - For continuous variables: raises ValueError (MI would be infinite) - For discrete MultiTimeSeries: raises NotImplementedError (not yet supported) Set to False to bypass this check (use with caution).
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.

Returns:

Mutual information in bits. Always non-negative (clipped at 0). For GCMI, this is a lower bound on the true MI.

Return type:

float

Notes

The function automatically handles different variable type combinations: - Continuous-Continuous: Uses GCMI or KSG as specified - Discrete-Discrete: Uses exact MI computation (same for both estimators) - Mixed (Continuous-Discrete): Uses appropriate mixed estimators - Multivariate: Supported for continuous variables only

For discrete-discrete MI, the estimator parameter is ignored since MI can be computed exactly from the joint probability distribution.

GCMI is recommended for most applications as it’s much faster and provides a useful lower bound. KSG is more accurate but computationally expensive, especially for large datasets.

Examples

>>> # Simple correlation detection
>>> np.random.seed(42)
>>> x = np.random.randn(1000)
>>> y = x + np.random.randn(1000) * 0.5
>>> mi = get_mi(x, y)
>>> print(f"MI = {mi:.3f} bits")
MI = 1.114 bits

>>> # Time-delayed mutual information
>>> ts1 = TimeSeries(np.sin(np.linspace(0, 10*np.pi, 1000)))
>>> ts2 = TimeSeries(np.sin(np.linspace(0, 10*np.pi, 1000) + np.pi/4))
>>> mi_delay = get_mi(ts1, ts2, shift=25)  # Check 25-sample delay

>>> # Multivariate MI
>>> mts1 = MultiTimeSeries(np.random.randn(3, 1000), discrete=False)
>>> mts2 = MultiTimeSeries(np.random.randn(2, 1000), discrete=False)
>>> mi_multi = get_mi(mts1, mts2)

See also

get_1d_mi: MI for univariate time series (called internally)
get_multi_mi: MI between multiple and single time series
get_tdmi: Time-delayed MI for finding optimal embedding delays
conditional_mi: Conditional mutual information I(X;Y|Z)

driada.information.get_1d_mi(ts1, ts2, shift=0, ds=1, k=5, estimator='gcmi', check_for_coincidence=True, mi_estimator_kwargs=None)[source]

Computes mutual information between two 1d variables efficiently

Parameters:

ts1 (TimeSeries/MultiTimeSeries instance or numpy array) – First time series or variable
ts2 (TimeSeries/MultiTimeSeries instance or numpy array) – Second time series or variable
shift (int, default=0) – ts2 will be roll-moved by the number ‘shift’ after downsampling by ‘ds’ factor
ds (int, default=1) – downsampling constant (take every ‘ds’-th point)
k (int, default=5) – number of neighbors for ksg estimator
estimator (str, default='gcmi') –
Estimation method. Should be ‘ksg’ (accurate but slow) and ‘gcmi’ (fast, but estimates the lower bound on MI). In most cases ‘gcmi’ should be preferred.

Note on downsampling with GCMI: For performance reasons, when ds > 1, the copula transformation is applied to the full data before downsampling. This is an approximation that works well for small downsampling factors (ds ≤ 5) and smooth signals, but may introduce inaccuracies for large downsampling factors or highly variable signals.
check_for_coincidence (bool, default=True) – If True, checks for MI(X,X) computation at zero shift: - For discrete variables: returns H(X) - For continuous variables: raises ValueError (MI is infinite) Default: True.
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.

Returns:

mi – Mutual information in bits (or its lower bound in case of ‘gcmi’ estimator) between ts1 and (possibly) shifted ts2. Both estimators return values in bits.

Return type:

float

driada.information.get_multi_mi(tslist, ts2, shift=0, ds=1, k=5, estimator='gcmi', mi_estimator_kwargs=None)[source]

Compute mutual information between multiple time series and a single time series.

Parameters:

tslist (list of TimeSeries) – List of TimeSeries objects (multivariate X)
ts2 (TimeSeries) – Single TimeSeries object (Y)
shift (int, optional) – Number of samples to shift ts2. Default: 0
ds (int, optional) – Downsampling factor. Default: 1
k (int, optional) – Number of neighbors for KSG estimator. Default: 5
estimator (str, optional) –
Estimation method. ‘gcmi’ (fast, lower bound) or ‘ksg’ (slower, more accurate). Default: ‘gcmi’

Note on downsampling with GCMI: For performance reasons, when ds > 1, the copula transformation is applied to the full data before downsampling. This is an approximation that works well for small downsampling factors (ds ≤ 5) and smooth signals, but may introduce inaccuracies for large downsampling factors or highly variable signals.
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.

Returns:

Mutual information I(X;Y) in bits where X is the multivariate input from tslist

Return type:

float

driada.information.conditional_mi(ts1, ts2, ts3, ds=1, k=5)[source]

Calculate conditional mutual information I(X;Y|Z).

Computes the conditional mutual information between ts1 (X) and ts2 (Y) given ts3 (Z) for various combinations of continuous and discrete variables.

Parameters:

ts1 (TimeSeries) – First variable (X). Must be continuous.
ts2 (TimeSeries) – Second variable (Y). Can be continuous or discrete.
ts3 (TimeSeries) – Conditioning variable (Z). Can be continuous or discrete.
ds (int, optional) – Downsampling factor. Default: 1.
k (int, optional) – Number of neighbors for entropy estimation. Default: 5.

Returns:

Conditional mutual information I(X;Y|Z) in bits.

Return type:

float

Raises:

ValueError – If ts1 is discrete (only continuous X is currently supported).

Notes

Supports four cases: - CCC: All continuous - uses Gaussian copula - CCD: X,Y continuous, Z discrete - uses Gaussian copula per Z value - CDC: X,Z continuous, Y discrete - uses chain rule identity - CDD: X continuous, Y,Z discrete - uses entropy decomposition

For the CDD case, GCMI estimator has limitations due to uncontrollable biases (copula transform does not conserve entropy). See https://doi.org/10.1002/hbm.23471 for details.

Conditional MI can be negative due to estimation biases, especially with finite samples. This function uses adaptive thresholds: - Small negatives (< 1% of entropy scale): Silently clipped to 0 - Moderate negatives (1-10% of scale): Clipped with warning - Large negatives (> 10% of scale): Raises ValueError

The CDD case is particularly prone to negative biases due to mixed estimators and receives more lenient treatment.

driada.information.interaction_information(ts1, ts2, ts3, ds=1, k=5)[source]

Calculate three-way interaction information II(X;Y;Z).

The interaction information quantifies the amount of information that is shared among all three variables. It can be positive (synergy) or negative (redundancy).

Parameters:

ts1 (TimeSeries) – First variable (X). Must be continuous.
ts2 (TimeSeries) – Second variable (Y). Can be continuous or discrete.
ts3 (TimeSeries) – Third variable (Z). Can be continuous or discrete.
ds (int, optional) – Downsampling factor. Default: 1.
k (int, optional) – Number of neighbors for entropy estimation. Default: 5.

Returns:

Interaction information II(X;Y;Z) in bits. - II < 0: Redundancy (Y and Z provide overlapping information about X) - II > 0: Synergy (Y and Z together provide more information than separately)

Return type:

float

Notes

The interaction information is computed using Williams & Beer convention: II(X;Y;Z) = I(X;Y|Z) - I(X;Y) = I(X;Z|Y) - I(X;Z)

This implementation assumes X is the target variable (e.g., neural activity) and Y, Z are predictor variables (e.g., behavioral features).

Time-Delayed MI

driada.information.get_tdmi(data, min_shift=1, max_shift=100, nn=5, estimator='gcmi')[source]

Compute time-delayed mutual information (TDMI) for a time series.

Calculates mutual information between a time series and delayed versions of itself across a range of time lags. Useful for detecting temporal dependencies and optimal embedding delays.

Parameters:

data (array-like) – 1D time series data.
min_shift (int, optional) – Minimum time lag to compute. Default: 1.
max_shift (int, optional) – Maximum time lag to compute (exclusive). Default: 100.
nn (int, optional) – Number of nearest neighbors for KSG MI estimation. Only used when estimator=’ksg’. Default: 5.
estimator ({'gcmi', 'ksg'}, optional) – MI estimator to use. ‘gcmi’ is faster but provides a lower bound, ‘ksg’ is more accurate but slower. Default: ‘gcmi’.

Returns:

TDMI values in bits for each time lag from min_shift to max_shift-1.

Return type:

list of float

Notes

The first minimum in TDMI often indicates optimal embedding delay
High TDMI at specific lags indicates periodic structure
All values are returned in bits for consistency
For long time series, ‘gcmi’ is recommended for speed
For precise embedding delay detection, ‘ksg’ may be more accurate

Examples

>>> data = np.sin(np.linspace(0, 10*np.pi, 1000))
>>> tdmi = get_tdmi(data, min_shift=1, max_shift=50)
>>> optimal_delay = np.argmin(tdmi) + 1  # First minimum
>>>
>>> # Using KSG for more accuracy
>>> tdmi_ksg = get_tdmi(data, min_shift=1, max_shift=50, estimator='ksg')

Similarity Measures

driada.information.get_sim(x, y, metric, shift=0, ds=1, k=5, estimator='gcmi', check_for_coincidence=False, mi_estimator_kwargs=None)[source]

Computes similarity between two (possibly multidimensional) variables efficiently

Parameters:

x (TimeSeries, MultiTimeSeries, or numpy.ndarray) – First time series. If numpy array, will be converted to TimeSeries (1D) or MultiTimeSeries (2D+).
y (TimeSeries, MultiTimeSeries, or numpy.ndarray) – Second time series. If numpy array, will be converted to TimeSeries (1D) or MultiTimeSeries (2D+).
metric (str) – Similarity metric to compute. Options include: - ‘mi’: Mutual information (supports multivariate data) - ‘spearman’, ‘pearson’, ‘kendall’: Correlation coefficients (univariate only) - ‘av’: Activity ratio (requires one binary and one continuous variable) - ‘fast_pearsonr’: Fast Pearson correlation (univariate only) - Any scipy.stats correlation function name (univariate only)
shift (int, optional) – Time shift to apply to y before computing similarity. Positive values shift y forward in time. Default is 0.
ds (int, optional) – Downsampling factor. Only every ds-th sample is used. Default is 1.
k (int, optional) – Number of nearest neighbors for KSG mutual information estimator. Only used when metric=’mi’ and estimator=’ksg’. Default is 5.
estimator ({'gcmi', 'ksg'}, optional) – Estimator to use for mutual information calculation. Only used when metric=’mi’. Default is ‘gcmi’.
check_for_coincidence (bool, optional) – Whether to check if x and y contain identical data (which would result in infinite MI). Only used for MI calculation. Default is False.
mi_estimator_kwargs (dict, optional) – Additional keyword arguments passed to the MI estimator function.

Returns:

similarity – Similarity value between x and (possibly shifted) y. The interpretation depends on the metric: - MI: Non-negative value in bits - Correlations: Value between -1 and 1 - Activity ratio: Non-negative ratio

Return type:

float

Raises:

ValueError – If metric is not supported for the given variable types (e.g., ‘av’ requires one binary and one continuous variable). If trying to use correlation metrics with multivariate data.
Exception – If multidimensional inputs are not provided as MultiTimeSeries.

Gaussian Copula MI (GCMI)

Fast parametric mutual information estimation using Gaussian copulas.

driada.information.mi_gg(x, y, biascorrect=True, demeaned=False)

Mutual information (MI) between two Gaussian variables in bits.

Computes mutual information between two (possibly multidimensional) Gaussian variables using the entropy relation: I(X;Y) = H(X) + H(Y) - H(X,Y). Assumes variables follow a multivariate Gaussian distribution.

Parameters:

x (ndarray) – First variable, shape (n_features_x, n_samples) or (n_samples,) for 1D. Columns correspond to samples, rows to dimensions/variables.
y (ndarray) – Second variable, shape (n_features_y, n_samples) or (n_samples,) for 1D. Must have same number of samples as x.
biascorrect (bool, default=True) – Whether to apply bias correction to the MI estimate. Uses psi function (digamma) correction for finite sample bias in entropy estimation.
demeaned (bool, default=False) – Whether input data already has zero mean. Set True if data has been copula-normalized or otherwise centered.

Returns:

Mutual information in bits. Always non-negative, with 0 indicating independence.

Return type:

float

Raises:

ValueError – If x and y have different number of samples, or if inputs are not 1D or 2D arrays.

Notes

This function assumes data follows a multivariate Gaussian distribution. For non-Gaussian data, use copula normalization first (via ctransform).

The bias correction uses the Miller-Madow correction generalized to multivariate Gaussians, improving accuracy for small sample sizes.

Examples

>>> # Independent Gaussian variables
>>> rng = np.random.RandomState(42)
>>> x = rng.randn(1, 1000)  # 1D Gaussian
>>> y = rng.randn(1, 1000)  # Independent 1D Gaussian
>>> mi = mi_gg(x, y)
>>> mi < 0.05  # Should be near 0 for independent variables
True

>>> # Correlated Gaussian variables
>>> x = rng.randn(1, 1000)
>>> y = 0.7 * x + 0.3 * rng.randn(1, 1000)  # Correlated
>>> mi = mi_gg(x, y)
>>> mi > 0.5  # Significant mutual information
True

>>> # Multidimensional case
>>> x = rng.randn(3, 1000)  # 3D Gaussian
>>> y = rng.randn(2, 1000)  # 2D Gaussian
>>> # Create correlation: y depends on x
>>> y[0] = 0.5 * x[0] + 0.5 * rng.randn(1000)
>>> mi = mi_gg(x, y)
>>> mi > 0.2  # Detects the dependency
True

See also

gcmi_cc: Gaussian-copula MI for arbitrary continuous distributions
ent_g: Gaussian entropy used in MI calculation

References

Ince, R. A., et al. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573.

driada.information.gcmi_cc(x, y)[source]

Gaussian-Copula Mutual Information between two continuous variables.

Main user-facing function for computing mutual information between continuous variables using the Gaussian Copula MI (GCMI) method. Handles arbitrary continuous distributions by transforming marginals to Gaussian via copula normalization.

Parameters:

x (ndarray) – First continuous variable, shape (n_features_x, n_samples) or (n_samples,) for 1D. If 2D, rows are features and columns are samples. If 1D, treated as single feature with multiple samples.
y (ndarray) – Second continuous variable, shape (n_features_y, n_samples) or (n_samples,) for 1D. Must have same number of samples as x.

Returns:

Mutual information in bits. Always non-negative, with 0 indicating independence. Provides a lower bound to the true MI.

Return type:

float

Notes

The GCMI method:

Transforms each variable to standard normal marginals using the empirical CDF (copula transform)
Computes MI under the Gaussian copula assumption
Applies bias correction for finite samples

This approach is:

Robust to outliers due to rank-based transform
Computationally efficient (no density estimation)
Provides MI lower bound (exact for jointly Gaussian data)
Suitable for continuous neural data (firing rates, LFP, etc.)

For discrete variables, use gcmi_cd or gcmi_ccd instead.

Examples

>>> # Linear relationship with non-Gaussian marginals
>>> rng = np.random.RandomState(42)
>>> x = rng.exponential(size=1000)  # Non-Gaussian
>>> y = 2 * x + rng.normal(0, 0.5, size=1000)
>>> mi = gcmi_cc(x, y)
>>> mi > 1.0  # Detects strong dependency
True

>>> # Monotonic nonlinear relationship
>>> x = rng.exponential(size=1000)
>>> y = np.log(x + 1) + rng.normal(0, 0.1, size=1000)
>>> mi = gcmi_cc(x, y)
>>> mi > 0.5  # Detects monotonic dependency
True

>>> # Multidimensional example
>>> x = rng.randn(3, 1000)  # 3D variable
>>> y = rng.randn(2, 1000)  # 2D variable
>>> # Create dependency
>>> y[0] = 0.5 * x[0] + 0.3 * x[1] + 0.2 * rng.randn(1000)
>>> mi = gcmi_cc(x, y)
>>> mi > 0.3  # Detects multivariate dependency
True

See also

mi_gg: MI for Gaussian variables (without copula transform)
gccmi_ccd: Conditional MI with discrete conditioning variable

References

Ince, R. A., et al. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573.

driada.information.gccmi_ccd(x, y, z, Zm=None)[source]

Gaussian-Copula CMI between 2 continuous variables conditioned on a discrete variable.

Calculates the conditional mutual information (CMI) between two continuous variables x and y, conditioned on a discrete variable z, using a Gaussian copula approach. This method can handle multivariate continuous variables.

The Gaussian copula transforms the marginal distributions to standard Gaussian while preserving the dependence structure, allowing efficient estimation of mutual information.

Parameters:

x (array_like, shape (n_features_x, n_samples) or (n_samples,)) – First continuous variable. If multivariate, features are in rows and samples in columns.
y (array_like, shape (n_features_y, n_samples) or (n_samples,)) – Second continuous variable. If multivariate, features are in rows and samples in columns.
z (array_like, shape (n_samples,)) – Discrete conditioning variable. Must contain integer values in the range [0, max(z)] (inclusive).
Zm (int, optional) – Number of unique values in the discrete variable z. If None (default), it is automatically computed as len(np.unique(z)). Providing this value can be useful if you know z doesn’t contain all possible values.

Returns:

I – Conditional mutual information I(X;Y|Z) in bits.

Return type:

float

Raises:

ValueError – If x or y have more than 2 dimensions. If z is not a 1D array. If z does not contain integer values. If the number of samples doesn’t match across inputs.

Notes

The function uses a Gaussian copula transformation to estimate CMI. For each value of the discrete conditioning variable z, it transforms the conditional distributions of x and y to Gaussian, then computes the CMI using entropy calculations on the Gaussian-transformed data.

This is particularly useful for analyzing neural data where the conditioning variable might represent experimental conditions, stimulus types, or behavioral states.

Examples

>>> # Continuous variables with dependency modulated by discrete state
>>> rng = np.random.RandomState(42)
>>> n_samples = 3000
>>> z = rng.choice([0, 1, 2], size=n_samples)  # 3 discrete states
>>> x = np.zeros(n_samples)
>>> y = np.zeros(n_samples)
>>>
>>> # Different relationships for each state
>>> for state in [0, 1, 2]:
...     mask = z == state
...     n_state = mask.sum()
...     if state == 0:  # Independent in state 0
...         x[mask] = rng.randn(n_state)
...         y[mask] = rng.randn(n_state)
...     elif state == 1:  # Linear relationship in state 1
...         x[mask] = rng.randn(n_state)
...         y[mask] = 0.8 * x[mask] + 0.2 * rng.randn(n_state)
...     else:  # Nonlinear in state 2
...         x[mask] = rng.uniform(-2, 2, n_state)
...         y[mask] = x[mask]**2 + 0.5 * rng.randn(n_state)
>>>
>>> # Reshape for function input
>>> x = x.reshape(1, -1)
>>> y = y.reshape(1, -1)
>>>
>>> # CMI captures state-dependent relationships
>>> cmi = gccmi_ccd(x, y, z)
>>> cmi > 0.2  # Significant CMI due to state-dependent coupling
True

>>> # Compare with unconditional MI
>>> mi_uncond = gcmi_cc(x, y)
>>> cmi > mi_uncond * 0.8  # CMI captures most of the dependency
True

See also

gcmi_cc: Unconditional Gaussian-copula MI
cmi_ggg: CMI for all-continuous variables

References

Ince, R. A., et al. (2017). A statistical framework for neuroimaging data analysis based on mutual information estimated via a Gaussian copula. Human Brain Mapping, 38(3), 1541-1573.

driada.information.cmi_ggg(x, y, z, biascorrect=True, demeaned=False)[source]

Conditional mutual information between two Gaussian variables given a third.

Computes CMI between two (possibly multidimensional) Gaussian variables x and y, conditioned on a third variable z. Uses entropy decomposition: I(X;Y|Z) = H(X|Z) + H(Y|Z) - H(X,Y|Z).

Parameters:

x (array_like, shape (n_features_x, n_samples) or (n_samples,)) – First Gaussian variable. If 2D, rows are features and columns are samples. If 1D with shape (n,), converted to shape (1, n).
y (array_like, shape (n_features_y, n_samples) or (n_samples,)) – Second Gaussian variable. Must have same number of samples as x.
z (array_like, shape (n_features_z, n_samples) or (n_samples,)) – Conditioning Gaussian variable. Must have same number of samples as x and y.
biascorrect (bool, optional) – Whether to apply bias correction for finite samples. Default is True.
demeaned (bool, optional) – Whether input data already has zero mean (e.g., if copula-normalized). Default is False.

Returns:

Conditional mutual information I(X;Y|Z) in bits.

Return type:

float

Raises:

ValueError – If x, y, or z are 3D or higher dimensional arrays. If number of samples don’t match between x, y, and z.

Notes

Conditional mutual information measures the dependency between X and Y after accounting for the influence of Z. If X and Y are independent given Z, then I(X;Y|Z) = 0.

This function assumes all variables follow a joint Gaussian distribution. For non-Gaussian data, apply copula normalization first.

The entropy decomposition uses: I(X;Y|Z) = H(X,Z) + H(Y,Z) - H(X,Y,Z) - H(Z)

Examples

>>> # Independent given Z
>>> rng = np.random.RandomState(42)
>>> z = rng.randn(1, 1000)
>>> x = z + 0.5 * rng.randn(1, 1000)  # X depends on Z
>>> y = z + 0.5 * rng.randn(1, 1000)  # Y depends on Z
>>> # X and Y are correlated through Z
>>> mi_uncond = mi_gg(x, y)
>>> mi_uncond > 0.3  # Significant MI without conditioning
True
>>> # But independent given Z
>>> cmi = cmi_ggg(x, y, z)
>>> cmi < 0.05  # Near zero when conditioned on Z
True

>>> # Direct dependency not explained by Z
>>> z = rng.randn(1, 1000)
>>> x = rng.randn(1, 1000)
>>> y = 0.5 * x + 0.3 * z + 0.2 * rng.randn(1, 1000)
>>> cmi = cmi_ggg(x, y, z)
>>> cmi > 0.3  # Still significant after conditioning
True

See also

mi_gg: Unconditional mutual information
gccmi_ccd: CMI with discrete conditioning variable

References

Cover, T. M., & Thomas, J. A. (2006). Elements of information theory.

driada.information.mi_model_gd(x, y, Ym=None, biascorrect=True, demeaned=False)[source]

Mutual information between a Gaussian and a discrete variable in bits.

Computes MI between a (possibly multidimensional) Gaussian variable x and a discrete variable y using ANOVA-style model comparison. For 1D x this provides a lower bound to the mutual information.

Note: Each discrete class must have at least 2 samples for covariance estimation. Classes with fewer samples will be skipped with a warning.

Parameters:

x (array_like, shape (n_features, n_samples) or (n_samples,)) – Gaussian variable data. If 2D, rows are features and columns are samples. If 1D with shape (n,), converted to shape (1, n) representing a single feature with n samples.
y (array_like, shape (n_samples,)) – Discrete variable containing integer values in range [0, max(y)]. Must be 1D array with same number of samples as x.
Ym (int, optional) – Number of discrete states. If None (default), automatically computed as np.max(y) + 1. Useful if y doesn’t contain all possible states.
biascorrect (bool, optional) – Whether to apply bias correction for finite samples. Default is True.
demeaned (bool, optional) – Whether input data x already has zero mean (e.g., if copula-normalized). Default is False.

Returns:

Mutual information I(X;Y) in bits.

Return type:

float

Raises:

ValueError – If x is 3D or higher dimensional array. If y is not a 1D array. If number of samples don’t match between x and y. If Ym is not an integer.

Warning

RuntimeWarning: If any class has fewer than 2 samples. These classes will be skipped in the MI calculation as covariance estimation requires at least 2 samples per class.

Examples

>>> import numpy as np
>>> # Continuous data (2 variables, 100 samples)
>>> x = np.random.randn(2, 100)
>>> # Discrete labels (3 classes)
>>> y = np.random.randint(0, 3, 100)
>>> mi = mi_model_gd(x, y, Ym=3)
>>> isinstance(mi, float)
True
>>> # Automatic Ym detection
>>> mi_auto = mi_model_gd(x, y)
>>> isinstance(mi_auto, float)
True

See also

ent_g: Gaussian entropy estimation

KSG Estimators

Non-parametric mutual information estimation using k-nearest neighbors.

driada.information.nonparam_mi_cc(x, y, z=None, k=5, base=2.718281828459045, alpha='auto', lf=5, precomputed_tree_x=None, precomputed_tree_y=None)[source]

Kraskov-Stögbauer-Grassberger (KSG) mutual information estimator.

Estimates mutual information between continuous variables using k-nearest neighbors. Can compute conditional MI when z is provided: I(X;Y|Z).

Parameters:

x (array-like) – First variable, shape (n_samples,) or (n_samples, n_features_x).
y (array-like) – Second variable, shape (n_samples,) or (n_samples, n_features_y). Must have same number of samples as x.
z (array-like, optional) – Conditioning variable for conditional MI: I(X;Y|Z). Shape (n_samples,) or (n_samples, n_features_z).
k (int, default=5) – Number of nearest neighbors. Common values: - k = 4-5 for most applications - Use larger k for higher dimensions Must satisfy k < n_samples.
base (float, default=np.e) – Logarithm base. Use np.e for nats, 2 for bits, 10 for dits.
alpha (float or "auto", default="auto") – Local Non-uniformity Correction (LNC) parameter. - “auto”: automatically selects optimal alpha - float: manual alpha value (0 disables correction) - Warning: LNC disabled when k ≤ dimensionality
lf (int, default=5) – Leaf size for k-d tree construction. Smaller values may be faster for small datasets, larger values for big datasets.
precomputed_tree_x (BallTree/KDTree, optional) – Pre-built tree for x to avoid recomputation in repeated calls.
precomputed_tree_y (BallTree/KDTree, optional) – Pre-built tree for y to avoid recomputation in repeated calls.

Returns:

Mutual information estimate in units determined by base. Always non-negative (up to estimation error).

Return type:

float

Notes

Uses the KSG estimator algorithm 1: I(X;Y) = ψ(k) - <ψ(n_x + 1) + ψ(n_y + 1)> + ψ(n)

where: - ψ is the digamma function - n_x, n_y are the number of neighbors in X, Y spaces - <·> denotes average over all samples - n is the total number of samples

Small noise is added to continuous variables to break ties.

References

Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6), 066138.

Gao, S., Ver Steeg, G., & Galstyan, A. (2015). Efficient Estimation of Mutual Information for Strongly Dependent Variables. AISTATS, PMLR 38:277-286. (LNC correction)

Examples

>>> # MI between correlated Gaussians
>>> np.random.seed(42)
>>> x = np.random.randn(1000)
>>> y = x + np.random.randn(1000) * 0.5
>>> mi = nonparam_mi_cc(x, y)
>>> print(f"MI = {mi:.3f} nats")
MI = 0.760 nats

>>> # Conditional MI: I(X;Y|Z)
>>> z = np.random.randn(1000)
>>> cmi = nonparam_mi_cc(x, y, z=z)
>>> print(f"cmi = {cmi:.3f} nats")
cmi = 0.756 nats

Raises:: ValueError – If arrays have different lengths, k is invalid, or base is not positive.

driada.information.nonparam_mi_cd(x_continuous, y_discrete, k=5, base=2.718281828459045)[source]

Mutual information between continuous and discrete variables using KSG estimator.

Uses the mixed-type mutual information estimator from the KSG paper.

Parameters:

x_continuous (array_like) – Continuous variable data of shape (n_samples,) or (n_samples, n_features). Should contain finite values.
y_discrete (array_like) – Discrete variable data of shape (n_samples,). Values should be discrete categories (integers or strings).
k (int, optional) – Number of nearest neighbors to use. Default is 5. Must be positive.
base (float, optional) – Logarithm base. Default is e (natural logarithm). Must be positive.

Returns:

Mutual information in units determined by base. Always non-negative.

Return type:

float

Raises:

ValueError – If inputs have incompatible shapes or invalid values.

Notes

Computes MI as I(X;Y) = H(X) - H(X|Y) where H(X|Y) is the weighted average of conditional entropies. Categories with fewer than k+1 samples are skipped, which may introduce bias for small sample sizes.

driada.information.nonparam_mi_dc(x_discrete, y_continuous, k=5, base=2.718281828459045)[source]

Mutual information between discrete and continuous variables using KSG estimator.

This is just the symmetric version of nonparam_mi_cd.

Parameters:

x_discrete (array_like) – Discrete variable data of shape (n_samples,). Values should be discrete categories (integers or strings).
y_continuous (array_like) – Continuous variable data of shape (n_samples,) or (n_samples, n_features). Should contain finite values.
k (int, optional) – Number of nearest neighbors to use. Default is 5. Must be positive.
base (float, optional) – Logarithm base. Default is e (natural logarithm). Must be positive.

Returns:

Mutual information in units determined by base. Always non-negative.

Return type:

float

Notes

MI is symmetric, so this function simply swaps the arguments and calls nonparam_mi_cd. See that function for implementation details.