Core RSA Functions

Core RSA functions for computing and comparing RDMs.

driada.rsa.core.compute_rdm(patterns, metric='correlation', logger=None)[source]

Compute representational dissimilarity matrix from patterns.

Parameters:
  • patterns (np.ndarray or MVData) – Pattern matrix of shape (n_items, n_features) if np.ndarray (will be transposed automatically) or MVData object Each row is a pattern/item, each column is a feature

  • metric (str, default 'correlation') – Distance metric: ‘correlation’, ‘euclidean’, ‘cosine’, ‘manhattan’

  • logger (logging.Logger, optional) – Logger instance for debugging (currently unused)

Returns:

rdm – Representational dissimilarity matrix (n_items, n_items)

Return type:

np.ndarray

driada.rsa.core.compute_rdm_from_timeseries_labels(data, labels, metric='correlation', average_method='mean')[source]

Compute RDM from time series data using behavioral variable labels.

Parameters:
  • data (np.ndarray) – Data array of shape (n_features, n_timepoints)

  • labels (np.ndarray) – Label for each timepoint, shape (n_timepoints,) Each unique label defines a condition/item

  • metric (str, default 'correlation') – Distance metric for RDM computation

  • average_method (str, default 'mean') – How to average within conditions: ‘mean’ or ‘median’

Return type:

Tuple[ndarray, ndarray]

Returns:

  • rdm (np.ndarray) – Representational dissimilarity matrix

  • unique_labels (np.ndarray) – The unique labels in order as they appear in the RDM

driada.rsa.core.compute_rdm_from_trials(data, trial_starts, trial_labels, trial_duration=None, metric='correlation', average_method='mean')[source]

Compute RDM from time series data using explicit trial structure.

Parameters:
  • data (np.ndarray) – Data array of shape (n_features, n_timepoints)

  • trial_starts (np.ndarray) – Start indices for each trial

  • trial_labels (np.ndarray) – Label for each trial (same length as trial_starts)

  • trial_duration (int, optional) – Fixed duration for each trial. If None, uses time until next trial

  • metric (str, default 'correlation') – Distance metric for RDM computation

  • average_method (str, default 'mean') – How to average within trial: ‘mean’ or ‘median’

Return type:

Tuple[ndarray, ndarray]

Returns:

  • rdm (np.ndarray) – Representational dissimilarity matrix

  • unique_labels (np.ndarray) – The unique trial labels in order as they appear in the RDM

driada.rsa.core.compare_rdms(rdm1, rdm2, method='spearman')[source]

Compare two representational dissimilarity matrices.

Quantifies the similarity between two RDMs using correlation or cosine similarity metrics. Only the upper triangular portion (excluding diagonal) is compared since RDMs are symmetric.

Parameters:
  • rdm1 (np.ndarray) – First RDM, square symmetric matrix of shape (n_items, n_items).

  • rdm2 (np.ndarray) – Second RDM, must have the same shape as rdm1.

  • method (str, default 'spearman') – Comparison method: - ‘spearman’: Spearman rank correlation (robust to monotonic transforms) - ‘pearson’: Pearson correlation (assumes linear relationship) - ‘kendall’: Kendall’s tau (robust but slower, O(n²) complexity) - ‘cosine’: Cosine similarity (angle between RDM vectors)

Returns:

Similarity score between RDMs: - Correlations (‘spearman’, ‘pearson’, ‘kendall’): Range [-1, 1] - Cosine similarity: Range [0, 1] (NaN if either RDM has zero norm) - Returns NaN if correlation cannot be computed (e.g., constant RDMs)

Return type:

float

Raises:
  • ValueError – If RDMs have different shapes. If method is not one of the supported options.

  • RuntimeWarning – If either RDM contains NaN values (via warnings.warn).

Notes

Only upper triangular values are used since RDMs are symmetric and diagonal is uninformative (always 0).

P-values from statistical tests are computed internally but not returned. Use bootstrap_rdm_comparison for statistical inference.

Kendall’s tau is more robust than Spearman but has O(n²) complexity in the number of unique values, making it slow for large RDMs.

For cosine similarity, if either RDM vector has zero norm (all values identical), the function returns NaN and issues a warning.

Examples

>>> import numpy as np
>>> # Create two similar RDMs
>>> rdm1 = np.array([[0, 0.5, 0.8], [0.5, 0, 0.3], [0.8, 0.3, 0]])
>>> rdm2 = np.array([[0, 0.6, 0.7], [0.6, 0, 0.4], [0.7, 0.4, 0]])
>>> # Compare using different methods
>>> pearson_sim = compare_rdms(rdm1, rdm2, method='pearson')
>>> print(f"Pearson correlation: {pearson_sim:.3f}")
Pearson correlation: 0.954
>>> spearman_sim = compare_rdms(rdm1, rdm2, method='spearman')
>>> print(f"Spearman correlation: {spearman_sim:.3f}")
Spearman correlation: 1.000
>>> # Cosine similarity
>>> cosine_sim = compare_rdms(rdm1, rdm2, method='cosine')
>>> print(f"Cosine similarity: {cosine_sim:.3f}")
Cosine similarity: 0.985

See also

compute_rdm

Compute RDMs from neural patterns

bootstrap_rdm_comparison

Statistical comparison with confidence intervals

rsa_compare

High-level interface for comparing datasets

driada.rsa.core.bootstrap_rdm_comparison(data1, data2, labels1, labels2, metric='correlation', comparison_method='spearman', n_bootstrap=1000, random_state=None)[source]

Bootstrap test for RDM similarity between two datasets.

Performs statistical inference on RDM similarity using within-condition resampling. This maintains the experimental design while estimating confidence intervals and assessing reliability of the similarity.

Parameters:
  • data1 (np.ndarray) – First dataset of shape (n_features1, n_timepoints). Features can be different between datasets (e.g., comparing V1 vs V2 neurons).

  • data2 (np.ndarray) – Second dataset of shape (n_features2, n_timepoints). Must have the same number of timepoints as data1.

  • labels1 (np.ndarray) – Condition labels for each timepoint in data1, shape (n_timepoints,).

  • labels2 (np.ndarray) – Condition labels for each timepoint in data2, shape (n_timepoints,). Must contain the same unique values as labels1.

  • metric (str, default 'correlation') – Distance metric for RDM computation. See compute_rdm for options.

  • comparison_method (str, default 'spearman') – Method for comparing RDMs. See compare_rdms for options.

  • n_bootstrap (int, default 1000) – Number of bootstrap iterations. Higher values give more stable estimates but take longer.

  • random_state (int, optional) – Random seed for reproducibility. Creates a local RandomState to avoid affecting global random state.

Returns:

Dictionary containing: - ‘observed’: float, observed RDM similarity between datasets - ‘bootstrap_distribution’: np.ndarray, bootstrap similarity values - ‘p_value’: float, two-tailed test of observed vs bootstrap mean - ‘ci_lower’: float, 2.5th percentile of bootstrap distribution - ‘ci_upper’: float, 97.5th percentile of bootstrap distribution - ‘mean’: float, mean of bootstrap distribution - ‘std’: float, standard deviation of bootstrap distribution

Return type:

dict

Raises:

ValueError – If datasets don’t have the same unique condition labels.

Notes

The bootstrap procedure: 1. Resamples timepoints within each condition independently 2. Maintains the number of samples per condition 3. Computes RDMs from resampled data 4. Calculates similarity between resampled RDMs

This within-condition resampling preserves the experimental design while capturing trial-by-trial variability.

The p-value tests whether the observed similarity is extreme relative to the bootstrap distribution mean. This is NOT a standard null hypothesis test but rather a stability assessment.

Uses a local RandomState to avoid modifying global numpy random state, ensuring thread safety and reproducibility.

Examples

>>> import numpy as np
>>> # Create two datasets with different patterns but similar structure
>>> np.random.seed(42)
>>> n_features = 20
>>> n_timepoints = 90
>>>
>>> # 3 conditions, 30 samples each
>>> labels = np.repeat(['A', 'B', 'C'], 30)
>>>
>>> # Create data with condition-specific patterns
>>> v1_data = np.random.randn(n_features, n_timepoints)
>>> v2_data = np.random.randn(n_features, n_timepoints)
>>>
>>> # Bootstrap will resample and compute similarity distribution
>>> results = bootstrap_rdm_comparison(
...     v1_data, v2_data, labels, labels,
...     n_bootstrap=50, random_state=42
... )
>>>
>>> # Check that results contain expected keys
>>> print('Keys:', sorted(results.keys()))
Keys: ['bootstrap_distribution', 'ci_lower', 'ci_upper', 'mean', 'observed', 'p_value', 'std']
>>>
>>> # Random data should give low correlation
>>> print(f"Observed between -1 and 1: {-1 <= results['observed'] <= 1}")
Observed between -1 and 1: True

See also

compare_rdms

Direct RDM comparison without bootstrap

compute_rdm_from_timeseries_labels

Compute RDM from labeled data

rsa_compare

High-level interface with multiple data types

driada.rsa.core.compute_rdm_unified(data, items=None, data_type='calcium', metric='correlation', average_method='mean', trial_duration=None)[source]

Compute RDM with automatic data type detection and dispatching.

This function intelligently dispatches to the appropriate RDM computation method based on the input data type and items specification. It provides a unified interface for computing RDMs from various data structures (arrays, MVData, Experiments) and item definitions (pre-averaged patterns, timeseries labels, or trial structures).

Parameters:
  • data (np.ndarray, MVData, or Experiment) – The data to compute RDM from: - np.ndarray: Raw data matrix (n_features, n_timepoints) - MVData: MVData object - Experiment: DRIADA Experiment object

  • items (np.ndarray, str, dict, or None) – How to define items/conditions: - None: Compute RDM directly from patterns (requires pre-averaged data) - np.ndarray: Condition labels for each timepoint - str: Name of dynamic feature (for Experiment objects) - dict: Trial structure with ‘trial_starts’ and ‘trial_labels’

  • data_type (str, default 'calcium') – For Experiment objects, which data type to use (‘calcium’ or ‘spikes’)

  • metric (str, default 'correlation') – Distance metric for RDM computation. Options: ‘correlation’, ‘euclidean’, ‘cosine’, ‘manhattan’

  • average_method (str, default 'mean') – How to average within conditions (‘mean’ or ‘median’)

  • trial_duration (int, optional) – For trial structure, fixed duration for each trial. If specified in both parameter and items dict, dict value takes precedence.

Return type:

Tuple[ndarray, Optional[ndarray]]

Returns:

  • rdm (np.ndarray) – Representational dissimilarity matrix

  • labels (np.ndarray or None) – The unique labels/conditions if items were specified

Raises:

ValueError – If items required but not provided for Experiment objects. If trial structure dict missing required keys. If metric is not one of the valid options. If MVData/Embedding used with trial structure.

Notes

Imports are performed inside the function to avoid circular dependencies. This has minimal performance impact as the function is typically called only a few times per analysis.

When trial_duration is specified in both the items dict and as a parameter, the dict value takes precedence and a warning is issued.

Examples

>>> import numpy as np
>>> # Direct pattern RDM (pre-averaged data)
>>> patterns = np.random.randn(10, 50)  # 10 items, 50 features
>>> rdm, _ = compute_rdm_unified(patterns)
>>> print(f"RDM shape: {rdm.shape}")
RDM shape: (10, 10)
>>> # From time series with labels
>>> data = np.random.randn(100, 1000)  # 100 features, 1000 timepoints
>>> labels = np.repeat([0, 1, 2, 3], 250)
>>> rdm, unique_labels = compute_rdm_unified(data, labels)
>>> print(f"RDM shape: {rdm.shape}, unique labels: {unique_labels}")
RDM shape: (4, 4), unique labels: [0 1 2 3]
>>> # From MVData object
>>> from driada.dim_reduction.data import MVData
>>> mvdata = MVData(np.random.randn(50, 100))  # 50 features, 100 samples
>>> rdm, _ = compute_rdm_unified(mvdata)
>>> print(f"RDM shape: {rdm.shape}")
RDM shape: (100, 100)

See also

compute_rdm

Direct RDM computation from patterns

compute_rdm_from_timeseries_labels

RDM from labeled timeseries

compute_rdm_from_trials

RDM from trial structure

compute_experiment_rdm

RDM from Experiment objects

driada.rsa.core.rsa_compare(data1, data2, items=None, metric='correlation', comparison='spearman', data_type='calcium', logger=None)[source]

Compare neural representations using RSA.

This is a simplified API for the most common RSA use case: comparing two sets of neural representations. It automatically handles different data types (arrays, MVData, Embeddings, Experiments) and computes the similarity between their representational geometries.

Parameters:
  • data1 (np.ndarray, MVData, or Experiment) – First dataset (n_items, n_features) if array, MVData object, or Experiment

  • data2 (np.ndarray, MVData, or Experiment) – Second dataset (same n_items as data1)

  • items (str, dict, or None) – How to define conditions (required for Experiment objects): - None: For arrays/MVData, assumes data is already averaged per item - str: Name of dynamic feature (e.g., ‘stimulus_type’) - dict: Trial structure with ‘trial_starts’ and ‘trial_labels’

  • metric (str, default 'correlation') – Distance metric for RDM computation

  • comparison (str, default 'spearman') – Method for comparing RDMs (‘spearman’, ‘pearson’, ‘kendall’, ‘cosine’)

  • data_type (str, default 'calcium') – For Experiment objects, which data to use (‘calcium’ or ‘spikes’)

  • logger (logging.Logger, optional) – Logger for debugging messages

Returns:

similarity – Similarity score between the two neural representations. Range depends on comparison method: [-1, 1] for correlations, [0, 1] for cosine similarity.

Return type:

float

Raises:

ValueError – If items not specified for Experiment objects. If trying to compare Experiment with non-Experiment data. If RDMs have incompatible shapes (different numbers of items).

Notes

Imports are performed inside the function to avoid circular dependencies. Embedding objects are automatically converted to MVData for uniform processing.

When comparing arrays or MVData without items specification, assumes the data is already averaged per condition (each row represents one item/condition).

Examples

>>> import numpy as np
>>> # Compare two brain areas with structured data
>>> np.random.seed(42)
>>> n_stimuli = 5
>>> n_neurons_v1, n_neurons_v2 = 20, 15
>>>
>>> # Create orthogonal patterns for each stimulus
>>> v1_data = np.zeros((n_stimuli, n_neurons_v1))
>>> v2_data = np.zeros((n_stimuli, n_neurons_v2))
>>>
>>> # Each stimulus activates different neurons in both areas
>>> for i in range(n_stimuli):
...     # V1: each stimulus activates 4 specific neurons
...     v1_data[i, i*4:(i+1)*4] = 1.0
...     # V2: similar pattern with 3 neurons per stimulus
...     v2_data[i, i*3:(i+1)*3] = 1.0
>>>
>>> # Add small noise for realism
>>> v1_data += 0.1 * np.random.randn(n_stimuli, n_neurons_v1)
>>> v2_data += 0.1 * np.random.randn(n_stimuli, n_neurons_v2)
>>>
>>> # This creates similar RDM structure in both areas
>>> similarity = rsa_compare(v1_data, v2_data, comparison='spearman')
>>> print(f"RSA similarity: {similarity:.3f}")
RSA similarity: 0.479
>>> # Compare using compute_rdm_unified first
>>> from driada.rsa import compute_rdm_unified
>>> np.random.seed(123)
>>> data1 = np.random.randn(50, 90)  # 50 features, 90 timepoints
>>> data2 = np.random.randn(40, 90)  # 40 features, same timepoints
>>> labels = np.repeat(['A', 'B', 'C'], 30)  # 30 samples per condition
>>> # First compute RDMs with labels
>>> rdm1, _ = compute_rdm_unified(data1, items=labels)
>>> rdm2, _ = compute_rdm_unified(data2, items=labels)
>>> # Both RDMs now have shape (3, 3) for the 3 conditions
>>> print(f"RDM shapes: {rdm1.shape}, {rdm2.shape}")
RDM shapes: (3, 3), (3, 3)
>>> # Compare the RDMs
>>> from driada.rsa import compare_rdms
>>> similarity = compare_rdms(rdm1, rdm2)
>>> print(f"RSA similarity between -1 and 1: {-1 <= similarity <= 1}")
RSA similarity between -1 and 1: True

See also

compute_rdm_unified

Unified RDM computation interface

compare_rdms

Direct comparison of RDM matrices

bootstrap_rdm_comparison

Statistical comparison with confidence intervals

Core functions for computing representational dissimilarity matrices (RDMs) and comparing neural representations.

JIT-Optimized Functions

JIT-compiled core functions for RSA.

These functions provide optimized implementations of computationally intensive RSA operations.

driada.rsa.core_jit.fast_correlation_distance(patterns)

Compute correlation distance matrix using JIT-optimized loops.

This function computes pairwise correlation distances between patterns using explicit loops optimized for numba JIT compilation. It handles edge cases like zero-variance patterns and uses sample correlation (ddof=1) to match numpy.corrcoef behavior.

Parameters:

patterns (np.ndarray) – Pattern matrix of shape (n_items, n_features). Each row represents a pattern/item, each column a feature.

Returns:

rdm – Correlation distance matrix (n_items, n_items). Values range from 0 (identical patterns) to 2 (perfectly anti-correlated). Diagonal is always 0.

Return type:

np.ndarray

Notes

The function standardizes each pattern to zero mean and unit variance using sample standard deviation (n-1 denominator). For patterns with zero variance, correlation is undefined and distance is set to 0 if patterns are identical, 1 otherwise.

Correlation values are clipped to [-1, 1] to handle numerical errors before computing distance as 1 - correlation.

Examples

>>> patterns = np.array([[1, 2, 3], [2, 4, 6], [1, 1, 1]])
>>> rdm = fast_correlation_distance(patterns)
>>> # rdm[0,1] ≈ 0 (perfect correlation)
>>> # rdm[0,2] = 1 (undefined correlation, different patterns)

See also

compute_rdm

Higher-level function that uses this for correlation metric

fast_euclidean_distance

Alternative distance metric

fast_manhattan_distance

Alternative distance metric

driada.rsa.core_jit.fast_average_patterns(data, labels, unique_labels)

Average patterns within conditions using fast JIT-compiled loops.

This function computes the mean pattern for each unique condition label by averaging all timepoints that belong to that condition. Optimized for performance using explicit loops compatible with numba JIT compilation.

Parameters:
  • data (np.ndarray) – Data array of shape (n_features, n_timepoints)

  • labels (np.ndarray) – Label for each timepoint

  • unique_labels (np.ndarray) – Unique condition labels

Returns:

patterns – Averaged patterns (n_conditions, n_features)

Return type:

np.ndarray

Notes

If no timepoints match a given label, that condition’s pattern will be all zeros. This is intentional to maintain consistent output shape.

Examples

>>> data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
>>> labels = np.array([0, 1, 0, 1])
>>> unique = np.array([0, 1])
>>> patterns = fast_average_patterns(data, labels, unique)
>>> # patterns[0] = mean of columns 0,2 = [2, 6]
>>> # patterns[1] = mean of columns 1,3 = [3, 7]

See also

compute_rdm_from_timeseries_labels

Higher-level function that uses this

compute_rdm_from_trials

Alternative averaging approach for trial data

driada.rsa.core_jit.fast_euclidean_distance(patterns)

Compute Euclidean distance matrix using JIT-optimized loops.

Computes pairwise Euclidean distances between all pattern pairs using explicit loops for numba JIT compilation compatibility.

Parameters:

patterns (np.ndarray) – Pattern matrix of shape (n_items, n_features). Each row is a pattern in n_features-dimensional space.

Returns:

rdm – Symmetric Euclidean distance matrix (n_items, n_items) with zeros on diagonal. Values are non-negative.

Return type:

np.ndarray

Notes

Uses the standard Euclidean distance formula: d(i,j) = sqrt(sum((patterns[i,k] - patterns[j,k])^2))

No overflow protection is implemented. For very large values, consider normalizing patterns first.

Examples

>>> patterns = np.array([[0, 0], [3, 4], [1, 0]])
>>> rdm = fast_euclidean_distance(patterns)
>>> # rdm[0,1] = 5.0 (distance from origin to (3,4))
>>> # rdm[0,2] = 1.0 (distance from origin to (1,0))

See also

compute_rdm

Higher-level function that uses this for euclidean metric

fast_correlation_distance

Alternative distance metric

fast_manhattan_distance

Alternative distance metric

driada.rsa.core_jit.fast_manhattan_distance(patterns)

Compute Manhattan distance matrix using JIT-optimized loops.

Computes pairwise Manhattan (L1) distances between patterns using explicit loops for numba JIT compilation compatibility.

Parameters:

patterns (np.ndarray) – Pattern matrix of shape (n_items, n_features). Each row represents a pattern/item, each column a feature.

Returns:

rdm – Symmetric Manhattan distance matrix (n_items, n_items) with zeros on diagonal. All values are non-negative.

Return type:

np.ndarray

Notes

Manhattan distance (also called L1 distance or taxicab distance) is the sum of absolute differences: d(i,j) = sum(abs(patterns[i,k] - patterns[j,k]))

This metric is more robust to outliers than Euclidean distance and often used for high-dimensional or sparse data.

Examples

>>> patterns = np.array([[0, 0], [3, 4], [1, 1]])
>>> rdm = fast_manhattan_distance(patterns)
>>> # rdm[0,1] = 7 (|0-3| + |0-4|)
>>> # rdm[0,2] = 2 (|0-1| + |0-1|)
>>> # rdm[1,2] = 5 (|3-1| + |4-1|)

See also

compute_rdm

Higher-level function that uses this for manhattan metric

fast_euclidean_distance

Alternative distance metric

fast_correlation_distance

Alternative distance metric