Core RSA Functions

Core RSA functions for computing and comparing RDMs.

driada.rsa.core.compute_rdm(patterns, metric='correlation', logger=None)[source]

Compute representational dissimilarity matrix from patterns.

Parameters:

patterns (np.ndarray or MVData) – Pattern matrix of shape (n_items, n_features) if np.ndarray (will be transposed automatically) or MVData object Each row is a pattern/item, each column is a feature
metric (str, default 'correlation') – Distance metric: ‘correlation’, ‘euclidean’, ‘cosine’, ‘manhattan’
logger (logging.Logger, optional) – Logger instance for debugging (currently unused)

Returns:

rdm – Representational dissimilarity matrix (n_items, n_items)

Return type:

np.ndarray

driada.rsa.core.compute_rdm_from_timeseries_labels(data, labels, metric='correlation', average_method='mean')[source]

Compute RDM from time series data using behavioral variable labels.

Parameters:

data (np.ndarray) – Data array of shape (n_features, n_timepoints)
labels (np.ndarray) – Label for each timepoint, shape (n_timepoints,) Each unique label defines a condition/item
metric (str, default 'correlation') – Distance metric for RDM computation
average_method (str, default 'mean') – How to average within conditions: ‘mean’ or ‘median’

Return type:

Tuple[ndarray, ndarray]

Returns:

rdm (np.ndarray) – Representational dissimilarity matrix
unique_labels (np.ndarray) – The unique labels in order as they appear in the RDM

driada.rsa.core.compute_rdm_from_trials(data, trial_starts, trial_labels, trial_duration=None, metric='correlation', average_method='mean')[source]

Compute RDM from time series data using explicit trial structure.

Parameters:

data (np.ndarray) – Data array of shape (n_features, n_timepoints)
trial_starts (np.ndarray) – Start indices for each trial
trial_labels (np.ndarray) – Label for each trial (same length as trial_starts)
trial_duration (int, optional) – Fixed duration for each trial. If None, uses time until next trial
metric (str, default 'correlation') – Distance metric for RDM computation
average_method (str, default 'mean') – How to average within trial: ‘mean’ or ‘median’

Return type:

Tuple[ndarray, ndarray]

Returns:

rdm (np.ndarray) – Representational dissimilarity matrix
unique_labels (np.ndarray) – The unique trial labels in order as they appear in the RDM

driada.rsa.core.compare_rdms(rdm1, rdm2, method='spearman')[source]

Compare two representational dissimilarity matrices.

Quantifies the similarity between two RDMs using correlation or cosine similarity metrics. Only the upper triangular portion (excluding diagonal) is compared since RDMs are symmetric.

Parameters:

rdm1 (np.ndarray) – First RDM, square symmetric matrix of shape (n_items, n_items).
rdm2 (np.ndarray) – Second RDM, must have the same shape as rdm1.
method (str, default 'spearman') – Comparison method: - ‘spearman’: Spearman rank correlation (robust to monotonic transforms) - ‘pearson’: Pearson correlation (assumes linear relationship) - ‘kendall’: Kendall’s tau (robust but slower, O(n²) complexity) - ‘cosine’: Cosine similarity (angle between RDM vectors)

Returns:

Similarity score between RDMs: - Correlations (‘spearman’, ‘pearson’, ‘kendall’): Range [-1, 1] - Cosine similarity: Range [0, 1] (NaN if either RDM has zero norm) - Returns NaN if correlation cannot be computed (e.g., constant RDMs)

Return type:

float

Raises:

ValueError – If RDMs have different shapes. If method is not one of the supported options.
RuntimeWarning – If either RDM contains NaN values (via warnings.warn).

Notes

Only upper triangular values are used since RDMs are symmetric and diagonal is uninformative (always 0).

P-values from statistical tests are computed internally but not returned. Use bootstrap_rdm_comparison for statistical inference.

Kendall’s tau is more robust than Spearman but has O(n²) complexity in the number of unique values, making it slow for large RDMs.

For cosine similarity, if either RDM vector has zero norm (all values identical), the function returns NaN and issues a warning.

Examples

>>> import numpy as np
>>> # Create two similar RDMs
>>> rdm1 = np.array([[0, 0.5, 0.8], [0.5, 0, 0.3], [0.8, 0.3, 0]])
>>> rdm2 = np.array([[0, 0.6, 0.7], [0.6, 0, 0.4], [0.7, 0.4, 0]])

>>> # Compare using different methods
>>> pearson_sim = compare_rdms(rdm1, rdm2, method='pearson')
>>> print(f"Pearson correlation: {pearson_sim:.3f}")
Pearson correlation: 0.954

>>> spearman_sim = compare_rdms(rdm1, rdm2, method='spearman')
>>> print(f"Spearman correlation: {spearman_sim:.3f}")
Spearman correlation: 1.000

>>> # Cosine similarity
>>> cosine_sim = compare_rdms(rdm1, rdm2, method='cosine')
>>> print(f"Cosine similarity: {cosine_sim:.3f}")
Cosine similarity: 0.985

See also

compute_rdm: Compute RDMs from neural patterns
bootstrap_rdm_comparison: Statistical comparison with confidence intervals
rsa_compare: High-level interface for comparing datasets

driada.rsa.core.bootstrap_rdm_comparison(data1, data2, labels1, labels2, metric='correlation', comparison_method='spearman', n_bootstrap=1000, random_state=None)[source]

Bootstrap test for RDM similarity between two datasets.

Performs statistical inference on RDM similarity using within-condition resampling. This maintains the experimental design while estimating confidence intervals and assessing reliability of the similarity.

Parameters:

data1 (np.ndarray) – First dataset of shape (n_features1, n_timepoints). Features can be different between datasets (e.g., comparing V1 vs V2 neurons).
data2 (np.ndarray) – Second dataset of shape (n_features2, n_timepoints). Must have the same number of timepoints as data1.
labels1 (np.ndarray) – Condition labels for each timepoint in data1, shape (n_timepoints,).
labels2 (np.ndarray) – Condition labels for each timepoint in data2, shape (n_timepoints,). Must contain the same unique values as labels1.
metric (str, default 'correlation') – Distance metric for RDM computation. See compute_rdm for options.
comparison_method (str, default 'spearman') – Method for comparing RDMs. See compare_rdms for options.
n_bootstrap (int, default 1000) – Number of bootstrap iterations. Higher values give more stable estimates but take longer.
random_state (int, optional) – Random seed for reproducibility. Creates a local RandomState to avoid affecting global random state.

Returns:

Dictionary containing: - ‘observed’: float, observed RDM similarity between datasets - ‘bootstrap_distribution’: np.ndarray, bootstrap similarity values - ‘p_value’: float, two-tailed test of observed vs bootstrap mean - ‘ci_lower’: float, 2.5th percentile of bootstrap distribution - ‘ci_upper’: float, 97.5th percentile of bootstrap distribution - ‘mean’: float, mean of bootstrap distribution - ‘std’: float, standard deviation of bootstrap distribution

Return type:

dict

Raises:

ValueError – If datasets don’t have the same unique condition labels.

Notes

The bootstrap procedure: 1. Resamples timepoints within each condition independently 2. Maintains the number of samples per condition 3. Computes RDMs from resampled data 4. Calculates similarity between resampled RDMs

This within-condition resampling preserves the experimental design while capturing trial-by-trial variability.

The p-value tests whether the observed similarity is extreme relative to the bootstrap distribution mean. This is NOT a standard null hypothesis test but rather a stability assessment.

Uses a local RandomState to avoid modifying global numpy random state, ensuring thread safety and reproducibility.

Examples

>>> import numpy as np
>>> # Create two datasets with different patterns but similar structure
>>> np.random.seed(42)
>>> n_features = 20
>>> n_timepoints = 90
>>>
>>> # 3 conditions, 30 samples each
>>> labels = np.repeat(['A', 'B', 'C'], 30)
>>>
>>> # Create data with condition-specific patterns
>>> v1_data = np.random.randn(n_features, n_timepoints)
>>> v2_data = np.random.randn(n_features, n_timepoints)
>>>
>>> # Bootstrap will resample and compute similarity distribution
>>> results = bootstrap_rdm_comparison(
...     v1_data, v2_data, labels, labels,
...     n_bootstrap=50, random_state=42
... )
>>>
>>> # Check that results contain expected keys
>>> print('Keys:', sorted(results.keys()))
Keys: ['bootstrap_distribution', 'ci_lower', 'ci_upper', 'mean', 'observed', 'p_value', 'std']
>>>
>>> # Random data should give low correlation
>>> print(f"Observed between -1 and 1: {-1 <= results['observed'] <= 1}")
Observed between -1 and 1: True

See also

compare_rdms: Direct RDM comparison without bootstrap
compute_rdm_from_timeseries_labels: Compute RDM from labeled data
rsa_compare: High-level interface with multiple data types

driada.rsa.core.compute_rdm_unified(data, items=None, data_type='calcium', metric='correlation', average_method='mean', trial_duration=None)[source]

Compute RDM with automatic data type detection and dispatching.

This function intelligently dispatches to the appropriate RDM computation method based on the input data type and items specification. It provides a unified interface for computing RDMs from various data structures (arrays, MVData, Experiments) and item definitions (pre-averaged patterns, timeseries labels, or trial structures).

Parameters:

data (np.ndarray, MVData, or Experiment) – The data to compute RDM from: - np.ndarray: Raw data matrix (n_features, n_timepoints) - MVData: MVData object - Experiment: DRIADA Experiment object
items (np.ndarray, str, dict, or None) – How to define items/conditions: - None: Compute RDM directly from patterns (requires pre-averaged data) - np.ndarray: Condition labels for each timepoint - str: Name of dynamic feature (for Experiment objects) - dict: Trial structure with ‘trial_starts’ and ‘trial_labels’
data_type (str, default 'calcium') – For Experiment objects, which data type to use (‘calcium’ or ‘spikes’)
metric (str, default 'correlation') – Distance metric for RDM computation. Options: ‘correlation’, ‘euclidean’, ‘cosine’, ‘manhattan’
average_method (str, default 'mean') – How to average within conditions (‘mean’ or ‘median’)
trial_duration (int, optional) – For trial structure, fixed duration for each trial. If specified in both parameter and items dict, dict value takes precedence.

Return type:

Tuple[ndarray, Optional[ndarray]]

Returns:

rdm (np.ndarray) – Representational dissimilarity matrix
labels (np.ndarray or None) – The unique labels/conditions if items were specified

Raises:

ValueError – If items required but not provided for Experiment objects. If trial structure dict missing required keys. If metric is not one of the valid options. If MVData/Embedding used with trial structure.

Notes

Imports are performed inside the function to avoid circular dependencies. This has minimal performance impact as the function is typically called only a few times per analysis.

When trial_duration is specified in both the items dict and as a parameter, the dict value takes precedence and a warning is issued.

Examples

>>> import numpy as np
>>> # Direct pattern RDM (pre-averaged data)
>>> patterns = np.random.randn(10, 50)  # 10 items, 50 features
>>> rdm, _ = compute_rdm_unified(patterns)
>>> print(f"RDM shape: {rdm.shape}")
RDM shape: (10, 10)

>>> # From time series with labels
>>> data = np.random.randn(100, 1000)  # 100 features, 1000 timepoints
>>> labels = np.repeat([0, 1, 2, 3], 250)
>>> rdm, unique_labels = compute_rdm_unified(data, labels)
>>> print(f"RDM shape: {rdm.shape}, unique labels: {unique_labels}")
RDM shape: (4, 4), unique labels: [0 1 2 3]

>>> # From MVData object
>>> from driada.dim_reduction.data import MVData
>>> mvdata = MVData(np.random.randn(50, 100))  # 50 features, 100 samples
>>> rdm, _ = compute_rdm_unified(mvdata)
>>> print(f"RDM shape: {rdm.shape}")
RDM shape: (100, 100)

See also

compute_rdm: Direct RDM computation from patterns
compute_rdm_from_timeseries_labels: RDM from labeled timeseries
compute_rdm_from_trials: RDM from trial structure
compute_experiment_rdm: RDM from Experiment objects

driada.rsa.core.rsa_compare(data1, data2, items=None, metric='correlation', comparison='spearman', data_type='calcium', logger=None)[source]

Compare neural representations using RSA.

This is a simplified API for the most common RSA use case: comparing two sets of neural representations. It automatically handles different data types (arrays, MVData, Embeddings, Experiments) and computes the similarity between their representational geometries.

Parameters:

data1 (np.ndarray, MVData, or Experiment) – First dataset (n_items, n_features) if array, MVData object, or Experiment
data2 (np.ndarray, MVData, or Experiment) – Second dataset (same n_items as data1)
items (str, dict, or None) – How to define conditions (required for Experiment objects): - None: For arrays/MVData, assumes data is already averaged per item - str: Name of dynamic feature (e.g., ‘stimulus_type’) - dict: Trial structure with ‘trial_starts’ and ‘trial_labels’
metric (str, default 'correlation') – Distance metric for RDM computation
comparison (str, default 'spearman') – Method for comparing RDMs (‘spearman’, ‘pearson’, ‘kendall’, ‘cosine’)
data_type (str, default 'calcium') – For Experiment objects, which data to use (‘calcium’ or ‘spikes’)
logger (logging.Logger, optional) – Logger for debugging messages

Returns:

similarity – Similarity score between the two neural representations. Range depends on comparison method: [-1, 1] for correlations, [0, 1] for cosine similarity.

Return type:

float

Raises:

ValueError – If items not specified for Experiment objects. If trying to compare Experiment with non-Experiment data. If RDMs have incompatible shapes (different numbers of items).

Notes

Imports are performed inside the function to avoid circular dependencies. Embedding objects are automatically converted to MVData for uniform processing.

When comparing arrays or MVData without items specification, assumes the data is already averaged per condition (each row represents one item/condition).

Examples

>>> import numpy as np
>>> # Compare two brain areas with structured data
>>> np.random.seed(42)
>>> n_stimuli = 5
>>> n_neurons_v1, n_neurons_v2 = 20, 15
>>>
>>> # Create orthogonal patterns for each stimulus
>>> v1_data = np.zeros((n_stimuli, n_neurons_v1))
>>> v2_data = np.zeros((n_stimuli, n_neurons_v2))
>>>
>>> # Each stimulus activates different neurons in both areas
>>> for i in range(n_stimuli):
...     # V1: each stimulus activates 4 specific neurons
...     v1_data[i, i*4:(i+1)*4] = 1.0
...     # V2: similar pattern with 3 neurons per stimulus
...     v2_data[i, i*3:(i+1)*3] = 1.0
>>>
>>> # Add small noise for realism
>>> v1_data += 0.1 * np.random.randn(n_stimuli, n_neurons_v1)
>>> v2_data += 0.1 * np.random.randn(n_stimuli, n_neurons_v2)
>>>
>>> # This creates similar RDM structure in both areas
>>> similarity = rsa_compare(v1_data, v2_data, comparison='spearman')
>>> print(f"RSA similarity: {similarity:.3f}")
RSA similarity: 0.479

>>> # Compare using compute_rdm_unified first
>>> from driada.rsa import compute_rdm_unified
>>> np.random.seed(123)
>>> data1 = np.random.randn(50, 90)  # 50 features, 90 timepoints
>>> data2 = np.random.randn(40, 90)  # 40 features, same timepoints
>>> labels = np.repeat(['A', 'B', 'C'], 30)  # 30 samples per condition
>>> # First compute RDMs with labels
>>> rdm1, _ = compute_rdm_unified(data1, items=labels)
>>> rdm2, _ = compute_rdm_unified(data2, items=labels)
>>> # Both RDMs now have shape (3, 3) for the 3 conditions
>>> print(f"RDM shapes: {rdm1.shape}, {rdm2.shape}")
RDM shapes: (3, 3), (3, 3)
>>> # Compare the RDMs
>>> from driada.rsa import compare_rdms
>>> similarity = compare_rdms(rdm1, rdm2)
>>> print(f"RSA similarity between -1 and 1: {-1 <= similarity <= 1}")
RSA similarity between -1 and 1: True

See also

compute_rdm_unified: Unified RDM computation interface
compare_rdms: Direct comparison of RDM matrices
bootstrap_rdm_comparison: Statistical comparison with confidence intervals

Core functions for computing representational dissimilarity matrices (RDMs) and comparing neural representations.

JIT-Optimized Functions

JIT-compiled core functions for RSA.

These functions provide optimized implementations of computationally intensive RSA operations.

driada.rsa.core_jit.fast_correlation_distance(patterns)

Compute correlation distance matrix using JIT-optimized loops.

This function computes pairwise correlation distances between patterns using explicit loops optimized for numba JIT compilation. It handles edge cases like zero-variance patterns and uses sample correlation (ddof=1) to match numpy.corrcoef behavior.

Parameters:: patterns (np.ndarray) – Pattern matrix of shape (n_items, n_features). Each row represents a pattern/item, each column a feature.
Returns:: rdm – Correlation distance matrix (n_items, n_items). Values range from 0 (identical patterns) to 2 (perfectly anti-correlated). Diagonal is always 0.
Return type:: np.ndarray

Notes

The function standardizes each pattern to zero mean and unit variance using sample standard deviation (n-1 denominator). For patterns with zero variance, correlation is undefined and distance is set to 0 if patterns are identical, 1 otherwise.

Correlation values are clipped to [-1, 1] to handle numerical errors before computing distance as 1 - correlation.

Examples

>>> patterns = np.array([[1, 2, 3], [2, 4, 6], [1, 1, 1]])
>>> rdm = fast_correlation_distance(patterns)
>>> # rdm[0,1] ≈ 0 (perfect correlation)
>>> # rdm[0,2] = 1 (undefined correlation, different patterns)

See also

compute_rdm: Higher-level function that uses this for correlation metric
fast_euclidean_distance: Alternative distance metric
fast_manhattan_distance: Alternative distance metric

driada.rsa.core_jit.fast_average_patterns(data, labels, unique_labels)

Average patterns within conditions using fast JIT-compiled loops.

This function computes the mean pattern for each unique condition label by averaging all timepoints that belong to that condition. Optimized for performance using explicit loops compatible with numba JIT compilation.

Parameters:

data (np.ndarray) – Data array of shape (n_features, n_timepoints)
labels (np.ndarray) – Label for each timepoint
unique_labels (np.ndarray) – Unique condition labels

Returns:

patterns – Averaged patterns (n_conditions, n_features)

Return type:

np.ndarray

Notes

If no timepoints match a given label, that condition’s pattern will be all zeros. This is intentional to maintain consistent output shape.

Examples

>>> data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
>>> labels = np.array([0, 1, 0, 1])
>>> unique = np.array([0, 1])
>>> patterns = fast_average_patterns(data, labels, unique)
>>> # patterns[0] = mean of columns 0,2 = [2, 6]
>>> # patterns[1] = mean of columns 1,3 = [3, 7]

See also

compute_rdm_from_timeseries_labels: Higher-level function that uses this
compute_rdm_from_trials: Alternative averaging approach for trial data

driada.rsa.core_jit.fast_euclidean_distance(patterns)

Compute Euclidean distance matrix using JIT-optimized loops.

Computes pairwise Euclidean distances between all pattern pairs using explicit loops for numba JIT compilation compatibility.

Parameters:: patterns (np.ndarray) – Pattern matrix of shape (n_items, n_features). Each row is a pattern in n_features-dimensional space.
Returns:: rdm – Symmetric Euclidean distance matrix (n_items, n_items) with zeros on diagonal. Values are non-negative.
Return type:: np.ndarray

Notes

Uses the standard Euclidean distance formula: d(i,j) = sqrt(sum((patterns[i,k] - patterns[j,k])^2))

No overflow protection is implemented. For very large values, consider normalizing patterns first.

Examples

>>> patterns = np.array([[0, 0], [3, 4], [1, 0]])
>>> rdm = fast_euclidean_distance(patterns)
>>> # rdm[0,1] = 5.0 (distance from origin to (3,4))
>>> # rdm[0,2] = 1.0 (distance from origin to (1,0))

See also

compute_rdm: Higher-level function that uses this for euclidean metric
fast_correlation_distance: Alternative distance metric
fast_manhattan_distance: Alternative distance metric

driada.rsa.core_jit.fast_manhattan_distance(patterns)

Compute Manhattan distance matrix using JIT-optimized loops.

Computes pairwise Manhattan (L1) distances between patterns using explicit loops for numba JIT compilation compatibility.

Parameters:: patterns (np.ndarray) – Pattern matrix of shape (n_items, n_features). Each row represents a pattern/item, each column a feature.
Returns:: rdm – Symmetric Manhattan distance matrix (n_items, n_items) with zeros on diagonal. All values are non-negative.
Return type:: np.ndarray

Notes

Manhattan distance (also called L1 distance or taxicab distance) is the sum of absolute differences: d(i,j) = sum(abs(patterns[i,k] - patterns[j,k]))

This metric is more robust to outliers than Euclidean distance and often used for high-dimensional or sparse data.

Examples

>>> patterns = np.array([[0, 0], [3, 4], [1, 1]])
>>> rdm = fast_manhattan_distance(patterns)
>>> # rdm[0,1] = 7 (|0-3| + |0-4|)
>>> # rdm[0,2] = 2 (|0-1| + |0-1|)
>>> # rdm[1,2] = 5 (|3-1| + |4-1|)

See also

compute_rdm: Higher-level function that uses this for manhattan metric
fast_euclidean_distance: Alternative distance metric
fast_correlation_distance: Alternative distance metric