From awesome-cognitive-and-neuroscience-skills
Guides neural population decoding with MVPA, RSA, temporal generalization, and encoding models for systems neuroscience research questions.
npx claudepluginhub neuroaihub/awesome_cognitive_and_neuroscience_skills --plugin awesome-cognitive-and-neuroscience-skillsThis skill uses the workspace's default tool permissions.
This skill encodes expert methodological knowledge for multivariate neural decoding analyses in systems neuroscience. It covers cross-validated classification (MVPA), representational similarity analysis (RSA), temporal generalization, and encoding models. The skill provides domain-specific decision logic, parameter recommendations, and pitfall warnings that a machine-learning engineer without ...
Guides dimensionality reduction (PCA, GPFA, dPCA) and latent-variable analysis for neural populations from multi-electrode arrays, Neuropixels, or calcium imaging.
Analyzes Neuropixels 1.0/2.0 neural recordings from SpikeGLX/OpenEphys data: loads, preprocesses with motion correction, runs Kilosort4 spike sorting, computes quality metrics, curates units via Allen/IBL methods.
Processes Neuropixels recordings: spike sorting with Kilosort via SpikeInterface, quality metrics (ISI, firing rate, SNR), unit curation, post-analysis (PSTH, tuning curves, population decoding).
Share bugs, ideas, or general feedback.
This skill encodes expert methodological knowledge for multivariate neural decoding analyses in systems neuroscience. It covers cross-validated classification (MVPA), representational similarity analysis (RSA), temporal generalization, and encoding models. The skill provides domain-specific decision logic, parameter recommendations, and pitfall warnings that a machine-learning engineer without neuroscience training would not know.
Before executing the domain-specific steps below, you MUST:
For detailed methodology guidance, see the research-literacy skill.
This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.
Univariate analysis tests whether the mean activity level differs across conditions in a region. Decoding tests whether spatial patterns of activity carry information, even when mean activity is identical across conditions (Haynes, 2015). Use decoding when:
Domain judgment: High decoding accuracy does NOT mean the decoded region is the source of the representation. It means the information is accessible from that region's patterns. A downstream region receiving a copy of the signal will also decode well (Haynes, 2015).
What is your research question?
|
+-- "Is stimulus/task information present in this brain region's patterns?"
| --> Cross-validated classification (MVPA)
| Output: classification accuracy or d-prime
|
+-- "How are representations organized? Does the geometry match a model?"
| --> Representational Similarity Analysis (RSA)
| Output: model-RDM correlation, noise ceiling
|
+-- "When does information emerge and how does it transform over time?"
| --> Temporal Generalization (time x time decoding)
| Output: temporal generalization matrix
| Best for: EEG, MEG, intracranial recordings
|
+-- "What stimulus features drive neural responses across the feature space?"
--> Encoding Models (voxelwise/channel-wise prediction)
Output: prediction accuracy (R^2), feature tuning maps
| Classifier | When to Use | When to Avoid | Source |
|---|---|---|---|
| Linear SVM | Default choice; robust to high dimensionality; works well with small samples | When you need probabilistic outputs (use logistic regression) | Misaki et al., 2010; Varoquaux et al., 2017 |
| LDA | Fast; good when n_features << n_samples after reduction | Raw high-dimensional data (covariance estimate unstable) | Misaki et al., 2010 |
| Logistic Regression | When you need class probabilities; with L1 for sparse solutions | Rarely a bad choice; comparable to linear SVM | Varoquaux et al., 2017 |
| Linear kernel (general) | Almost always for fMRI/EEG | Nonlinear kernels rarely improve and risk overfitting | Misaki et al., 2010 |
Domain judgment: Linear classifiers are strongly preferred in neuroimaging because (1) fMRI/EEG patterns are high-dimensional relative to sample size, making nonlinear methods prone to overfitting, and (2) linear weights are more interpretable neurally, though see Haufe et al. (2014) on the distinction between classifier weights and activation patterns.
| Strategy | When to Use | Rationale |
|---|---|---|
| Leave-one-run-out | fMRI (standard) | Respects temporal autocorrelation within runs; prevents leakage from slow hemodynamic signals (Varoquaux et al., 2017) |
| Stratified k-fold (k=5-10) | EEG/MEG with many trials | Balances class proportions in each fold; k=5 recommended for bias-variance tradeoff (Varoquaux, 2018) |
| Leave-one-trial-out | When few trials available | Maximum training data but high variance; avoid for fMRI due to temporal autocorrelation (Varoquaux et al., 2017) |
| Leave-one-subject-out | Between-subject generalization | Tests whether patterns generalize across individuals |
CRITICAL -- Information leakage: Feature selection, normalization, and dimensionality reduction MUST be performed WITHIN each cross-validation fold, using ONLY training data. Fitting a PCA or z-scoring across all data before splitting inflates accuracy by leaking test-set statistics into training (Kriegeskorte et al., 2009; Varoquaux et al., 2017).
RSA abstracts from activity patterns to a condition-by-condition dissimilarity matrix (RDM), enabling comparison across brain regions, species, and computational models (Kriegeskorte et al., 2008).
| Distance Metric | Properties | When to Use | Source |
|---|---|---|---|
| Correlation distance (1 - Pearson r) | Invariant to mean and scale | Default for comparing pattern shape; standard in early RSA | Kriegeskorte et al., 2008 |
| Euclidean distance | Sensitive to amplitude | When amplitude differences are meaningful | Kriegeskorte et al., 2008 |
| Crossnobis distance | Cross-validated Mahalanobis; unbiased estimator with interpretable zero | Preferred for inferential statistics; requires multi-run data | Walther et al., 2016; Kriegeskorte & Diedrichsen, 2019 |
Domain judgment: The crossnobis estimator is unbiased -- its expected value is zero when two conditions have identical representations, unlike correlation distance or Euclidean distance which are positively biased by noise. This means crossnobis values can be negative (not a true distance), but this property makes it valid for statistical inference without bias correction (Walther et al., 2016).
Domain judgment: If a model falls within the noise ceiling, it explains as much variance as is explainable given the noise in the data. A model below the lower bound leaves systematic variance unexplained. This is NOT the same as a significance test -- a model can be significantly correlated with brain RDMs yet still fall below the noise ceiling (Nili et al., 2014).
See references/rsa-guide.md for a complete step-by-step RSA workflow.
Train a classifier at each time point t, test it at every time point t'. The resulting time x time matrix reveals the dynamics of neural representations (King & Dehaene, 2014).
| Pattern | Matrix Shape | Interpretation | Example |
|---|---|---|---|
| Diagonal only | Thin diagonal stripe | Information is present but the neural code changes over time (chain of transient states) | Sequence of processing stages |
| Square block | Broad off-diagonal generalization | Stable, sustained representation (same code maintained) | Working memory maintenance |
| Off-diagonal stripe | Horizontal or vertical extension | A code trained at one time reactivates later | Memory reactivation |
| Below-diagonal spread | Widening below diagonal | Later representations are decodable by earlier classifiers (persistent code) | Sustained sensory trace |
(King & Dehaene, 2014; Grootswagers et al., 2017)
| Parameter | Recommended Value | Rationale | Source |
|---|---|---|---|
| Window width | 50 ms for EEG/MEG | Balances temporal resolution with SNR | Grootswagers et al., 2017 |
| Step size | 10 ms for EEG/MEG | Provides smooth temporal profile without excessive computation | Grootswagers et al., 2017 |
| Baseline window | -200 to 0 ms | Standard pre-stimulus baseline | Grootswagers et al., 2017 |
| Features | All sensors at time point t | Use all channels; spatial patterns carry information | King & Dehaene, 2014 |
Encoding models predict neural responses from stimulus features, complementing decoding (which predicts stimuli from neural responses).
Feature selection, z-scoring, PCA, or any data-driven preprocessing on the full dataset before cross-validation splitting will leak information from test folds into training, inflating accuracy. ALL such steps must occur WITHIN each fold (Kriegeskorte et al., 2009; Varoquaux et al., 2017).
Decoding "success" may reflect confounds rather than neural representations:
Selecting an ROI based on significant searchlight clusters and then performing additional analyses on those clusters is circular (Kriegeskorte et al., 2009; Etzel et al., 2013). Use independent data or pre-registered ROIs for follow-up analyses.
Raw SVM or regression weights do NOT indicate which voxels/channels are most activated by a condition. They indicate which features are most useful for discrimination, which can include suppressing noise. To obtain neurophysiologically interpretable maps, transform weights into activation patterns using the method of Haufe et al. (2014).
Unequal trial counts across classes bias accuracy toward the majority class. Solutions:
Based on Haynes (2015), Varoquaux et al. (2017), and Grootswagers et al. (2017):
See references/decoding-methods.md for detailed classifier comparisons, searchlight parameters, and software tools.
See references/rsa-guide.md for a complete step-by-step RSA analysis workflow.