Post

CON²PHYS

CON²PHYS and the Pre-COSYNE Brainhack 2026

CON²PHYS

Overview & goals

This year’s Pre-COSYNE Brainhack draws its inspiration from CON²PHYS — CONceptual CONsistency in electroPHYSiology. The project starts from a simple but pressing observation: systems neuroscience is full of core concepts whose meanings quietly drift from lab to lab. Ideas like functional connectivity, or even something as basic as spike–spike correlations are central to our work, yet they often are conceptualised and implemented differently depending on who you ask.

Do these differences actually matter? That’s exactly what we will find out.

During the Brainhack, participants will analyze the same electrophysiology dataset and, guided by their own methodological assumptions, answer a set of deliberately underspecified multiple-choice questions tied to core systems neuroscience concepts. This setup mirrors real research, revealing how researchers navigate conceptual ambiguity and how arbitrary analytical choices practically impact scientific results.

Brainhack format

In this year’s Pre-COSYNE Brainhack, you will answer 4 of the 15 multiple-choice questions that are part of CON²PHYS, using the exact same data as the full project. You’ll work in teams of 3–4, balanced in terms of prior electrophysiology experience and coding skills. During the hackathon, Research Software Engineers (RSEs) will give short, practical sessions on how to make analyses more reproducible with cleaner code, standard data layouts, and basic testing (to be adjusted based on the actual content of the talks). The goal is not necessarily to “find the right answer”, but to take stock of the lack of consensus that might emerge. We expect heterogeneity in both answers and methods, and a part of the event will focus on discussing how and why different teams arrived at different conclusions.

Since the Brainhack already covers a significant portion of the 15 CON²PHYS questions, participants are well positioned to later extend their analyses to the full set of questions and complete a CON²PHYS submission that will make them eligible for co-authorship on the manuscript resulting from the project.

Dataset

The dataset consists of 18 compressed files, each corresponding to one of the 18 mice included in the dataset. The data includes single-unit activity (SUA) and local field potentials (LFP) recorded using a Neuropixels probe during a behavioral task. Task details and electrode placements are intentionally anonymized to reduce biases and minimize workload. The data has been collected from 3 simultaneously recorded brain areas. Every recording includes LFP (≥20 channels) and SUA (≥5 units, total of 1449 units) signals from each brain area. The length of the recordings varies between 55 and 101 minutes.

For each mouse, the following data is provided:

📋 Trial Data (.xlsx)

A spreadsheet containing trial-specific information, all aligned with the ephys data:

  • trial_start (s): Trial onset.
  • stim_start (s): Stimulus presentation.
  • outcome (s): Reward or punishment time.
  • trial_end (s): Trial conclusion (the trial length is variable).
  • Variable A: Binary categorical behavioral variable.
  • Variable B: Binary categorical behavioral variable.
  • Variable C: Categorical behavioral variable (1–3).

🧠 Brain Area (.npy, .mat)

  • [1 × num_units] array with integer values (1–3), indicating the brain area in which a unit has been recorded.
  • [1 × num_units] array with integer values for cluster IDs.

⚡ Spikes (.npy, .mat)

  • [1 × num_spikes] array with float spike times (s).

🔍 Clusters (.npy, .mat)

  • [1 × num_spikes] array with integer values representing the cluster ID to which each spike corresponds to. Please note that, accordingly, this array has the same length as spikes.

🌊 Waveforms (.npy, .mat)

  • [num_units × 128] float array of average waveforms for each unit, recorded at 30 kHz on the best detection channel.
  • The order of waveforms matches that of cluster IDs and brain areas in brain_area.

🌀 Local Field Potentials (LFP) (.npy, .mat)

  • lfp1, lfp2, lfp3 each contain [num_channels × timestamps] float arrays of LFP signals.
  • The number of channels varies from brain area to brain and from mouse to mouse. The minimum number of channels per brain area is 20.
  • Channels within a brain area are contiguous in space, but channels from different brain areas are not.
  • Channels within a brain area are ordered from the deepest to the most superficial with respect to the brain surface.
  • The dataset comprises one channel every two that have been recorded on the Neuropixel probe. The vertical spacing between recording sites is 20µM.
  • The signal has been recorded with an external reference and has already undergone a preprocessing pipeline.
  • Sampling rate: 500 Hz.

Questions

In this hackathon, you will work on 4 of the 15 questions that make up the full CON²PHYS project.
For each question, you will:

  • choose one categorical answer from the options below
  • briefly describe your analysis
    (inclusion/exclusion criteria, analysis steps, statistics, and any key results)

Below are the 4 questions that will be asked during the hackathon.

Question 1
Which brain area (if any) has the highest density of ripples (i.e. “hippocampal” ripples traditionally occurring during sharp wave-ripples)?
  • Brain area 1
  • Brain area 2
  • Brain area 3
  • Not enough data / no differences
Question 2
In which brain area are pairwise spike train interactions strongest at the 100 ms timescale?
  • Brain area 1
  • Brain area 2
  • Brain area 3
  • Not enough data / no differences
Question 3
Which brain area pair has the strongest directed functional connectivity?
  • Brain area 1 ⇒ Brain area 2
  • Brain area 3 ⇒ Brain area 2
  • Brain area 3 ⇒ Brain area 1
  • Not enough data / no differences
Question 4
During which trial segment is variable C best decoded?
  • Trial start ⇒ Stim start
  • Stim start ⇒ Outcome
  • Outcome ⇒ Trial end
  • Not enough data / no differences

How to prepare

Preparing in advance is crucial: time for hacking will be limited, and you want to use to tackle the questions.

Download and unzip the dataset before the hackathon. The dataset is roughly ~29 Gb and consists of 18 zipped files. Download them one by one to maximize speed. Here, we are only giving the python version of the dataset. If you want to use Matlab, please write to Mattia Chini.

Next, set up your coding environment. You are free to use the environment you prefer (e.g. your usual Python or Matlab setup). The configuration below is only a suggestion that provides a ready-to-use Python environment with Anaconda / Miniconda, and ensures that these example notebooks run smoothly. You can use these notebooks to check that you downloaded the full dataset correctly, and to start playing with the data.

  1. Install Miniconda / Anaconda

  2. Create the environment using the provided environment.yml file:

    conda env create -f environment.yml

  3. Check that everything runs correctly by starting Jupyter and opening the example notebooks.

You are welcome to use your favourite packages and pipelines, but if you are unsure where to begin with, here are some recommended packages:

  • neurodsp (pip install neurodsp) digital signal processing toolbox (for neural time series)
  • mne (pip install mne) digital signal processing toolbox (for neurophysiological data)
  • pynapple (pip install pynapple) general neurophysiological data analysis
  • please add as you see fit
This post is licensed under CC BY 4.0 by the author.

© Pre-COSYNE Brainhack. Some rights reserved.

Using the Chirpy theme for Jekyll.