Scalable software and models for large-scale extracellular recordings
View/ Open
Date
11/03/2022Author
Hurwitz, Cole Lincoln
Metadata
Abstract
The brain represents information about the world through the electrical activity of
populations of neurons. By placing an electrode near a neuron that is firing (spiking), it
is possible to detect the resulting extracellular action potential (EAP) that is transmitted
down an axon to other neurons. In this way, it is possible to monitor the communication
of a group of neurons to uncover how they encode and transmit information. As the
number of recorded neurons continues to increase, however, so do the data processing
and analysis challenges. It is crucial that scalable software and analysis tools are developed
and made available to the neuroscience community to keep up with the large
amounts of data that are already being gathered.
This thesis is composed of three pieces of work which I develop in order to better
process and analyze large-scale extracellular recordings. My work spans all stages of extracellular
analysis from the processing of raw electrical recordings to the development
of statistical models to reveal underlying structure in neural population activity.
In the first work, I focus on developing software to improve the comparison and adoption
of different computational approaches for spike sorting. When analyzing neural
recordings, most researchers are interested in the spiking activity of individual neurons,
which must be extracted from the raw electrical traces through a process called
spike sorting. Much development has been directed towards improving the performance
and automation of spike sorting. This continuous development, while essential,
has contributed to an over-saturation of new, incompatible tools that hinders rigorous
benchmarking and complicates reproducible analysis. To address these limitations, I
develop SpikeInterface, an open-source, Python framework designed to unify preexisting
spike sorting technologies into a single toolkit and to facilitate straightforward
benchmarking of different approaches. With this framework, I demonstrate that modern,
automated spike sorters have low agreement when analyzing the same dataset, i.e.
they find different numbers of neurons with different activity profiles; This result holds
true for a variety of simulated and real datasets. Also, I demonstrate that utilizing a
consensus-based approach to spike sorting, where the outputs of multiple spike sorters
are combined, can dramatically reduce the number of falsely detected neurons.
In the second work, I focus on developing an unsupervised machine learning approach
for determining the source location of individually detected spikes that are
recorded by high-density, microelectrode arrays. By localizing the source of individual
spikes, my method is able to determine the approximate position of the recorded neuriii
ons in relation to the microelectrode array. To allow my model to work with large-scale
datasets, I utilize deep neural networks, a family of machine learning algorithms that
can be trained to approximate complicated functions in a scalable fashion. I evaluate
my method on both simulated and real extracellular datasets, demonstrating that it is
more accurate than other commonly used methods. Also, I show that location estimates
for individual spikes can be utilized to improve the efficiency and accuracy of spike
sorting. After training, my method allows for localization of one million spikes in approximately
37 seconds on a TITAN X GPU, enabling real-time analysis of massive
extracellular datasets.
In my third and final presented work, I focus on developing an unsupervised machine
learning model that can uncover patterns of activity from neural populations
associated with a behaviour being performed. Specifically, I introduce Targeted Neural
Dynamical Modelling (TNDM), a statistical model that jointly models the neural activity
and any external behavioural variables. TNDM decomposes neural dynamics (i.e.
temporal activity patterns) into behaviourally relevant and behaviourally irrelevant dynamics;
the behaviourally relevant dynamics constitute all activity patterns required
to generate the behaviour of interest while behaviourally irrelevant dynamics may be
completely unrelated (e.g. other behavioural or brain states), or even related to behaviour
execution (e.g. dynamics that are associated with behaviour generally but are not
task specific). Again, I implement TNDM using a deep neural network to improve its
scalability and expressivity. On synthetic data and on real recordings from the premotor
(PMd) and primary motor cortex (M1) of a monkey performing a center-out reaching
task, I show that TNDM is able to extract low-dimensional neural dynamics that are
highly predictive of behaviour without sacrificing its fit to the neural data.