Quantifying the isolation quality of extracellularly recorded action potentials

https://doi.org/10.1016/j.jneumeth.2007.03.012Get rights and content

Abstract

There have been many approaches to the problem of detection and sorting of extra-cellularly recorded action potentials, but only a few methods actually quantify the quality of this fundamental process. In most cases, the quality assessment is based on the subjective judgment of human observers and the recorded units are divided into “well isolated” or “multi-unit” groups. This subjective evaluation precludes comprehensive assessment of single-unit studies since the most basic parameter, i.e. their data quality, is not explicitly defined. Here we propose objective measures to evaluate the quality of spike data, based on the time-stamps of the detected spikes and the high-frequency sampling of the analog signal of cortical and basal-ganglia data. We show that quantification of recording quality by the signal-to-noise ratio (SNR) may be misleading. The recording quality is better assessed by an isolation score that measures the overlap between the noise (non-spike) and the spike clusters. Furthermore, we use a nearest-neighbors algorithm to estimate the proportion of false positive and false negative classification errors. To validate these quality measures, we simulate spike detection and sorting errors and show that the scores are good predictors of the frequency of errors. The reliability of the isolation score is further verified by errors implanted in real basal ganglia data and by using different sorting algorithms. We conclude that quantitative measures of spike isolation can be obtained independently of the method used for spike detection and sorting, and recommend their reports in any study based on the activity of single neurons.

Introduction

The problem of extracting single neuron activity from extracellular recordings has been investigated extensively and comprehensively reviewed (e.g. Lewicki, 1998). The process of detecting action potentials from the extracellular waveforms (spikes) and clustering them into different neuronal sources is known as spike detection and sorting. Spike detection and sorting algorithms are not perfect and classification errors can occur for a number of reasons. First, most algorithms are not fully automatic (e.g. Abeles and Goldstein, 1977, Worgotter et al., 1986, Bergman and DeLong, 1992) and their real-time use can lead to human errors (Wood et al., 2004). Second, inaccurate assumptions about the data can also lead to errors. Some algorithms presuppose a parametric statistical model (Lewicki, 1994, Pouzat et al., 2002, Pouzat et al., 2004, Shoham et al., 2003), whereas other algorithms are based on non-parametric assumptions (Fee et al., 1996a). In both cases these assumptions, whether explicit or implicit, may be violated. For example, the analog trace in Fig. 1 shows significant modulation of the spike waveforms and illustrates how the stationarity (waveform stability) assumption may be violated and thus lead to classification errors.

Although many approaches to the problem of sorting spikes have been put forward, only a few methods have been developed to quantify the quality of the spike sorting (Harris et al., 2001, Pouzat et al., 2002, Schmitzer-Torbert et al., 2005). In most cases, the quality assessment is done subjectively by a human observer, and units with high scores are then reported as having a “high signal-to-noise ratio” and being “well isolated”. These subjective reports do not permit comparison of data quality across different studies and unfortunately are predisposed to personal bias.

In this article we propose objective measures to assess the quality of spike detection and sorting. Our measures quantify two different aspects of the data:

  • 1.

    Quality of the recording, by calculating SNR (Section 3.1). We present and discuss two calculations of the SNR that differ in their noise estimation. The first is based on the noise when an action potential occurs and the second is based on the noise between action potentials.

  • 2.

    Clustering quality. We introduce an isolation score for quantifying the overlap between the spike and the noise (non-spike) clusters (Section 3.2). We then present classification error scores that estimate the fraction of events that were misclassified as spikes (false positive errors) or misclassified as noise-events (false negative errors) (Section 3.3).

To validate these measures, we simulate spike-sorting errors (Section 3.4.1) and test the isolation score and classification error scores as a function of the fraction of simulated errors for different units. We check the scores under different conditions by applying several clustering algorithms (Section 3.4.2). We use real data from the basal ganglia and simulated errors to investigate the score parameter space (Section 3.4.4). Finally, we compare the results of the different scores (Section 3.4.5).

Section snippets

Neuronal recording procedures

The data were collected from experiments performed on two vervet monkeys (Monkey Cu: Cercopithecus aethiops, female, weighing 3.5–4 kg and monkey T, female, weighing 3 kg) and two Macaque fascicularis (monkey Y, male, weighing 5 kg and monkey P, female, weighing 3 kg). Details of the behavior of the monkeys and animal care are described elsewhere (Heimer et al., 2002, Morris et al., 2004, Elias et al., 2007). Recordings were made in the external segment of the globus pallidus (GPe), a central

Results

Spike detection and sorting quality depends first on the recording quality and then on the quality of the clustering algorithm. To evaluate recording quality we used the signal-to-noise ratio (SNR) (Section 3.1). Although the SNR can be used for initial estimation of recording quality and a high SNR is usually a necessary condition for good unit isolation, the SNR is not a direct measure of the isolation of a single unit. Sorting of recordings with a high SNR may nonetheless result in a spike

Discussion

We quantified the quality of spike detection and sorting using signal-to-noise ratios (SNR), isolation scores, and classification error scores. We then simulated errors for validating the scores, compared several spike sorting algorithms, and investigated the parameter space of the scores.

Acknowledgement

This study was partly supported by a Center of Excellence grant administered by the ISF and HUNA's “Fighting against Parkinson” grant.

References (24)

  • S. Elias et al.

    Statistical properties of pauses of the high-frequency discharge neurons in the external segment of the globus pallidus

    J Neurosci

    (2007)
  • M.S. Fee et al.

    Variability of extracellular spike waveforms of cortical neurons

    J Neurophysiol

    (1996)
  • Cited by (88)

    • Evaluation of Spike Sorting Algorithms: Application to Human Subthalamic Nucleus Recordings and Simulations

      2019, Neuroscience
      Citation Excerpt :

      There are deviations in the number of detected single units, in the percentage of unsorted events and rpv, as well as differences in the cluster quality measures. The IS is typically used to select well isolated single units, e.g., by rejecting clusters with IS < 0.7 (Joshua et al., 2007; Lourens et al., 2013; Deffains et al., 2014). The percentage of tolerated rpv is typically assumed to be 0.5% up to 2.5% (Moran et al., 2008; Lourens et al., 2013; Yang et al., 2014) for refractory periods assumed to be 1 ms up to 4 ms (Bar-Gad et al., 2001a; Moran et al., 2008; Eden et al., 2012; Lourens et al., 2013; Shimamoto et al., 2013; Kelley et al., 2018).

    View all citing articles on Scopus
    View full text