Automatic Cell Segmentation by Adaptive Thresholding (ACSAT) for Large-Scale Calcium Imaging Datasets

Abstract Advances in calcium imaging have made it possible to record from an increasingly larger number of neurons simultaneously. Neuroscientists can now routinely image hundreds to thousands of individual neurons. An emerging technical challenge that parallels the advancement in imaging a large number of individual neurons is the processing of correspondingly large datasets. One important step is the identification of individual neurons. Traditional methods rely mainly on manual or semimanual inspection, which cannot be scaled for processing large datasets. To address this challenge, we focused on developing an automated segmentation method, which we refer to as automated cell segmentation by adaptive thresholding (ACSAT). ACSAT works with a time-collapsed image and includes an iterative procedure that automatically calculates global and local threshold values during successive iterations based on the distribution of image pixel intensities. Thus, the algorithm is capable of handling variations in morphological details and in fluorescence intensities in different calcium imaging datasets. In this paper, we demonstrate the utility of ACSAT by testing it on 500 simulated datasets, two wide-field hippocampus datasets, a wide-field striatum dataset, a wide-field cell culture dataset, and a two-photon hippocampus dataset. For the simulated datasets with truth, ACSAT achieved >80% recall and precision when the signal-to-noise ratio was no less than ∼24 dB.


Introduction
The ability to record from a large population of single neurons during behavior greatly facilitates the investigation of the contribution of individual neurons to neuronal network dynamics. Extracellular single-unit recording has traditionally been a method of choice in neurophysiological analyses of single neurons in the brain. Recent improvements, such as the new generation of genetically encoded calcium sensors GCaMP6 (Chen et al., 2013, have made it possible to observe hundreds to thousands of individual neurons simultaneously (Ohki et al., 2005;Andermann et al., 2010;Huber et al., 2012;Ziv et al., 2013;Issa et al., 2014;Mohammed et al., 2016). Though indirect, these calcium indicators have been sensitive enough to monitor neuronal activity with high spatiotemporal precision in behaving animals, allowing researchers to examine the activity of populations of a specific cell type (Hofer et al., 2011;Wachowiak et al., 2013;Pinto and Dan, 2015;Allen et al., 2017) or the same cell over an extended period of time (Poort et al., 2015).
As the performance of genetically encoded calcium indicators has improved, wide-field microscopy has become feasible for recording the activity of a large population of neurons over an extended anatomical area (Lütcke et al., 2013;Wilt et al., 2013). Although lacking the spatial subcellular resolution of a multiphoton microscope, wide-field microscopes can operate at a higher speed, allowing simultaneous recording of increasingly larger populations (Ghosh et al., 2011;Ziv et al., 2013;Kim et al., 2016;Mohammed et al., 2016). Advanced microfabrication techniques further miniaturized the widefield microscope to a microendoscope capable of monitoring neural activity in freely-moving animals (Ghosh et al., 2011;Ziv et al., 2013).
An emerging technical challenge that parallels advances in calcium imaging is the processing of large datasets (Hamel et al., 2015). During data analysis, an important step is to identify regions of interest (ROIs) corresponding to individual neurons. As data grows rapidly both spatially and temporally, the traditional laborintensive approach of manual inspection has to be automated. Principal component analysis (PCA) and independent component analysis (ICA) methods are natural and frequently used candidates for automating ROI iden-tification (Mukamel et al., 2009). However, if its assumption of statistical independence between neurons is violated, which is often the case in real neural recordings, then the method relies on user selection of parameters for spatial segmentation.
Threshold-based methods represent a promising and intuitive alternative for automatic ROI identification. However, several challenges need to be overcome, including variability in recording conditions or fluorescence signal strength across structures, recording subjects, and the imaging field. For example, one of the most referenced thresholding methods, Otsu's method, which automatically selects the optimal threshold value that minimizes the intraclass variance among ROI pixels and among background pixels, would only successfully segment some of the highest-intensity ROIs (Otsu, 1979;Sezgin and Sankur, 2004). Additionally, the multiclass Otsu's method is limited because uneven lighting may result in separate background classes. A waterfall-thresholding approach addresses uneven lighting by iterative thresholding to capture all intensity peaks, but its selection of a threshold value is ad hoc, making it dataset-dependent and user-dependent (Mellen and Tuong, 2009). A feedback loop-based approach for segmenting bacteria cells optimizes the threshold value from the distribution of pixel intensities, but its assumption that the total ROI area remains constant over time does not hold for calciumimaging datasets because neurons change in brightness (Shen et al., 2015). A recent machine learning-based algorithm uses image gradients and pixel traces to optimize threshold values, but it still requires a user's subjective input in selecting a background removal factor based on each dataset (Fantuzzo et al., 2017). Other approaches based on edge detection have trouble due to weak fluorescence signal strength in comparison with the background pixels (Sadeghian et al., 2009). Generally, most segmentation methods require a high level of tuning to each individual dataset.
To overcome these challenges of diverse imaging datasets, we introduce a new automated cell segmentation by adaptive thresholding (ACSAT) algorithm. ACSAT dynamically and automatically determines global and local threshold values based on the distribution of pixel intensities within a time-collapsed image of a recorded image sequence. We demonstrate the utility of ACSAT on simulated datasets, cell culture datasets, and in vivo widefield and two-photon datasets. For the simulated datasets with truth, ACSAT achieved Ͼ80% recall and precision when the signal-to-noise ratio was no less than ϳ24 dB. ACSAT also captured ϳ80% of human-identified ROIs in datasets from mouse hippocampus and striatum and was even able to detect low-intensity neurons that were initially undetected by human referees. SV40 virus (titer: ϳ6e12 GC/mL, University of Pennsylvania Vector Core). AAV was delivered either into the dorsal CA1 (AP: -2, ML: 1.4, DV: -1.6), or into the dorsal striatum (AP: 0.5, ML: 1.8, DV: -1.6) regions. Injections were performed with a 10-L syringe (World Precision Instruments) coupled with a 33-gauge needle (NF33BL, World Precision Instruments) at a speed of 40 nL/min, controlled by a microsyringe pump (UltraMicroPump 3-4, World Precision Instruments). Upon complete recovery, a custom imaging chamber with glass coverslip was surgically implanted on top of the viral injection site by removing the overlying cortical tissue. The imaging chamber was assembled by fitting a circular coverslip (size 0; OD: 3 mm) to a stainless steel cannula (OD: 3.17 mm, ID: 2.36 mm) using a UV-curable optical adhesive (Norland Products). During surgery, a custom aluminum headplate was also attached to the skull, which allowed head fixation during the imaging session.
Imaging data were acquired with a custom wide-field microscope coupled with a scientific CMOS camera (ORCA-Flash 4.0, C11440-42U, Hamamatsu), controlled by the commercial software package HCImageLive (Hamamatsu). The wide-field microscope consisted of a Leica N Plan 10ϫ 0.25 PH1 objective lens, an excitation filter (HQ 470/50), a dichroic mirror (FF506-Di02), an emission filter (FF01-536/40), a commercial SLR lens as the tube lens (Nikon Zoom-NIKKOR 80 -200 mm f/4 AI-s), and a 5W LED (LZ1-00B200, 460 nm; LedEngin). Data acquisition was performed at 20 Hz, at a resolution of 1024 ϫ 1024 pixels, with 16 bits per pixel, for ϳ10 -20 min. With 10ϫ objective lens, the microscope provided a field of view of 1.343 ϫ 1.343 mm 2 (1.312 ϫ 1.312 m 2 /pixel) of brain tissue. Imaging data were streamed from the camera to RAM of a custom computer (dual Intel Xeon processors, 128 GB RAM, and a GeForce GTX Titan video card) to ensure temporal precision. After each imaging session, data were moved from RAM to hard drive and saved in multipage tagged image file format.
Two hippocampus datasets (A and B) were collected from two mice [dataset A was previously reported by Mohammed et al. (2016)]. The mice were trained to perform a trace conditioning task known to involve hippocampal neural activity (Solomon et al., 1986;Moyer et al., 1990;Tseng et al., 2004;Sakamoto et al., 2005). In this task, the animal was trained to associate a conditioned stimulus (a 350-ms-long tone) with an unconditioned stimulus (a gentle 100-ms air puff to one eye). There was a 250-ms trace interval between two stimuli. During each recording session, the animal was head-fixed and performed 40 trials with a randomized 31-36-s intertrial interval. The hippocampus datasets (1024 ϫ 1024 pixels/frame, 2047 frames, ϳ100 s, ϳ4 GB size) analyzed in this study were part of larger recording sessions (ϳ50 GB size).
The striatum dataset was collected from a head-fixed animal running on a spherical treadmill system. The treadmill system consisted of a Styrofoam ball floated by air pressure in a 3D-printed bowl designed as described in Dombeck et al. (2007) that allowed the animal to move its limbs freely while head-fixed. The mouse was first han-dled for several days before being head-fixed to the spherical treadmill. Habituation to running on the spherical treadmill while head-fixed occurred over 3-4 days/ week at the same time of day as subsequent recording sessions (8 -12 h after lights-on), for several weeks. Single imaging sessions took ϳ25 min. Sampling occurred at ϳ20 Hz, and exposure time was fixed at 20 ms. The striatum dataset (ϳ100 s, ϳ4 GB size) contains 2047 frames with 1024 ϫ 1024 pixels per frame and was also part of a larger dataset (ϳ25 GB size).
Two human referees manually identified ROIs in the hippocampus dataset A and in the striatum dataset to create a set of human-generated ROIs for comparison with ACSAT's segmentation results. This manual selection was done by viewing the image sequence and segmenting ROIs that had fluorescence traces compatible with neuronal dynamics and/or by selecting ROIs from a composite image created from the video sequence and confirming that fluorescence traces were compatible with neuronal dynamics.

Wide-field cell culture dataset
The primary neuron cell dataset was collected from a 10-day-old neuron culture, infected with AAV9-Syn-GCaMP6.WPRE.SV40 virus. Seven days after infection, neurons were imaged at 20 Hz for 60 s. The primary neuron culture dataset contains 1201 frames, 1024 ϫ 1024 pixels per frame, recorded with the same imaging setup as for the hippocampus and striatum datasets described above.

Signal-to-noise ratio (SNR) calculation
We calculated the SNR in decibels (dB) as SNR ϭ 20 ϫ log ROI background .
For the simulated datasets, ROI is the mean intensity value of all pixels belonging to all ROIs in the timecollapsed image I 0 , and similarly, background is the standard deviation of background pixel intensity values, i.e., all pixels that do not belong to an ROI. For the hippocampus dataset A and the striatum dataset, ROI is the maximumintensity value of an ROI trace, and background is the standard deviation of the background trace. The ROI trace value at each time point is the averaged intensity values of all pixels belonging to that ROI, and similarly for the background trace, which uses all pixels not belonging to any ROI. Note that the SNR for the simulated datasets describes the whole time-collapsed image I 0 , whereas the SNRs for the hippocampus and striatum datasets describe an individual ROI.

Simulated datasets
We tested ACSAT's segmentation performance on 500 simulated datasets with varying SNRs (between ϳ19 and ϳ29 dB) and numbers of ROIs (between 300 and 700). Fig.   2B shows some examples of the simulated time-collapsed image, i.e., the input image I 0 to ACSAT in Fig. 1A. The simulation gives us the true locations of all ROIs so that we can accurately assess ACSAT's segmentation performance. The input is the time-collapsed image I 0 , and the output is a collection of automatically segmented ROIs. In each iteration, the Global FIBAT step identifies potential ROIs ͕ROIs͖ n = by applying FIBAT, described in B, to the entire image I n ; and the Local FIBAT step, described in C, splits overlapping ROIs. B, Flowchart of the FIBAT algorithm. The input image is segmented using each of the test threshold values 1 , 2 , ѧ, T . The search range for the optimal threshold value ( 1 , T ) is iteratively narrowed to contain the test threshold value which results in the maximum number of ROIs. C, Local FIBAT procedure. FIBAT, described in B, is recursively applied to each potential ROI until the resulting ROIs can no longer be separated by FIBAT.
Our simulated datasets were obtained by a procedure adapted from Zhou et al. (2018). We used the model I 0 ϭ E ϩ AC to generate the simulated datasets, where E represents noise, A represents the shapes of each ROI, and C adjusts each ROI's intensity to simulate uneven lighting.
The pixel noise values in E were randomly sampled from the background pixel values in the time-collapsed image for the hippocampus dataset. This noise is unlikely to be Gaussian, because the time-collapsing procedure subtracts the mean value from the maximum value of each pixel such that the time-collapsed image is biased toward higher pixel values.
The centroid location of each ROI represented in A was randomly selected with weights C 2 . The pixel values comprising the body of each ROI was modeled deterministically by the bivariate Gaussian probability density function, with widths randomly selected according to Zhou et al. (2018).
The image C is also used to amplify each ROI's pixel values to reflect uneven lighting conditions across the imaging field. C was generated by applying heavy Gaussian filtering to the time-collapsed image of the hippocampus dataset until no individual ROIs are detectable.

ACSAT overview
Fluorescence imaging data obtained in the form of image sequences is processed offline using a custom Matlab algorithm. Image sequences were first motioncorrected as described in Mohammed et al. (2016) to remove micromotion of the imaged area caused by breathing and other movements of the animal. ACSAT (Fig. 1A) is then applied to a time-collapsed image that represents the image sequences, to automatically identify individual neurons as ROIs.
The input image sequence is first loaded into Matlab as a 3D matrix (height ϫ width ϫ time) and then timecollapsed to produce a representative two-dimensional image (height ϫ width, I 0 in Fig. 1A), where each pixel in I 0 is represented by the maximum-intensity value of that pixel across the entire image sequence with its mean value removed. This time-collapsed image I 0 is then used for the rest of the algorithm. Pixels with low-intensity values would correspond to static background, whereas pixels with high-intensity values would correspond to neurons with GCaMP6 expression. In general, neurons with GCaMP6 expression appear in I 0 as a cluster of adjacent pixels with high-intensity values and with size similar to that of a neuron. Meanwhile, it is improbable for random background noise to generate clusters with similar properties. Thus, the time-collapsed image I 0 is expected to contain sufficient information to separate neurons from the background.
Next, ACSAT iteratively generates ROIs ͕ROIs͖ n from the time-collapsed image I n for iterations n ϭ 1, 2, ѧ, starting with I 1 ϭ I 0 . Before each subsequent iteration, I n is generated by cumulatively clearing previously segmented ROIs, ͕ROIs͖ nϪ1 , from I nϪ1 by setting each ROI's pixels in I nϪ1 to blank values of 0 and dilating the cleared area. As described later, each iteration consists of both adaptive thresholding at the global level (Global FIBAT in Fig. 1A), using the automatically selected threshold value n ‫ء‬ (Fig.  1B), and adaptive thresholding at the local level (Local FIBAT in Fig. 1A). When the change in global threshold value Խ nϩ1 ‫ء‬ Ϫ n ‫ء‬ Խ is insignificant, further iterations are likely to contribute more false positives than true positives. Thus, the ACSAT algorithm terminates at iteration n if where 1 ‫ء‬ acts as a normalizing factor. Accordingly, the final output of ACSAT is the union of the segmented ROIs from each iteration, ͕ROIs͖ 1 ഫ· · ·ഫ͕ROIs͖ n .

Global and local adaptive thresholding in ACSAT
Each iteration n of ACSAT contributes a set of newly segmented ROIs ͕ROIs͖ n from I n by applying our fluorescence intensity based adaptive thresholding (FIBAT) algorithm, at the global and local levels (Global/Local FIBAT in Global adaptive thresholding is the first step in the nth iteration of ACSAT (Fig. 1A). This step applies FIBAT directly to the whole image ͑I n ¡ I͒ to identify potential These potential ROIs ͕ROIs͖ n Ј may include groups of adjacent neurons or overlapping neurons because neurons located above and below the focal plane could be captured in the same frame during wide-field imaging. Such overlap, however, is unlikely to occur in two-photon datasets or in cell culture datasets. The local adaptive thresholding step (Fig. 1C) recursively separates any potentially overlapping ROIs within ͕ROIs͖ n Ј to output ͕ROIs͖ n . Specifically, each ROI in ͕ROIs͖ n Ј is individually dilated and then inputted to the local FIBAT (ROI ¡ I) in Fig. 1B to obtain a set of separated ROIs ͕ROIs Խ ‫ء‬ ͖. If any outputted set ͕ROIs Խ ‫ء‬ ͖ contains more than one separated ROI, then each ROI in the set ͕ROIs Խ ‫ء‬ ͖ is further separated using the same procedure, thus forming a recursive loop. Otherwise, if any outputted set ͕ROIs Խ ‫ء‬ ͖ contains only one ROI, then the recursion terminates. The final output of the local thresholding step ͕ROIs͖ n is the union of all such sets containing one ROI that cannot be further separated.

FIBAT
As described, FIBAT (Fig. 1B) is used in both the global and local adaptive thresholding steps of each iteration of ACSAT to identify potential ROIs in the time-collapsed image I ϭ I n or to separate potentially overlapping neurons within I which is an element of ͕ROIs͖ n Ј , respectively. In either case, an optimal pixel intensity threshold value ‫ء‬ separates ROIs from the background. FIBAT selects ‫ء‬ by searching for the threshold value that maximizes the number of resulting ROIs that are larger in area than A min and smaller in area than A max .
The search is performed recursively over a pixel intensity range ͑ 1 , T ͒, where initially 1 is the minimum pixel intensity value in I and T is the maximum pixel intensity value in I. From this search range, T test threshold values 1 , 2 , ѧ, TϪ1 , T are uniformly selected. A larger T will decrease the probability of skipping the optimal threshold value, but it will result in more computation time that may not be necessary. Because the threshold value is refined by a recursive process until it reaches the optimal value that produces the maximum number of ROIs, the value of T should have little to no effect on ACSAT's segmentation results. We chose T ϭ 12. Each of these test threshold values 1 , ѧ, T is applied to the image I by assigning each pixel a 1 (a true calcium event) if its value is greater than the threshold or a 0 (a false calcium event) otherwise. Morphological operations are then performed to refine the thresholded images. Specifically, these operations fill in holes (0s surrounded by 1s) and remove spur pixels that may be due to noise. The operations also break H-connected ROIs before splitting overlapping cells. ROIs are finally collected with 8-connectivity (Matlab function bwlabel or bwconncomp) to generate a set of segmented ROIs for each test threshold value: ͕ROIs Խ 1 ͖, ѧ, ͕ROIs Խ T ͖.
Since ROIs represent real neurons that are roughly spherical in shape and are ϳ5-20 m in diameter, some realistic criteria can be used to eliminate false ROIs that are not possibly actual neurons. Accordingly, FIBAT removes ROIs from ͕ROIs Խ 1 ͖, ѧ, ͕ROIs Խ T ͖ if their centroid is outside the ROI, or if their area is less than A min or greater than A max , or if their solidity (i.e., the area ratio between the convex hull of a ROI and the ROI itself) is greater than approximately the golden ratio.
The next search range is selected based on the results of the test thresholds. A relationship of the test threshold values 1 , ѧ, T versus the numbers of resulting ROIs Խ ͕ROIs Խ 1 ͖ Խ , ѧ, Խ ͕ROIs Խ T ͖ Խ can be generated ( Fig. 1B). If the test threshold value c resulted in the most ROIs, i.e., c ϭ argmax cԽ ͕ROIs Խ c ͖ Խ , then the next search range is set to ͑ max͕1, cϪ1͖ , min͕T, cϩ1͖ ͒ to include c inside the search range. If more than one test threshold value c 1 , c 2 , ѧ resulted in the same maximum number of ROIs, then the next search range is similarly set to ͑ max͕1, min͕c 1 , c 2 , ѧ͖Ϫ1͖ , min͕T, max͕c 1 , c 2 , ѧ͖ϩ1͖ ͒ to contain all c 1 , c 2 , ѧ. This search is terminated when further refinement of the search range produces little improvement in the number of detected ROIs: either the new search range Խ cϩ1 Ϫ cϪ1Խ is less than or the new range overlaps the previous range by at least ␣. We chose ␣ ϭ 90% and ϭ the smallest nonzero intensity difference between every pair of adjacent pixels in whole image I. As such, is determined automatically and does not require user input. On termination, the optimal threshold value is set to ‫ء‬ ϭ 1 / 2 ͑min͕ c 1 , c 2 , ѧ͖ ϩ max͕ c 1 , c 2 , ѧ͖͒, and the segmented ROIs ͕ROIs Խ ‫ء‬ ͖ includes ROIs whose area exceeds A max .

Code accessibility
The code/software described in the paper is freely available online at https://github.com/sshen8/acsat. The code is available in Extended Data 1.

Results
We tested ACSAT on 500 simulated datasets, two wide-field hippocampus datasets, a wide-field striatum dataset, a wide-field cell culture dataset, and a twophoton hippocampus dataset. The simulated datasets with known ground truth allowed us to accurately assess the segmentation performance of ACSAT in different conditions of SNR and number of ROIs. For the hippocampus dataset A and the striatum dataset, in which the ground truth is unknown, we used human-generated ROIs as a reference. For the cell culture dataset, hippocampus dataset B, and two-photon dataset, we provide the ACSAT segmented ROIs that can be inspected and interpreted by users.

ACSAT performance on simulated datasets with various SNRs and numbers of ROIs
To evaluate the performance of ACSAT, we simulated 500 time-collapsed images I 0 with various numbers of ROIs (between 300 and 700) at random locations and different SNRs (between ϳ19 and ϳ29 dB). The exact locations of ROIs are known and served as the ground truth to provide an accurate evaluation of the performance of ACSAT. For all 500 datasets, we used the parameters ␦ ϭ 10%, A min ϭ 50 px Ϸ 86 m 2 and A max ϭ 300 px Ϸ 516 m 2 for the global adaptive thresholding step, and A min ϭ 20 px Ϸ 34 m 2 and A max ϭ ϱ for the local adaptive thresholding step because ROIs tend to shrink in size after repeatedly applying FIBAT.
The recall and precision results for each of these simulated datasets are shown as dots in Fig. 2A1 and Fig.  2A2, respectively. Fig. 2B shows examples of the simulated time-collapsed images, and each example corresponds to a dot in Fig. 2A1 and Fig. 2A2. At SNR greater than ϳ24 dB, ACSAT shows a stable performance with generally higher than 80% recall. The precision rate remains stable at generally higher than 80% when SNR is greater than ϳ21 dB. However, the performance of AC-SAT falls when SNR is below ϳ20 dB.

ACSAT performance on hippocampus dataset A and striatum dataset
We used ACSAT (Fig. 1A) to automatically segment ROIs from a hippocampus wide-field imaging dataset and a striatum wide-field imaging dataset. Before the application of the ACSAT, the image sequences were timecollapsed as shown in Figs. 3 and 4 (top rows) for the hippocampus A and the striatum datasets, respectively. These time-collapsed images show high-intensity areas resembling neural morphology. The final segmented ROIs outputted by ACSAT are illustrated in Figs. 3 and 4 (bottom row), respectively.
For both datasets, we initiated ACSAT using the same parameters as for the simulated datasets (␦ ϭ 10%, A min ϭ 50 px Ϸ 86 m 2 and A max ϭ 300 px Ϸ 516 m 2 for the global adaptive thresholding step, and A min ϭ 20 px Ϸ 34 m 2 and A max ϭ ϱ for the local adaptive thresholding step). To obtain the results as shown in Figs. 3 and 4, it took approximately 1 min per iteration on a Xeon E5-1650 v4 at 3.6 GHz with 128 GB DDR4 RAM, but it used Ͻ30 MB RAM. As such, the RAM size had little effect on the speed.

ACSAT performance compared to human-generated ROIs
To assess the performance of the ACSAT algorithm, we compared the ACSAT segmentation results with ROIs   ). B, ACSAT-determined ROIs from multiple iterations overlying on the time-collapsed image (red, yellow, green, and blue outline corresponds to iteration 1, 2, 3, and 4, respectively). The fourth iteration (blue) is shown for comparison although ACSAT terminated at iteration 3 (red, yellow, and green). generated by human inspection (human-generated ROIs). This set of human-generated ROIs contained 423 ROIs for the hippocampus dataset A and 91 ROIs for the striatum dataset. We first compared the ACSAT-generated ROIs for the hippocampus A and striatum datasets with the ROIs in the human-generated ROIs. We consider a pair of ROIs to correspond to the same neuron if they had centroids that were Ͻ50 px Ϸ 65.6 m apart and had a mutual overlap Ͼ60%. We calculated the mutual overlap as the average of the percentages of the overlapping area against the areas of both ROIs. When there were multiple ROIs sharing overlapping areas, we selected the pair with highest mutual overlap as the matched ROIs.
For the hippocampus dataset A, ACSAT identified 445 ROIs after three iterations. Among these 445 ROIs, 317 ROIs were matched in the human-generated ROIs (Match), and 128 ROIs were not in the human-generated ROIs (ACSAT-only). Additionally, 106 ROIs in humangenerated ROIs were not identified by ACSAT (Humanonly). This result gave us a precision rate of 71.2% (317 out of 445) and a recall rate of 74.9% (317 out of 423). For the striatum dataset, ACSAT was terminated after one iteration and identified a total of 135 ROIs: 69 Match ROIs, 66 ACSAT-only ROIs, and 22 Human-only ROIs (precision rate: 51.1%, recall rate: 75.8%).
We further examined the fluorescence traces of ROIs from the ACSAT-only, Human-only, and Match groups. Representative traces are shown in Fig. 5A1,B1, respectively, for the hippocampus A and striatum datasets, and all traces are available in Extended Data 2. The value of each ROI fluorescence trace at each time point is the average intensity value of all pixels belonging to that ROI. In Fig. 5A1,B1, each trace is normalized by subtracting the mean value of that trace over time and then dividing the difference by that mean value. We calculated the SNR for every ROI in each group. In both the hippocampus A and striatum datasets, the Match ROIs exhibit a broad range of SNR, indicating that both ACSAT and humans are capable of identifying ROIs with various intensities in the time-collapsed image (Fig. 5A2,B2).
We further examined the individual ROIs identified by ACSAT that were not identified by humans. This secondary manual inspection found that some of the ACSAT-only ROIs were actually true neurons (i.e., with fluorescence traces compatible with neuronal dynamics) that were missed in the initial human-generated ROIs because of human error. This means that ACSAT was able to segment ROIs that were difficult to identify by human experts. Specifically, for the hippocampus A dataset, 70 (54.7%) out of 128 ROIs initially labeled as ACSAT-only were later determined to be actual neurons, and for the striatum dataset, 31 (47%) ROIs were true neurons. After correction, of the total 445 ACSAT ROIs from the hippocampus dataset A, 387 segmented ROIs corresponded to true neurons (Match), and 58 segmented ROIs did not correspond to true neurons determined by human inspection (ACSAT-only). Additionally, 106 true ROIs were not segmented (Human-only). This corresponds to a precision rate of 87% and a recall rate of 78.5%. Similarly, for the striatum dataset, which resulted in 135 ACSAT ROIs, there were 100 Match ROIs, 35 ACSAT-only ROIs, and 22 Human-only ROIs after correction. This corresponds to a precision rate of 74.1% and a recall rate of 82%. Although neurons in the hippocampus and striatum have different morphology and fluorescence intensity, ACSAT was consistently effective for both datasets, and it was able to  ). B, ACSAT-determined ROIs from multiple iterations overlying on the time-collapsed image (red, yellow, green, and blue outline corresponds to iteration 1, 2, 3, and 4, respectively). The second (yellow), third (green), and fourth (blue) iterations are shown for comparison although ACSAT terminated at iteration 1 (red). detect low-intensity neurons that were initially undetected by human referees. As such, our results demonstrate the robustness and effectiveness of the algorithm.
The result from the hippocampus dataset A shows that ACSAT successfully identified true ROIs of diverse sizes (Fig. 6, red). In general, the false-positive ROIs had relatively smaller areas (Fig. 6, yellow), similar to the ROIs missed by human referees (Fig. 6, green). This indicates that ACSAT is more likely to recognize intensity changes in small areas, thereby outperforming human referees under such challenging detection conditions. Additionally, ACSAT missed a small portion of true ROIs, which shares similar sizes with those identified (Fig. 6, blue).

Number of iterations in using ACSAT
For the hippocampus dataset A, ACSAT was terminated at iteration n ϭ 3 when the change in global threshold value For the striatum dataset, ACSAT was terminated at iteration n ϭ 1 when the change in global threshold value To evaluate how ACSAT performs when terminated at different iteration numbers, we ran ACSAT up to nine iterations on both datasets, and calculated several major performance indicators after each iteration (Fig. 7): cumulative number of ROIs, global threshold value, recall, falsenegative rate, and false discovery rate (which is equal to 1 -precision) compared to the human-generated ROIs be- fore secondary manual inspection of false positives. The cumulative number of ROIs, recall, and false discovery rate increased with the iteration number, but at different speed. While the cumulative number of ROIs and the false discovery rate increased steadily, recall rose steeply and reached its plateau within approximately three iterations for the hippocampus dataset and after the first iteration for the striatum dataset. Both the global threshold value and the false-negative rate dropped as iterations progressed, indicating that ACSAT dynamically adjusted the threshold to capture potential ROIs with lower intensity in later iterations. This dynamic adjustment of the threshold value at each iteration was possible only because of the removal of segmented ROIs before each iteration. Overall, the changes in these performance indicators over iterations suggested that most true ROIs were identified during the early iterations: n Յ 3 for the hippocampus dataset and n ϭ 1 for the striatum dataset, which are consistent with when the ACSAT termination criterion described by ␦ was met. ROIs segmented during later iterations were mostly false positive.

FIBAT global and local thresholding
In Fig. 8, we demonstrate how FIBAT (Fig. 1B) determines the threshold value that achieves optimal segmentation results by sampling the distribution of threshold values versus the number of ROIs. Each trace of Fig. 8 plots the number of ROIs that results from each sampled threshold value in the global thresholding step during the first four iterations of ACSAT (Fig. 1A) on the hippocampus dataset A. In each iteration, FIBAT (Fig. 1B) first samples the threshold values across the entire intensity range at coarse resolution to identify the potential search range that may result in the maximum number of ROIs. FIBAT further resamples threshold values within the new search range with a finer resolution, until it reaches a threshold value that gives the maximum number of ROIs. This design allows FIBAT to determine the optimal threshold value with a fine resolution without actually sampling the whole intensity range at the fine scale, and, as a result, reduces the processing time.
After performing global thresholding to identify potential ROIs ͕ROIs͖ n Ј (Fig. 1A), ACSAT further applies FIBAT locally to each identified ROI in ͕ROIs͖ n Ј to refine the segmentation results (Fig. 9). When neurons are densely labeled with GCaMP6, using the global thresholding step alone may lead to one or more large clusters of adjacent neurons being segmented as a single ROI (Fig. 9A). For each such cluster, FIBAT (Fig. 1B) determines and applies a new threshold value to the local ROI area. With local thresholding, the example cluster is further segmented into five new ROIs (Fig. 9B), which would not otherwise be separated by applying the global threshold. Because further local thresholding produces the same result (Fig. 9C), the local thresholding step of ACSAT concludes that these five ROIs cannot be further separated, exits the recursive loop, and outputs these ROIs.
To evaluate the efficacy of local thresholding, we examined the hippocampus dataset A at each iteration be- Figure 6. Distribution of ROI size for hippocampus dataset A. ROIs identified by ACSAT and human (red) with various size. The ACSAT-only ROIs (yellow) and those missed by human experts (green) tend to have small areas, while the areas of human-only ROIs (blue) appear slightly larger. fore and after the local thresholding step (Fig. 10, left and right bars, respectively). Local thresholding refined the ROIs detected by global thresholding and captured more true ROIs at every iteration. It is also worth noting that, at later iterations, local thresholding was still able to identify true ROIs that were missed by global thresholding alone (Fig. 10, iteration 4).

ACSAT performance on two-photon dataset
We applied ACSAT to the two-photon dataset Neurofinder 03.00 (Fig. 11C). Genetically Encoded Calcium Indicators are generally not expressed in the nuclei (Tian et al., 2009), and because of the optical sectioning technique that two-photon imaging provides, in this dataset the nuclei appear dark. Additionally, this dataset had high speckle noise. Thus, the time-collapsed image generated by ACSAT using max minus mean pixel values shows bright nuclei. The truth file provided by Neurofinder contains 621 ROIs, most of which are nuclei. Since the features of this dataset are the nuclei, which are smaller, we used the parameters ␦ ϭ 5%, A min ϭ 20 px Ϸ 6.9 m 2 and A max ϭ 1000 px Ϸ 34.6 m 2 for the global adaptive thresholding step, and A min ϭ 20 px Ϸ 6.9 m 2 and A max ϭ ϱ for the local adaptive thresholding step.
ACSAT identified 571 ROIs. Among these, 442 ROIs were matched with the truth (true positive), and 179 ROIs Figure 8. Convergence of the FIBAT optimal global threshold value for the hippocampus dataset A. FIBAT first sampled at a coarse scale across a wide intensity range, and then focused on a small potential intensity range with a fine scale to identify the optimal global threshold value that generated most ROIs. The vertical lines indicate the final optimal global threshold value determined by FIBAT for each iteration.  Figure 10. Local thresholding improves ACSAT performance for hippocampus dataset A. The ROIs identified by ACSAT at each iteration before local thresholding (left) and after (right). Local thresholding separated overlapping ROIs and thus helped identify more ROIs, including those identified by human (black) or missed by human (red).
were not in the truth (false positive). Additionally, 129 ROIs in truth were not identified by ACSAT (false negative). This result gave us a recall rate of 71.2% (442 out of 621) and a precision rate of 77.4% (442 out of 571).
We further inspected the time-collapsed image I 0 and observed that the right side of the time-collapsed image I 0 had different patterns of texture than the left side. To use the new texture information for ROI detection by ACSAT, we extracted the right side of I 0 that is rich in texture information to generate I Ј 0 as input to ACSAT. The I Ј 0 was generated by change detection between the original image and its Gaussian-filtered counterpart. Thus, ACSAT identified an additional 157 ROIs, of which 95 were true positives, and 62 were false positives. Combining these additional ROIs with the ROIs identified by direct application of ACSAT results in a recall rate of 82.8% (514 out of 621) and a precision rate of 70.6% (514 out of 728).

ACSAT performance on cell culture and hippocampus B dataset
Finally, we used ACSAT to detect ROIs in the dataset of the primary neuron culture expressing GCaMP6f (Fig.  11B). Qualitatively, it appears ACSAT successfully identified the cell bodies of the majority of neurons in early iterations, and neurites in later iterations. We also used ACSAT to detect ROIs in the hippocampus dataset B (Fig.  11A). For both datasets, we used the parameters ␦ ϭ 10%, A min ϭ 50 px Ϸ 86 m 2 and A max ϭ 300 px Ϸ 516 m 2 for the global adaptive thresholding step, and A min ϭ 20 px Ϸ 34 m 2 and A max ϭ ϱ for the local adaptive thresholding step because ROIs tend to shrink in size after repeatedly applying FIBAT.

Discussion
In this study, we presented our automated cell segmentation by adaptive thresholding (ACSAT) method that adaptively selects threshold values based on image pixel intensity with two iterative steps at the global and local levels using a time-collapsed image. As such, the algorithm is capable of handling morphological variations in fluorescence intensity in neurons and is robust against luminance condition changes across datasets. When applied to two datasets collected from the hippocampus and the striatum in mice, ACSAT resulted in ϳ80% recall rate of ROIs containing individual neurons (78.5% for the hippocampus A dataset and 82% for the striatum dataset), and ϳ80% precision rate (87% for the hippocampus dataset and 74.1% for the striatum dataset). ACSAT was also able to detect low-intensity ROIs that were initially undetected by human referees. When applied to 500 simulated datasets, ACSAT achieved recall and precision rates higher than 80% when SNR was no less than ϳ24 dB. However, the performance of ACSAT falls when SNR reaches below ϳ20 dB.
The ACSAT algorithm is an intuitive thresholding method that uses global and local schemes to address variations in fluorescence intensity levels of GCaMP6 flu- . For all three datasets, the majority of ROIs were identified during the first two iterations. Red, yellow, green, and blue ROI outline corresponds to iteration 1, 2, 3, and 4, respectively. orescence even within the same image field. Simply applying a lower global threshold value would result in few large ROIs containing multiple neurons within one ROI. On the other hand, with a high global threshold value, only a small number of neurons with high intensity would be found. As such, applying a single high or low threshold value would generate inadequate results of either few or excessive ROIs, which is a universal limitation of thresholding methods. Our algorithm efficiently addresses this challenge in two ways.
First, it cumulatively excludes previously segmented ROIs from the time-collapsed image I n after each iteration so that in the following iteration, ACSAT could detect new ROIs that require distinct thresholds to separate but were missed with previous thresholds. Therefore, the global threshold value n ‫ء‬ (Fig. 1) used by ACSAT usually decreases after each iteration, and ROIs with high intensity were segmented before those with low intensity, as shown in Figs. 3 and 4. Because ACSAT is based on adaptive thresholding, it allows us to objectively and robustly segment ROIs with low intensity relative to the background. These low-intensity areas often pose challenges to human experts when manually detecting ROIs, as our results showed that about half of the ROIs initially labeled as false positive were actually true neurons (Fig.  10).
Second, ACSAT uses fluorescence intensity based adaptive thresholding (FIBAT) locally to separate overlapping ROIs. This approach directly addresses the issue of heterogeneity in recorded neural signals when the intensities of pixels surrounding an ROI can vary. However, because a higher thresholding value is usually required to separate adjoining neurons, the output sub-ROIs after local FIBAT are often smaller than the corresponding true neurons. Thus, a simple dilation step was applied during the local FIBAT step. This correction is useful to prevent real ROIs from falling below the minimum area criterion A min and thus being removed. Although the interleaving process of global FIBAT and local FIBAT has been effective in addressing overlapping neurons, a potential problem still arises if two neurons with similar intensities have significant overlap with each other in the time-collapsed image such that there is no trough between them. Then ACSAT may identify them as a single ROI. Conversely, if there is a neuron with multiple hotspots (Pnevmatikakis et al., 2016), then this may be identified as multiple neurons by ACSAT. Such a scenario, however, can be minimized by the minimum area criterion A min and the maximum area criterion A max . Spatial overlap is profound for wide-field imaging, but not for two-photon imaging or in vitro cell culture imaging with single cell layer. With increasing improvement wide-field imaging, such as volumetric imaging (Shain et al., 2018;Xiao et al., 2018), such significant overlap may be better eliminated during data acquisition step.
ACSAT has three sets of free parameters that can be rationally chosen or otherwise are not sensitive: ␦, which describes the termination condition for ACSAT; ␣, which describes a termination condition for FIBAT; and A min and A max , which describe the allowed sizes of ROIs.
The termination condition for ACSAT, described by ␦, can be explained by the tendencies of ACSAT. Specifically, running ACSAT for more iterations increases the number of ROIs segmented, especially the number of low-intensity ROIs, as the global threshold value n ‫ء‬ gradually decreases (Fig. 7). While many of the added ROIs are true ROIs, the proportion of false-positive ROIs added increases as iteration number increases (Fig. 7). This increasing proportion of outputted false positives in later iterations can be attributed to the higher probability of a spurious collection of adjacent background pixels meeting the criteria to be an ROI. Also, the added false positives can be related to the step which clears previously segmented ROIs from the time-collapsed image at the start of each iteration of ACSAT. Due to the scattering of light in brain tissue, ROI removal may leave a few small fragments of bright pixels around removed areas, which could be identified as ROIs during the next iteration. ACSAT tries to avoid this problem by dilating the cleared area, which makes sure the whole ROI is cleared rather than only the brighter center. Besides dilation, these misidentified ROIs were also discarded either because of their small size or because they do not meet the solidity criteria; however, occasionally they may pass the size criteria and become the false-positive ROIs. As a result, the majority of false positives tend to have small size (Fig.  6, yellow).
To balance the effects of simultaneous increase in true ROIs and false positive ROIs, ACSAT stops when a decrease of global threshold value becomes relatively small between iterations, i.e., At that stage, most true ROIs have been detected and removed from the time-collapsed image. Thus, the global threshold values n ‫ء‬ of any further iterations are similar, so most ROIs detected at this stage are false positives. For the hippocampus dataset A, iteration n ϭ 3 is when the increase in false positives begins to outweigh the increase in true positives, and for the striatum dataset, nearly all true ROIs segmented by ACSAT were outputted at iteration n ϭ 1 (Fig. 7). Qualitatively, the time-collapsed image I 0 for hippocampus has a higher density of neurons with a greater variety of pixel intensities than the I 0 for striatum, so it may take more iterations for ACSAT to perform at the same rate on the hippocampus dataset than on the striatum dataset. ACSAT's performance under the diverse conditions of these two datasets suggests that our choice of ␦ ϭ 10% provides a robust and rational termination condition for ACSAT that can be generalized to other datasets, namely the 500 simulated datasets and the cell culture dataset, as well. In fact, changing the termination condition from ␦ ϭ 10% to ␦ ϭ 5% only affected the segmentation results in Ͻ17% of the 500 simulated datasets. For the two-photon dataset, our reported results are using the termination condition ␦ ϭ 5%. In general, users can choose ␦ to be between 5% and 10% based on the needs of their application: if recall is more important, then users should choose a smaller ␦, and if precision is more important, then users should choose a larger ␦.
Additionally, the final segmentation results generated by ACSAT are not sensitive to the termination conditions for FIBAT described by ␣ and . FIBAT is terminated if the threshold search range has minimal change over an iteration, which we determine in two ways. One way this condition would be satisfied is when all threshold values within the search range result in the same, optimal number of ROIs. This is equivalent to setting the criterion ␣ ϭ 100%. For the practical purpose of reducing FIBAT run time, we allow termination if the change in the search range is Ͻ 1 Ϫ ␣ ϭ 10%. This condition is also easily met when FIBAT is used in the local thresholding step because, by definition, ROIs that cannot be separated by FIBAT will return exactly one ROI no matter what threshold value is used. Additionally, we terminate FI-BAT if the search range is smaller than , the smallest difference between any pair of adjacent pixels in I, which can be objectively and automatically determined from I. If FIBAT were to continue refining the threshold value, then the gained precision beyond that defined by would be useless due to the discrete step in pixel intensity values in I.
The last set of parameters A min and A max should be chosen based on how large neurons are expected to be using information including neuron size, image resolution, magnification, imaging method, etc. In our wide-field datasets, the boundaries of neurons may not be as well defined as those collected with two-photon microscope, and the size will appear larger than the size of a neuron due to light scattering in wide-field conditions. This effect is consistent with our observation that the minimum size of the human-generated ROIs was 38 px Ϸ 64.6 m 2 for the hippocampus A dataset and 66 px Ϸ 112.2 m 2 for the striatum dataset. Thus, our minimum ROI criteria for the wide-field datasets may be larger than a typical neuron size.
The images I 0 used by ACSAT are time-collapsed, and therefore do not contain any temporal information. With the flexibility of ACSAT, the framework of ACSAT can be used as long as a single image can be generated to represent the ROIs within the image sequence. For example, an input image I Ј 0 can be generated where the value of each pixel represents the time of its maximum intensity. This image I Ј 0 would allow ACSAT to separate adjoined ROIs that have similar intensity values in I 0 but reach their maximum intensity at different time points, which is described by I Ј 0 . Other ways to generate the single representative image include correlations with nearby pixels, intensity dynamics such as standard deviation or variance over time, texture of the time-collapsed image (for example, as used for the two-photon dataset), and a combination of various parameters. Overall, by taking advantage of adaptively determining the threshold value at both the global level and the local level, ACSAT can theoretically perform segmentation on any image containing ROIs with nonhomogenous intensity as long as it has sufficient contrast between ROIs and the background.