Abstract
There is a growing imperative to understand the neurophysiological impact of our rapidly changing and diverse technological, social, chemical, and physical environments. To untangle the multidimensional and interacting effects requires data at scale across diverse populations, taking measurement out of a controlled lab environment and into the field. Electroencephalography (EEG), which has correlates with various environmental factors as well as cognitive and mental health outcomes, has the advantage of both portability and cost-effectiveness for this purpose. However, with numerous field researchers spread across diverse locations, data quality issues and researcher idle time due to insufficient participants can quickly become unmanageable and expensive problems. In programs we have established in India and Tanzania, we demonstrate that with appropriate training, structured teams, and daily automated analysis and feedback on data quality, non-specialists can reliably collect EEG data alongside various survey and assessments with consistently high throughput and quality. Over a 30-week period, research teams were able to maintain an average of 25.6 subjects per week, collecting data from a diverse sample of 7,933 participants ranging from Hadzabe hunter-gatherers to office workers. Furthermore, data quality, computed on the first 5,831 records using two common methods, PREP and FASTER, was comparable to benchmark datasets from controlled lab conditions. Altogether this resulted in a cost per subject of under $50, a fraction of the cost typical of such data collection, opening up the possibility for large-scale programs particularly in low- and middle-income countries.
Significance Statement With wide human diversity, a rapidly changing environment and growing rates of neurological and mental health disorders, there is an imperative for large-scale neuroimaging studies across diverse populations that can deliver high quality data and be affordably sustained. Here we demonstrate, across two large-scale field data acquisition programs operating in India and Tanzania, that with appropriate systems it is possible to generate high throughput EEG data of quality comparable to controlled lab settings. With effective costs of under $50 per subject, this opens new possibilities for low- and middle- income countries to implement large-scale programs, and to do so at scales that previously could not be considered.
Footnotes
The authors declare no competing financial interests.
This work was supported by funding from Sapien Labs, USA and the Sapien Labs Foundation, India.
We thank members of the Sapien Labs team for their assistance with data management. We are grateful to all participants for their contribution to the project.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.





