ERP Reliability Analysis Toolbox
The ERP Reliability Analysis (ERA) toolbox is an open-source Matlab program that uses generalizability (G) theory to evaluate the reliability of ERP data. The purpose of the toolbox is to characterize the dependability (G-theory analog of reliability) of ERP scores to facilitate their calculation on a study-by-study basis and increase the reporting of these estimates.
The ERA Toolbox provides information about the minimum number of trials needed for dependable ERP scores and describes the overall dependability of ERP measurements (Clayson & Miller, 2017a). All information provided by the ERA Toolbox is stratified by group and condition to allow the user to directly compare dependability (e.g., a particular group may require more trials to achieve an acceptable level of dependability than another group). The code used by the toolbox was based on the formulas discussed in Baldwin, Larson, and Clayson (2015). The algorithms and their application by the ERA Toolbox are also covered in detail in Clayson & Miller (2017a).
You can also view the wiki and the code on Github.
Why Another Toolbox?
Reliability is a property of scores (the data in hand), not a property of measures. This means that P3, error-related negativity (ERN), late positive potential (LPP), (insert your favorite ERP component here) is not reliable in some “universal” sense (Clayson & Miller, 2017b). Since reliability is context dependent, demonstrating the reliability of LPP scores in undergraduates at UCLA does not mean LPP scores recorded from children in New York can be assumed to be reliable. Measurement reliability needs to be demonstrated on a population-by-population, study-by-study, component-by-component basis.
The purpose of the ERA Toolbox is to facilitate the calculation of dependability estimates to characterize observed ERP scores. ERP psychometric studies have been useful in suggesting cutoffs and characterizing the overall reliability of ERP components in those studies. When designing a study, that information can help guide decisions about, for example, the number of trials to present to a participant for a given population. However, just because the observed data meet the previously recommended trial cutoff does not mean that the ERP scores are necessarily reliable (Clayson & Miller, 2017b). ERP score reliability cannot be inferred from trial counts.
My hope is that the ERA Toolbox will make it easier to demonstrate the reliability of ERP scores on a study-by-study basis. I’ve recently extend the capability of the toolbox by including estimates of test-retest reliability (Clayson, Carbine, Baldwin, Olsen, & Larson, 2021; see also Carbine, Clayson, Baldwin, LeCheminant, & Larson, 2021, for a more applied manuscript) and estimates of subject-level reliability (Clayson, Brush, & Hajcak, 2021).
Mismeasurement of ERPs leads to misunderstood phenomena and mistaken conclusions (Clayson & Miller, 2017b). Poor ERP score reliability from mismeasurement compromises validity. Improving ERP measurement, by ensuring score reliability, can improve our trust of the inferences drawn from observed scores and the likelihood of our findings replicating.
A Little History
I started this project in December 2015. What started as some in-house code turned into a flexible gui that I hope can help others. After all, there’s no need to rewrite the wheel!
Latest Release
You can download the latest release of the toolbox from Github. Feel free to let me know if you have any suggestions for the toolbox!