Sensory and motor evoked potentials (SEP; MEP) can be used to measure quantitatively the extent of delayed signal conduction in multiple sclerosis (MS). They may be useful to monitor disease course and even serve as biomarkers in clinical trials (Hardmeier et al., 2017). Here we evaluate physiological and rater-related variability to determine the minimal significant change between two measurements intra-individually. 15 healthy subjects were evaluated twice within 30 days with median and tibial SEP and upper (UL) and lower limb (LL) MEP in three centers1–3 according to a common standardized protocol in keeping with IFCN recommendations. Four neurophysiologists (FJ, MH, PA, PF) independently marked all curves blinded to their previous ratings using a web-based tool (EPMark; HB, MH, PF). In SEP, N13, N20 and N22, P40, in MEP, cortico-muscular- (CML) and spinal-muscular-latencies were marked. N20, P40, shortest and mean CML, and central (motor) conduction times (CCT; CMCT) were analyzed. Mixed effect models were calculated using results of (a) 1st and 2nd rating of identical baseline curves (model 1), and (b) 1st baseline and follow-up rating (model 2) as combined outcomes. Based on model 2, confidence intervals (CI) were calculated for the difference of a repeated measurement of the same curve. Intra-class-correlation coefficient (ICC) for intra-and inter-rater reliability (model 1) was very high in median SEP (N20: 0.97; CCT: 0.85) and tibial SEP (P40: 0.95; CCT: 0.89). In MEP, ICC was higher when mean CML (UL: 0.94; LL: 0.90) or mean CMCT (UL: 0.88; LL: 0.91) was used instead of shortest CML (UL: 0.80; LL: 0.78) or shortest CMCT (UL: 0.65; LL: 0.81). As CCT and CMCT showed lower ICC, only CML, N20 and P40 were further analyzed. Total variance in model 2 ranged from 0.9 to 6.4 ms (N20: 0.9, P40: 6.4; CML-UL [shortest/mean]: 4.8/4.1; CML-LL: 5.8/3.7), mainly accounted for by inter-subject variability (64–79%); estimation of center effects was unreliable. 80%-CI ranged from 0.4 to 1.5 ms (N20: 0.4, P40: 1.3; CML-UL: 1.5/1.1; CML-LL: 1.1/0.9), 95%-CI from 0.7 to 3.0 ms (N20: 0.7, P40: 2.6; CML-UL: 3.0/2.2; CML-LL: 2.2/1.7). Main SEP components (N20, P40) and MEP cortico-muscular latencies show higher reliability than central conduction times. Mean instead of shortest CML further improves reliability. Intra-subject differences in individual tracts exceeding 0.4 to 3.0 ms (depending on test and CI level) most likely reflect true underlying changes. These numbers may be used to define responders to remyelinating therapies in MS. However, these confidence intervals may be higher in patients and have to be validated in a larger sample. In group level analyses, variability is much less important as over- and underestimation counterbalance each other. Supported by an unconditional research grant from Biogen Inc. MA, USA, which had no influence in planning the study or data analysis.