Purpose: This study evaluated whether or not polysomnography (PSG) inter-scorer reliability (ISR) across sleep centres could be improved by external proficiency testing (EPT), or by EPT combined with method alignment training. Methods: Experienced scorers form 15 sleep centres were randomised to the following: (1) a control group, (2) a group that received a self-directed intervention of EPT reports (EPTPassive) or (3) a group that received an active intervention of method alignment training and EPT reports (EPTActive). Respiratory, arousal and sleep scoring ISR from sixteen PSG fragments were compared between groups across time. Results: Among 30 scorers, there were no ISR changes in controls between baseline (BL) and 6 months (6 m). Both EPT groups showed ISR improvement from BL to 6 m for respiratory, arousal and sleep scoring (p < 0.05). Respiratory scoring back-transformed mean (95CI) proportion of specific agreement (PSA) for the EPTPassive group improved from 0.78 (0.72–0.84) to 0.80 (0.74–0.86) and for the EPTActive group from 0.80 (0.74–0.85) to 0.82 (0.76–0.88). Arousal scoring PSA for the EPTPassive group improved from 0.72 (0.66–0.77) to 0.74 (0.69–0.79) and for the EPTActive group from 0.71 (0.65–0.76) to 0.77 (0.72–0.82). Sleep scoring kappa for the EPTPassive group improved from 0.64 (0.58–0.69) to 0.73 (0.68–0.77) and for the EPTActive group from = 0.75 (0.71–0.80) to 0.80 (0.76–0.85). Overall, poorer performers achieved greater improvement. Conclusion: External proficiency testing produced modest, statistically significant PSG inter-scorer reliability improvements among experienced scorers across sleep centres, with potential to improve clinical management of individual patients and increase research study statistical power. [ABSTRACT FROM AUTHOR]