Shim, J. S., Song, I.‐S., Jee, G., Kwak, Y.‐S., Tsagouri, I., Goncharenko, L., McInerney, J., Vitt, A., Rastaetter, L., Yue, J., Chou, M., Codrescu, M., Coster, A. J., Fedrizzi, M., Fuller‐Rowell, T. J., Ridley, A. J., Solomon, S. C., and Habarulema, J. B.
Assessing space weather modeling capability is a key element in improving existing models and developing new ones. In order to track improvement of the models and investigate impacts of forcing, from the lower atmosphere below and from the magnetosphere above, on the performance of ionosphere‐thermosphere models, we expand our previous assessment for 2013 March storm event (Shim et al., 2018, https://doi.org/10.1029/2018SW002034). In this study, we evaluate new simulations from upgraded models (the Coupled Thermosphere Ionosphere Plasmasphere Electrodynamics (CTIPe) model version 4.1 and the Global Ionosphere Thermosphere Model (GITM) version 21.11) and from the NCAR Whole Atmosphere Community Climate Model with thermosphere and ionosphere extension (WACCM‐X) version 2.2 including eight simulations in the previous study. A simulation from the NCAR Thermosphere‐Ionosphere‐Electrodynamics General Circulation Model version 2 (TIE‐GCM 2.0) is also included for comparison with WACCM‐X. TEC and foF2 changes from quiet‐time background are considered to evaluate the model performance on the storm impacts. For evaluation, we employ four skill scores: Correlation coefficient (CC), root‐mean square error (RMSE), ratio of the modeled to observed maximum percentage changes (Yield), and timing error (TE). It is found that the models tend to underestimate the storm‐time enhancements of foF2 (F2‐layer critical frequency) and TEC (Total Electron Content) and to predict foF2 and/or TEC better in North America but worse in the Southern Hemisphere. The ensemble simulation for TEC is comparable to results from a data assimilation model (Utah State University‐Global Assimilation of Ionospheric Measurements (USU‐GAIM)) with differences in skill score less than 3% and 6% for CC and RMSE, respectively. Plain Language Summary: The Earth's ionosphere‐thermosphere (IT) system, which is present between the lower atmosphere and the magnetosphere, is highly variable due to external forcings from below and above as well as internal forcings mainly associated with ion‐neutral coupling processes. The variabilities of the IT system can adversely affect our daily lives, therefore, there is a need for both accurate and reliable weather forecasts to mitigate harmful effects of space weather events. In order to track the improvement of predictive capabilities of space weather models for the IT system, and to investigate the impacts of the forcings on the performance of IT models, we evaluate new simulations from upgraded models (Coupled Thermosphere Ionosphere Plasmasphere Electrodynamics model version 4.1 and Global Ionosphere Thermosphere Model version 21.11) and from NCAR Whole Atmosphere Community Climate Model with thermosphere and ionosphere extension (WACCM‐X) version 2.2 together with 8 simulations in the previous study. A simulation of NCAR Thermosphere‐Ionosphere‐Electrodynamics General Circulation Model version 2 is also included for the comparison with WACCM‐X. Quantitative evaluation is performed by using four skill scores including Correlation coefficient, root‐mean square error, ratio of the modeled to observed maximum percentage changes (Yield), and timing error. The findings of this study will provide a baseline for future validation studies of new and improved models. Key Points: F2‐layer critical frequency/Total Electron Content (foF2/TEC) and their changes during a storm predicted by ionosphere‐thermosphere coupled models are evaluated against Global Ionosphere Radio Observatory foF2 and GPS TEC measurementsModel simulations tend to underestimate the storm‐time enhancements of foF2 and TEC and to predict them better in the northern hemisphereEnsemble of all simulations for TEC is comparable to the data assimilation model (USU‐GAIM) [ABSTRACT FROM AUTHOR]