Channel prediction is a vital technique that can be used to support adaptive transmissions and mitigate the feedback delay of the channel state information (CSI) in the wireless communications, especially in the frequency division duplex (FDD) system. In this paper, we focus on the channel prediction issue in multiple input-multiple output orthogonal frequency division multiplexing (MIMO-OFDM) systems. First, we introduce the general channel prediction framework based on the temporal and spatial correlations in MIMO-OFDM systems. Then, we combine the broad learning idea and recurrent computation, and further introduce the broad echo state network (BESN) for channel prediction in MIMO-OFDM systems. Third, to estimate the output weight matrix, we offer two versions of the BESN, i.e., the basic BESN (B-BESN) and the group forward variable selection (GFVS)-based BESN (GFVS-BESN). In the latter, we develop the GFVS strategy to further extract useful information from those collected features in the BESN and make the BESN more able to process the CSI samples. Fourth, inspired by deep learning, we import the conjugate gradient descent backpropagation (CGDBP) technique to fine-tune the random weights and biases in the BESN and give the related derivations. Then, we prove the echo state property in the BESN and analyze the computational complexity. In the simulation section, we comprehensively evaluate the prediction performances for the standard Extended Vehicular A model (EVA) and Extended Typical Urban model (ETU) under different signal-to-noise ratios (SNRs), different antenna configurations, different spatial correlations, different maximum Doppler shifts and different channel prediction methods. The simulation results indicate that the BESN has excellent prediction performance in MIMO-OFDM systems. [ABSTRACT FROM AUTHOR]