The ability to ensure safe and economic operation of power grids is challenging because of the large-scale integration of wind power as a result of its intermittent and fluctuating nature. Accurate wind power prediction is critical to overcome these concerns. This study proposed a novel hybrid encoder–decoder model by combining bidirectional gated recurrent unit, multi-head attention mechanism, and ensemble technique for multi-step ultra-short-term power prediction of wind farms. The bidirectional gated recurrent unit accurately details the complex temporal dependency of input sequence information in the encoder and outputs the encoded vector. To focus on features that contribute more to the output, two types of multi-head attention mechanism, including self-attention and cross-attention, were used in the decoder to decode the encoded vector and obtain the forecast wind power sequence. Furthermore, an ensemble technique was used to integrate forecast results from various individual predictors, which reduced the uncertainty of individual prediction results and improved predictive accuracy. The input data included historical information from the wind farm and future information from numerical weather prediction. The forecast model was validated using actual data, and results showed that it achieved superior accuracy and stability compared with other existing models in four multi-step prediction scenarios (1-, 2-, 3-, and 4-h prediction).