Background

Geomagnetic storms are the common features of space weather causing a threat to ground- and space-based technology systems. Most of the intense geomagnetic storms are generally caused by fast coronal mass ejections (CMEs) which induce disturbances in the solar wind (SW). Geomagnetic storms occur as a result of the energy transfer from the SW to the Earth’s magnetosphere via magnetic reconnection. Hence, changes in the SW plasma and the interplanetary magnetic field (IMF) are important factors to consider when developing magnetic storm forecast models. At present, the physics of the magnetosphere and the interplanetary medium is not completely understood and there is still no comprehensive model of the solar-terrestrial environment. The current geomagnetic storm prediction tools are dominated by empirical models, relying mostly on the observable storm precursors in the SW (Fox and Murdin 2001). There have been various functional relationships proposed for magnetic storm prediction such models to predict the disturbance storm time (Dst) index from SW parameters proposed by Burton et al. (1975) and Temerin and Li (2002). Empirical prediction models include, among others, the neural network (NN) models that are known to have the property of learning from cases and with ability to handle complex nonlinear physical phenomena. This NN capability in space weather-related predictions has been demonstrated in various studies (e.g. Uwamahoro et al. 2012; Watthanasangmechai et al. 2012). In the domain of geomagnetic field, Segarra and Curto (2012) recently applied NN for the automatic detection of sudden commencements. Other various NN-based models for predicting geomagnetic storms using SW and IMF data as inputs have also been developed and used (Lundstedt and Wintoft 1994). In particular, Elman NN-based algorithms by Wu and Lundstedt (1996), Wu and Lundstedt (1997), and Lundstedt et al. (2002) demonstrated the ability to improve the Dst forecast.

Other than the prediction of the Dst index, models for predicting geomagnetic Kp index (from SW and IMF input parameters) have also been developed (Boberg et al. 2000; Costello 1997; Wing et al. 2005). The difficulties related to the prediction of Kp index during storm periods (with K p>5) were noticed by Wing et al. (2005). A few models have also been developed for the prediction of the locally measured K index including the work by Virjanen et al. (2008) and Kutiev et al. (2009). In their study, Virjanen et al. (2008) described the problems encountered when predicting the storm time K index on the basis of only previous K index values and suggested the necessity to consider SW parameters as model inputs. The study described in this paper explored the application of Elman NN techniques for predicting the locally measured geomagnetic K index at the Hermanus Magnetic Observatory. The results obtained are compared to the prediction performance of the global Kp index.

The motivation behind this study lies on the importance of the K and related planetary Kp indices in space weather modeling. The ability to predict the K index can find application, for example, in predicting geomagnetically induced currents (GICs) (Virjanen et al. 2008). On the other hand, the K-derived planetary Kp index plays a key role in the magnetospheric and ionospheric modeling (Wing et al. 2005). Regional ionospheric models, e.g. TEC prediction models and the South African Bottomside Ionospheric Model (SABIM), take into account the local magnetic conditions by using the a index, which are directly derived from the locally recorded K index (Habarulema 2010; McKinnell 2002). An accurate model to predict the local storm time K index might, therefore, make a significant contribution towards improving ionospheric and other regional space weather models that consider magnetic activity as input.

Methods

The data sets

Geomagnetic K and K p indices

Geomagnetic K index is a quasi-logarithmic local index of geomagnetic activity. The K index quantifies disturbances in the H component of the Earth’s magnetic field with an integer in the range of 0 to 9, with 1 indicating calm conditions and 5 or more indicating a storm. The K index is derived from the maximum fluctuations of the H component observed on a magnetometer during a 3-h interval. Ground stations (magnetometers) throughout the world monitor geomagnetic activity providing a local logarithmic K index. There is a close link between the local K and global Kp indices. The planetary-scale Kp index (Menvielle and Berthelier 1991) is derived from the average of fractional K indices at 13 subauroral observatories. The Kp index is based primarily on data from magnetic observatories at middle latitudes and its values are generated with a time resolution of 3 h. This index represents a quasi-logarithmic measure of the disturbance range, also having values between 0 (very quiet) and 9 (very disturbed). While the K is a measure of the local magnetic disturbance, the Kp index is a good measure of the global magnetic activity (Prölss 2004).

This paper mainly focusses on the predictability of the storm time K index recorded at the Hermanus Magnetic Observatory (34° 25.5 S; 19° 13.5 E) in South Africa. The observatory is part of the South African National Space Agency (SANSA) at Hermanus and is also an active participant in the worldwide network of magnetic observatories (INTERMAGNET), monitoring and modelling variations of the Earth’s magnetic field. The K index represents a measure of the local geomagnetic activity response to solar and associated SW disturbances (http://spaceweather.sansa.org.za/). Figure 1 illustrates an intense magnetic storm on 3 to 5 August 2010 (K=6 recorded at Hermanus) following a coronal mass ejection launched from the Sun on the 1 August 2010 at 13:42 UT.

Figure 1
figure 1

Hermanus K index response to solar storms. This figure indicates the K index level recorded at the Hermanus Magnetic Observatory during an intense magnetic storm on 3 to 5 August 2010. This storm followed a solar CME that erupted on 1 August 2010.

Solar wind input parameters

Geomagnetic disturbances are closely linked to the IMF fluctuations, both in magnitude and direction (Schwenn et al. 2005). An interconnection between the long-duration southward IMF B z component and the Earth’s magnetic field allows SW energy transport into the Earth’s magnetosphere (Gonzalez et al. 1994). Several studies including a recent work by Kissinger et al. (2011) have indicated the role of the SW speed in magnetic storms generation. Indeed, sustained and enhanced SW speed and southward and northward IMF B z components are commonly associated with interplanetary shocks and ejecta known to be important causes of storms (Gosling et al. 1990). On the other hand, enhanced SW number density N is also an important parameter which often affects the storm strength (Crooker 2000). An increase in the SW density can cause the compression of the dayside magnetopause which drives the increase of the magnetopause current, field-aligned currents and cross-tail currents. Many papers, e.g. Wang et al. (2003) and Xie et al. (2008), have described the link between the high SW dynamic pressure and geomagnetic storms. Correlations between the Kp index and various SW parameters were previously established (e.g., Papitashvili et al. 2000). Figure 2 indicates the correlation between various SW parameters and the Hermanus (Her) K index. From the figure, it is clear that the IMF B z is more correlated with the K index more than any other SW parameter.

Figure 2
figure 2

Relationship between the variability of K index and various solar wind parameters. This figure combines scatter plots showing the correlation between various solar wind parameters and the K.

For the model described in this paper, the input parameters used were the SW speed V, the IMF B t and B z components as well as the SW particle number density N. The B z used here is in the Geocentric Solar Magnetospheric (GSM) system because it maximizes the correlation with geomagnetic activity (Kivelson and Russell 1995). Figure 3 illustrates the disturbances in the SW parameters and associated geomagnetic response as measured by the local K and the global Kp indices during a storm period. The figure clearly indicates how higher values of magnetic indices are directly associated with the abrupt changes in the SW parameters.

Figure 3
figure 3

Geomagnetic K and K p indices response to solar wind storms. This figure shows the disturbances in the SW parameters and associated geomagnetic response as measured by the local K and the global Kp indices during a storm period. In the Figure, the blue broken lines indicate the variability of B z ,V and the corresponding geomagnetic Kp response. The variability of N p,B z parameters and the corresponding K response are represented by the solid lines.

The model was developed using hourly OMNI-2 SW and IMF parameters [B t ,B z ,V and N] data for both network training and testing sets. These data are from various spacecraft and are provided by the National Space Science Data Center available online on its OMNIWEB http://omniweb.gsfc.nasa.gov/html. The Kp index data used are provided by the National Geophysical Data center (NGDC) and are also available online on the website ftp://ftp.ngdc.noaa.gov/STP/GEOMANGETIC_DATA/.

An introduction to neural network prediction techniques

A neural network is an information processing system consisting of a large number of simple processing elements called neurons. NNs are characterised by (1) the pattern of connection between the neurons, (2) the method of determining the weights on the connections (training or learning algorithm) and (3) the activation function (Fausett 1994). For the NN models used for predictions, three types of neurons (or units) are defined: (i) input units, which are set to represent values within the time series, (ii) output units, which store the output values corresponding to a given set of input values and produce the results of the NN processing and (iii) hidden units, which keep the internal representation of the mapping.

Units in layers are connected by weights which keep the knowledge of the network and govern the influence of each input has on each output. Weights are adjusted by a learning process which involves the comparison of network calculations with input-output data for known cases. The process of adjusting weights is known as network training. During the training, weights are determined so that the network properly relates inputs to desired outputs. Hence, the network learns to predict outcomes from experience rather than from using causal laws (Macpherson et al. 1995).

A unique feature of NNs lies in their ability not only to learn the training data but also to generalise by predicting unseen patterns within the boundaries given by the training set. In general, solving a nonlinear problem with the NN technique requires (1) choosing a convenient network architecture, (2) selecting a large database of input-output pairs (patterns) that contains sufficient historical information about the time series, and (3) training the network to relate the inputs to the corresponding outputs. Several available NN training algorithms have been proposed (Bishop 1995; Fausett 1994; Haykin 1994) including the feed forward NN (FFNN) and the Elman neural network. To develop the local K and global index prediction model, the Elman neural network algorithm was used. The Elman NN (Elman 1990) is a type of network that belongs to the class of recurrent NNs commonly known as the Elman recurrent network (ERN). This consists of an input layer, a hidden layer and an output layer. It also has an additional context layer that always stores the output from the hidden layer and relays this information in the next iteration. Therefore, context neurons form a sort of short-term memory, very useful for improving prediction of sequences. This means that the state of the whole network at a given time depends on an aggregate of the previous states, as well as on the current inputs (Pallocchia et al. 2006). A simplified mathematical description of ERN can be found in various literature including a recent paper by (Cai et al. 2010).

Development of the NN model

The training and testing data sets consist of storm periods selected within SC 23 [1996-2006]. The database was constructed based on storm events with K p K≥5. Each storm period was defined as having a K≥5 (K p ) at least once, each preceded and followed by a quiet magnetic period of at least 12 h. The quiet time data included variations from one storm event to the other since it depended on the storm behaviour. However, for each storm event, there was at least a day (eight data points) of K p K≤5 included before and after the storm time. Based on this criteria, the training database (1996 to 2003 and 2006) consisted of 4,930 data points. Selected storm periods during years 2004 and 2005 (688 data points) were excluded from the training process and were used to test the performance of the model. Note that both the observed and predicted local K and global K p indices are three-hourly indices. Therefore, an input row (pattern) is made up of four SW parameter values (V,B t,B z,N), each one being the average of the three preceding hourly values. Figure 4 shows a schematic illustration of the network architecture used in this study. The m values were 5 and 6 (indicating that there were five and six hidden nodes in the hidden layer) for the K index and Kp index models, respectively. The output of the NN is a three-hourly K(K p ) index. During the training process, the optimal number of hidden nodes was systematically determined by varying the number of hidden nodes. At the start of the training process, weights are chosen randomly for the ERN within both the input and context layers. Training is done iteratively and the mean square errors for training and testing patterns were monitored. As long as the error on the testing pattern decreased, the training process was allowed to proceed and terminated only when the error started increasing since at this point, the network is believed to have achieved convergence/generalisation. The root mean square (RMSE) and the correlation coefficient (CC) were the statistical measures used to characterise the prediction performance of the model. The network with the optimum performance was reached with NN structure configurations 4:5:1 for K index and 4:6:1 for Kp index. Numbers in configuration 4:5:1 represent input, hidden and output nodes, respectively. Table 1 shows various network configurations that were tried and the corresponding RMSE or CC.

Figure 4
figure 4

The Elman recurrent neural network (ERN) used. The m values were 5 and 6 (indicating that there were five and six hidden nodes in the hidden layer) for the K index and Kp index models, respectively.

Table 1 Different NN configurations investigated with corresponding prediction performance

Results and discussion

Table 1 shows different NN configurations that were investigated with the corresponding prediction performance, evaluated by calculating RMSE and CC over the whole validation data set. From Table 1, it is clear that the developed model performs better when predicting the global Kp index than it does for the prediction of the local Hermanus K index. One among other possible reasons of this difference in prediction performance might be due to the fact that the global Kp index is derived from various K indices averaged and corrected to their respective magnetic latitudes observatories. The results from the developed model indicate a CC of 0.76 between the predicted and observed Hermanus K index and a CC of 0.88 for Kp index. Even though the data set used is not the same, this model prediction is comparable with the previous Kp index prediction by Wing et al. (2005) and Boberg et al. (2000). However, it is important to note that the latter models considered all the Kp s as input data, while the current model was developed using the Kp input data for only selected storm events.

The prediction performance of this model was tested on four selected intense storms which were part of the validation data set (not included in the training data set). Two of the four storms occurred in 2004 and the two others in 2005. The two selected storms in 2004 were both long-duration intense storms, behaving like two storms in one with two peak maxima. Figure 5 shows the performance of the model on predicting the local Hermanus K index during the selected storm periods. The storm in Figure 5a lasted for 5 days (from 7 to 12 of November 2004) reaching two peak maxima of K=7 within the storm period. The global Kp index reached a peak maximum of K p=8.7 on 8 November at 03:00 UT and on 9 November at 18:00 UT as shown on Figure 6a. The storm in July 2004 (Figure 5b) also had two peak K index maxima, one on 25 July at 6:00 UT (K=6) and another on 27 July at 21:00 UT with K=7. The two test storms in 2005 represented in Figure 5c,d represent the most recent and greatest magnetic storms of SC 23. Both the storms reached the peak Hermanus K index value of 8. Figure 6 is similar to Figure 5, for the global K p , and indicates that there is a slightly improved performance when the model is applied to the prediction of the global Kp index. Table 2 presents a statistical summary of the model’s K index prediction performance on the selected individual storms and clearly shows that the selected storms are well predicted with an average CC of 0.80 for the K index and 0.90 for the Kp index.

Figure 5
figure 5

Model performance on K index. This figure shows the K index prediction performance of the developed model on the four selected individual storms.

Figure 6
figure 6

Model performance on K p index. Illustration of the Kp index prediction performance on the four selected individual storms. The Kp prediction on this figure shows some improvement compared with the K index prediction.

Table 2 The model’s K and K p indices prediction performance on the four selected individual storms

Conclusions

A NN-based model for predicting the storm time local K and global K p indices using SW and IMF input parameters has been developed. Many previous studies focused on developing models to predict magnetic storms as measured by the Dst index. However, some findings (e.g. Borovsky and Denton 2006) suggest that the Dst alone can, in some cases, be a poor indicator of the properties of a storm. The primary aim of this study was to explore the NN predictability of the locally measured storm time K index from SW and IMF parameters. The results obtained compare well with previous Kp (closely related to K index) predictions by Wing et al. (2005) and Boberg et al. (2000) noting however that contrary to these previous models, the current model involved Kp data for only selected storm events. The results obtained from the developed NN models are in line with what is already known about the SW control of geomagnetic activity. With a knowledge of the SW velocity, density, as well as the IMF strength and orientation, it is possible to predict well the energisation of the ring current and reproduce accurately the magnetic measurements recorded by ground-based magnetometers (Russell 1986). The developed model constitutes a step towards achieving real-time forecasts of the locally (Hermanus) measured K index. If achieved, the real-time prediction of the K index will contribute significantly to improving regional ionospheric modelling as well as other regional space weather models that consider the locally measured magnetic activity as input.