×
files/journal/2022-09-01_23-34-07-000000_997.jpg

Journal of Modern Mathematics and Statistics

ISSN: Online
ISSN: Print 1994-5388
162
Views
0
Downloads

A Sequential Monte Carlo Approach for Online Stock Market Prediction Using Hidden Markov Model

E. Ahani and O. Abass
Page: 73-77 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

This study attempts a development of a Sequential Monte Carlo (SMC) algorithm approach of prediction based on joint probability distribution in Hidden Markov Model (HMM). SMC methods, a general class of monte carlo methods are mostly used for sampling from sequences of distributions. Simple examples of these algorithms are extensively used in the tracking and signal processing literature. Recent developments indicate that these techniques have much more general applicability and can be applied very effectively to statistical inference problems. Firstly, due to the problem involved in estimating the parameter of HMM, the HMM is now represented in a state space model and the Sequential Monte Carlo (SMC) method is used. Secondly, the researchers make the prediction using SMC method in HMM and then develop the corresponding on-line algorithm. At last, the data of daily stock prices in the banking sector of the Nigerian Stock Exchange (NSE) (price index between the years 1st January 2005 to 31st December 2008) are analyzed and experimental results reveal that the method proposed in this manner is effective.


INTRODUCTION

State space or hidden Markov models are convenient means to statistically model a process that varies in time. The state space model (Doucet and Johansen, 2009) of a hidden Markov model is shown by the following two equations:

(1)


(2)

The state variables xt and observations yt may be continuous-valued, discrete-valued or a combination of the two which indicates the probability density, associated with moving from xt-1- xt and are the state (transition) and observation densities. Practically, the x’s are the unseen true signals in signal processing (Liu and Chen, 1995), the actual words in speech recognition (Rabiner, 1989), the target features in a multitarget tracking problem (Avitzour, 1995; Gordon et al., 1993, 1995), the image characteristics in computer vision (Isard and Blake, 1996), the gene indicator in a DNA sequence analysis (Churchill, 1989) or the underlying volatility in an economical time series (Pitt and Shephard, 1997). Hidden Markov models shown the applications of dynamic state space model in DNA and protein sequence analysis (Krogh et al., 1994; Liu et al., 1997).

While, using the functions provided by C++ to expand an on-line algorithm of predicting hidden Markov model, this study takes impetus from Johansen (2009) SMCTC: Sequential Monte Carlo in C++. Further supports were derived from some results on predicted and actual data of monthly national air passengers in America. Cheng applied SMC methodology to tackle the problems of optimal filtering and smoothing in hidden Markov models. SMC have also stirred great interest in the engineering and statistical literature (Doucet et al., 2000) for a summary of the state of the art). Lately, by Johansen et al. (2008), SMC methods have been applied for resolving a marginal Maximum Likelihood problem. In Gordon et al. (1993), the application of SMC to optimal filtering was first offered. Here, SMC method is developed for prediction of state by estimating the probability

Hidden Markov model: Although, initially introduced and studied as far back as 1957 and early 1970’s, the recent popularity of statistical methods of HMM is not in question. A HMM is a bivariate discrete-time process where {Xk}k≥0 is an homogeneous Markov chain which is not directly observed but can only be observed through {Yk}k≥0 that produce the sequence of observation. {Yk}k≥0 is a sequence of independent random variables such that the conditional distribution of Yk only depends on Xk The underlying Markov chain {Xk}k≥0 is called the state. In general, the random variables Xk; Yk can be of any dimension and of any domain such as discrete, real or complex.

The researchers collect K elements of Xk and Yk for k = 1, 2,....,K to construct the vectors Xk and Yk, respectively. Because of the Markov assumption, the probability of the current true state given the immediately previous one is conditionally independent of the other earlier states:

Similarly, the measurement at the kth time step is dependent only upon the current state, so is conditionally independent of all other states given the current state:

Using these assumptions, the probability distribution over all states of the HMM can be written simply as:

Which is reflected graphically in Fig. 1. Given, using the following prediction and update steps:

In this case, we use numerical integration which becomes computationally complex when the number of states of xk are large. One particular Monte Carlo based approach to solve this for the HMM is the SMC.

Sequential monte carlo methods: Since their pioneering contribution in 1993 (Gordon et al., 1993), SMC have become a well known class of numerical methods for the solution of optimal estimation problems in non-linear non-Gaussian scenarios. The key idea of SMC method is to represent the posterior density function at time k-1 by samples and associated weights and to compute estimates based on these samples and weights.

Fig. 1: Probability distribution over all state of HMM

As the number of samples becomes very large, this Monte Carlo characterization develops into an equivalent representation to the functional description of the posterior probability density function (Sanjeev et al., 2002). If we let :

be samples and associated weights approximating the density function:

is a set of particles with associated weights;

with,

then the density function are approximated by:

Where, δ(x) signifies the Dirac delta role. Yk becomes available when a new observation arrives and the density function are obtained recursively in two stages:

Drawing samples
Updating weight with the principle of importance sampling (Doucet et al., 2000; Sanjeev et al., 2002)

The particles are proliferated over time by Monte carlo simulation to get new particles and weights (usually as new information are received), hence forming a series of PDF approximations over time. The reason that it works can be understood from the theory of (recursive) importance sampling.

Procedural functions: This is how it works. We consider a particular algorithm for the SMC, known also as the Sampling Importance Resampling (SIR) (Gordon et al., 1993; Carpenter et al., 1999; Johansen, 2009). The algorithm can be summarized as follows: The algorithm is initiated by setting k = 1 for which we define

Prediction (for step k): Draw N samples from the distribution The weights is:

Where, is calculated from the conditional PDF , given observation Yk:

Resample (for step k): Resample the random measure obtained in the prediction procedure to get:

which has uniform weights. The importance of the prediction step is clear by establishing the following results. Using a importance function satisfying the property:

is the random measure for estimating where is the trajectory for particle i and where is the normalized weights of particle I at time k which can be calculated recursively. Let;

According to the argument, at the kth step, the density function estimate for is:

After the density function has been estimated, the observation prediction with some samples with associated weights can be made. Accordingly, are approximated by a new set of samples and the observation prediction equation is:

Data description: The earlier method is applied to the data sets of daily stock prices in the banking sector of the Nigerian Stock Exchange (price index between the years 1st January 2005 to 31st December 2008). Three hidden states are studied: bull, bear and even. These hidden states along with the observable sequences of large rise, small rise, no change, large drop and small drop were used to develop the hidden Markov model (Fig. 2).

The sequence of observation is obtained by subtracting the prior price from the current price and then with the percentage change gives the classification of the sequence of observation.

Let Pt be the price of an asset at time t , the daily price relative/log return is calculated Regularly, stock prices alter in stock markets as seen in the price index on Tuesday, February 5 th 2006; it fell by >100% (Fig. 3). There is no infallible system that indicates the precise movement of stock price. Instead, stock price is subjective to the influence of various factors such as company fundamentals, external factors and market behaviour. These decide the state of the market which maybe in bull, even or bear state.

Fig. 2: Daily stock prices in the banking sector of the Nigerian Stock Exchange (price index between the years 1st January 2005 to 31st December 2008)

 

Fig. 3: Daily stock prices in the banking sector of the Nigerian Stock Exchange (red line represents predicted stock price while blue line represents actual stock price)

 

 

Table 1: Predicated daily stock price in the banking sector of the NSE
Mape (%); 0.068285

 

It grows along time through different market state which are hidden states. The state of the market can be a Markovian process and are modeled in HMM.

Experimental outcome: Utilizing the functions provided by C++, this study develops an on-line algorithm of predicting hidden Markov model according to the analysis of section 2 and 3. It draws motivation from Johansen (2009) SMCTC: Sequential Monte Carlo in C++. The on-line prediction using SMC begins with states producing signals that follow the normal distribution. The number of hidden states in the Markov chain are defined as Bull (state 1), Even (state 2) and Bear (state 3). Figure 3 shows the predicted and actual daily stock prices and Table 1 shows predicted representational prices of the NSE and predicted errors.

The stock price is modeled in HMM and prediction is made based on available observations. Due to the strong statistical foundation of HMM and SMC method, it can predict similar pattern proficiently (Fig. 3). From Table 1, we can observe that the Mean Absolute Percentage Error (MAPE) is 0.068. Hence, the predictive exactness is high.

CONCLUSION

In this study, an online, sequential Monte Carlo method is applied for prediction in Hidden Markov model. A C++ (Sequential Monte Carlo in C++) template class library (Johansen, 2009) enabled us to develop an online, sequential Monte Carlo for the prediction.

The basic theory of HMM and SMC method was introduced. Then we approximated the density function with a set of random samples with associated weights.

Lastly, the data sets of daily stock prices in the banking sector of the Nigerian Stock Exchange (price index between the years 1st January 2005 to 31st) are analyzed, and experimental results revealed that the online algorithm is effective.

How to cite this article:

E. Ahani and O. Abass. A Sequential Monte Carlo Approach for Online Stock Market Prediction Using Hidden Markov Model.
DOI: https://doi.org/10.36478/jmmstat.2010.73.77
URL: https://www.makhillpublications.co/view-article/1994-5388/jmmstat.2010.73.77