by Joseph Rickert
In a previous post, I showed some elementary properties of discrete time Markov Chains could be calculated, mostly with functions from the markovchain package. In this post, I would like to show a little bit more of the functionality available in that package by fitting a Markov Chain to some data. In this first block of code, I load the gold data set from the forecast package which contains daily morning gold prices in US dollars from January 1, 1985 through March 31, 1989. Next, since there are few missing values in the sequence, I impute them with a simple “ad hoc” process by substituting the previous day’s price for one that is missing. There are two statements in the loop because there are a number of instances where there are two missing values in a row. Note that some kind of imputation is necessary because I will want to compute the autocorrelation of the series, and like many R functions acf() does not like NAs. (it doesn’t make sense to compute with NAs.)
library(forecast)
library(markovchain)
data(gold) # Load gold time series
# Impute missing values
gold1 <- gold
for(i in 1:length(gold)){
gold1[i] <- ifelse(is.na(gold[i]),gold[i-1],gold[i])
gold1[i] <- ifelse(is.na(gold1[i]),gold1[i-1],gold1[i])
}
plot(gold1, xlab = “Days”, ylab = “US$”, main = “Gold prices 1/1/85 – 1/31/89″)
This is an interesting series with over 1,000 points, but definitely not stationary; so it is not a good candidate for trying to model as a Markov Chain. The series produced by taking first differences is more reasonable. The series flat, oscillating about a mean (0.07) slightly above zero and the autocorrelation trails off as one might expect for a stationary series.
# Take first differences to try and get stationary series
goldDiff <- diff(gold1)
par(mfrow = c(1,2))
plot(goldDiff,ylab=””,main=”1st differences of gold”)
acf(goldDiff)
Next, we set up for modeling by constructing a series of labels. …read more
Source:: http://revolutionanalytics.com