“order is essentially the arrival of redundancy in a system, a reduction of possibilities”
— Von Föerster (2003)
In this course we will not discuss the type of linear time series models known as Autoregressive Models (e.g. AR, ARMA, ARiMA, ARfiMA) summarised on this Wikipedia page on timeseries. There are many extensions to these linear models, check the CRAN Task View
on Time Series Analysis
to learn more (e.g. about package zoo
and forecast
). We will in fact be discussing a lot of methods in a book the Wiki page refers to for ‘Further references on non-linear time series analysis’: Nonlinear Time Series Analysis by Kantz & Schreiber. You do not need to buy the book, but it can be a helpful reference if you want to go beyond the formal level (= mathematics) used in this course. Some of the packages we use are based on the accompanying software to the book TiSEAN which is written in C
and Fortran
and can be called from the command line (Windows / Linux).
We are going to analyse 3 different time series using different techniques. The goal is to describe the dynamics, temporal patterns (if any) and perhaps other interesting characteristics of these time series.
Importing data
Two ways:
A. By downloading:
series.xlsx
.R
using the code belowlibrary(rio)
series <- rio::import("series.xlsx")
B. By importing from Github:
url
associated with the Download button on Github (right-click).rio::import(url)
library(rio)
series <- rio::import("https://github.com/complexity-methods/CSA-assignments/raw/master/assignment_data/BasicTSA_arma/series.xlsx")
series.xlsx
contains three time series TS_1
, TS_2
and TS_3
.
mean()
, sd()
, round()
).
ts_levels()
. Look at the examples in the manual entry of this function to see how to plot the level data.summary()
, sd()
).
library(casnet)
# TS1
round(mean(series$TS_1),2)
## [1] 0.51
round(sd(series$TS_1),2)
## [1] 0.35
# TS2
round(mean(series$TS_2),2)
## [1] 0.51
round(sd(series$TS_2),2)
## [1] 0.35
# TS3
round(mean(series$TS_3),2)
## [1] 0.51
round(sd(series$TS_3),2)
## [1] 0.35
# Similar means and SDs rounded to 2 decimal points
ts_levels()
. Look at the examples in the manual entry of this function to see how to plot the level data.ts_levels()
so it will return this level.lvl1 <- ts_levels(series$TS_1)
plot(ts(lvl1$pred$y))
lines(lvl1$pred$p, col="red3", lwd=3)
lvl2 <- ts_levels(series$TS_2)
plot(ts(lvl2$pred$y))
lines(lvl2$pred$p, col="red3", lwd=3)
lvl3 <- ts_levels(series$TS_3)
plot(ts(lvl3$pred$y))
lines(lvl3$pred$p, col="red3", lwd=3)
### Time series 2
### You can set 'changeSensitivity' to a stricter value.
lvl2a <- ts_levels(series$TS_2, changeSensitivity = .1)
plot(ts(lvl2a$pred$y))
lines(lvl2a$pred$p, col="red3", lwd=3)
Correlation functions are intuitive tools for quantifying the temporal structure in a time series. As you know, the correlation measure can only quantify linear regularities between variables, which is why we discuss them here as basic
tools for time series analysis. So what are the variables? In the simplest case, the variables between which we calculate a correlation are between a data point at time t and a data point that is separated in time by some lag, for example, if you would calculate the correlation in a lag-1 return plot, you would have calculated the 1st value of the correlation function (actually, it is 2nd value, the 1st value is the correlation of time series with itself, the lag-0 correlation, which is of course \(r = 1\)).
You can do the analyses in SPSS, R
or JAMOVI. This analysis is very common, so you’ll find functions called acf
, pacf
and ccf
in many other statistical software packages, In package casnet
you can use the function plotRED_acf()
(plot REDundancies).
acf()
, pacf()
, and try plotRED_acf()
)
acf()
, pacf()
, and try plotRED_acf()
)
library(casnet)
acf(series$TS_1, lag.max = 50)
acf(series$TS_1, lag.max = 50, type = "partial")
plotRED_acf(y = series$TS_1,Lmax = 50, returnCorFun = FALSE)
acf(series$TS_2, lag.max = 50)
acf(series$TS_2, lag.max = 50, type = "partial")
plotRED_acf(y = series$TS_2,Lmax = 50, returnCorFun = FALSE)
acf(series$TS_3, lag.max = 50)
acf(series$TS_3, lag.max = 50, type = "partial")
plotRED_acf(y = series$TS_3,Lmax = 50, returnCorFun = FALSE)
Many non-linear analyses can be considered “descriptive” techniques, that is, the aim is not to fit the parameters of a model, but to describe, quantitatively, some aspects of how one value changes into another value over time.
A qualitative description of the fractal dimension of a time series (or 1D curve) can be given by deciding whether the curve looks/behaves like a line, or, like a plane.
As can be seen in the figure below, if slow processes (low frequencies) dominate the signal, they are more line-like and will have a fractal dimension closer to 1
. If fast processes (high frequencies) dominate the signal, they are more plane-like and will have a fractal dimension closer to 2
.
library(casnet)
library(plyr)
N <- 512
noises <- round(seq(-3,3,by=.5),1)
yy <- llply(noises, function(a){cbind(noise_powerlaw(alpha = a, N = 512, seed = 1234))})
names(yy) <- noises
tmp<- data.frame(yy,check.names = FALSE)
plotTS_multi(tmp, ylabel = "Scaling exponent alpha")
Relative Roughness is calculated using the following formula:
\[\begin{equation} RR = 2\left[1 + \frac{\gamma_1(x_i)}{Var(x_i)}\right] \tag{2.1} \end{equation}\]
The numerator in the formula stands for the lag 1
auto-covariance of the time series \(x_i\), this is the unstandardised lag1 autocorrelation. Check the function acf()
to figure out how to get it. The denominator stands for the (global) variance of \(x_i\) which all statistics packages can calculate. Another way to describe the variance is: lag 0
auto-covariance.
acf()
, argument type
.casnet
that will do everything for you: fd_RR()
Use Figure 2.1 to lookup which value of \(RR\) corresponds to which type of dynamics:
acf()
, argument type
.casnet
that will do everything for you: fd_RR()
To see how to calculate RR ‘by hand’ you could look at the code of the function fd_RR()
If you select the function name in R and press F2, you should be able to see the code, or you can just type fd_RR
(without parentheses)
fd_RR
## function(y){
## # lag.max = n gives autocovariance of lags 0 ... n,
## VAR <- stats::acf(y, lag.max = 1, type = 'covariance', plot=FALSE)
## # RR formula
## RelR <- 2*(1-VAR$acf[2] / VAR$acf[1])
## # Add some attributes to the output
## attributes(RelR) <- list(localAutoCoVariance = VAR$acf[2], globalAutoCoVariance = VAR$acf[1])
##
## return(RelR)
## }
## <bytecode: 0x7ff395fcae70>
## <environment: namespace:casnet>
# TS1
fd_RR(series$TS_1)
## [1] 2.101952
## attr(,"localAutoCoVariance")
## [1] -0.006144102
## attr(,"globalAutoCoVariance")
## [1] 0.1205297
# TS2
fd_RR(series$TS_2)
## [1] 0.3564002
## attr(,"localAutoCoVariance")
## [1] 0.09851098
## attr(,"globalAutoCoVariance")
## [1] 0.1198722
# TS3
fd_RR(series$TS_3)
## [1] 2.054631
## attr(,"localAutoCoVariance")
## [1] -0.003322233
## attr(,"globalAutoCoVariance")
## [1] 0.1216251
Use the sample_entropy()
function in package pracma
.
edim
of 3 data points, and a tolerance range r
equal to the sd()
of the series (1*sd(ts)
).edim
parameter? If so, how does the outcome change, and why?SampEn
value due to different parameter settings change the relative order of the entropy for the three time series?edim
of 3 data points, and a tolerance range r
equal to the sd()
of the series (1*sd(ts)
).edim
parameter? If so, how does the outcome change, and why?SampEn
value due to different parameter settings change the relative order of the entropy for the three time series?library(pracma)
sample_entropy(series$TS_1, edim = 3, r = sd(series$TS_1))
## [1] 0.6525828
sample_entropy(series$TS_2, edim = 3, r = sd(series$TS_2))
## [1] 0.1958882
sample_entropy(series$TS_3, edim = 3, r = sd(series$TS_3))
## [1] 0.5281159
# Change edim parameters.
sample_entropy(series$TS_1, edim = 6, r = sd(series$TS_1))
## [1] 0.6760045
sample_entropy(series$TS_2, edim = 6, r = sd(series$TS_2))
## [1] 0.1678013
sample_entropy(series$TS_3, edim = 6, r = sd(series$TS_3))
## [1] 0.5276779
# Change r parameters.
# Has effect on relative order.
sample_entropy(series$TS_1, edim = 3, r = .2*sd(series$TS_1))
## [1] 2.202697
sample_entropy(series$TS_2, edim = 3, r = .2*sd(series$TS_2))
## [1] 1.519537
sample_entropy(series$TS_3, edim = 3, r = .2*sd(series$TS_3))
## [1] 0.6279321
The change of edim
keeps the relative order, change of r
for the same edim
does not.
On Day 1 we looked at return plots of simulated time series, we discussed what kind of shapes to expect based on the kind of change processes generated the data.
library(dplyr)
plot(dplyr::lag(series$TS_1), series$TS_1, pch = ".")
plot(dplyr::lag(series$TS_2), series$TS_2, pch = ".")
plot(dplyr::lag(series$TS_3), series$TS_3, pch = ".")