There are several packages in R
you could use to do (C)RQA
analyses, but you can also find a Matlab
toolbox for (C)RQA
here: CRP toolbox, there are Python
libraries as well, see this page for an overview of software. We will use the functions in package casnet
.
crqa()
was mainly designed to run categorical Cross-Recurrence Quantification Analysis (see Coco & Dale (2014) and for R code see appendices in Coco & Dale (2013)). The article is a good reference work for details on how to conduct RQA analysis.
We’ll use thecasnet
:
# Install casnet if necessary: https://github.com/FredHasselman/casnet
# !!Warning!! Very Beta...
# library(devtools)
# devtools::install_github("FredHasselman/casnet")
library(casnet)
library(rio)
Package casnet
has 2 functions that will perform auto- and cross-recurrence quantification analyses:
rp_cl()
uses a precompiled program that is run from the command line (see command line recurrence plot tools). It will take as input one or two timeseries and construct a recurrence matrix and is very fast compared to other methods. The first time you run the command rp_cl()
the casnet
package will try to download the correct executables for your operating system, look for messages in the console to find out if the script succeeded. The output of rp_cl()
will be similar to the CRP toolbox and Python
libraries by Norbert Marwan.
rp()
and rp_measures()
rp_cl()
fails on OS that allows 32 bit programs, this is usually because you do not have enough rights to execute a command line program on your machine. In that case, figure out how to get those rights, or use rp()
and rp_measures()
.rp_measures()
assumes you have already contructed a recurrence matrix using function rp()
. The output of rp_measures()
contains more measures than rp_cl()
some of which are still experimental.Let’s start with a small time series… open and/or download this Google sheet: https://docs.google.com/spreadsheets/d/1k4FYDbszfCReCM5_wcw6NIEgSFtTPIy3qmf-kFiZjy8/edit?usp=sharing
The spreadsheet displays a short categorical time series of 5 observed categories labelled A
through E
.
Below the time series a matrix is displayed with zeroes in the upper triangle. The columns and rows of the matrix are labelled according to the categorical time series y in B2:B13
, starting in the lower left corner. The Line Of Incidence (LOI) is the diagonal of the matrix going from the lower left corner to the upper right corner. Like the diagonal of a correlation matrix, the LOI in Auto-RQA will contain only recurrent points (1
), because that’s where the column labels will be identical to the row labels.
Because this is Auto-RQA and we are comparing the time series to itself, the recurrence matrix will be symmetrical, so we only have to look at the upper triangle.
0
into a 1
C29
(A) and look upwards (C27
to C17
). If you see the value A recurring, change the 0
into a 1
D29
and find recurrences of the value B in the future (in D26
to D17
)1
s, the recurrent points (RN
) and put the value in cell B32
(Total)B33
and B34
respectively.Q17
to Q20
V22
V21
Q24
to Q27
V27
V21
X17
?RR
)L
, or MEAN_dl
)L_max
, or MAX_dl
)L_entr
, or ENT_dl
)X17
is the size of the matrix (12*12) divided by 2 (because we are only looking at the upper triangle), minus the length of the diagonal, because these aren’t actually recurrent points.y
is randomised?
RR
) - No change, none of the values are changed, so the number of values that will recur is the unchanged.L
, or MEAN_dl
) - Likely to be lower, shuffling will likely break up line structures, but if there are just a few categories and/or the time series is short, it could be the same or higher.L_max
, or MAX_dl
) - Very likely to be lower, shuffling will likely break up line structures, but if there are just a few categories and/or the time series is short, it could be the same or higher.L_entr
, or ENT_dl
) - - Difficult to predict, shuffling will break up line structures and could increase entropy (more different line lengths), but it is also possible that ther will be only lines of length 2 or 3 left after randomisation, this will be a very homogeneous distribution which has a high entropy. This is more likely if there are just a few categories and/or the time series is short.casnet
comes with a vignette demonstrating the usage of rp_cl()
, see https://fredhasselman.github.io/casnet/articles/cl_RQA.html, or the manual pages.
Here is a great video summary of the phase space reconstruction technique: https://youtu.be/6i57udsPKms
R-packages for Phase Space Reconstruction
Package casnet
relies on packages fractal
and rgl
to reconstruct phase space. Package rgl
is used for plotting and can cause problems on some windows machines. On windows You’ll need to check if you have the X Window System for interactive 3D plotting. This Linux desktop system comes installed in some way or form on most Mac and Windows systems. You can test if it is present by running rgl::open3d()
in R
, which will try to open an interactive plotting device.
plot3D
which is not interactive function
Package fractal
includes the 3 dimensions of the Lorenz system in the chaotic regime. Run this code rgl::plot3d(lorenz,type="l")
with all three packages loaded to get an interactive 3D plot of this strange attractor.
We’ll reconstruct the attractor based on just dimension X
of the system using the data from package fractal
, and functions from package casnet
. Don’t forget: always look at the manual pages of a function you are running.
fractal
and package nonlinearTseries
use functions with similar names, perhapp better to not load them together, or always use fractal::
and nonlinearTseries::
if you want to call the functions direclty.
lx <- lorenz[1:1024,1]
to reconstruct the phase space based on lx
.
est_parameters()
.nnThres = .01
. This is the proportion of remaining nearest neighbours below which we consider the number of dimensions optimal. Because this time series was generated by a simulation of a system, we can use a much lower threshold. For real data, which is much noisier, you can use 0.1
or 0.2
.1024
, but use e.g. 512
.This est_parameters()
function will call several other functions (e.g. fractal::timeLag()
with method = "mutual"
and nonlinearTseries::findAllNeighbours()
for the false nearest neighbourhood search). Look at the manual page, you can run the est_parameters()
function with the defaults settings, it will show a diagnostic plot. Because we have to search over a range of parameters it can take a while to get the results…
casnet
functions this is the argument emLag
). Remember, this is just an optimisation choice. Sometimes there is no first minimum, or no global minimum.ts_embed()
(you can also use fractal::embedSeries()
).rgl::plot3d()
to plot the reconstructed space. Plot the reconstructed phase space. (You’ll need to use as.matrix()
on the object created by fractal::embedSeries()
)lx <- lorenz[1:1024,1]
to reconstruct the phase space based on lx
.
est_parameters()
.## Loading required package: splus2R
## Loading required package: ifultools
library(invctr)
library(casnet)
# The X data
lx <- lorenz[1:1024,1]
# Search for the parameters,
params <- est_parameters(lx)
# For X
lx_emb <- ts_embed(y=lx,emLag = emLag, emDim = emDim)
rgl::plot3d(lx_emb, type="l")
# For Y
ly <- lorenz[1:1024,2]
# Search for the parameters,
params <- est_parameters(ly)
Perform an RQA on the reconstructed state space of the Lorenz system. You’ll need a radius (also called: threshold, or \(\epsilon\)) in order to decide which points are close together (recurrent). You have already seen casnet
provides a function which will automatically select the best parameter settings for phase space reconstruction: est_parameters()
.
First get a radius that gives you a fixed recurrence rate use est_radius()
look at the manual pages. The function rp()
will also call est_radius
if you provide emRad = NA
.
Best way to ensure you are using the same parameters in each function is to create some lists with parameter settings (check the rp_cl()
or rp()
manual to figure out what these parameters mean)
library(fractal)
library(casnet)
# Lorenz X
lx <- lorenz[1:1024,1]
emLag = 17
emDim = 3
(emRad <- est_radius(y1 = lx, emLag = emLag, emDim = emDim)$Radius)
##
## Auto-recurrence: Setting diagonal to (1 + max. distance) for analyses
## lower and upper are both 0 (no band, just diagonal)
## using: diag(mat) <- 37.9296...
##
## Searching for a radius that will yield 0.05 for RR
##
## Converged! Found an appropriate radius...
## [1] 3.718435
# RQA analysis
#(out <- rp_cl(lx,emDim = emDim, emLag = emLag, emRad = emRad))
RMx <- rp(y1 = lx, emDim = emDim, emLag = emLag, emRad = emRad)
out <- rp_measures(RMx)
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
We can plot the recurrence matrix…
library(casnet)
# Unthresholded matrix (no radius applied)
RM <- rp(y1 = lx, emDim = emDim, emLag = emLag)
# plot it
rp_plot(RM)
# Thresholded by the radius
RMth <- di2bi(RM,emRad = emRad)
# You cam also call rp wth emRad = NA
RMth <- rp(y1 = lx, emDim = emDim, emLag = emLag, emRad = NA)
##
## Auto-recurrence: Setting diagonal to (1 + max. distance) for analyses
## lower and upper are both 0 (no band, just diagonal)
## using: diag(mat) <- 37.9296...
##
## Searching for a radius that will yield 0.05 for RR
##
## Converged! Found an appropriate radius...
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
RQA promises to be able to distinguish between different dynamical regimes, for example dynamics due to different parameter settings of the logistic map, see e.g. https://en.wikipedia.org/wiki/Recurrence_quantification_analysis#Time-dependent_RQA
RR = 0.05
(either by looking at the color bar or by estimating with a function) and plot the recurrence plot.# Unthresholded
RM1 <- rp(y1 = TS1, emDim = emDim, emLag = emLag)
rp_plot(RM1, plotDimensions = TRUE)
##
## Auto-recurrence: Setting diagonal to (1 + max. distance) for analyses
## lower and upper are both 0 (no band, just diagonal)
## using: diag(mat) <- 3.6818...
##
## Searching for a radius that will yield 0.05 for RR
##
## Converged! Found an appropriate radius...
# Unthresholded
RM3 <- rp(y1 = TS3, emDim = emDim, emLag = emLag)
rp_plot(RM3, plotDimensions = TRUE)
##
## Auto-recurrence: Setting diagonal to (1 + max. distance) for analyses
## lower and upper are both 0 (no band, just diagonal)
## using: diag(mat) <- 2.9897...
##
## Searching for a radius that will yield 0.05 for RR
##
## Converged! Found an appropriate radius...
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
##
## ~~~o~~o~~casnet~~o~~o~~~
##
## Global Measures
## Global Max.rec.points N.rec.points Recurrence.Rate Singular.points
## 1 Recurrence Matrix 252506 12626 0.05000277 11774
## Divergence Repetitiveness Anisotropy
## 1 0.1 2.096244 1
##
##
## Line-based Measures
## Line.based N.lines N.points.on.lines Measure Rate Mean Max
## 1 Diagonal 288 852 Determinism 0.0674798 2.958333 10
## 2 Vertical 764 1786 V Laminarity 0.1414541 2.337696 8
## 3 Horizontal 764 1786 H Laminarity 0.1414541 2.337696 8
## Entropy.of.lengths Relative.entropy CoV.of.lengths
## 1 1.3338255 0.2144897 0.4423578
## 2 0.7315297 0.1176358 0.3339008
## 3 0.7315297 0.1176358 0.3339008
##
## ~~~o~~o~~casnet~~o~~o~~~
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
## <sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
##
## ~~~o~~o~~casnet~~o~~o~~~
##
## Global Measures
## Global Max.rec.points N.rec.points Recurrence.Rate Singular.points
## 1 Recurrence Matrix 252506 12626 0.05000277 9004
## Divergence Repetitiveness Anisotropy
## 1 0.04545455 0.2611817 1
##
##
## Line-based Measures
## Line.based N.lines N.points.on.lines Measure Rate Mean Max
## 1 Diagonal 982 3622 Determinism 0.28686837 3.688391 22
## 2 Vertical 466 946 V Laminarity 0.07492476 2.030043 3
## 3 Horizontal 466 946 H Laminarity 0.07492476 2.030043 3
## Entropy.of.lengths Relative.entropy CoV.of.lengths
## 1 1.7511245 0.28159465 0.59913633
## 2 0.1348913 0.02169159 0.08417993
## 3 0.1348913 0.02169159 0.08417993
##
## ~~~o~~o~~casnet~~o~~o~~~
We’ll analyse a time series recorded from a subject who is tracing a circle using a computer mouse: circle tracing data.
# Reload the data
library(rio)
mouseXY <- rio::import("https://raw.githubusercontent.com/FredHasselman/The-Complex-Systems-Approach-Book/master/assignments/assignment_data/RQA_circletrace/mouse_circle_xy.csv")
fractal
contains a function surrogate
. This will create so-called constrained realisations of the time series. You can look at the rqa
vignette in package casnet
(also available on Github) Look at the help pages of the function, or study the Surrogates Manual of the TISEAN software and create two surrogate series, one based on phase
and one on aaft
.
“For a minimal significance requirement of 95% , we thus need at least 19 or 39 surrogate time series for one- and two-sided tests, respectively. The conditions for rank based tests with more samples can be easily worked out. Using more surrogates can increase the discrimination power.”
Have a look at the package rEDM
, you can use it to do a convergent cross mapping analysis