Package 'texmex'

Title:	Statistical Modelling of Extreme Values
Description:	Statistical extreme value modelling of threshold excesses, maxima and multivariate extremes. Univariate models for threshold excesses and maxima are the Generalised Pareto, and Generalised Extreme Value model respectively. These models may be fitted by using maximum (optionally penalised-)likelihood, or Bayesian estimation, and both classes of models may be fitted with covariates in any/all model parameters. Model diagnostics support the fitting process. Graphical output for visualising fitted models and return level estimates is provided. For serially dependent sequences, the intervals declustering algorithm of Ferro and Segers (2003) <doi:10.1111/1467-9868.00401> is provided, with diagnostic support to aid selection of threshold and declustering horizon. Multivariate modelling is performed via the conditional approach of Heffernan and Tawn (2004) <doi:10.1111/j.1467-9868.2004.02050.x>, with graphical tools for threshold selection and to diagnose estimation convergence.
Authors:	Harry Southworth [aut, cre], Janet E. Heffernan [aut], Paul D. Metcalfe [aut], Yiannis Papastathopoulos [ctb], Alec Stephenson [ctb], Stuart Coles [ctb]
Maintainer:	Harry Southworth <[email protected]>
License:	GPL (>=2)
Version:	2.4.9
Built:	2025-03-19 05:18:17 UTC
Source:	https://github.com/harrysouthworth/texmex

Help Index

Extreme value modelling
Accurately compute (exp(x) - 1) / x
Accurately compute log(1-exp(x))
Accurately compute log(1 + x) / x
Compute pmax(x y, -1) in such a way that zeros in x beat infinities in y.
Annotate a threshold selection ggplot
Information Criteria
Bootstrap a conditional multivariate extreme values model
Measures of extremal dependence
Calculate the copula of a matrix of variables
Cross-validation for a model object
Cross-validation for the shape parameter in an extreme values model
Density, cumulative density, quantiles and random number generation for the extended generalized Pareto distribution 3
Density, cumulative density, quantiles and random number generation for the generalized extreme value distribution
Density, cumulative density, quantiles and random number generation for the generalized Pareto distribution
The Gumbel distribution
Compute empirical distribution function
Estimate the EGP3 distribution power parameter over a range of thresholds
Calculate upper end point for a fitted extreme value model
Extreme value modelling
Bootstrap an evmOpt fit
MCMC simulation around an evmOpt fit
Set the seed from a fitted evmSim object.
Extremal index estimation and automatic declustering
Fancy plotting for copulas
Diagnostic plots for an declustered object
Diagnostic plots for the replicate estimated parameter values in an evmBoot object
Diagnostic plots for an evm object
Diagnostic plots for the Markov chains in an evmSim object
Conditional multivariate extreme values modelling
Fit multiple independent generalized Pareto models
Plotting function for return level estimation
Profile likelihood based confidence intervals for GPD
Estimate generalized Pareto distribution parameters over a range of values
Joint exceedance curves
Liver related laboratory data
Log-likelihood for evmOpt objects
Provide full marginal reference distribution for for maringal transformation
Multivariate conditional Spearman's rho
Estimate the dependence parameters in a conditional multivariate extreme values model
Simulation from dependence models
Estimate dependence parameters in a conditional multivariate extreme values model over a range of thresholds.
Change values of parameters in a migpd object
Mean residual life plot
Plot copulas
Plots for evmOpt objects
Plots for evmSim objects
Predict return levels from extreme value models, or obtain the linear predictors.
Return levels
Print evmOpt objects
Rain, wavesurge, portpirie and nidd datasets.
Extreme Value random process generation.
Generalized logistic distribution
Extreme Value random process generation.
Simulate from a fitted evm object
Air pollution data, separately for summer and winter months
Create families of distributions
Process Metropolis output from extreme value model fitting to discard unwanted observations.

Extreme value modelling

Description

Extreme values modelling, including the conditional multivariate approach of Heffernan and Tawn (2004).

Details

The package was originally called ‘texmex’ for Threshold EXceedances and Multivariate EXtremes. However, it is no longer the case that only threshold excess models are implemented, so the ‘tex’ bit doesn't make sense. So, the package is called ‘texmex’ because it used to be called ‘texmex’.

evm: Fit extreme value distributions to data, possibly with covariates. Use maximum likelihood estimation, maximum penalized likelihood estimation, simulate from the posterior distribution or run a parametric bootstrap. Extreme value families include the generalized Pareto distribution (gpd) and generalized extreme value (gev) distribution.

mex: Fit multiple, independent generalized Pareto models to the the upper tails of the columns of a data set, and estimate the conditional dependence structure between the columns using the method of Heffernan and Tawn.

bootmex: Bootstrap estimation for parameters in generalized Pareto models and in the dependence structure.

declust: Estimation of extremal index and subsequent declustering of dependent sequences using the intervals estimator of Ferro and Segers.

Author(s)

Harry Southworth, Janet E. Heffernan, Paul D. Metcalfe

Maintainer: Harry Southworth <[email protected]>

URL: https://github.com/harrysouthworth/texmex

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004.

C.A.T Ferro and J. Segers, Inference for Clusters of Extreme Values, Journal of the Royal Statistical society B, 65, 545 – 556, 2003.

Examples


# Analyse the winter data used by Heffernan and Tawn
mymex <- mex(winter, mqu = .7, penalty="none", dqu=.7, which = "NO")
plot(mymex)
# Only do 10 replicates to keep CRAN checks happy. Do many more in any
# real application
myboot <- bootmex(mymex, R=10)
plot(myboot)
mypred <- predict(myboot,  pqu=.95)
summary(mypred , probs = c( .025, .5, .975 ))

# Analyse the liver data included in the package
library(MASS) # For the rlm function

liver <- liver[liver$ALP.M > 1,] # Get rid of outlier
liver$ndose <- as.numeric(liver$dose)

alt <- resid(rlm(log(ALT.M) ~ log(ALT.B) + ndose, data=liver, method="MM"))
ast <- resid(rlm(log(AST.M) ~ log(AST.B) + ndose, data=liver, method="MM"))
alp <- resid(rlm(log(ALP.M) ~ log(ALP.B) + ndose, data=liver, method="MM"))
tbl <- resid(rlm(log(TBL.M) ~ log(TBL.B) + ndose, data=liver, method="MM"))

r <- data.frame(alt=alt, ast=ast, alp=alp, tbl=tbl)

Amex <- mex(r[liver$dose == "A",], mqu=.7)
Bmex <- mex(r[liver$dose == "B",], mqu=.7)
Cmex <- mex(r[liver$dose == "C",], mqu=.7)
Dmex <- mex(r[liver$dose == "D",], mqu=.7)

par(mfcol=c(3,3))
plot(Amex)

plot(Dmex, col="blue")

## Take a closer look at the marginal behaviour of ALT

r$ndose <- liver$ndose

altmod1 <- evm(alt, qu=.7, phi = ~ ndose, xi = ~ ndose, data=r)
altmod2 <- evm(alt, qu=.7, phi = ~ ndose, data=r)
altmod3 <- evm(alt, qu=.7, xi = ~ ndose, data=r)
altmod4 <- evm(alt, qu=.7, data=r)

# Prefer model 3, with term for xi on basis of AIC

balt3 <- evm(alt, qu=.7, xi = ~ ndose, data=r, method="simulate")
par(mfrow=c(3,3))
plot(balt3)

# use longer burn-in and also thin the output

balt3 <- thinAndBurn(balt3,burn=1000,thin=5)
plot(balt3)

# Get some simulated values for dose D

DParam <- predict(balt3,type="lp",newdata=data.frame(ndose=4),all=TRUE)$obj$link[[1]]

simD <- rgpd(nrow(DParam), sigma=exp(DParam[,"phi"]), xi=DParam[,"xi"], u=quantile(alt, .7))

# These are simulated residuals. Get some baselines and transform all
# to raw scale

b <- sample(log(liver$ALT.M), size=nrow(balt3$param), replace=TRUE)
res <- exp(b + simD)

# estimate quantiles on raw scale
quantile(res, prob=c(.5, .75, .9, .95, .99))

# estimate proportion exceeding 3*upper limit of normal mean(res >
# 36 * 3) # 36 is the upper limit of normal for ALT

# Analyse the winter data used by Heffernan and Tawn
mymex <- mex(winter, mqu = .7, penalty="none", dqu=.7, which = "NO")
plot(mymex)
# Only do 10 replicates to keep CRAN checks happy. Do many more in any
# real application
myboot <- bootmex(mymex, R=10)
plot(myboot)
mypred <- predict(myboot,  pqu=.95)
summary(mypred , probs = c( .025, .5, .975 ))

# Analyse the liver data included in the package
library(MASS) # For the rlm function

liver <- liver[liver$ALP.M > 1,] # Get rid of outlier
liver$ndose <- as.numeric(liver$dose)

alt <- resid(rlm(log(ALT.M) ~ log(ALT.B) + ndose, data=liver, method="MM"))
ast <- resid(rlm(log(AST.M) ~ log(AST.B) + ndose, data=liver, method="MM"))
alp <- resid(rlm(log(ALP.M) ~ log(ALP.B) + ndose, data=liver, method="MM"))
tbl <- resid(rlm(log(TBL.M) ~ log(TBL.B) + ndose, data=liver, method="MM"))

r <- data.frame(alt=alt, ast=ast, alp=alp, tbl=tbl)

Amex <- mex(r[liver$dose == "A",], mqu=.7)
Bmex <- mex(r[liver$dose == "B",], mqu=.7)
Cmex <- mex(r[liver$dose == "C",], mqu=.7)
Dmex <- mex(r[liver$dose == "D",], mqu=.7)

par(mfcol=c(3,3))
plot(Amex)

plot(Dmex, col="blue")

## Take a closer look at the marginal behaviour of ALT

r$ndose <- liver$ndose

altmod1 <- evm(alt, qu=.7, phi = ~ ndose, xi = ~ ndose, data=r)
altmod2 <- evm(alt, qu=.7, phi = ~ ndose, data=r)
altmod3 <- evm(alt, qu=.7, xi = ~ ndose, data=r)
altmod4 <- evm(alt, qu=.7, data=r)

# Prefer model 3, with term for xi on basis of AIC

balt3 <- evm(alt, qu=.7, xi = ~ ndose, data=r, method="simulate")
par(mfrow=c(3,3))
plot(balt3)

# use longer burn-in and also thin the output

balt3 <- thinAndBurn(balt3,burn=1000,thin=5)
plot(balt3)

# Get some simulated values for dose D

DParam <- predict(balt3,type="lp",newdata=data.frame(ndose=4),all=TRUE)$obj$link[[1]]

simD <- rgpd(nrow(DParam), sigma=exp(DParam[,"phi"]), xi=DParam[,"xi"], u=quantile(alt, .7))

# These are simulated residuals. Get some baselines and transform all
# to raw scale

b <- sample(log(liver$ALT.M), size=nrow(balt3$param), replace=TRUE)
res <- exp(b + simD)

# estimate quantiles on raw scale
quantile(res, prob=c(.5, .75, .9, .95, .99))

# estimate proportion exceeding 3*upper limit of normal mean(res >
# 36 * 3) # 36 is the upper limit of normal for ALT

Accurately compute (exp(x) - 1) / x

Description

Accurately compute (exp(x) - 1) / x

Usage

.exprel(x)
.exprel(x)

Arguments

`x`	numeric vector

Value

numeric vector

Accurately compute log(1-exp(x))

Description

Accurately compute log(1-exp(x))

Usage

.log1mexp(x)
.log1mexp(x)

Arguments

`x`	numeric vector

Value

a numeric vector

Accurately compute log(1 + x) / x

Description

Accurately compute log(1 + x) / x

Usage

.log1prel(x)
.log1prel(x)

Arguments

`x`	numeric vector

Value

numeric vector

Compute pmax(x y, -1) in such a way that zeros in x beat infinities in y.

Description

This is a common pattern in much of the distribution code, so it's worth factoring out.

Usage

.specfun.safe.product(x, y)
.specfun.safe.product(x, y)

Arguments

`x`	a numeric vector
`y`	a numeric vector

Value

an appropriate numeric vector

Annotate a threshold selection ggplot

Description

Annotate a threshold selection ggplot with the number of exceedances of various thresholds.

Usage

addExcesses(p, x, y, data, textsize)
addExcesses(p, x, y, data, textsize)

Arguments

`p`	An object produced by ggplot
`x`	Horizontal axis data containing the full range.
`y`	Verticle axis data containing the full range.
`data`	The actual data being considered for GPD modelling.
`textsize`	The size of the text in the annotations.

Information Criteria

Description

Compute AIC and (approximate) DIC for evmOpt objects

Usage

## S3 method for class 'evmOpt'
AIC(object, penalized = FALSE, nsamp = 1000, DIC, WAIC, ..., k = 2)
## S3 method for class 'evmOpt'
AIC(object, penalized = FALSE, nsamp = 1000, DIC, WAIC, ..., k = 2)

Arguments

`object`	fit model object
`penalized`	whether to use the penalized log-likelihood
`nsamp`	Number of approximate Gaussian sample to use in computing DIC. Defaults to `nsamp=1e3`. Only used when the object has class 'evmOpt'.
`DIC`	Logical. Whether to compute DIC. Defaults to `DIC = TRUE`. Only applicable to objects of class 'evmSim'.
`WAIC`	Logical. Whether to compute WAIC. Defaults to `WAIC = TRUE`. Only applicable to objects of class 'evmSim'.
`...`	other arguments currently ignored
`k`	numeric, the penalty per parameter to be used; the default `k = 2` is the classical AIC.

Details

If the object has class 'evmOpt', nsamp random draws are made from the Gaussian distribution with mean and covariance inferred from the model object. The result will be an approximate DIC. Note that AIC should not be trusted if priors are not flat. For example, if you use a regularizing prior on xi, say xi ~ N(0, 0.25), AIC can be misleading and DIC should be preferred. If the object has class 'evmSim', the actual posterior draws are used in the computation. Also note that sometimes the optimizer returns an approximatae covariance that is not postive-semidefinite, in which case the DIC will be reported as NA.

Value

The AIC and DIC

Bootstrap a conditional multivariate extreme values model

Description

Bootstrap a conditional multivariate extreme values model following the method of Heffernan and Tawn, 2004.

Usage

bootmex(x, R = 100, nPass=3, trace=10,referenceMargin=NULL)

## S3 method for class 'bootmex'
plot(x, plots = "gpd", main = "", ...)
## S3 method for class 'bootmex'
print(x, ...)
bootmex(x, R = 100, nPass=3, trace=10,referenceMargin=NULL)

## S3 method for class 'bootmex'
plot(x, plots = "gpd", main = "", ...)
## S3 method for class 'bootmex'
print(x, ...)

Arguments

`x`	An object of class "mex" as returned by function `mex`.
`R`	The number of bootstrap runs to perform. Defaults to `R`=100.
`nPass`	An integer. Sometimes, particularly with small samples, the estimation process fails with some bootstrap samples. The function checks which runs fail and takes additional bootstrap samples in an attempt to get parameter estimates. By default, it has nPass=3 attempts at this before giving up.
`trace`	How often to inform the user of progress. Defaults to `trace=10`.
`referenceMargin`	Optional set of reference marginal distributions to use for marginal transformation if the data's own marginal distribution is not appropriate (for instance if only data for which one variable is large is available, the marginal distributions of the other variables will not be represented by the available data). This object can be created from a combination of datasets and fitted GPDs using the function `makeReferenceMarginalDistribution`.
`plots`	What type of diagnostic plots to produce. Defaults to "gpd" in which case gpd parameter estimate plots are produced otherwise plots are made for the dependence parameters.
`main`	Title for plots.
`...`	Further arguments to be passed to methods.

Details

Details of the bootstrap method are given by Heffernan and Tawn (2004). The procedure is semi-parametric.

Firstly, values of all variables are simulated independently from the parametric Gumbel or Laplace distributions (depending on the choice of margins in the original call to mex). The sample size and data dimension match that of the original data set. Then an empirical bootstrap sample is generated from the original data after its transformation to the Gumbel/Laplace scale. Again, sample size and structure match the original data set. The empirical bootstrap samples from each margin are then sorted, and then replaced by their corresponding values from the sorted Gumbel/Laplace samples. This procedure preserves the dependence structure of the empirical bootstrap sample while ensuring the marginal properties of the resulting semi-parametric bootstrap sample are those of the parametric Gumbel/Laplace distribution.

The simulated, ordered Laplace/Gumbel sample is then transformed to the scale of the original data by using the Probability Integral Transform. Values beneath the original thresholds for fitting of the GPD tail models are transformed by using the empirical distribution functions and for values above these thresholds, the fitted GPDs are used. This completes the semi-parametric bootstrap from the data.

Parameter estimation is then carried out as follows: The parameters in the generalized Pareto distributions are estimated by using the bootrap data, these data are then transformed to the Laplace/Gumbel scale using the orginal threshold, their empirical distribution function and these estimated GPD parameters. The variables in the dependence structure of these variables are then estimated.

Note that maximum likelihood estimation will often fail for small samples when the generalized Pareto distribution is being fit. Therefore it will often be useful to use penalized likelihood estimation. The function bootmex does whatever was done in the call to migpd or mex that generated the object with which it is being called.

Also note that sometimes (again, usually with small data sets) all of the simulated Laplace/Gumbel random numbers will be beneath the threshold for the conditioning variable. Such samples are abandoned by bootmex and a new sample is generated. This probably introduces some bias into the resulting bootstrap distributions.

The plot method produces histograms of bootstrap gpd parameters (the default) or scatterplots of dependence parameters with the point estimates for the original data shown.

By design, there is no coef method. The bootstrapping is done to account for uncertainty. It is not obvious that adjusting the parameters for the mean bias is the correct thing to do.

Value

An object of class 'bootmex'. Print and plot functions are available.

Author(s)

Harry Southworth

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004

Examples



mymex <- mex(winter , mqu = .7, dqu = .7, which = "NO")
myboot <- bootmex(mymex)
myboot
plot(myboot,plots="gpd")
plot(myboot,plots="dependence")

mymex <- mex(winter , mqu = .7, dqu = .7, which = "NO")
myboot <- bootmex(mymex)
myboot
plot(myboot,plots="gpd")
plot(myboot,plots="dependence")

Measures of extremal dependence

Description

Compute measures of extremal dependence for 2 variables.

Usage

chi(data, nq = 100, qlim = NULL, alpha = 0.05, trunc = TRUE)

## S3 method for class 'chi'
summary(object, ...)

## S3 method for class 'summary.chi'
print(x, digits=3, ...)

## S3 method for class 'chi'
print(x, ...)

## S3 method for class 'chi'
plot(x, show=c("Chi"=TRUE,"ChiBar"=TRUE), lty=1,
cilty=2, col=1, spcases=TRUE, cicol=1, xlim=c(0, 1), ylimChi =
c(-1, 1), ylimChiBar = c(-1, 1), mainChi = "Chi", mainChiBar =
"Chi Bar", xlab = "Quantile", ylabChi =
expression(chi(u)), ylabChiBar = expression(bar(chi)(u)),
ask, ...)

## S3 method for class 'chi'
ggplot(data=NULL, mapping, xlab = "Quantile", 
ylab=c("ChiBar" = expression(bar(chi)(u)), "Chi" = expression(chi(u))),
main=c("ChiBar" = "Chi Bar",       "Chi" = "Chi"),
xlim = c(0, 1), ylim =list("Chi" = c(-1, 1),"ChiBar" = c(-1, 1)),
ptcol="blue",fill="orange",show=c("ChiBar"=TRUE,"Chi"=TRUE),
spcases = TRUE,plot., ..., environment)
chi(data, nq = 100, qlim = NULL, alpha = 0.05, trunc = TRUE)

## S3 method for class 'chi'
summary(object, ...)

## S3 method for class 'summary.chi'
print(x, digits=3, ...)

## S3 method for class 'chi'
print(x, ...)

## S3 method for class 'chi'
plot(x, show=c("Chi"=TRUE,"ChiBar"=TRUE), lty=1,
cilty=2, col=1, spcases=TRUE, cicol=1, xlim=c(0, 1), ylimChi =
c(-1, 1), ylimChiBar = c(-1, 1), mainChi = "Chi", mainChiBar =
"Chi Bar", xlab = "Quantile", ylabChi =
expression(chi(u)), ylabChiBar = expression(bar(chi)(u)),
ask, ...)

## S3 method for class 'chi'
ggplot(data=NULL, mapping, xlab = "Quantile", 
ylab=c("ChiBar" = expression(bar(chi)(u)), "Chi" = expression(chi(u))),
main=c("ChiBar" = "Chi Bar",       "Chi" = "Chi"),
xlim = c(0, 1), ylim =list("Chi" = c(-1, 1),"ChiBar" = c(-1, 1)),
ptcol="blue",fill="orange",show=c("ChiBar"=TRUE,"Chi"=TRUE),
spcases = TRUE,plot., ..., environment)

Arguments

`data`	A matrix containing 2 numeric columns.
`nq`	The number of quantiles at which to evaluate the dependence measures.
`qlim`	The minimum and maximum quantiles at which to do the evaluation.
`alpha`	The size of the confidence interval to be used. Defaults to `alpha = 0.05`.
`trunc`	Logical flag indicating whether the estimates should be truncated at their theoretical bounds. Defaults to `trunc = TRUE`.
`x`, `object`	An object of class `chi`.
`digits`	Number of digits for printing.
`show`	Logical, of length 2, names "Chi" and "ChiBar". Defaults to `c("Chi" = TRUE, "ChiBar" = TRUE)`.
`lty`, `cilty`, `col`, `cicol`	Line types and colours for the the estimated quantities and their confidence intervals.
`xlim`, `ylimChi`, `ylimChiBar`	Limits for the axes.
`mainChi`, `mainChiBar`	Main titles for the plots.
`xlab`, `ylabChi`, `ylabChiBar`	Axis labels for the plots.
`mapping`, `ylab`, `main`, `ylim`, `ptcol`, `fill`, `environment`	Arguments to ggplot methods.
`spcases`	Whether or not to plot special cases of perfect (positive and negative) dependence and indpenendence. Defaults to `FALSE`.
`plot.`	whether to plot to active graphics device.
`ask`	Whether or not to ask before reusing the graphics device.
`...`	Further arguments to be passed to methods.

Details

Computes the functions chi and chi-bar described by Coles, Heffernan and Tawn (1999). The limiting values of these functions as the quantile approaches 1 give an empirical measure of the type and strength of tail dependendce exhibited by the data.

A limiting value of ChiBar equal to 1 indicates Asymptotic Dependence, in which case the limiting value of Chi gives a measure of the strength of dependence in this class. A limiting value of ChiBar of less than 1 indicates Asymptotic Independence in which case Chi is irrelevant and the limiting value of ChiBar gives a measure of the strength of dependence.

The plot and ggplot methods show the ChiBar and Chi functions. In the case of the confidence interval for ChiBar excluding the value 1 for all of the largest quantiles, the plot of the Chi function is shown in grey.

Value

An object of class chi containing the following.

`chi`	Values of chi and their estimated upper and lower confidence limits.
`chibar`	Values of chibar and their estimated upper and lower confidence limits.
`quantile`	The quantiles at which chi and chi-bar were evaluated.
`chiulb`, `chibarulb`	Upper and lower bounds for chi and chi-bar.

Note

When the data contain ties, the values of chi and chibar are calculated by assigning distinct ranks to tied values using the rank function with argument ties.method = "first". This results in the values of chi and chibar being sensitive to the order in which the tied values appear in the data.

The code is a fairly simple reorganization of code written by Janet E. Heffernan and Alec Stephenson and which appears in the chiplot function in the evd package.

Author(s)

Janet E. Heffernan, Alec Stephenson, Harry Southworth

References

S. Coles, J. E. Heffernan and J. A. Tawn, Dependence measures for extreme values analyses, Extremes, 2, 339 – 365, 1999.

A. G. Stephenson. evd: Extreme Value Distributions, R News, 2, 2002.

Examples



D <- liver[liver$dose == "D",]
chiD <- chi(D[, 5:6])
par(mfrow=c(1,2))
ggplot(chiD)

A <- liver[liver$dose == "A",]
chiA <- chi(A[, 5:6])
# here the limiting value of chi bar(u) lies away from one so the chi plot is
# not relevant and is plotted in grey
ggplot(chiA) 



D <- liver[liver$dose == "D",]
chiD <- chi(D[, 5:6])
par(mfrow=c(1,2))
ggplot(chiD)

A <- liver[liver$dose == "A",]
chiA <- chi(A[, 5:6])
# here the limiting value of chi bar(u) lies away from one so the chi plot is
# not relevant and is plotted in grey
ggplot(chiA)

Calculate the copula of a matrix of variables

Description

Returns the copula of several random variables.

Usage

copula(x, na.last = NA, ...)

## Default S3 method:
copula(x, na.last = NA, ...)

## S3 method for class 'data.frame'
copula(x, na.last = NA, ...)

## S3 method for class 'matrix'
copula(x, na.last = NA, ...)
copula(x, na.last = NA, ...)

## Default S3 method:
copula(x, na.last = NA, ...)

## S3 method for class 'data.frame'
copula(x, na.last = NA, ...)

## S3 method for class 'matrix'
copula(x, na.last = NA, ...)

Arguments

`x`	A matrix or data.frame containing numeric variables.
`na.last`	How to treat missing values. See `rank` for details.
`...`	further arguments

Details

The result is obtained by applying edf to each column of x in turn.

Print and plot methods are available for the copula class.

Value

A matrix with the same dimensions as x, each column of which contains the quantiles of each column of x. This object is of class copula.

Methods (by class)

copula(default): default method
copula(data.frame): data frame method
copula(matrix): matrix method

Author(s)

Harry Southworth

Examples


  D <- liver[liver$dose == "D",]
  Dco <- copula(D)
  plot(Dco)

D <- liver[liver$dose == "D",]
  Dco <- copula(D)
  plot(Dco)

Cross-validation for a model object

Description

Cross-validation for a model object

Usage

cv(object, folds = 10, ...)

## S3 method for class 'cv'
print(x, ...)

## S3 method for class 'cv'
summary(object, ...)

## S3 method for class 'cv'
plot(x, y, ...)

## S3 method for class 'cv'
ggplot(data, mapping = NULL, ..., environment = parent.frame())
cv(object, folds = 10, ...)

## S3 method for class 'cv'
print(x, ...)

## S3 method for class 'cv'
summary(object, ...)

## S3 method for class 'cv'
plot(x, y, ...)

## S3 method for class 'cv'
ggplot(data, mapping = NULL, ..., environment = parent.frame())

Arguments

`object`	A model object.
`folds`	The number of cross-validation folds to use. Defaults to `folds = 10`.
`...`	Other arguments to be passed through to methods.
`x`, `y`	Arguments to plot method.
`data`, `mapping`, `environment`	Arguments ggplot method.

Details

The function is generic. At present, only objects of class 'evmOpt', as returned by texmex::evm can be used.

Cross-validation for the shape parameter in an extreme values model

Description

Cross-validation for the shape parameter in an extreme values model

Usage

## S3 method for class 'evmOpt'
cv(
  object,
  folds = 10,
  ...,
  penalty = "gaussian",
  range = seq(1, 25, length.out = 25),
  shape = NULL
)
## S3 method for class 'evmOpt'
cv(
  object,
  folds = 10,
  ...,
  penalty = "gaussian",
  range = seq(1, 25, length.out = 25),
  shape = NULL
)

Arguments

`object`	An object of class 'evmOpt' as returned by `evm`.
`folds`	Integer giving the number of cross-validation folds to use. Defaults to `folds = 10`.
`...`	Not used.
`penalty`	String specifying the type of penalty to use. Defaults to `penalty = "gaussian"` which is equivalent to using a quadratic penalty. The other allowed value is `penalty = "lasso"` and an L1 penalty is used.
`range`	A sequence of values for the penalty parameter. Defaults to `range = seq(1, 25, length.out = 25)`. The values are taken to be the reciprocals of the prior variance so must be strictly positive.
`shape`	String giving the name of the shape parameter. Defaults to `shape = NULL` and the function tries to guess.

Details

Only the shape parameter is assumed to be penalized. The penalty can be thought of in terms of the variance of a prior distribution, which is equivalent to a quadratic penalty. Because the shape parameter will usually be between -1/2 and 1/2, a prior N(0, 1/16) distribution will likely be a good starting point, so values that span 16 will usually be appropriate.

Note that the procedure appears to frequently prefer larger penalties over smaller ones, effectively driving the shape parameter to zero. However, if you are fitting distributions that can model long tails, there is probably a good reason for that and you should use your prior knowledge to determine if non-zero values of the shape are plausible, rather than rely solely on an automated procedure.

Also note that small numbers of observations can have a big impact on parameter estimates. Because cross-validation involves randomly assigning values to folds, the results are generally different from one run to the next. These to features combined can produce quite big differences between cross-validation runs and it is advisable to use either leave-one-out (by setting folds to be the same as the length of the data), or to run the procedure several times and average over them.

@note At present, only models without covariates are implemented.

Density, cumulative density, quantiles and random number generation for the extended generalized Pareto distribution 3

Description

Density, cumulative density, quantiles and random number generation for the EGP3 distribution of Papastathopoulos and Tawn

Usage

degp3(x, kappa = 1, sigma, xi, u = 0, log.d = FALSE)

pegp3(q, kappa = 1, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

qegp3(p, kappa = 1, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

regp3(n, kappa = 1, sigma, xi, u = 0)
degp3(x, kappa = 1, sigma, xi, u = 0, log.d = FALSE)

pegp3(q, kappa = 1, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

qegp3(p, kappa = 1, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

regp3(n, kappa = 1, sigma, xi, u = 0)

Arguments

`x`, `q`, `p`	Value, quantile or probability respectively.
`kappa`	The power parameter (Papastathopoulos and Tawn call it the shape parameter and call what we call the shape parameter the tail index.)
`sigma`	Scale parameter.
`xi`	Shape parameter.
`u`	Threshold
`log.d`, `log.p`	Whether or not to work on the log scale.
`lower.tail`	Whether to return the lower tail.
`n`	Number of random numbers to simulate.

Author(s)

Harry Southworth

References

I. Papastathopoulos and J. A. Tawn, Extended generalized Pareto modles for tail estimation, Journal of Statistical Planning and Inference, 143, 131 – 143, 2013

Examples


  x <- regp3(1000, kappa=2, sigma=1, xi=.5)
  hist(x)
  x <- regp3(1000, kappa=2, sigma=exp(rnorm(1000, 1, .25)), xi=rnorm(1000, .5, .2))
  hist(x)
  plot(pegp3(x, kappa=2, sigma=1, xi=.5))

x <- regp3(1000, kappa=2, sigma=1, xi=.5)
  hist(x)
  x <- regp3(1000, kappa=2, sigma=exp(rnorm(1000, 1, .25)), xi=rnorm(1000, .5, .2))
  hist(x)
  plot(pegp3(x, kappa=2, sigma=1, xi=.5))

Density, cumulative density, quantiles and random number generation for the generalized extreme value distribution

Description

Density, cumulative density, quantiles and random number generation for the generalized extreme value distribution

Usage

dgev(x, mu, sigma, xi, log.d = FALSE)

pgev(q, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

qgev(p, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

rgev(n, mu, sigma, xi)
dgev(x, mu, sigma, xi, log.d = FALSE)

pgev(q, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

qgev(p, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

rgev(n, mu, sigma, xi)

Arguments

`x`, `q`, `p`	Value, quantile or probability respectively.
`mu`	Location parameter.
`sigma`	Scale parameter.
`xi`	Shape parameter.
`log.d`, `log.p`	Whether or not to work on the log scale.
`lower.tail`	Whether to return the lower tail.
`n`	Number of random numbers to simulate.

Details

Random number generation is done as a transformation of the Gumbel distribution; Gumbel random variates are generated as the negative logarithm of standard exponentials.

Author(s)

Harry Southworth

Examples


  x <- rgev(1000, mu=0, sigma=1, xi=.5)
  hist(x)
  x <- rgev(1000, mu=0, sigma=exp(rnorm(1000, 1, .25)), xi=rnorm(1000, .5, .2))
  hist(x)
  plot(pgev(x, mu=0, sigma=1, xi=.5))

x <- rgev(1000, mu=0, sigma=1, xi=.5)
  hist(x)
  x <- rgev(1000, mu=0, sigma=exp(rnorm(1000, 1, .25)), xi=rnorm(1000, .5, .2))
  hist(x)
  plot(pgev(x, mu=0, sigma=1, xi=.5))

Density, cumulative density, quantiles and random number generation for the generalized Pareto distribution

Description

Density, cumulative density, quantiles and random number generation for the generalized Pareto distribution

Usage

dgpd(x, sigma, xi, u = 0, log.d = FALSE)

pgpd(q, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

qgpd(p, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

rgpd(n, sigma, xi, u = 0)
dgpd(x, sigma, xi, u = 0, log.d = FALSE)

pgpd(q, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

qgpd(p, sigma, xi, u = 0, lower.tail = TRUE, log.p = FALSE)

rgpd(n, sigma, xi, u = 0)

Arguments

`x`, `q`, `p`	Value, quantile or probability respectively.
`sigma`	Scale parameter.
`xi`	Shape parameter.
`u`	Threshold
`log.d`, `log.p`	Whether or not to work on the log scale.
`lower.tail`	Whether to return the lower tail.
`n`	Number of random numbers to simulate.

Details

Random number generation is done by transformation of a standard exponential.

Author(s)

Janet E Heffernan, Paul Metcalfe, Harry Southworth

Examples


  x <- rgpd(1000, sigma=1, xi=.5)
  hist(x)
  x <- rgpd(1000, sigma=exp(rnorm(1000, 1, .25)), xi=rnorm(1000, .5, .2))
  hist(x)
  plot(pgpd(x, sigma=1, xi=.5))

x <- rgpd(1000, sigma=1, xi=.5)
  hist(x)
  x <- rgpd(1000, sigma=exp(rnorm(1000, 1, .25)), xi=rnorm(1000, .5, .2))
  hist(x)
  plot(pgpd(x, sigma=1, xi=.5))

The Gumbel distribution

Description

Density, distribution and quantile functions, and random number generation for the Gumbel distribution

Usage

dgumbel(x, mu, sigma, log.d = FALSE)

rgumbel(n, mu, sigma)

pgumbel(q, mu, sigma, lower.tail = TRUE, log.p = FALSE)

qgumbel(p, mu, sigma, lower.tail = TRUE, log.p = FALSE)
dgumbel(x, mu, sigma, log.d = FALSE)

rgumbel(n, mu, sigma)

pgumbel(q, mu, sigma, lower.tail = TRUE, log.p = FALSE)

qgumbel(p, mu, sigma, lower.tail = TRUE, log.p = FALSE)

Arguments

`x`, `q`, `p`	Vectors of quantiles or probabilities.
`mu`	The location parameter.
`sigma`	The scale parameter.
`log.d`, `log.p`	Whether to return logged values, or to treat probabilities/densities as being logged.
`n`	The number of observations.
`lower.tail`	Whether to return the lower tail. If `lower.tail=FALSE`, the upper tail is returned.

Compute empirical distribution function

Description

Compute the empirical distribution function

Usage

edf(x, na.last = NA)
edf(x, na.last = NA)

Arguments

`x`	A numeric vector
`na.last`	How to treat missing values. See `rank` for details.

Value

A vector of quantiles relating to the observations in x.

Author(s)

Harry Southworth

Examples


plot(winter$NO, edf(winter$NO))

plot(winter$NO, edf(winter$NO))

Estimate the EGP3 distribution power parameter over a range of thresholds

Description

Estimate extended generalized Pareto distribution power parameter over a range of values, using maximum (penalized) likelihood.

Usage

egp3RangeFit(data, umin=quantile(data, .05), umax=quantile(data,
.95), nint = 10, penalty = "gaussian", priorParameters = NULL, alpha=0.05)
## S3 method for class 'egp3RangeFit'
print(x, ...)
## S3 method for class 'egp3RangeFit'
plot(x, xlab = "Threshold", ylab = "kappa", main = NULL, addNexcesses=TRUE, log.="", ...)
## S3 method for class 'egp3RangeFit'
ggplot(data, mapping, xlab = "Threshold", ylab = expression(kappa),
main=NULL,fill="orange", col="blue",addNexcesses=TRUE, textsize=4, ..., environment)
egp3RangeFit(data, umin=quantile(data, .05), umax=quantile(data,
.95), nint = 10, penalty = "gaussian", priorParameters = NULL, alpha=0.05)
## S3 method for class 'egp3RangeFit'
print(x, ...)
## S3 method for class 'egp3RangeFit'
plot(x, xlab = "Threshold", ylab = "kappa", main = NULL, addNexcesses=TRUE, log.="", ...)
## S3 method for class 'egp3RangeFit'
ggplot(data, mapping, xlab = "Threshold", ylab = expression(kappa),
main=NULL,fill="orange", col="blue",addNexcesses=TRUE, textsize=4, ..., environment)

Arguments

`data`	The data vector to be modelled.
`umin`	The minimum threshold above which to estimate the parameters.
`umax`	The maximum threshold above which to estimate the parameters.
`nint`	The number of thresholds at which to perform the estimation.
`penalty`	The type of penalty to be used in the maximum penalized likelihood estimation. Should be either "gaussian" or "none". Defaults to "gaussian".
`priorParameters`	Parameters to be used for the penalty function. See the help for `evm` for more informaiton.
`alpha`	100(1 - alpha)% confidence intervals will be plotted with the point estimates. Defaults to `alpha = 0.05`.
`x`	Argument to the `print` functions.
`xlab`	Label for the x-axis.
`ylab`	Label for the y-axis.
`main`	The main title.
`textsize`	Size of text for annotation showing number of threshold excesses.
`addNexcesses`	Annotate top axis with numbers of threshold excesses arising with the corresponding values of threshold on the bottom axis.
`log.`	Argument passed through to `plot`. Can take values "x" for plotting the x-axis on the log scale, "y" for plotting the y-axis on the log scale, "xy" for both, or "" (the default) for neither.
`mapping`, `fill`, `col`, `environment`	Arguments to ggplot method.
`...`	Arguments to `plot`.

Details

Papastathopoulos and Tawn present 3 extended versions of the generalized Pareto distribution. Using the egp3 texmex family object, the power parameter in the EGP3 distribution is estimated on the log scale, a confidence interval is calculated and the result is transformed back to the scale of the power parameter and returned to the user.

When the power paramer, kappa, is equal to 1, the EPG3 distribution is identical to the generalized Pareto distribution. Therefore, the plot of the estimated parameter over a range of thresholds provides a diagnostic for threshold selection: the lowest value of kappa whose confidence interval includes 1 is suggested as the threshold for generalized Pareto modelling.

If lower thresholds are used and the EGP3 distribution itself is used for modelling, some care should be taken to ensure the model provides a reasonable degree of fit to the data. Limited experience suggests that such models seldom fit well and the main value of the EGP3 distribution is as a diagnostic for threshold selection as described here.

Note this function does not extend to assessing model fit when there are covariates included in the model.

Author(s)

Harry Southworth

References

I. Papastathopoulos and J. A. Tawn, Extended generalized Pareto modles for tail estimation, Journal of Statistical Planning and Inference, 143, 131 – 143, 2013

Examples


 # because of the time it takes to run
erf <- egp3RangeFit(rain)
plot(erf)
ggplot(erf)


# because of the time it takes to run
erf <- egp3RangeFit(rain)
plot(erf)
ggplot(erf)

Calculate upper end point for a fitted extreme value model

Description

Calculate upper end point for fitted extreme value model

Usage

endPoint(y, verbose=TRUE, .unique=TRUE, ...)

## S3 method for class 'evmOpt'
endPoint(y,verbose=TRUE, .unique=TRUE, ...)
## S3 method for class 'evmSim'
endPoint(y,verbose=TRUE, .unique=TRUE, ...)
endPoint(y, verbose=TRUE, .unique=TRUE, ...)

## S3 method for class 'evmOpt'
endPoint(y,verbose=TRUE, .unique=TRUE, ...)
## S3 method for class 'evmSim'
endPoint(y,verbose=TRUE, .unique=TRUE, ...)

Arguments

`y`	Object of class `evmOpt` or `evmSim`, as returned by `evm`.
`verbose`	Whether to print output.
`.unique`	Whether or not to use only unique values of `y`.
`...`	further arguments to be passed to the `signif` function.

Value

In cases where the fitted shape parameter is negative, the fitted finite upper endpoint of the extreme value model.

Author(s)

Janet E. Heffernan

Extreme value modelling

Description

Likelihood based modelling and inference for extreme value models, possibly with explanatory variables.

Usage

evm(y, data = NULL, ...)

evmReal(y, data)

evm.default(
  y,
  data = NULL,
  family = gpd,
  th = -Inf,
  qu,
  ...,
  penalty = NULL,
  prior = "gaussian",
  method = "optimize",
  cov = "observed",
  start = NULL,
  priorParameters = NULL,
  maxit = 10000,
  trace = NULL,
  iter = 40500,
  burn = 500,
  thin = 4,
  chains = 1,
  proposal.dist = c("gaussian", "cauchy"),
  jump.cov,
  jump.const = NULL,
  R = 1000,
  cores = NULL,
  export = NULL,
  verbose = TRUE,
  call = NULL
)
evm(y, data = NULL, ...)

evmReal(y, data)

evm.default(
  y,
  data = NULL,
  family = gpd,
  th = -Inf,
  qu,
  ...,
  penalty = NULL,
  prior = "gaussian",
  method = "optimize",
  cov = "observed",
  start = NULL,
  priorParameters = NULL,
  maxit = 10000,
  trace = NULL,
  iter = 40500,
  burn = 500,
  thin = 4,
  chains = 1,
  proposal.dist = c("gaussian", "cauchy"),
  jump.cov,
  jump.const = NULL,
  R = 1000,
  cores = NULL,
  export = NULL,
  verbose = TRUE,
  call = NULL
)

Arguments

`y`	Either a numeric vector, the name of a variable in `data` or a string representing the name of a variable in `data`. NOTE THAT the use of non-standard evaluation is likely to be removed from future versions of texmex.
`data`	A data frame containing `y` and any covariates.
`...`	In `evm`, formulae for the parameters in the family, e.g. `phi = ~ x`. If none are specified, they all default to `~1`.
`family`	An object of class 'texmexFamily'. Defaults to `family=gpd` and a generalized Pareto distribution (GPD) is fit to the data. Alternatively the family could be `gev`, `weibull` or `gumbel`, resulting in a generalized extreme value distribution, Weibull or Gumbell distribution being fit. Family `cgpd` fits the generalized Pareto distribution but with the shape parameter constrained to be > 0.5 by using the link function suggested by Yee and Stephenson (2007), $\eta$ = log( $\xi$ + 0.5). Family `egp3` fits the extended GP family 3 of Papastathopoulos and Tawn (2013). No other families are currently available in texmex, but users may write their own.
`th`	For threshold excess models (such as when `family=gpd`), the threshold for `y`, exceedances above which will be used to fit the upper tail model. Note that if you have already thresholded your data and want to model all of `y`, you still need to specify `th`.
`qu`	An alternative to `th`, a probability defined such that `quantile(y,qu)` equals `th`.
`penalty`	How to penalize the likelhood. Currently, either "none"", "gaussian"" or "lasso"" are the only allowed values. If `penalty` is "gaussian" or "lasso" then the parameters for the penalization are specified through the `priorParameters` argument. See below. Defaults to `penalty=NULL` and applies maximum likelihood estimation.
`prior`	If `method = "optimize"`, just an alternative way of specifying the penalty, and only one or neither of `penalty` and `prior` should be given. If `method = "simulate"`, prior must be "gaussian" because no other prior distributions have been implemented.
`method`	Should be either "optimize" (the default), "simulate" or "bootstrap". The first letter or various abbreviations will do. If 'optimize' is used, the (penalized) likelihood is directly optimized using `optim` and point estimates (either ML or MAP estimates) are returned with other information. If "simulate", a Metropolis algorithm is used to simulate from the joint posterior distribution of the parameters. If "bootstrap", a parametric boostrap is performed.
`cov`	How to compute the covariance matrix of the parameters. Defaults to `cov = "observed"` in which case the observed information matrix is used, if the `info` element of the `texmexFamily` object is present. Note that currently, this is not implemented for `gev`. Alternatives are `cov = "numeric"` in which case a numerical approximation of the Hessian is used (see the help for `optim`), or `cov = "sandwich"` if the `sandwich` element of the `texmexFamily` object is implemented. The `cov = "sandwich"` method implements the Huber sandwich correction to the covariance matrix for data which are not independent and in which case the likelihood function no longer has the interpretation of a joint likelihood, but instead should be interpreted as a pseudo-likelihod. In some cases, particularly with small samples, the numerical approximation can be quite different from the closed form (`cov="observed"`) result, and the value derived from the observed information should be preferred. However, in either case, since the underlying log-likelihood may be far from quadratic for small samples, the resulting estimates of standard errors are liable to approximate poorly the true standard errors. Also see the comments in the Details section, below.
`start`	Starting values for the parameters, to be passed to `optim`. If not provided, the function will use the `start` element of the `texmexFamily` object if it exists.
`priorParameters`	A list with two components. The first should be a vector of means, the second should be a covariance matrix if the penalty/prior is "gaussian" or "quadratic" and a diagonal precision matrix if the penalty/prior is "lasso", "L1" or "Laplace". If `method = "simulate"` then these represent the parameters in the Gaussian prior distribution. If `method = 'optimize'` then these represent the parameters in the penalty function. If not supplied: all default prior means are zero; all default prior variances are $10^4$ ; all covariances are zero.
`maxit`	The number of iterations allowed in `optim`.
`trace`	Whether or not to print progress to screen. If `method = "optimize"`, the argument is passed into `optim` – see the help for that function. If `method = "simulate"`, the argument determines at how many steps of the Markov chain the function should tell the user, and in this case it defaults to `trace = 10000`.
`iter`	Number of simulations to generate under `method = "simulate"`. Defaults to 40500.
`burn`	The number of initial steps to be discarded. Defaults to 500.
`thin`	The degree of thinning of the resulting Markov chains. Defaults to 4 (one in every 4 steps is retained).
`chains`	The number of Markov chains to run. Defaults to 1. If you run more than 1, the function tries to figure out how to do it in parallel using as many cores as there are chains.
`proposal.dist`	The proposal distribution to use, either multivariate gaussian or a multivariate Cauchy.
`jump.cov`	Covariance matrix for proposal distribution of Metropolis algorithm. This is scaled by `jump.const`.
`jump.const`	Control parameter for the Metropolis algorithm.
`R`	The number of parametric bootstrap samples to run when `method = "bootstrap"` is requested. Defaults to 1000.
`cores`	The number of cores to use when bootstrapping. Defaults to `cores=NULL` and the function guesses how many cores are available and uses them all.
`export`	Character vector of names of objects to export if parallel processing is being used and you are using objects from outside of texmex. It it passed to `parallel::clusterExport` and used by `texmex::evmBoot`.
`verbose`	Whether or not to print progress to screen. Defaults to `verbose=TRUE`.
`call`	Used internally, defaults to `call = NULL`.

Details

The main modelling function is evm (extreme value model) and the distribution to be used is specified by passing an object of class texmexFamily to the family argument.

The default texmexFamily object used by evm is gpd. Currently, the other texmexFamily objects available are gev which results in fitting a generalized extreme value (GEV) distribution to the data, gpdIntCensored which can be used to fit the GPD to data which has been rounded to a given number of decimal places by recognisiing the data as interval censored, and egp3 which fits the extended generalized Pareto distribution version 3 of Papastathopoulos and Tawn (2013).

See Coles (2001) for an introduction to extreme value modelling and the GPD and GEV models.

For the GPD model, we use the following parameterisation of evm:

$P(Y \le y) = 1 - (1 + \xi y / \sigma)^{-1/\xi}$

for $y \ge 0$ and $1 + \xi y / \sigma \ge 0.$

For the GEV model, we use:

$P(Y \le y) = exp (-(1 + \xi (y - \mu) / \sigma) ^{-1/\xi})$

In each case, the scale parameter is sigma ( $\sigma$ ) and the shape parameter is xi ( $\xi$ ). The GEV distribution also has location parameter mu ( $\mu$ ). See Papastathopoulos and Tawn (2013) for specification of the EGP3 model.

Working with the log of the scale parameter improves the stability of computations, makes a quadratic penalty more appropriate and enables the inclusion of covariates in the model for the scale parameter, which must remain positive. We therefore work with $\phi$ =log( $\sigma$ ). All specification of priors or penalty functions refer to $\phi$ rather than $\sigma$ . A quadratic penalty can be thought of as a Gaussian prior distribution, whence the terminology of the function.

Parameters of the evm are estimated by using maximum (penalized) likelihood (method = "optimize"), or by simulating from the posterior distribution of the model parameters using a Metropolis algorithm (method = "simulate"). In the latter case, start is used as a starting value for the Metropolis algorithm; in its absence, the maximum penalized likelhood point estimates are computed and used.

A boostrap approach is also available (method = "bootstrap"). This runs a parametric bootstrap, simulating from the model fit by optimization.

When method = "simulate" the print and summary functions give posterior means and standard deviations. Posterior means are also returned by the coef method. Depending on what you want to do and what the posterior distributions look like (use plot method) you might want to work with quantiles of the posterior distributions instead of relying on standard errors.

When method = "bootstrap", summaries of the bootstrap distribution and the bootstrap estimate of bias are displayed.

Value

If method = "optimize", an object of class evmOpt:

`call`	The call to `evmSim` that produced the object.
`data`	The original data (above and below the threshold for fitting if a distribution for threshold excesses has been used). In detail, `data` is a list with elements `y` and `D`. `y` is the response variable and `D` is a list containing the design matrices implied by any formlae used in the call to `evm`.
`convergence`	Output from `optim` relating to whether or not the optimizer converged.
`message`	A message telling the user whether or not convergence was achieved.
`threshold`	The threshold of the data above which the evmSim model was fit.
`penalty`	The type of penalty function used, if any.
`coefficients`	The parameter estimates as computed under maximum likelihood or maximum penalized likelihood.
`rate`	The proportion of observations above the threshold. If the model is not a threshold exceedance model (e.g. the GEV model), the rate will be 1.
`priorParameters`	See above.
`residuals`	Residuals computed using the residual function in the `texmexFamily` object, if any. These are used primarly for producing QQ and PP plots via `plot.evmOpt` or `ggplot.evmOpt`. The residuals are transformed values of the raw data, accounting for the parameter estimates: see the `residuals` component of the `texmexFamily` object for the calculations. For the generalized Pareto family, they are (if the model fits well) standard exponential variates; for the GEV family, standard Gumbel variates.
`ploglik`	The value of the optimized penalized log-likelihood.
`loglik`	The value of the optimized (unpenalized) log-likelihood. If `penalty='none'` is used, this will be identical to `ploglik`, above.
`cov`	The estimated covariance of the parameters in the model.
`se`	The estimated standard errors of the parameters in the model.
`xlevels`	A named list containing a named list for each design matrix (main parameter) in the model. Each list contians an element named after each factor in the linear predictor for the respective design matrix. These are used by the `predict` method to ensure all factor levels are known, even if they don't appear in `newdata`.

If method = "simulate", an object of class evmSim:

`call`	The call to `evmSim` that produced the object.
`threshold`	The threshold above which the model was fit.
`map`	The point estimates found by maximum penalized likelihood and which were used as the starting point for the Markov chain. This is of class `evmOpt` and methods for this class (such as resid and plot) may be useful.
`burn`	The number of steps of the Markov chain that are to be treated as the burn-in and not used in inferences.
`thin`	The degree of thinning used.
`chains`	The entire Markov chain generated by the Metropolis algorithm.
`y`	The response data above the threshold for fitting.
`seed`	The seed used by the random number generator.
`param`	The remainder of the chain after deleting the burn-in and applying any thinning.

If method = "bootstrap", an object of class evmBoot:

`call`	The call to `evmBoot` that produced the object.
`replicates`	The parameter estimates from the bootstrap fits.
`map`	The fit by by maximum penalized likelihood to the orginal data. This is of class `evmOpt` and methods for this class (such as resid and plot) may be useful.

There are summary, plot, print, residuals and coefficients methods available for these classes.

Note

For both GPD and GEV models, when there are estimated values of $\xi \le -0.5$ , the regularity conditions of the likelihood break down and inference based on approximate standard errors cannot be performed. In this case, the most fruitful approach to inference appears to be by the bootstrap. It might be possible to simulate from the posterior, but finding a good proposal distribution might be difficult and you should take care to get an acceptance rate that is reasonably high (around 40% when there are no covariates, lower otherwise). To constrain the parameter space of the GP shape parameter, use family = cgpd in the call to evm and the transformation $\eta$ = log( $\xi$ + 0.5) is used, as suggested by Yee and Stephenson (2007).

Author(s)

Janet E. Heffernan, Harry Southworth. Some of the internal code is based on the gpd.fit function in the ismev package and is due to Stuart Coles.

References

S. Coles. An Introduction to Statistical Modelling of Extreme Values. Springer, 2001.

I. Papastathopoulos and J. A. Tawn, Extended generalised Pareto models for tail estimation, Journal of Statistical Planning and Inference, 143, 131 - 143, 2013.

T. W. Yee and A. G. Stephenson, Vector generalized linear and additive extreme value models, Extremes, 10, 1 – 19, 2007.

Examples


  
  #mod <- evm(rain, th=30)
  #mod
  #par(mfrow=c(2, 2))
  #plot(mod)
  

  
  mod <- evm(rain, th=30, method="sim")
  par(mfrow=c(3, 2))
  plot(mod)
  

  
  mod <- evm(SeaLevel, data=portpirie, family=gev)
  mod
  plot(mod)
  

  
  mod <- evm(SeaLevel, data=portpirie, family=gev, method="sim")
  par(mfrow=c(3, 3))
  plot(mod)
  


#mod <- evm(rain, th=30)
  #mod
  #par(mfrow=c(2, 2))
  #plot(mod)
  

  
  mod <- evm(rain, th=30, method="sim")
  par(mfrow=c(3, 2))
  plot(mod)
  

  
  mod <- evm(SeaLevel, data=portpirie, family=gev)
  mod
  plot(mod)
  

  
  mod <- evm(SeaLevel, data=portpirie, family=gev, method="sim")
  par(mfrow=c(3, 3))
  plot(mod)

Bootstrap an evmOpt fit

Description

This runs a parametric bootstrap simulating from an optimized model.

Usage

evmBoot(o, R=1000, trace=100, cores=NULL, export=NULL, theCall)
## S3 method for class 'evmBoot'
summary(object,...)
## S3 method for class 'evmBoot'
plot(x,col=4,border=NULL,...)
## S3 method for class 'evmBoot'
coef(object,...)
## S3 method for class 'summary.evmBoot'
print(x,...)
## S3 method for class 'evmBoot'
print(x,...)
evmBoot(o, R=1000, trace=100, cores=NULL, export=NULL, theCall)
## S3 method for class 'evmBoot'
summary(object,...)
## S3 method for class 'evmBoot'
plot(x,col=4,border=NULL,...)
## S3 method for class 'evmBoot'
coef(object,...)
## S3 method for class 'summary.evmBoot'
print(x,...)
## S3 method for class 'evmBoot'
print(x,...)

Arguments

`o`	a fit `evmOpt` object
`R`	the number of parametric bootstrap samples to run
`trace`	the frequency of trace messages
`cores`	The number of coresto use when bootstrapping. Defaults to `cores=NULL` and the function guesses how many cores are available and uses them all.
`export`	Character vector of names of variables to export. See the help file for `parallel::export`. Defaults to `export = NULL` and most users will never need to use it.
`theCall`	(for internal use)
`x`	an `evmBoot` object
`col`	colour used to fill histogram
`border`	the colour of the border around the bars
`object`	a `evmBoot` object
`...`	other arguments passed to internal functions

Value

An object of class evmBoot; a list with

`call`	The call to `evmBoot` that produced the object.
`replicates`	The parameter estimates from the bootstrap fits.
`map`	The fit by by maximum penalized likelihood to the original data.

Note

It is not expected that a user will need to call this function directly; you are directed to evm.

MCMC simulation around an evmOpt fit

Description

MCMC simulation around an evmOpt fit

Usage

evmSim(
  o,
  priorParameters,
  prop.dist,
  jump.const,
  jump.cov,
  iter,
  start,
  thin,
  burn,
  chains,
  export = NULL,
  verbose,
  trace,
  theCall,
  ...
)
evmSim(
  o,
  priorParameters,
  prop.dist,
  jump.const,
  jump.cov,
  iter,
  start,
  thin,
  burn,
  chains,
  export = NULL,
  verbose,
  trace,
  theCall,
  ...
)

Arguments

`o`	a fit `evmOpt` object
`priorParameters`	A list with two components. The first should be a vector of means, the second should be a covariance matrix if the penalty/prior is "gaussian" or "quadratic" and a diagonal precision matrix if the penalty/prior is "lasso", "L1" or "Laplace". If `method = "simulate"` then these represent the parameters in the Gaussian prior distribution. If `method = 'optimize'` then these represent the parameters in the penalty function. If not supplied: all default prior means are zero; all default prior variances are $10^4$ ; all covariances are zero.
`prop.dist`	The proposal distribution to use, either multivariate gaussian or a multivariate Cauchy.
`jump.const`	Control parameter for the Metropolis algorithm.
`jump.cov`	Covariance matrix for proposal distribution of Metropolis algorithm. This is scaled by `jump.const`.
`iter`	Number of simulations to generate
`start`	Starting values for the chain; if missing, defaults to the MAP/ML estimates in `o`.
`thin`	The degree of thinning of the resulting Markov chains.
`burn`	The number of initial steps to be discarded.
`chains`	The number of Markov chains to run. Defaults to 1. If you run more, the function will try to figure out how to do it in parallel using the same number of cores as chains.
`export`	Character vector of names of variables to export. See the help file for `parallel::export`. Defaults to `export = NULL` and most users will never need to use it. Only matters on Windows.
`verbose`	Whether or not to print progress to screen. Defaults to `verbose=TRUE`.
`trace`	How frequently to talk to the user
`theCall`	(internal use only)
`...`	ignored

Value

an object of class evmSim:

`call`	The call to `evmSim` that produced the object.
`threshold`	The threshold above which the model was fit.
`map`	The point estimates found by maximum penalized likelihood and which were used as the starting point for the Markov chain. This is of class `evmOpt` and methods for this class (such as resid and plot) may be useful.
`burn`	The number of steps of the Markov chain that are to be treated as the burn-in and not used in inferences.
`thin`	The degree of thinning used.
`chains`	The entire Markov chain generated by the Metropolis algorithm.
`y`	The response data above the threshold for fitting.
`seed`	The seed used by the random number generator.
`param`	The remainder of the chain after deleting the burn-in and applying any thinning.

Note

it is not expected that the user should call this directly

Set the seed from a fitted evmSim object.

Description

Set the seed from a fitted evmSim object to ensure reproducibility of output.

Usage

evmSimSetSeed(x)
evmSimSetSeed(x)

Arguments

`x`	An object of class `evmSim`, as returned by `evm` using `method = "simulate"`.

Details

Sets the seed to the value used to fit the model.

Author(s)

Harry Southworth

Examples

  
 data <- rnorm(1000)
 mod <- evm(data, qu=.7, method="simulate")
 evmSimSetSeed(mod)
 mod1 <- evm(data, qu=.7, method="simulate") # this produces the same MCMC output as mod


data <- rnorm(1000)
 mod <- evm(data, qu=.7, method="simulate")
 evmSimSetSeed(mod)
 mod1 <- evm(data, qu=.7, method="simulate") # this produces the same MCMC output as mod

Extremal index estimation and automatic declustering

Description

Given a threshold which defines excesses above that threshold, estimate the extremal index of a dependent sequence by using the method of Ferro and Segers, 2003. The extremal index estimate can then be used to carry out automatic declustering of the sequence to identify independent clusters and estimate the GPD for cluster maxima. Graphical diagnostics of model fit are available.

Usage

extremalIndex(y, data = NULL, threshold)

extremalIndexRangeFit(y, data = NULL, umin = quantile(y,.5), umax =
quantile(y, 0.95), nint = 10, nboot = 100, alpha = .05, estGPD=TRUE,
verbose = TRUE, trace = 10, ...)

bootExtremalIndex(x)

declust(y, r=NULL, data = NULL, ...)

## S3 method for class 'extremalIndex'
declust(y, r=NULL,...)

## S3 method for class 'declustered'
plot(x, ylab = "Data",...)

## S3 method for class 'declustered'
evm(y, data=NULL, family=gpd, ...)

## S3 method for class 'extremalIndexRangeFit'
plot(x,addNexcesses=TRUE,estGPD=TRUE,...)

## S3 method for class 'extremalIndex'
print(x,...)

## S3 method for class 'declustered'
print(x,...)

## S3 method for class 'extremalIndexRangeFit'
ggplot(data=NULL, mapping, xlab, ylab, main,
ylim = "auto",ptcol="dark blue",col="dark blue",fill="orange",
textsize=4,addNexcesses=TRUE,estGPD=TRUE,..., environment)
extremalIndex(y, data = NULL, threshold)

extremalIndexRangeFit(y, data = NULL, umin = quantile(y,.5), umax =
quantile(y, 0.95), nint = 10, nboot = 100, alpha = .05, estGPD=TRUE,
verbose = TRUE, trace = 10, ...)

bootExtremalIndex(x)

declust(y, r=NULL, data = NULL, ...)

## S3 method for class 'extremalIndex'
declust(y, r=NULL,...)

## S3 method for class 'declustered'
plot(x, ylab = "Data",...)

## S3 method for class 'declustered'
evm(y, data=NULL, family=gpd, ...)

## S3 method for class 'extremalIndexRangeFit'
plot(x,addNexcesses=TRUE,estGPD=TRUE,...)

## S3 method for class 'extremalIndex'
print(x,...)

## S3 method for class 'declustered'
print(x,...)

## S3 method for class 'extremalIndexRangeFit'
ggplot(data=NULL, mapping, xlab, ylab, main,
ylim = "auto",ptcol="dark blue",col="dark blue",fill="orange",
textsize=4,addNexcesses=TRUE,estGPD=TRUE,..., environment)

Arguments

`y`	Argument to function `extremalIndex`: either a numeric vector or the name of a variable in `data`.
`data`	A data frame containing `y` and any covariates. In `evm.declustered`, it should be NULL and is included to match the arguments of generic `evm`.
`threshold`	The threshold for `y`, exceedances above which will be used to estimate the extremal index and carry out automatic declustering.
`family`	The type of extreme value model. The user should not change this from its default in `evm.declustered`.
`x`	Objects passed to methods.
`r`	Positivie integer: run length to be used under "runs" declustering. If specified then so-called "runs" declustering will be carried out, otherwise defaults to NULL in which case the automatic "intervals" declustering method of Ferro and Segers is used.
`umin`	The minimum threshold above which to esimate the parameters.
`umax`	The maximum threshold above which to esimate the parameters.
`nint`	The number of thresholds at which to perform the estimation.
`nboot`	Number of bootstrap samples to simulate at each threshold for estimation.
`alpha`	100(1 - alpha)% confidence intervals will be plotted with the point estimates. Defaults to `alpha = 0.05`.
`xlab`	Label for the x-axis (ggplot).
`ylab`	Label for the y-axis (ggplot).
`addNexcesses`	Whether to annotate the top axis of plots with the number of excesses above the corresponding threhsold. Defaults to `TRUE`.
`estGPD`	Whether to estimate GPD parameters at each choice of thereshold – defaults to `TRUE` in which case the GPD parameters are estimated.
`verbose`	Whether to report on progress in RangeFit calculations. Defaults to `TRUE`.
`trace`	How frequently to report bootstrap progress in RangeFit calculations. Defaults to 10.
`mapping`, `main`, `ylim`, `ptcol`, `col`, `fill`, `textsize`, `environment`	Further arguments to ggplot method.
`...`	Further arguments to methods.

Details

The function extremalIndex estimates the extremal index of a dependent series of observations above a given threshold threshold, returning an object of class "extremalIndex". Plot and print methods are available for this class. A graphical diagnostic akin to Figure 1 in Ferro and Segers (2003) is produced by the plot method for this class. This plot is used to test the model assumption underpinning the estimation, with good fit being indicated by interexceedance times which correspond to inter-cluster times lying close to the diagonal line indicated.

In addition to good model fit, an appropriate choice of threshold is one above which the estimated extremal index is stable over further, higher thresholds (up to estimation uncertainty). This can be assessed by using the function extremalIndexRangeFit, which examines a range of threshold values. At each threshold, the extremal index is estimated; that estimate is used to decluster the series and the parameters of the GPD are optionally estimated for the resulting declustered series. Uncertainty in the estimation of the extremal index and GPD parameters is assessed by using a bootstrap scheme which accounts for uncertainty in the extremal index estimation, and the corresponding uncertainty in the declustering of the series. There are plot and ggplot methods for output of this function, which is of class extremalIndexRangeFit.

The function declust returns an object of class "declustered", identifying independent clusters in the original series. Print, plot and show methods are available for this class. The GPD model can be fitted to objects of this class, including the use of covariates in the linear predictors for the parameters of the GPD. See examples below.

Value

The function extremalIndex returns a list of class "extremalIndex":

`EIintervals`	Estimate of the extremal index by using the intervals estimator of Ferro and Segers.
`threshold`	threshold for declustering and estimation
`TotalN`	length of original data series
`nExceed`	number of exceedances of `threshold` in original series.
`thExceedanceProb`	probablity of threshold exceedance in original series.
`call`	the original function call
`interExceedTimes`	times between threshold exceedances
`thExceedances`	observation from the original series which are above `threshold`
`exceedanceTimes`	times of occurrance of threshold exceedances
`y`	original dependent series
`data`	data frame or NULL

The function declust returns a list of type "declustered":

`clusters`	integer labels assigning threshold exceedances to clusters
`sizes`	number of exceedances in each cluster
`clusterMaxima`	vector made up of the largest observation from each distinct cluster. In the case of ties, the first value is taken.
`isClusterMax`	logical; length equal to number of threshold exceedances, value is `TRUE` for threshold exceedances which correspond to cluster maxima
`y`	see entry for object of class "extremalIndex" above
`data`	see entry for object of class "extremalIndex" above
`threshold`	see entry for object of class "extremalIndex" above
`EIintervals`	see entry for object of class "extremalIndex" above
`call`	see entry for object of class "extremalIndex" above
`InterExceedTimes`	times between threshold exceedances, length is one less than the number of threshold exceedances
`InterCluster`	logical: indicates inter exceedance times larger than `r` the run length used for declustering
`thExceedances`	see entry for object of class "extremalIndex" above
`exceedanceTimes`	see entry for object of class "extremalIndex" above
`r`	run length used for declustering
`nClusters`	Number of indenendent clusters identified
`method`	Method used for declustering (either "intervals" or "runs")

The function bootExtremalIndex return a single vector corersponding to a bootstrap sample from the original series: observations are censored at threshold so that values below this threshold are indicated by the value -1.

The method evm for class "declustered" returns an object of type "evmOpt" or "evmSim" depending on the precise function call - see documentation for evm.

Author(s)

Janet E. Heffernan

References

Ferro, C.A.T. and Segers, J., (2003) "Inference for clusters of Extreme Values", JRSS B 65, Part 2, pp 545–556.

Examples


par(mfrow=c(2,2));
extremalIndexRangeFit(summer$O3,nboot=10)
ei <- extremalIndex(summer$O3,threshold=45)
plot(ei)
d <- declust(ei)
plot(d)
evm(d)

## fitting with covariates:

so2 <- extremalIndex(SO2,data=winter,threshold=15)
plot(so2)
so2 <- extremalIndex(SO2,data=winter,threshold=20)
plot(so2) ## fits better

so2.d <- declust(so2)
par(mfrow=c(1,1)); plot(so2.d)
so2.d.gpd <- evm(so2.d) # AIC 661.1

evm(so2.d,phi=~NO)
evm(so2.d,phi=~NO2)
evm(so2.d,phi=~O3) # better AIC 651.9
evm(so2.d,phi=~PM10)

so2.d.gpd.o3 <- evm(so2.d,phi=~O3)

par(mfrow=c(2,2)); plot(so2.d.gpd.o3)

par(mfrow=c(2,2));
extremalIndexRangeFit(summer$O3,nboot=10)
ei <- extremalIndex(summer$O3,threshold=45)
plot(ei)
d <- declust(ei)
plot(d)
evm(d)

## fitting with covariates:

so2 <- extremalIndex(SO2,data=winter,threshold=15)
plot(so2)
so2 <- extremalIndex(SO2,data=winter,threshold=20)
plot(so2) ## fits better

so2.d <- declust(so2)
par(mfrow=c(1,1)); plot(so2.d)
so2.d.gpd <- evm(so2.d) # AIC 661.1

evm(so2.d,phi=~NO)
evm(so2.d,phi=~NO2)
evm(so2.d,phi=~O3) # better AIC 651.9
evm(so2.d,phi=~PM10)

so2.d.gpd.o3 <- evm(so2.d,phi=~O3)

par(mfrow=c(2,2)); plot(so2.d.gpd.o3)

Fancy plotting for copulas

Description

Fancy plotting for copulas

Usage

## S3 method for class 'copula'
ggplot(
  data,
  mapping = aes(),
  color = "blue",
  alpha = 0.7,
  jitter = FALSE,
  jitter.factor = 0.05,
  point.size = 1,
  smooth = FALSE,
  smooth.method = "auto",
  smooth.se = TRUE,
  smooth.level = 0.95,
  smooth.formula = y ~ x,
  legend.position = "none",
  legend.title = ggplot2::waiver(),
  diag = FALSE,
  lower = TRUE,
  ticks = TRUE,
  ...,
  environment = parent.frame()
)
## S3 method for class 'copula'
ggplot(
  data,
  mapping = aes(),
  color = "blue",
  alpha = 0.7,
  jitter = FALSE,
  jitter.factor = 0.05,
  point.size = 1,
  smooth = FALSE,
  smooth.method = "auto",
  smooth.se = TRUE,
  smooth.level = 0.95,
  smooth.formula = y ~ x,
  legend.position = "none",
  legend.title = ggplot2::waiver(),
  diag = FALSE,
  lower = TRUE,
  ticks = TRUE,
  ...,
  environment = parent.frame()
)

Arguments

`data`	A data.frame.
`mapping`	Not used.
`color`	Defaults to `color = "blue"`.
`alpha`	Defaults to `alpha = 0.7`.
`jitter`	If `jitter=TRUE`, the values are jittered before plotting. Defaults to `jitter. = FALSE`.
`jitter.factor`	How much jittering to use. Defaults to `jitter.factor = .05`.
`point.size`	Defaults to `point.size = 1`.
`smooth`	Defaults to `smooth = FALSE`.
`smooth.method`	Defaults to `smooth.method = "auto"` and is passed to `geom_smooth` only when `smooth = TRUE`.
`smooth.se`	Defaults to `smooth.se = TRUE` and is used only when `smooth = TRUE`.
`smooth.level`	Defaults to `smooth.level = 0.95` and is used only when `smooth = TRUE`.
`smooth.formula`	A formula, defaulting to `smooth.formula = y ~ x` to be passed as the `formula` argument to `geom_smooth`.
`legend.position`	Passed into `theme`, defaults to `legend.position="none"`.
`legend.title`	Passed into `theme`. Defaults to `legend.title = waiver()`.
`diag`	Defaults to `diag = FALSE` and panels on the diagonal are not produced.
`lower`	Defaults to `lower = TRUE` and only the lower triangle is plotted.
`ticks`	Defaults to `ticks = TRUE` and ticks and their labels are put on the axes. Otherwise, no tick or labels are used.
`...`	Not used.
`environment`	Not used.

Diagnostic plots for an declustered object

Description

Create and display diagnostic plots for a declustered object.

Usage

## S3 method for class 'declustered'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ptcol = c("blue", "orange"),
  col = "light blue",
  plot. = TRUE,
  ...,
  environment
)

## S3 method for class 'extremalIndex'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ptcol = "blue",
  col = "light blue",
  plot. = TRUE,
  ...,
  environment
)
## S3 method for class 'declustered'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ptcol = c("blue", "orange"),
  col = "light blue",
  plot. = TRUE,
  ...,
  environment
)

## S3 method for class 'extremalIndex'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ptcol = "blue",
  col = "light blue",
  plot. = TRUE,
  ...,
  environment
)

Arguments

`data`	An object of class `declustered` or `extremalIndex`.
`mapping`	Not used.
`xlab`	Label for the x-axis.
`ylab`	Label for the y-axis.
`main`	Plot title.
`ptcol`	Colour for points. Defaults to `ptcol="blue"`.
`col`	Colour for lines. Defaults to `col="light blue"`.
`plot.`	Whether or not to display the output. Defaults to `plot.=TRUE`.
`...`	Other arguments passed through to underlying plot functions.
`environment`	Not used.

Diagnostic plots for the replicate estimated parameter values in an evmBoot object

Description

Diagnostic plots for the replicate estimated parameter values in an evmBoot object

Usage

## S3 method for class 'evmBoot'
ggplot(
  data = NULL,
  mapping,
  denscol = "light blue",
  histcol = "dark blue",
  linecol = "orange",
  plot.it = TRUE,
  ...,
  environment
)
## S3 method for class 'evmBoot'
ggplot(
  data = NULL,
  mapping,
  denscol = "light blue",
  histcol = "dark blue",
  linecol = "orange",
  plot.it = TRUE,
  ...,
  environment
)

Arguments

`data`	An object of class 'evmBoot'.
`mapping`, `environment`	ignored
`denscol`	Colour for the densities. Defaults to 'light blue'.
`histcol`	Colour for the histograms. Defaults to 'dark blue'.
`linecol`	Colour for the point estimate lines. Decaults to 'orange'.
`plot.it`	Whether or not to actually print the plots. Defaults to `plot.it=TRUE`. If `plot.it=FALSE`, you might want to control the layout. Do this with `do.call("grid.arrange", c(plots, ncol=2))`, for example, where `plots` is the objected returned by `ggplot.evmBoot`.
`...`	Additional arguments to `ggplot`, currently unused.

Diagnostic plots for an evm object

Description

Create and display diagnostic plots for an evm object. See plot.evmOpt for further details on what is being plotted.

Usage

## S3 method for class 'evmOpt'
ggplot(
  data,
  mapping,
  which = 1:4,
  main = rep(NULL, 4),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  jitter.width = 0,
  jitter.height = 0,
  ptcol = "blue",
  span = 2/3,
  col = "light blue",
  fill = "orange",
  plot. = TRUE,
  ncol = 2,
  nrow = 2,
  ...,
  environment
)
## S3 method for class 'evmOpt'
ggplot(
  data,
  mapping,
  which = 1:4,
  main = rep(NULL, 4),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  jitter.width = 0,
  jitter.height = 0,
  ptcol = "blue",
  span = 2/3,
  col = "light blue",
  fill = "orange",
  plot. = TRUE,
  ncol = 2,
  nrow = 2,
  ...,
  environment
)

Arguments

`data`	An object of class `evm`.
`mapping`, `environment`	ignored
`which`	Which plots to produce. Defaults to `which=1:4`.
`main`	Main titles. Should have length 4.
`xlab`	Labels for x-axes.
`nsim`	Number of simulated datasets to create to form tolerance regions.
`alpha`	Used to compute coverage of pointwise confidence intervals.
`jitter.width`, `jitter.height`	Used to control the amount of jittering of points in the plots of the residuals versus covariates (when covariates are in the model). Defaults to `jitter.width=0, jitter.height = 0`.
`ptcol`	Colour for points. Defaults to `ptcol="blue"`.
`span`	Passed to the loess smoother and defaults to `span=2/3`. Sometimes this choice is poor: if the loess smoother looks wrong, try `span=1`.
`col`	Colour for lines. Defaults to `col="light blue"`.
`fill`	Colour for confidence regions. Defaults to `fill="orange"`
`plot.`	Whether or not to display the output. Defaults to `plot.=TRUE`. If the display doesn't have the desired row and column layout, the user should specify `plot.=FALSE`, asign the output to an object, and use `grid.arrange` to display it.
`ncol`	The number of columns wanted in the resulting plot. Defaults to `ncol=2`. This argument is passed into `grid.arrange`.
`nrow`	The number of rows wanted in the resulting plot. Defaults to `nrow=2`. This argument is passed into `grid.arrange`.
`...`	Other arguments passed through to underlying plot functions.

Details

The function attempts to arrange the plots nicely. If the output isn't what was wanted, the function returns the graphs to the user as a list so that the user can use grid.arrange directly. Also, if you have one or more covariates in the model and the loess smoother looks wrong, try setting span=1.

Diagnostic plots for the Markov chains in an evmSim object

Description

Diagnostic plots for the Markov chains in an evmSim object

Usage

## S3 method for class 'evmSim'
ggplot(
  data = NULL,
  mapping,
  which.plots = 1:3,
  chain = 1,
  denscol = "dark blue",
  acfcol = "light blue",
  plot.it = TRUE,
  ...,
  environment
)
## S3 method for class 'evmSim'
ggplot(
  data = NULL,
  mapping,
  which.plots = 1:3,
  chain = 1,
  denscol = "dark blue",
  acfcol = "light blue",
  plot.it = TRUE,
  ...,
  environment
)

Arguments

`data`	An object of class 'evmSim'.
`mapping`, `environment`	ignored
`which.plots`	Which plots to produce. Density plots correspond to 1, trace plots of the Markov chains to 2 and autocorrelation function plots to 3.
`chain`	An integer indicating which chain to plot (only relevant if there is more than 1 chain). Defaults to 1. If you ran multiple chains, you should look at diagnostics for all of them.
`denscol`	Colour for the density plots. Defaults to 'dark blue'.
`acfcol`	Colour for the ACF plots. Defaults to 'light blue'.
`plot.it`	Whether or not to actually print the plots. Defaults to `plot.it=TRUE`. If `plot.it=FALSE`, you might want to control the layout. Do this with `do.call("grid.arrange", c(plots, ncol=2))`, for example, where `plots` is the objected returned by `ggplot.evmSim`.
`...`	Additional arguments to `ggplot`, currently unused.

Conditional multivariate extreme values modelling

Description

Fit the conditional multivariate extreme value model of Heffernan and Tawn

Usage

## S3 method for class 'mex'
ggplot(
  data = NULL,
  mapping,
  ptcol = "blue",
  col = "cornflowerblue",
  fill = "orange",
  plot. = TRUE,
  quantiles = seq(0.1, by = 0.2, len = 5),
  ...,
  environment
)

mex(
  data,
  which,
  mth,
  mqu,
  dqu,
  cov = "numeric",
  family = gpd,
  margins = "laplace",
  constrain = TRUE,
  v = 10,
  penalty = "gaussian",
  maxit = 10000,
  trace = 0,
  verbose = FALSE,
  priorParameters = NULL
)

mexAll(data, mqu, dqu)

## S3 method for class 'mexList'
print(x, ...)

## S3 method for class 'mex'
plot(x, quantiles = seq(0.1, by = 0.2, len = 5), col = "grey", ...)

## S3 method for class 'predict.mex'
plot(x, pch = c(1, 3, 20), col = c(2, 8, 3), cex = c(1, 1, 1), ask = TRUE, ...)

## S3 method for class 'predict.mex'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ptcol = c("grey", "dark blue", "orange"),
  col = "dark blue",
  fill = "orange",
  shape = 16:18,
  size = rep(1, 3),
  plot. = TRUE,
  ...,
  environment
)

## S3 method for class 'mex'
predict(
  object,
  which,
  pqu = 0.99,
  nsim = 1000,
  trace = 10,
  smoothZdistribution = FALSE,
  ...
)

## S3 method for class 'predict.mex'
summary(object, mth, probs = c(0.05, 0.5, 0.95), ...)
## S3 method for class 'mex'
ggplot(
  data = NULL,
  mapping,
  ptcol = "blue",
  col = "cornflowerblue",
  fill = "orange",
  plot. = TRUE,
  quantiles = seq(0.1, by = 0.2, len = 5),
  ...,
  environment
)

mex(
  data,
  which,
  mth,
  mqu,
  dqu,
  cov = "numeric",
  family = gpd,
  margins = "laplace",
  constrain = TRUE,
  v = 10,
  penalty = "gaussian",
  maxit = 10000,
  trace = 0,
  verbose = FALSE,
  priorParameters = NULL
)

mexAll(data, mqu, dqu)

## S3 method for class 'mexList'
print(x, ...)

## S3 method for class 'mex'
plot(x, quantiles = seq(0.1, by = 0.2, len = 5), col = "grey", ...)

## S3 method for class 'predict.mex'
plot(x, pch = c(1, 3, 20), col = c(2, 8, 3), cex = c(1, 1, 1), ask = TRUE, ...)

## S3 method for class 'predict.mex'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ptcol = c("grey", "dark blue", "orange"),
  col = "dark blue",
  fill = "orange",
  shape = 16:18,
  size = rep(1, 3),
  plot. = TRUE,
  ...,
  environment
)

## S3 method for class 'mex'
predict(
  object,
  which,
  pqu = 0.99,
  nsim = 1000,
  trace = 10,
  smoothZdistribution = FALSE,
  ...
)

## S3 method for class 'predict.mex'
summary(object, mth, probs = c(0.05, 0.5, 0.95), ...)

Arguments

`data`	A numeric matrix or data.frame, the columns of which are to be modelled.
`col`	In `plot` method for objects of class `mex`, the colour for points on scatterplots of residuals and original data respectively. In `plot` method for objects of class `predict.mex`, the colours of points for observed, and simulated data (conditioning variable not the largest) and simulated data (conditioning variable is the largest) respectively.
`quantiles`	A vector of quantiles taking values between 0 and 1 specifying the quantiles of the conditional distributions which will be plotted.
`...`	Further arguments to be passed to methods.
`which`	The variable on which to condition. This can be either scalar, indicating the column number of the conditioning variable, or character, giving the column name of the conditioning variable.
`mth`	Marginal thresholds. In `mex`, the threshold above which to fit generalized Pareto distributions. If this is a vector of length 1, the same threshold will be used for each variable. Otherwise, it should be a vector whose length is equal to the number of columns in `data`. In `summary.predict.mex`, the thresholds over which to simulate data from the fitted multivariate model. If not supplied, it is taken to be the thresholds that were used to fit the dependence model on the scale of the original data.
`mqu`	Marginal quantiles As an alternative to specifying the marginal GPD fitting thresholds via `mth`, you can specify the quantile (a probability) above which to fit generalized Pareto distributions. If this is a vector of length 1, the same quantile will be used for each variable. Otherwise, it should be a vector whose length is equal to the number of columns in `data`.
`dqu`	Dependence quantile. Used to specify the quantile at which to threshold the conditioning variable data when estimating the dependence parameters. For example `dqu=0.7` will result in the data with the highest 30% of values of the conditioning variable being used to estimate the dependence parameters. The same threshold will be used for each dependent variable. If not supplied then the default is to set `dqu=mqu[which]` the quantile corresponding to the threshold used to fit the marginal model to the tail of the conditioning variable. Note that there is no requirement for the quantiles used for marginal fitting (`mqu`) and dependence fitting (`dqu`) to be the same, or for them to be ordered in any way.
`cov`	String, passed through to `evm`: how to estimate the covariance. Defaults to `cov = "observed"`.
`family`	An object of class "texmexFamily". Should be either `family = gpd` or `family = cgpd` and defaults to the first of those.
`margins`	See documentation for `mexDependence`.
`constrain`	See documentation for `mexDependence`.
`v`	See documentation for `mexDependence`.
`penalty`	How to penalize the likelihood when estimating the marginal generalized Pareto distributions. Defaults to “gaussian”. See the help file for `evm` for more information.
`maxit`	The maximum number of iterations to be used by the optimizer. defaults to `maxit = 10000`.
`trace`	Passed internally to `optim`. Whether or not to inform the user of the progress of the optimizer. Defaults to 0, indicating no trace.
`verbose`	Whether or not to keep the user informed of progress. Defaults to `verbose = FALSE`.
`priorParameters`	Parameters of prior/penalty used for estimation of the GPD parameters. This is only used if `penalty = "gaussian"`. It is a named list, each element of which contains two components: the first component should be a vector of length 2 corresponding to the location of the Gaussian distribution; the second a 2x2 matrix corresponding to the covariance matrix of the distribution. The names should match the names of the columns of `data`. If not provided, the default priors are independent normal, centred at zero, with variance 10000 for phi=log(sigma) and 0.25 for xi. See the details section.
`x`, `object`	Object of class `mex` or `summary.mex` as returned by these functions respectively.
`pch`, `cex`	Plotting characters: colours and symbol expansion. The observed and simulated data are plotted using different symbols, controlled by these arguments and `col`, each of which should be of length 2.
`ask`	Whether or not to ask before changing the plot. Defaults to `ask = TRUE`.
`shape`, `size`, `mapping`, `ptcol`, `fill`, `plot.`, `environment`, `xlab`, `ylab`, `main`	Further arguments to plot and ggplot methods.
`pqu`	Prediction quantile. Argument to `predict` method. The quantile of the conditioning variable above which it will be simulated for importance sampling based prediction. Defaults to `pqu = .99` .
`nsim`	Argument to `predict` method. The number of simulated observations to be generated for prediction.
`smoothZdistribution`	In `predict.mex`, whether or not to sample from the smoothed distribution of the underlying residuals. Defaults to `FALSE`, in which case no smoothing is carried out. If `TRUE` then each margin of the underlying multivariate residual is smoothed independently, by using kernel smoothing with a normal kernel, and bandwith chosen using the `bw.nrd` function. This can be useful for removing "stripeyness" in importance samples which have few values in the conditional tails.
`probs`	In `summary` method for objects of class `predict.mex`: the quantiles of the conditional distribution(s) to calculate. Defaults to 5%, 50% and 95%.

Details

The function mex works as follows. First, Generalized Pareto distributions (GPD) are fitted to the upper tails of each of the marginal distributions of the data: the GPD parameters are estimated for each column of the data in turn, independently of all other columns. Then, the conditional multivariate approach of Heffernan and Tawn is used to model the dependence between variables. The returned object is of class "mex".

This function is a wrapper for calls to migpd and mexDependence, which estimate parameters of the marginal and dependence components of the Heffernan and Tawn model respectively. See documentation of these functions for details of modelling issues including the use of penalties / priors, threshold choice and checking for convergence of parameter estimates.

The plot method produces diagnostic plots for the fitted dependence model described by Heffernan and Tawn, 2004. The plots are best viewed by using the plotting area split by par(mfcol=c(.,.)) rather than mfrow, see examples below. Three diagnostic plots are produced for each dependent variable:

1) Scatterplots of the residuals Z from the fitted model of Heffernan and Tawn (2004) are plotted against the quantile of the conditioning variable, with a lowess curve showing the local mean of these points. 2) The absolute value of Z-mean(Z) is also plotted, again with the lowess curve showing the local mean of these points. Any trend in the location or scatter of these variables with the conditioning variable indicates a violation of the model assumption that the residuals Z are indepenendent of the conditioning variable. This can be indicative of the dependence threshold used being too low. 3) The final plots show the original data (on the original scale) and the fitted quantiles (specified by quantiles) of the conditional distribution of each dependent variable given the conditioning variable. A model that fits well will have good agreement between the distribution of the raw data (shown by the scatter plot) and the fitted quantiles. Note that the raw data are a sample from the joint distribution, whereas the quantiles are those of the estimated conditional distribution given the value of the conditioning variable, and while these two distributions should move into the same part of the sample space as the conditioning variable becomes more extreme, they are not the same thing!

The predict method for mex works as follows. The returned object has class "predict.mex". Simulated values of the dependent variables are created, given that the conditioning variable is above its 100pqu quantile. If predict.mex is passed an object object of class "mex" then the simulated values are based only on the point estimate of the dependence model parameters, and the original data. If predict.mex is passed an object object of class "bootmex" then the returned value additionally contains simulated replicate data sets corresponding to the bootstrap model parameter estimates. In both cases, the simulated values based on the original data and point estimates appear in component object$data$simulated. The simulated data from the bootstrap estimates appear in object$replicates.

The plot method for class "predict.mex" displays both the original data and the simulated data generated above the threshold for prediction; it shows the threshold for prediction (vertical line) and also the curve joining equal quantiles of the marginal distributions – this is for reference: variables that are perfectly dependent will lie exactly on this curve. Original data are shown with one plotting character and simulated data with another; colours of simulated point distinguish those points which have the conditioning variable as the largest (on a quantile scale) or not the largest.

The function mexAll fits a collection of GPD and conditional dependence models, the same fitted GPD being used for all of the dependence model fits. This can be used in turn to generate Monte Carlo samples from the entire sample space usign the collected dependence models.

Value

A call to mex returns an list of class mex containing the following three items:

`margins`	An object of class `migpd`.
`dependence`	An object of class `mexDependence`.
`call`	This matches the original function call.

There are plot, summary, coef and predict methods for this class.

A call to predict.mex does the importance sampling for prediction, and returns a list of class "predict.mex" for which there are print and plot methods available. The summary method for this class of object is intended to be used following a call to the predict method, to estimate quantiles or probabilities of threshold excesses for the fitted conditional distributions given the conditioning variable above the threshold for prediction. See examples below.

There are print, summary and plot methods available for the class "predict.mex".

Note

The package texmex is equipped to fit GPD models to the upper marginal tails only, not the lower tails. This is appropriate for extrapolating into the tails of any dependent variable when dependence between this variable and the conditioning variable is positive. In the case of negative dependence between the conditioning variable and any dependent variable, estimation of the conditional distribution of the dependent variable for extreme values of the conditioning variable would naturally visit the lower tail of the dependent variable. Extrapolation beyond the range of the observed lower tail is not supported in the current version of texmex. In cases where negative dependence is observed and extrapolation is required into the lower tail of the dependent variable, the situation is trivially resolved by working with a reflection of the dependent variable (Y becomes -Y and so the upper and lower tails are swapped). Results can be calculated for the reflected variable then reflected back to the correct scale. This is satisfactory when only the pair of variables (the conditioning and single dependent variable) are of interest, but when genuine multivariate (as opposed to simply bivariate) structure is of interest, this approach will destroy the dependence structure between the reflected dependent variable and the remaining dependent variables.

Author(s)

Harry Southworth, Janet E. Heffernan

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical Society B, 66, 497 - 546, 2004

Examples


w <- mex(winter, mqu=.7, dqu=0.7, which="O3")
w
par(mfcol=c(3, 2))
plot(w)

par(mfcol=c(2,2))
p <- predict(w)
summary(p)
summary(p,probs=c(0.01,0.2,0.5,0.8,0.99))
summary(p,probs=0.5,mth=c(40,50,150,25,50))
p
plot(p)


w <- mex(winter, mqu=.7, dqu=0.7, which="O3")
w
par(mfcol=c(3, 2))
plot(w)

par(mfcol=c(2,2))
p <- predict(w)
summary(p)
summary(p,probs=c(0.01,0.2,0.5,0.8,0.99))
summary(p,probs=0.5,mth=c(40,50,150,25,50))
p
plot(p)

Fit multiple independent generalized Pareto models

Description

Fit multiple independent generalized Pareto models as the first step of conditional multivariate extreme values modelling following the approach of Heffernan and Tawn, 2004.

Usage

## S3 method for class 'migpd'
ggplot(
  data,
  mapping = NULL,
  main = c("Probability plot", "Quantile plot", "Return level plot",
    "Histogram and density"),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...,
  environment
)

migpd(
  data,
  mth,
  mqu,
  penalty = "gaussian",
  maxit = 10000,
  trace = 0,
  verbose = FALSE,
  priorParameters = NULL,
  cov = "observed",
  family = gpd
)

## S3 method for class 'migpd'
plot(
  x,
  main = c("Probability plot", "Quantile plot", "Return level plot",
    "Histogram and density"),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...
)
## S3 method for class 'migpd'
ggplot(
  data,
  mapping = NULL,
  main = c("Probability plot", "Quantile plot", "Return level plot",
    "Histogram and density"),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...,
  environment
)

migpd(
  data,
  mth,
  mqu,
  penalty = "gaussian",
  maxit = 10000,
  trace = 0,
  verbose = FALSE,
  priorParameters = NULL,
  cov = "observed",
  family = gpd
)

## S3 method for class 'migpd'
plot(
  x,
  main = c("Probability plot", "Quantile plot", "Return level plot",
    "Histogram and density"),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...
)

Arguments

`data`	A matrix or data.frame, each column of which is to be modelled.
`mapping`, `environment`	Further arguments to ggplot method.
`main`	Character vector of length four: titles for plots produced by `plot` and `ggplot` methods.
`xlab`	As `main` but for x-axes labels.
`nsim`	Number of simulations on which to base tolerance envelopes in `plot` and `ggplot` methods.
`alpha`	Significance level for tolerance and confidence intervals in `plot` and `ggplot` methods.
`...`	Further arguments to be passed to methods.
`mth`	Marginal thresholds. Thresholds above which to fit the models. Only one of `mth` and `mqu` should be supplied. Length one (in which case a common threshold is used) or length equal to the number of columns of `data` (in which case values correspond to thresholds for each of the columns respectively).
`mqu`	Marginal quantiles. Quantiles above which to fit the models. Only one of `mth` and `mqu` should be supplied. Length as for `mth` above.
`penalty`	How the likelihood should be penalized. Defaults to "gaussian". See documentation for `evm`.
`maxit`	The maximum number of iterations to be used by the optimizer.
`trace`	Whether or not to tell the user how the optimizer is getting on. The argument is passed into `optim` – see the help for that function.
`verbose`	Controls whether or not the function prints to screen every time it fits a model. Defaults to FALSE.
`priorParameters`	Only used if `penalty = 'gaussian'`. A named list, each element of which contains two components: the first should be a vector of length 2 corresponding to the location of the Gaussian distribution; the second should be 2x2 matrix corresponding to the covariance matrix of the distribution. The names should match the names of the columns of `data`. If not provided, it defaults to independent priors being centred at zero, with variance 10000 for log(sigma) and 0.25 for xi. See the details section.
`cov`	String, passed through to `evm`: how to estimate the covariance. Defaults to `cov = "observed"`.
`family`	An object of class "texmexFamily". Should be either `family = gpd` or `family = cgpd` and defaults to the first of those.
`x`	Object of class `migpd` as returned by function `migpd`.

Details

The parameters in the generalized Pareto distribution are estimated for each column of the data in turn, independently of all other columns. Note, covariate modelling of GPD parameters is not supported.

Maximum likelihood estimation often fails with generalized Pareto distributions because of the likelihood becoming flat (see, for example, Hosking et al, 1985). Therefore the function allows penalized likelihood estimation, which is the same as maximum a posteriori estimation from a Bayesian point of view.

By default quadratic penalization is used, corresponding to using a Gaussian prior. If no genuine prior information is available, the following argument can be used. If xi = -1, the generalized Pareto distribution corresponds to the uniform distribution, and if xi is 1 or greater, the expectation is infinite. Thefore, xi is likely to fall in the region (-1, 1). A Gaussian distribution centred at zero and with standard deviation 0.5 will have little mass outside of (-1, 1) and so will often be a reasonable prior for xi. For log(sigma) a Gaussian distribution, centred at zero and with standard deviation 100 will often be vague. If a Gaussian penalty is specified but no parameters are given, the function will assume such indpendent priors.

Note that internally the function works with log(sigma), not sigma. The reasons are that quadratic penalization makes more sense for phi=log(sigma) than for sigma (because the distribution of log(sigma) will be more nearly symmetric), and because it was found to stabilize computations.

The associated coef, print and summary functions exponentiate the log(sigma) parameter to return results on the expected scale. If you are accessesing the parameters directly, however, take care to be sure what scale the results are on.

Threshold selection can be carried out with the help of functions mrl and gpdRangeFit.

Value

An object of class "migpd". There are coef, print, plot, ggplot and summary functions available.

Note

You are encourage to use the mqu argument and not mth. If you use mth, the quantiles then need to be estimated. There are, at the time of writing, 9 methods of estimating quantiles build into the quantile function. Tiny differences can cause problems in later stages of the analysis if functions try to simulate in an area that is legitimate according to the numerical value of the threshold, but not according to the estimated quantile.

Author(s)

Harry Southworth

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004

J. R. M. Hosking and J. R. Wallis, Parameter and quantile estimation for the generalized Pareto distribution, Technometrics, 29, 339 – 349, 1987

Examples



mygpd <- migpd(winter, mqu=.7, penalty = "none")
mygpd
summary(mygpd)
plot(mygpd)
g <- ggplot(mygpd)


mygpd <- migpd(winter, mqu=.7, penalty = "none")
mygpd
summary(mygpd)
plot(mygpd)
g <- ggplot(mygpd)

Plotting function for return level estimation

Description

Plotting function for return level estimation

Usage

## S3 method for class 'rl.evmOpt'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ylim = "auto",
  ptcol = "blue",
  col = "light blue",
  fill = "orange",
  alpha = 0.5,
  ...,
  environment
)
## S3 method for class 'rl.evmOpt'
ggplot(
  data = NULL,
  mapping,
  xlab,
  ylab,
  main,
  ylim = "auto",
  ptcol = "blue",
  col = "light blue",
  fill = "orange",
  alpha = 0.5,
  ...,
  environment
)

Arguments

`data`	An object of class `rl.evmOpt`, `rl.evmBoot`, `rl.evmSim`, `lp.evmOpt`, `lp.evmBoot` or `lp.evmSim`, .
`mapping`	Not used.
`xlab`	Label for the x-axis.
`ylab`	Label for the y-axis.
`main`	Plot title.
`ylim`	Plot limits for y-axis.
`ptcol`	Colour for points. Defaults to `ptcol="blue"`.
`col`	Colour for lines. Defaults to `col="light blue"`.
`fill`	Colour for shading polygons.
`alpha`	Transparency.
`...`	Other arguments passed through to underlying plot functions.
`environment`	Not used.

Profile likelihood based confidence intervals for GPD

Description

Calculates profile likelilhood based confidence intervals for a given fitted GPD model – this is only implemented for two parameter GPD with no covariates in the model.

Usage

gpd.prof(
  z,
  m,
  xmax,
  xlow,
  conf = 0.95,
  nint = 50,
  PlotIt = FALSE,
  mult = 2,
  priorParameters = NULL
)
gpd.prof(
  z,
  m,
  xmax,
  xlow,
  conf = 0.95,
  nint = 50,
  PlotIt = FALSE,
  mult = 2,
  priorParameters = NULL
)

Arguments

`z`	a fitted `evmOpt` object
`m`	return period : units are number of observations
`xmax`	point estimate of the return level, this is used to bracket the roots of the equation used to calculate the ends of the profile likelihood based confidence interval. The value need not be exact.
`xlow`	value lower than the lower end of the confidence interval, for bracketing in root finding
`conf`	confidence level, defaults to 0.95
`nint`	used for plotting if required, number of points at which to calculate the profile likelihood for plotting, defaults to 50
`PlotIt`	logical, whether or not to plot the profile likelihood, defaults to `FALSE`
`mult`	used to calculate the starting point for the root finding for solving to find the upper end of the confidence interval. The starting point is `mult*xmax` minus the lower end point. If this starting point is beyond the estimated upper endpoint of the fitted distribution then this can cause an error, and the value of `mult` should be reduced
`priorParameters`	optional, value of prior/penalty parameters used for penalised likelihood estimation, default to NULL

Value

Numeric vector of length two, with lower and upper ends of the estiamted confidence intervals respectively.

Estimate generalized Pareto distribution parameters over a range of values

Description

Estimate generalized Pareto distribution parameters over a range of values, using maximum (penalized) likelihood.

Usage

gpdRangeFit(data, umin=quantile(data, .05), umax=quantile(data, .95),
nint = 10, penalty = "gaussian", priorParameters = NULL, alpha=0.05,
cov="observed")
## S3 method for class 'gpdRangeFit'
print(x, ...)
## S3 method for class 'gpdRangeFit'
summary(object, ...)
## S3 method for class 'summary.gpdRangeFit'
print(x, ...)
## S3 method for class 'gpdRangeFit'
plot(x, xlab = "Threshold", ylab = NULL, main = NULL, addNexcesses=TRUE, ...)
## S3 method for class 'gpdRangeFit'
ggplot(data, mapping, xlab="Threshold", ylab=NULL,
main=NULL, fill="orange", col="blue", addNexcesses = TRUE, textsize=4, ...,
environment)
gpdRangeFit(data, umin=quantile(data, .05), umax=quantile(data, .95),
nint = 10, penalty = "gaussian", priorParameters = NULL, alpha=0.05,
cov="observed")
## S3 method for class 'gpdRangeFit'
print(x, ...)
## S3 method for class 'gpdRangeFit'
summary(object, ...)
## S3 method for class 'summary.gpdRangeFit'
print(x, ...)
## S3 method for class 'gpdRangeFit'
plot(x, xlab = "Threshold", ylab = NULL, main = NULL, addNexcesses=TRUE, ...)
## S3 method for class 'gpdRangeFit'
ggplot(data, mapping, xlab="Threshold", ylab=NULL,
main=NULL, fill="orange", col="blue", addNexcesses = TRUE, textsize=4, ...,
environment)

Arguments

`data`	The data vector to be modelled.
`umin`	The minimum threshold above which to estimate the parameters.
`umax`	The maximum threshold above which to estimate the parameters.
`nint`	The number of thresholds at which to perform the estimation.
`penalty`	The type of penalty to be used in the maximum penalized likelihood estimation. Should be either "gaussian" or "none". Defaults to "gaussian".
`priorParameters`	Parameters to be used for the penalty function. See the help for `evm` for more informaiton.
`alpha`	100(1 - alpha)% confidence intervals will be plotted with the point estimates. Defaults to `alpha = 0.05`.
`cov`	How to compute the covariance matrix of the parameters. Defaults to `cov = "observed"` in which case the observed information matrix is used, if the `info` element of the `texmexFamily` object is present. See more detailed documentation of this argument in `evm`.
`x`, `object`	Arguments to `print` and `summary` functions.
`xlab`	Label for the x-axis.
`ylab`	Label for the y-axis.
`main`	The main title.
`addNexcesses`	Annotate top axis with numbers of threshold excesses arising with the corresponding values of threshold on the bottom axis.
`col`	Colour of the line on the threshold stability plot.
`fill`	Colour of the pointwise confidence region on the threshold stability plots.
`textsize`	Size of text on the plot (ggplot). Defaults to `textsize=4`.
`...`	Arguments to `plot`.
`mapping`, `environment`	Not used.

Details

This is Stuart Coles' gpd.fitrange, as it appears in the ismev package, refactored into a function that does the computations, and method functions. The function uses evm internally and uses the default options for that function.

Note this function does not extend to assessing model fit when there are covariates included in the model.

Author(s)

Stuart Coles, Janet E Heffernan, Harry Southworth

Examples


par(mfrow=c(1,2))
plot(gpdRangeFit(rain))

par(mfrow=c(1,2))
plot(gpdRangeFit(rain))

Joint exceedance curves

Description

Calculate bivariate joint exceedance curves

Usage

JointExceedanceCurve(Sample, ExceedanceProb,...)
## S3 method for class 'jointExcCurve'
print(x,...)

## Default S3 method:
JointExceedanceCurve(Sample, ExceedanceProb, n = 50, x = NULL, ...)

## S3 method for class 'mexMC'
JointExceedanceCurve(
  Sample,
  ExceedanceProb,
  n = 50,
  x = NULL,
  which = 1:2,
  ...
)

## S3 method for class 'predict.mex'
JointExceedanceCurve(
  Sample,
  ExceedanceProb,
  n = 50,
  x = NULL,
  which = 1:2,
  ...
)

calcJointExceedanceCurve(Sample, ExceedanceProb, n = 50, x = NULL)

## S3 method for class 'jointExcCurve'
print(x, ...)

geom_jointExcCurve(x, ...)
JointExceedanceCurve(Sample, ExceedanceProb,...)
## S3 method for class 'jointExcCurve'
print(x,...)

## Default S3 method:
JointExceedanceCurve(Sample, ExceedanceProb, n = 50, x = NULL, ...)

## S3 method for class 'mexMC'
JointExceedanceCurve(
  Sample,
  ExceedanceProb,
  n = 50,
  x = NULL,
  which = 1:2,
  ...
)

## S3 method for class 'predict.mex'
JointExceedanceCurve(
  Sample,
  ExceedanceProb,
  n = 50,
  x = NULL,
  which = 1:2,
  ...
)

calcJointExceedanceCurve(Sample, ExceedanceProb, n = 50, x = NULL)

## S3 method for class 'jointExcCurve'
print(x, ...)

geom_jointExcCurve(x, ...)

Arguments

`Sample`	Monte Carlo (or other) sample from which to calculate joint exceedance curve
`ExceedanceProb`	Takes values between 0 and 1, constant value of joint exceedance probability for which the curve will be calculated
`...`	Further aguments to be passed to methods
`n`	If `x=NULL` then this is HALF the number of points at which the curve will be estimated (ie the curve is calculated at 2n locations)
`x`	If specified by the user, the values of in the first dimension of `Sample` at which to calculate the curve. Defaults to `NULL` otherwise should be a numeric vector within the range of the first dimension of `Sample`.
`which`	Vector length two identifying which margins to use for joint exceedance curve estimation. Can be integer vector, giving column numbers of original data matrix, or character vector identifying variables by name (these must match column names in original data).

Details

Calculates pairs of points (x,y) for which the point exceedance probability P(X>x and Y>y) is constant. This is available only in two dimensions: for higher dimensional data, the bivariate margin will be used and other variables ignored. Takes as input either a two column matrix of observations, output from mexMonteCarlo (in which case samples from all fitted models are used to calculate curves) or output from a call to the predict method for an object of class mex (in which case just the single fitted model is used for estimation, with the importance sample generated in the call to predict being used to calculate the joint exceedance curve).

Value

Returns an object of class jointExcCurve. This is a list of length two, one for each variable for which the curve is calculated. Each item of the list is a vector of coordinate values for the variable in question. Attributes include names and the exceedance probability used to calculate the curve ExceedanceProb.

The curve is calculated by finding pairs of points (x,y) for which the empirical probability P(X>x, Y>y) of both variables exceeding their corresponding value is equal to the specified ExceedanceProb. Note that when this is calculated for an object of class predict.mex (returned by a call to the predict method for an object of class mex) then the exceedance probability is interpreted as the UNCONDITIONAL exceedance probability of the importance sample, ie the probability of sampled values occurring from the original modelled joint distribution, and NOT the conditional distribution used to generate the importance sample.

Estimated curve can be added to a ggplot of the data (and/or importance sample) by using the function geom_jointExcCurve, see examples below.

Examples


 # for data frame of raw data
 Sigma <- matrix(c(1, .5, .5, 1), ncol=2)
 m1 <- rmvnorm(5000,sigma=Sigma)
 m1 <- as.data.frame(m1)
 j1 <- JointExceedanceCurve(m1,0.01)
 j2 <- JointExceedanceCurve(m1,0.005)
 j3 <- JointExceedanceCurve(m1,0.001)
 ggplot(m1,aes(V1,V2)) + geom_point(colour="dark blue",alpha=0.5) +
 geom_jointExcCurve(j1,colour="orange") +
 geom_jointExcCurve(j2,colour="orange") +
 geom_jointExcCurve(j3,colour="orange")

 # using importance sample generated by call to predict for object of class mex
 m <- mex(winter,mqu=0.7,dqu=0.7,which="NO")
 m2 <- predict(m,nsim=5000,pqu=0.999)
 g <- ggplot(m2,plot.=FALSE)
 j4 <- JointExceedanceCurve(m2,0.0005,which=c("NO","NO2"))
 j5 <- JointExceedanceCurve(m2,0.0002,which=c("NO","NO2"))
 j6 <- JointExceedanceCurve(m2,0.0001,which=c("NO","NO2"))
 g[[2]] +
     geom_jointExcCurve(j4,aes(NO,NO2),col="orange") +
     geom_jointExcCurve(j5,aes(NO,NO2),col="orange") +
     geom_jointExcCurve(j6,aes(NO,NO2),col="orange")

# for augmented dataset, generated by MC sampling from collection of fitted H+T models
 m <- mexAll(winter,mqu=0.7,dqu=rep(0.7,5))
 m3 <- mexMonteCarlo(nSample=5000,mexList=m)
 j7 <- JointExceedanceCurve(m3,0.05,which=c("NO","NO2"))
 j8 <- JointExceedanceCurve(m3,0.02,which=c("NO","NO2"))
 j9 <- JointExceedanceCurve(m3,0.01,which=c("NO","NO2"))
 ggplot(as.data.frame(m3$MCsample[,c("NO","NO2")]),aes(NO,NO2)) +
     geom_point(col="light blue",alpha=0.5) +
     geom_jointExcCurve(j7,col="orange") +
     geom_jointExcCurve(j8,col="orange") +
     geom_jointExcCurve(j9,col="orange")

# for data frame of raw data
 Sigma <- matrix(c(1, .5, .5, 1), ncol=2)
 m1 <- rmvnorm(5000,sigma=Sigma)
 m1 <- as.data.frame(m1)
 j1 <- JointExceedanceCurve(m1,0.01)
 j2 <- JointExceedanceCurve(m1,0.005)
 j3 <- JointExceedanceCurve(m1,0.001)
 ggplot(m1,aes(V1,V2)) + geom_point(colour="dark blue",alpha=0.5) +
 geom_jointExcCurve(j1,colour="orange") +
 geom_jointExcCurve(j2,colour="orange") +
 geom_jointExcCurve(j3,colour="orange")

 # using importance sample generated by call to predict for object of class mex
 m <- mex(winter,mqu=0.7,dqu=0.7,which="NO")
 m2 <- predict(m,nsim=5000,pqu=0.999)
 g <- ggplot(m2,plot.=FALSE)
 j4 <- JointExceedanceCurve(m2,0.0005,which=c("NO","NO2"))
 j5 <- JointExceedanceCurve(m2,0.0002,which=c("NO","NO2"))
 j6 <- JointExceedanceCurve(m2,0.0001,which=c("NO","NO2"))
 g[[2]] +
     geom_jointExcCurve(j4,aes(NO,NO2),col="orange") +
     geom_jointExcCurve(j5,aes(NO,NO2),col="orange") +
     geom_jointExcCurve(j6,aes(NO,NO2),col="orange")

# for augmented dataset, generated by MC sampling from collection of fitted H+T models
 m <- mexAll(winter,mqu=0.7,dqu=rep(0.7,5))
 m3 <- mexMonteCarlo(nSample=5000,mexList=m)
 j7 <- JointExceedanceCurve(m3,0.05,which=c("NO","NO2"))
 j8 <- JointExceedanceCurve(m3,0.02,which=c("NO","NO2"))
 j9 <- JointExceedanceCurve(m3,0.01,which=c("NO","NO2"))
 ggplot(as.data.frame(m3$MCsample[,c("NO","NO2")]),aes(NO,NO2)) +
     geom_point(col="light blue",alpha=0.5) +
     geom_jointExcCurve(j7,col="orange") +
     geom_jointExcCurve(j8,col="orange") +
     geom_jointExcCurve(j9,col="orange")

Liver related laboratory data

Description

Liver related laboratory data from a randomized, blind, parallel group clinical trial with 4 doses of a drug.

Usage

data(liver)
data(liver)

Format

A data frame with 606 observations on the following 9 variables.

ALP.B: Alkaline phosphatase at baseline. A numeric vector.
ALT.B: Alanine aminotransferase at baseline. A numeric vector.
AST.B: Aspartate aminotransferase at baseline. A numeric vector.
TBL.B: Total bilirubin at baseline. A numeric vector.
ALP.M: Alkaline phosphatase after treatment. A numeric vector.
ALT.M: Alanine aminotransferase after treatment. A numeric vector.
AST.M: Aspartate aminotransferase after treatment. A numeric vector.
TBL.M: Total bilirubin after treatment. A numeric vector.
dose: The treatment group (i.e. dose group). A factor with levels A B C D

Details

Dose A is the lowest dose, dose, B the next, C the next, and D the highest dose. The baseline values were taken prior to any treatment being received, and the clinical trial had a single post-baseline visit.

Source

AstraZeneca data on file.

Log-likelihood for evmOpt objects

Description

Return the log-likelihood or penalized log-likelihood for evmOpt objects.

Usage

## S3 method for class 'evmOpt'
logLik(object, penalized = FALSE, ...)
## S3 method for class 'evmOpt'
logLik(object, penalized = FALSE, ...)

Arguments

`object`	fit model object
`penalized`	whether to return the penalized log-likelihood
`...`	some methods need more arguments

Value

an object of class logLik

Provide full marginal reference distribution for for maringal transformation

Description

This gives the option of providing a set of reference marginal distributions to use for marginal transformation if the data's own marginal distribution is not appropriate (for instance if only data for which one variable is large is available, the marginal distributions of the other variables will not be represented by the available data). In such situations, the user can supply the full marginal information of the non-thresholded variables which are necessary to transform these variables correctly from the original margins to Gumbel/Laplace for estimation of dependence model parameters.

Usage

makeReferenceMarginalDistribution(x, r, whichNoChange = NULL)
makeReferenceMarginalDistribution(x, r, whichNoChange = NULL)

Arguments

`x`	output from migpd fit to the original data which does not represent at least one marginal distribution
`r`	output from migpd fit to the reference data which does represent the correct marginal distribution of the variable with incomplete representation in `x`
`whichNoChange`	Margins which are not to use the supplied reference distribution `r` have numeric indices, giving column numbers in original dataframe, listed in `whichNoChange`.

Value

An object of class "migpd".

Multivariate conditional Spearman's rho

Description

Compute multivariate conditional Spearman's rho over a range of quantiles.

Usage

MCS(X, p = seq(0.1, 0.9, by = 0.1))

## S3 method for class 'MCS'
plot(x, xlab = "p", ylab = "MCS", ...)

## S3 method for class 'MCS'
ggplot(data, mapping, main = "", ..., environment)

bootMCS(X, p = seq(0.1, 0.9, by = 0.1), R = 100, trace = 10)

## S3 method for class 'bootMCS'
ggplot(data, mapping, main = "", alpha = 0.05, ylim, ..., environment)

## S3 method for class 'bootMCS'
plot(x, xlab = "p", ylab = "MCS", alpha = 0.05, ylim, ...)

## S3 method for class 'bootMCS'
summary(object, alpha = 0.05, ...)

## S3 method for class 'summary.bootMCS'
print(x, ...)
MCS(X, p = seq(0.1, 0.9, by = 0.1))

## S3 method for class 'MCS'
plot(x, xlab = "p", ylab = "MCS", ...)

## S3 method for class 'MCS'
ggplot(data, mapping, main = "", ..., environment)

bootMCS(X, p = seq(0.1, 0.9, by = 0.1), R = 100, trace = 10)

## S3 method for class 'bootMCS'
ggplot(data, mapping, main = "", alpha = 0.05, ylim, ..., environment)

## S3 method for class 'bootMCS'
plot(x, xlab = "p", ylab = "MCS", alpha = 0.05, ylim, ...)

## S3 method for class 'bootMCS'
summary(object, alpha = 0.05, ...)

## S3 method for class 'summary.bootMCS'
print(x, ...)

Arguments

`X`	A matrix of numeric variables.
`p`	The quantiles at which to evaluate.
`x`, `object`	An object of class `MCS` or `bootMCS`.
`xlab`, `ylab`	Axis labels.
`...`	Optional arguments to be passed into methods.
`data`, `mapping`, `main`, `environment`	Arguments to ggplot method.
`R`	The number of bootstrap samples to run. Defaults to `R = 100`.
`trace`	How often to inform the user of progress. Defaults to `trace = 10`.
`alpha`	A 100(1 - alpha)% pointwise confidence interval will be produced. Defaults to `alpha = 0.05`.
`ylim`	Plotting limits for bootstrap plot.

Details

The method is described in detail by Schmid and Schmidt (2007). The main code was written by Yiannis Papastathopoulos, wrappers written by Harry Southworth.

When the result of a call to bootMCS is plotted, simple quantile bootstrap confidence intervals are displayed.

Value

MCS returns an object of class MCS. There are plot and print methods available for this class.

`MCS`	The estimated correlations.
`p`	The quantiles at which the correlations were evaluated at
`call`	The function call used.

bootMCS returns an object of class bootMCS. There are plot and summary methods available for this class.

`replicates`	Bootstrap replicates.
`p`	The quantiles at which the correlations were evaluated at
`R`	Number of bootstrap samples.
`call`	The function call used.

Author(s)

Yiannis Papastathopoulos, Harry Southworth

References

F. Schmid and R. Schmidt, Multivariate conditional versions of Spearman's rho and related measures of tail dependence, Journal of Multivariate Analysis, 98, 1123 – 1140, 2007

Examples


D <- liver[liver$dose == "D",]
plot(D)

Dmcs <- bootMCS(D[, 5:6])
Dmcs
plot(Dmcs)

D <- liver[liver$dose == "D",]
plot(D)

Dmcs <- bootMCS(D[, 5:6])
Dmcs
plot(Dmcs)

Estimate the dependence parameters in a conditional multivariate extreme values model

Description

Estimate the dependence parameters in a conditional multivariate extreme values model using the approach of Heffernan and Tawn, 2004.

Usage

mexDependence(x, which, dqu, margins="laplace",
    constrain=TRUE, v = 10, maxit=1000000, start=c(.01, .01),
    marTransform="mixture", referenceMargin = NULL, nOptim = 1,
    PlotLikDo=FALSE, PlotLikRange=list(a=c(-1,1),b=c(-3,1)),
    PlotLikTitle=NULL)
mexDependence(x, which, dqu, margins="laplace",
    constrain=TRUE, v = 10, maxit=1000000, start=c(.01, .01),
    marTransform="mixture", referenceMargin = NULL, nOptim = 1,
    PlotLikDo=FALSE, PlotLikRange=list(a=c(-1,1),b=c(-3,1)),
    PlotLikTitle=NULL)

Arguments

`x`	An object of class "migpd" as returned by `migpd`.
`which`	The name of the variable on which to condition. This is the name of a column of the data that was passed into `migpd`.
`dqu`	See documentation for this argument in `mex`.
`margins`	The form of margins to which the data are transformed for carrying out dependence estimation. Defaults to "laplace", with the alternative option being "gumbel". The choice of margins has an impact on the interpretation of the fitted dependence parameters. Under Gumbel margins, the estimated parameters a and b describe only positive dependence, while c and d describe negative dependence in this case. For Laplace margins, only parameters a and b are estimated as these capture both positive and negative dependence.
`constrain`	Logical value. Defaults to `constrain=TRUE` although this will subsequently be changed to FALSE if `margins="gumbel"` for which constrained estimation is not implemented. If `margins="laplace"` and `constrain=TRUE` then the dependence parameter space is constrained to allow only combinations of parameters which give the correct stochastic ordering between (positively and negatively) asymptotically dependent variables and variables which are asymptotically independent.
`v`	Scalar. Tuning parameter used to carry out constrained estimation of dependence structure under `constrain=TRUE`. Takes positive values greater than 1; values between 2 and 10 are recommended.
`maxit`	The maximum number of iterations to be used by the optimizer. Defaults to `maxit = 1000000`.
`start`	Optional starting value for dependence estimation. This can be: a vector of length two, with values corresponding to dependence parameters a and b respectively, and in which case `start` is used as a starting value for numerical estimation of each of the dependence models to be estimated; a matrix with two rows corresponding to dependence parameters a and b respectively and number of columns equal to the number of dependence models to be estimated (the ordering of the columns will be as in the original data matrix); or a previously estimated object of class "mex" whose dependence parameter estimates are used as a starting point for estimation. Note that under `constrain=TRUE`, if supplied, `start` must lie within the permitted area of the parameter space.
`marTransform`	Optional form of transformation to be used for probability integral transform of data from original to Gumbel or Laplace margins. Takes values `marTransform="mixture"` (the default) or `marTransform="empirical"`. When `marTransform="mixture"`, the rank transform is used below the corresponding GPD fitting threshold used in `x`, and the fitted gpd tail model is used above this threshold. When `marTransform="empirical"` the rank transform is used for the entire range of each marginal distribution.
`referenceMargin`	Optional set of reference marginal distributions to use for marginal transformation if the data's own marginal distribution is not appropriate (for instance if only data for which one variable is large is available, the marginal distributions of the other variables will not be represented by the available data). This object can be created from a combination of datasets and fitted GPDs using the function `makeReferenceMarginalDistribution`.
`nOptim`	Number of times to run optimiser when estimating dependence model parameters. Defaults to 1. In the case of `nOptim > 1` the first call to the optimiser uses the value `start` as a starting point, while subsequent calls to the optimiser are started at the parameter value to which the previous call converged.
`PlotLikDo`	Logical value: whether or not to plot the profile likelihood surface for dependence model parameters under constrained estimation.
`PlotLikRange`	This is used to specify a region of the parameter space over which to plot the profile log-likelihood surface. List of length 2; each item being a vector of length two corresponding to the plotting ranges for dependence parameters a and b respectively. If this argument is not missing, then `PlotLikDo` is set equal to TRUE.
`PlotLikTitle`	Used only if `PlotLikDo=TRUE`. Character string. Optional title added to the profile log-likelihood surface plot.

Details

Estimates the extremal dependence structure of the data in x. The precise nature of the estimation depends on the value of margins. If margins="laplace" (the default) then dependence parameters a and b are estimated after transformation of the data to Laplace marginal distributions. These parameters can describe both positive and negative dependence. If margins="gumbel" then the parameters a, b, c and d in the dependence structure described by Heffernan and Tawn (2004) are estimated in the following two steps: first, a and b are estimated; then, if a=0 and b is negative, parameters c and d are estimated (this is the case of negative dependence). Otherwise c and d will be fixed at zero (this is the case of positive dependence).

If margins="laplace" then the option of constrained parameter estimation is available by setting argument constrain=TRUE. The default is to constrain the values of the parameters (constrain=TRUE). This constrained estimation ensures validity of the estimated model, and enforces the consistency of the fitted dependence model with the strength of extremal dependence exhibited by the data. More details are given in Keef et al. (2013). The effect of this constraint is to limit the shape of the dependence parameter space so that its boundary is curved rather than following the original box constraints suggested by Heffernan and Tawn (2004). The constraint brings with it some performance issues for the optimiser used to estimate the dependence parameters, in particular sensitivity to choice of starting value which we describe now.

The dependence parameter estimates returned by this function can be particularly sensitive to the choice of starting value used for the optimisation. This is especially true when margins="laplace" and constrain=TRUE, in which case the maximum of the objective function can lie on the edge of the (possibly curved) constrained parameter space. It is therefore up to the user to check that the reported parameter estimates really do correspond to the maximum of the profile lilkelihood surface. This is easily carried out by using the visual diagnostics invoked by setting PlotLikDo=TRUE and adjusting the plotting area by using the argument PlotLikRange to focus on the region containing the surface maximum. See an example below which illustrates the use of this diagnostic.

Value

An object of class mex which is a list containing the following three objects:

`margins`	An object of class `migpd`.
`dependence`	An object of class `mexDependence`.
`call`	This matches the original function call.

Author(s)

Harry Southworth, Janet E. Heffernan

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004.

C. Keef, I. Papastathopoulos and J. A. Tawn. Estimation of the conditional distribution of a multivariate variable given that one of its components is large: Additional constraints for the Heffernan and Tawn model, Journal of Multivariate Analysis, 115, 396 – 404, 2013

Examples


data(winter)
mygpd <- migpd(winter , mqu=.7, penalty="none")
mexDependence(mygpd , which = "NO", dqu=.7)

# focus on 2-d example with parameter estimates on boundary of constrained parameter space:
NO.NO2 <- migpd(winter[,2:3] , mqu=.7, penalty="none")

# starting value gives estimate far from true max:
mexDependence(NO.NO2, which = "NO",dqu=0.7,start=c(0.01,0.01),
              PlotLikDo=TRUE,PlotLikTitle=c("NO2 | NO"))

# zoom in on plotting region containing maximum:
mexDependence(NO.NO2, which = "NO",dqu=0.7,start=c(0.01,0.01),
              PlotLikDo=TRUE,PlotLikTitle=c("NO2 | NO"),
              PlotLikRange = list(a=c(0,0.8),b=c(-0.2,0.6)))

# try different starting value:
mexDependence(NO.NO2, which = "NO",dqu=0.7,start=c(0.1,0.1),
              PlotLikDo=TRUE,PlotLikTitle=c("NO2 | NO"),
              PlotLikRange = list(a=c(0,0.8),b=c(-0.2,0.6)))


data(winter)
mygpd <- migpd(winter , mqu=.7, penalty="none")
mexDependence(mygpd , which = "NO", dqu=.7)

# focus on 2-d example with parameter estimates on boundary of constrained parameter space:
NO.NO2 <- migpd(winter[,2:3] , mqu=.7, penalty="none")

# starting value gives estimate far from true max:
mexDependence(NO.NO2, which = "NO",dqu=0.7,start=c(0.01,0.01),
              PlotLikDo=TRUE,PlotLikTitle=c("NO2 | NO"))

# zoom in on plotting region containing maximum:
mexDependence(NO.NO2, which = "NO",dqu=0.7,start=c(0.01,0.01),
              PlotLikDo=TRUE,PlotLikTitle=c("NO2 | NO"),
              PlotLikRange = list(a=c(0,0.8),b=c(-0.2,0.6)))

# try different starting value:
mexDependence(NO.NO2, which = "NO",dqu=0.7,start=c(0.1,0.1),
              PlotLikDo=TRUE,PlotLikTitle=c("NO2 | NO"),
              PlotLikRange = list(a=c(0,0.8),b=c(-0.2,0.6)))

Simulation from dependence models

Description

Simulate Monte Carlo sample from a collection of fitted conditional dependence models.

Usage

mexMonteCarlo(nSample,mexList,mult=10)
mexMonteCarlo(nSample,mexList,mult=10)

Arguments

`nSample`	Required sample size.
`mexList`	List of fitted dependence models (returned by `mexAll`).
`mult`	Integer specifying what multiple of the total number of points should be generated for rejection sample

Details

Generates a Monte Carlo sample of the required size from a collection of conditional multivariate extreme values model of Heffernan and Tawn, 2004. For each marginal variable, the model that conditions on that margin is used to simulate values in the part of the sample space for which that margin is the largest of all marignal variables (measured on a quantile scale).

Value

A list with the following components:

`nR`	For each margin, number of original Monte Carlo points replaced by points generated under the corresponding conditional model.
`MCsample`	Matrix contiaining the Monte Carlo sample, dimension `nSample` by dimension of original dataset.
`whichMax`	Vector of indices indicating which variable is largest (on the quantile scale)
`whichMaxAboveThresh`	Logical vector indicating which of the variables identified by `whichMax` are additionally above the corresponding threshold for dependence estimation.

Author(s)

Harry Southworth, Janet E. Heffernan

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004

Examples


  mAll <- mexAll(winter,mqu=0.7,dqu=c(0.7,0.7,0.7,0.7,0.7))
  mexMC <- mexMonteCarlo(5000,mAll)
  pairs(mexMC$MCsample)

mAll <- mexAll(winter,mqu=0.7,dqu=c(0.7,0.7,0.7,0.7,0.7))
  mexMC <- mexMonteCarlo(5000,mAll)
  pairs(mexMC$MCsample)

Estimate dependence parameters in a conditional multivariate extreme values model over a range of thresholds.

Description

Diagnostic tool to aid the choice of threshold to be used for the estimation of the dependence parameters in the conditional multivariate extreme values model of Heffernan and Tawn, 2004.

Usage

mexRangeFit(x, which, quantiles = seq(0.5, 0.9, length = 9),
start=c(.01, .01), R = 10, nPass=3, trace=10, margins = "laplace", constrain
= TRUE, v = 10, referenceMargin=NULL)
mexRangeFit(x, which, quantiles = seq(0.5, 0.9, length = 9),
start=c(.01, .01), R = 10, nPass=3, trace=10, margins = "laplace", constrain
= TRUE, v = 10, referenceMargin=NULL)

Arguments

`x`	An object of class `mex` or `migpd`.
`which`	The variable on which to condition.
`quantiles`	A numeric vector specifying the quantiles of the marginal distribution of the conditioning variable at which to fit the dependence model.
`start`	See documentation for this argument in `mexDependence`.
`R`	The number of bootstrap runs to perform at each threshold. Defaults to `R`=10.
`nPass`	Argument passed to function `bootmex`.
`trace`	Argument passed to function `bootmex`.
`margins`	Argument passed to function `mexDependence`.
`constrain`	Argument passed to function `mexDependence`.
`v`	Argument passed to function `mexDependence`.
`referenceMargin`	Optional set of reference marginal distributions to use for marginal transformation if the data's own marginal distribution is not appropriate (for instance if only data for which one variable is large is available, the marginal distributions of the other variables will not be represented by the available data). This object can be created from a combination of datasets and fitted GPDs using the function `makeReferenceMarginalDistribution`.

Details

Dependence model parameters are estimated using a range of threshold values. The sampling variability of these estimates is characterised using the bootstrap. Point estimates and bootstrap estimates are finally plotted over the range of thresholds. Choice of threshold should be made such that the point estimates at the chosen threshold and beyond are constant, up to sampling variation.

Value

NULL.

Author(s)

Harry Southworth, Janet E. Heffernan

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004

Examples



  w <- migpd(winter, mqu=.7)
  w
  par(mfrow=c(4,2))
  plot(mexRangeFit(w, which=1),main="Winter data, Heffernan and Tawn 2004",cex=0.5)


w <- migpd(winter, mqu=.7)
  w
  par(mfrow=c(4,2))
  plot(mexRangeFit(w, which=1),main="Winter data, Heffernan and Tawn 2004",cex=0.5)

Change values of parameters in a migpd object

Description

Change the values of parameters in a migpd object. You might want to do this after modelling marginal distributions as functions of covariates.

Usage

migpdCoefs(object, which, coefs)
migpdCoefs(object, which, coefs)

Arguments

`object`	An object of class `migpd`.
`which`	Which models in the `migpd` object you want to change.
`coefs`	The coefficients that you want to change to. If `which` has length 1, `coefs` can be a vector of parameters. Otherwise, it should be a list of vectors, and the list should have the same length as `which`

Value

A migpd object. See the help for migpd.

Author(s)

Harry Southworth

Examples


library(MASS)
liver <- liver
liver$ndose <- as.numeric(liver$dose)
d <- data.frame(alt = resid(rlm(log(ALT.M) ~ log(ALT.B) + ndose, data=liver)),
                ast = resid(rlm(log(AST.M) ~ log(AST.B) + ndose, data=liver)),
                alp = resid(rlm(log(ALP.M) ~ log(ALP.B) + ndose, data=liver)),
                tbl = resid(rlm(log(TBL.M) ~ log(TBL.B) + ndose, data=liver)))

Dgpds <- migpd(d[liver$dose == "D", 1:4], mqu=.7)

d$ndose <- liver$ndose
galt <- evm("alt", data=d, qu=.7, xi = ~ ndose)
gast <- evm("ast", data=d, qu=.7, xi = ~ ndose)
galp <- evm("alp", data=d, qu=.7, xi = ~ ndose)

altco <- predict(galt,type="lp",newdata=data.frame(ndose=4))$obj$link[1:2]
astco <- predict(gast,type="lp",newdata=data.frame(ndose=4))$obj$link[1:2]
alpco <- predict(galp,type="lp",newdata=data.frame(ndose=4))$obj$link[1:2]

Dgpd <- migpdCoefs(Dgpds, which=c("alt", "ast", "alp"),
                   coefs=list(altco, astco, alpco))

summary(Dgpd)
summary(Dgpds)


library(MASS)
liver <- liver
liver$ndose <- as.numeric(liver$dose)
d <- data.frame(alt = resid(rlm(log(ALT.M) ~ log(ALT.B) + ndose, data=liver)),
                ast = resid(rlm(log(AST.M) ~ log(AST.B) + ndose, data=liver)),
                alp = resid(rlm(log(ALP.M) ~ log(ALP.B) + ndose, data=liver)),
                tbl = resid(rlm(log(TBL.M) ~ log(TBL.B) + ndose, data=liver)))

Dgpds <- migpd(d[liver$dose == "D", 1:4], mqu=.7)

d$ndose <- liver$ndose
galt <- evm("alt", data=d, qu=.7, xi = ~ ndose)
gast <- evm("ast", data=d, qu=.7, xi = ~ ndose)
galp <- evm("alp", data=d, qu=.7, xi = ~ ndose)

altco <- predict(galt,type="lp",newdata=data.frame(ndose=4))$obj$link[1:2]
astco <- predict(gast,type="lp",newdata=data.frame(ndose=4))$obj$link[1:2]
alpco <- predict(galp,type="lp",newdata=data.frame(ndose=4))$obj$link[1:2]

Dgpd <- migpdCoefs(Dgpds, which=c("alt", "ast", "alp"),
                   coefs=list(altco, astco, alpco))

summary(Dgpd)
summary(Dgpds)

Mean residual life plot

Description

Calculate mean residual life and plot it to aid the identification of a threshold over which to fit a generalized Pareto distribution

Usage

mrl(data, umin = min(data), umax = max(data) - 0.1, nint = 100,
alpha=.050)
## S3 method for class 'mrl'
print(x, ...)
## S3 method for class 'summary.mrl'
print(x, ...)
## S3 method for class 'mrl'
summary(object, ...)
## S3 method for class 'mrl'
plot(x, xlab="Threshold", ylab="Mean excess", ...)
## S3 method for class 'mrl'
ggplot(data, mapping, xlab = "Threshold",
  ylab = "Mean excess", main=NULL,fill="orange", col="blue",
  rug=TRUE, addNexcesses=TRUE, textsize=4, ..., environment)
mrl(data, umin = min(data), umax = max(data) - 0.1, nint = 100,
alpha=.050)
## S3 method for class 'mrl'
print(x, ...)
## S3 method for class 'summary.mrl'
print(x, ...)
## S3 method for class 'mrl'
summary(object, ...)
## S3 method for class 'mrl'
plot(x, xlab="Threshold", ylab="Mean excess", ...)
## S3 method for class 'mrl'
ggplot(data, mapping, xlab = "Threshold",
  ylab = "Mean excess", main=NULL,fill="orange", col="blue",
  rug=TRUE, addNexcesses=TRUE, textsize=4, ..., environment)

Arguments

`data`	A numeric vector.
`umin`	The minimum value over which to threshold the data.
`umax`	The maximum value over which to threshold the data.
`nint`	The number of points at which to compute the plot.
`alpha`	Used to determine coverage of confidence interval to plot. Defaults to plotting a 95% interval.
`x`, `object`	Arguments to print, summary and plot functions.
`xlab`	Label for the x-axis. Defaults to `xlab="Threshold"`.
`ylab`	Label for the y-axis. Defaults to `ylab="Mean excess"`.
`...`	Optional arguments to `plot`.
`col`	Colour of the line on the MRL plot.
`rug`	Whether to add raw data as a rug along axis of plot.
`fill`	Colour of the pointwise confidence region on the MRL plot.
`main`	Main title.
`addNexcesses`	Whether to annotate the plot with the numbers of excesses over increasing thresholds. Defaults to `addNexcesses=TRUE`.
`textsize`	Size of text on the plot (ggplot). Defaults to `textsize=4`.
`mapping`, `environment`	Not used.

Details

Threshold choice for the fitting of the GPD is guided by the shape of the Mean Residual Life plot. A threshold which is suitably high will have a corresponding mrl plot which is approximately linear in shape above the threshold (up to sampling variation).

Value

A list with two components. data is the original data, mrl is a matrix containing information to produce the mean residual life plot.

Note

The function was originally written by Stuart Coles and appears in the ismev package. This version modified by Harry Southworth to allow more control over the appearance of the plot.

Author(s)

Janet E. Heffernan, Harry Southworth

References

S. Coles, An Introduction to Statistical Modeling of Extreme Values, Springer, 2001

Plot copulas

Description

Plot copulas

Usage

## S3 method for class 'copula'
plot(x, jitter. = FALSE, jitter.factor = 1, ...)
## S3 method for class 'copula'
plot(x, jitter. = FALSE, jitter.factor = 1, ...)

Arguments

`x`	A copula object
`jitter.`	If `jitter=TRUE`, the values are jittered before plotting. Defaults to `jitter. = FALSE`.
`jitter.factor`	How much jittering to use. Defaults to `jitter.factor = 1`.
`...`	Other arguments to pass through to `plot`.

Plots for evmOpt objects

Description

Various plots for evmOpt objects. These differ depending on whether or not there are covariates in the model. If there are no covariates then the diagnostic plots are PP- and QQ-plots, a return level plot (produced by plotrl.evmSim) and a histogram of the data with superimposed density estimate. These are all calculated using the data on the original scale. If there are covariates in the model then the diagnostics consist of PP- and QQ- plots calculated by using the model residuals (which will be standard exponential devaiates under the GPD model and standard Gumbel deviates under the GEV model), and plots of residuals versus fitted model parameters.

Usage

## S3 method for class 'evmOpt'
plot(
  x,
  main = rep(NULL, 4),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...
)
## S3 method for class 'evmOpt'
plot(
  x,
  main = rep(NULL, 4),
  xlab = rep(NULL, 4),
  nsim = 1000,
  alpha = 0.05,
  ...
)

Arguments

`x`	an object of class `evmOpt`
`main`	titles for diagnostic plots. Should be a vector of length 4, with values corresponding to the character strings to appear on the titles of the pp, qq, return level, and density estimate plots respectively.
`xlab`	As for `main` but labels for x-axes rather than titles.
`nsim`	The number of replicates to be simulated to produce the simulated tolerance intervals.
`alpha`	A $100(1 - \alpha)\%$ simulation envelope is produced.
`...`	FIXME

Details

The PP- and QQ-plots show simulated pointwise tolerance intervals. The region is a $100(1 - \alpha)\%$ region based on nsim simulated samples.

Plots for evmSim objects

Description

This function produces diagnostic plots for the Markov chains used to simulate from the posterior distributions for the model parameters. If the chains have converged on the posterior distributions, the trace plots should look like "fat hairy caterpillars" and their cumulative means should converge rapidly. Moreover, the autocorrelation functions should converge quickly to zero.

Usage

## S3 method for class 'evmSim'
plot(
  x,
  which.plots = 1:3,
  chain = 1,
  density.adjust = 2,
  print.seed = FALSE,
  ...
)
## S3 method for class 'evmSim'
plot(
  x,
  which.plots = 1:3,
  chain = 1,
  density.adjust = 2,
  print.seed = FALSE,
  ...
)

Arguments

`x`	an object of class `evmSim`
`which.plots`	Which plots to produce. Option 1 gives kernel density estimates, 2 gives traces of the Markov chains with superimposed cumulative means, 3 gives autocorrelation functions.
`chain`	Which chain to use in the trace and ACF plots. Only used if more than one chain was run. Defaults to `chain = 1`. If you ran multiple chains, you'll want to look at them all.
`density.adjust`	In `plot` method for class `evmSim`. Passed into `density`. Controls the amount of smoothing of the kernel density estimate.
`print.seed`	Whether or not to print the seed used in the simulations, or to annotate the plots with it.
`...`	ignored

Predict return levels from extreme value models, or obtain the linear predictors.

Description

Predict return levels from extreme value models, or obtain the linear predictors.

Usage

## S3 method for class 'lp.evmOpt'
plot(
  x,
  main = NULL,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 1,
  polycol = 15,
  plot. = TRUE,
  ...
)

## S3 method for class 'evmOpt'
predict(
  object,
  M = 1000,
  newdata = NULL,
  type = "return level",
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmOpt'
linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  full.cov = FALSE,
  ...
)

linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmSim'
predict(
  object,
  M = 1000,
  newdata = NULL,
  type = "return level",
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmSim'
linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmBoot'
predict(
  object,
  M = 1000,
  newdata = NULL,
  type = "return level",
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmBoot'
linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'lp.evmOpt'
print(x, digits = 3, ...)
## S3 method for class 'lp.evmOpt'
plot(
  x,
  main = NULL,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 1,
  polycol = 15,
  plot. = TRUE,
  ...
)

## S3 method for class 'evmOpt'
predict(
  object,
  M = 1000,
  newdata = NULL,
  type = "return level",
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmOpt'
linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  full.cov = FALSE,
  ...
)

linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmSim'
predict(
  object,
  M = 1000,
  newdata = NULL,
  type = "return level",
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmSim'
linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmBoot'
predict(
  object,
  M = 1000,
  newdata = NULL,
  type = "return level",
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmBoot'
linearPredictors(
  object,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'lp.evmOpt'
print(x, digits = 3, ...)

Arguments

`x`	An object of class `lp.evmOpt`, `lp.evmSim` or `lp.evmBoot`, to be passed to methods for these classes.
`main`, `pch`, `ptcol`, `cex`, `linecol`, `cicol`, `polycol`, `plot`, `plot.`	Further arguments to plot methods.
`...`	Further arguments to methods.
`object`	An object of class `evmOpt`, `evmSim` or `evmBoot`.
`M`	The return period: units are number of observations. Defaults to `M = 1000`. If a vector is passed, a list is returned, with items corresponding to the different values of the vector `M`.
`newdata`	The new data that you want to make the prediction for. Defaults in `newdata = NULL` in which case the data used in fitting the model will be used. Column names must match those of the original data matrix used for model fitting.
`type`	For the predict methods, the type of prediction, either "return level" (or "rl") or "link" (or "lp"). Defaults to `type = "return level"`. When a return level is wanted, the user can specify the associated return period via the `M` argument. If `type = "link"` the linear predictor(s) for `phi` and `xi` (or whatever other parameters are in your `texmexFamily` are returned. For the plot methods for simulation based estimation of underlying distributions i.e. objects derived from "evmSim" and "evmBoot" classes, whether to use the sample median `type="median"` or mean `type="mean"` estimate of the parameter.
`se.fit`	Whether or not to return the standard error of the predicted value. Defaults to `se.fit = FALSE` and is not implemented for `predict.evmSim` or `predict.evmBoot`.
`ci.fit`	Whether or not to return a confidence interval for the predicted value. Defaults to `ci.fit = FALSE`. For objects of class `evmOpt`, if set to `TRUE` then the confidence interval is a simple symmetric confidence interval based on the estimated approximate standard error. For the `evmSim` and `evmBoot` methods, the confidence interval represents quantiles of the simulated distribution of the parameters.
`alpha`	If `ci.fit = TRUE`, a 100(1 - alpha)% confidence interval is returned. Defaults to `alpha = 0.050`.
`unique.`	If `unique. = TRUE`, predictions for only the unique values of the linear predictors are returned, rather than for every row of `newdata`. Defaults to `unique. = TRUE`.
`full.cov`	Should the full covariance matrix be returned as part of a `list` object. This is used internally and not intended for direct use. Defaults to `full.cov = FALSE`
`all`	For the `evmSim` and `evmBoot` methods, if `all = TRUE`, the predictions are returned for every simulated parameter vector. Otherwise, only a summary of the posterior/bootstrap distribution is returned. Defaults to `all = FALSE`.
`sumfun`	For the `evmSim` and `evmBoot` methods, a summary function can be passed in. If `sumfun = FALSE`, the default, the summary function used returns the estimated mean and median, and quantiles implied by `alpha`.
`digits`	Number of digits to show when printing objects.

Details

By default, return levels predicted from the unique values of the linear predictors are returned. For the evmBoot method, estimates of confidence intervals are simply quantiles of the bootstrap sample. The evmBoot method is just a wrapper for the evmSim method.

Value

A list with two entries: the first being the call and the second being a further list with one entry for each value of M.

Note

At present, the confidence intervals returned for an object of class evmOpt are simple confidence intervals based on assumptions of normality that are likely to be far from the truth in many cases. A better approach would be to use profile likelihood, and we intend to implement this method at a future date. Alternatively, the credible intervals returned by using Bayesian estimation and the predict method for class "evmSim" will tend to give a better representation of the asymmetry of the estimated intervals around the parameter point estimates.

Author(s)

Harry Southworth and Janet E. Heffernan

Return levels

Description

Computation of return levels and confidence intervals for extreme value models.

Usage

## S3 method for class 'rl.evmOpt'
plot(
  x,
  xlab,
  ylab,
  main,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 0,
  polycol = 15,
  smooth = FALSE,
  sameAxes = TRUE,
  type = "median",
  ylim = NULL,
  plot. = TRUE,
  ...
)

## S3 method for class 'rl.evmSim'
plot(
  x,
  xlab,
  ylab,
  main,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 0,
  polycol = 15,
  smooth = FALSE,
  sameAxes = TRUE,
  type = "median",
  ylim = NULL,
  plot. = TRUE,
  ...
)

## S3 method for class 'rl.evmBoot'
plot(
  x,
  xlab,
  ylab,
  main,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 0,
  polycol = 15,
  smooth = FALSE,
  sameAxes = TRUE,
  type = "median",
  ylim = NULL,
  plot. = TRUE,
  ...
)

rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmOpt'
rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmSim'
rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmBoot'
rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'rl.evmOpt'
print(x, digits = 3, ...)
## S3 method for class 'rl.evmOpt'
plot(
  x,
  xlab,
  ylab,
  main,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 0,
  polycol = 15,
  smooth = FALSE,
  sameAxes = TRUE,
  type = "median",
  ylim = NULL,
  plot. = TRUE,
  ...
)

## S3 method for class 'rl.evmSim'
plot(
  x,
  xlab,
  ylab,
  main,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 0,
  polycol = 15,
  smooth = FALSE,
  sameAxes = TRUE,
  type = "median",
  ylim = NULL,
  plot. = TRUE,
  ...
)

## S3 method for class 'rl.evmBoot'
plot(
  x,
  xlab,
  ylab,
  main,
  pch = 1,
  ptcol = 2,
  cex = 0.75,
  linecol = 4,
  cicol = 0,
  polycol = 15,
  smooth = FALSE,
  sameAxes = TRUE,
  type = "median",
  ylim = NULL,
  plot. = TRUE,
  ...
)

rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmOpt'
rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  ...
)

## S3 method for class 'evmSim'
rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'evmBoot'
rl(
  object,
  M = 1000,
  newdata = NULL,
  se.fit = FALSE,
  ci.fit = FALSE,
  alpha = 0.05,
  unique. = TRUE,
  all = FALSE,
  sumfun = NULL,
  ...
)

## S3 method for class 'rl.evmOpt'
print(x, digits = 3, ...)

Arguments

`x`	Object passed to plot and print methods.
`xlab`, `ylab`, `main`, `pch`, `ptcol`, `cex`, `linecol`, `cicol`, `polycol`, `smooth`, `sameAxes`, `ylim`	Further arguments to plot methods.
`type`	For calls to plot methods for objects of class `rl.evmSim` or `rl.evmBoot`, specifies whether to use the sample mean (`type="mean"`) or median (`type="median"`) estimate of the return levels.
`plot.`	Parameter for plot method, whether to produce plots.
`...`	Further arguments to be passed to methods.
`object`	An object of class `evmOpt`, `evmSim` or `evmBoot`.
`M`	The M-observation return level is computed by the function. Defaults to `M = 1000`.
`newdata`	Data from which to calculate the return level. If not provided, the original data used to fit the model is used. Column names must match those of original data matrix used for model fitting.
`se.fit`	Whether or not to return the standard error of the predicted value. Defaults to `se.fit = FALSE`.
`ci.fit`	Whether or not to return a confidence interval for the predicted value. Defaults to `ci.fit = FALSE`. For objects of class `evmOpt`, if set to `TRUE` then the confidence interval is a simple symmetric confidence interval based on the estimated approximate standard error. For the `evmSim` and `evmBoot` methods, the confidence interval represents quantiles of the simulated distribution of the parameters.
`alpha`	If `ci.fit = TRUE`, a 100(1 - alpha)% confidence interval is returned. Defaults to `alpha = 0.050`.
`unique.`	If `unique. = TRUE`, predictions for only the unique values of the linear predictors are returned, rather than for every row of the original dataframe or of `newdata` if this latter is specified. Defaults to `unique. = TRUE`.
`all`	For the `evmSim` and `evmBoot` methods, if `all = TRUE`, the predictions are returned for every simulated parameter vector. Otherwise, only a summary of the posterior/bootstrap distribution is returned. Defaults to `all = FALSE`.
`sumfun`	For the `evmSim` and `evmBoot` methods, a summary function can be passed in. If `sumfun = FALSE`, the default, the summary function used returns the estimated mean and median, and quantiles implied by `alpha`.
`digits`	Number of digits to show when printing output.

Details

The M-observation return level is defined as the value that is expected to be exceeded only once every M observations. Thus, it is an estimate of a high quantile of the fitted distribution.

In models fit by the evm family of functions with family=gpd, only a fraction of the data is actually included in the model; the fitted GPD is a conditional model, conditioning on the threshold having been exceeded. This consideration is taken into account by rl which calculates unconditional return levels from the entire distribution of observations above and below the GPD fitting threshold.

Examples

mod <- evm(rain, qu=.8) # daily rainfall observations
rl(mod, M=100*365) # 100-year return level
mod <- evm(rain, qu=.8) # daily rainfall observations
rl(mod, M=100*365) # 100-year return level

Print evmOpt objects

Description

Print evmOpt objects

Usage

## S3 method for class 'evmOpt'
print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'evmOpt'
print(x, digits = max(3, getOption("digits") - 3), ...)

Arguments

`x`	a fit evmOpt object
`digits`	number of digits used for printing
`...`	further arguments passed to `format`

Rain, wavesurge, portpirie and nidd datasets.

Description

Rainfall, wave-surge, Port Pirie and River Nidd data sets.

Format

The format of the rain data is: num [1:17531] 0 2.3 1.3 6.9 4.6 0 1 1.5 1.8 1.8 ...

The wave-surge data is bivariate and is used for testing functions in texmex.

The Port Pirie data has two columns: 'Year' and 'SeaLevel'.

The River Nidd data represents 154 measurements of the level of the River Nidd at Hunsingore Weir (Yorkshire, UK) between 1934 and 1969. Each measurement breaches the threshold of $65 m^3/2$. Various authors have analysed this dataset, as described by Papastathopoulos and Tawn~egp, there being some apparent difficulty in identifying a threshold above which GPD models are suitable.

Details

The rain, wave-surge and Port Pirie datasets are used by Coles and appear in the ismev package. The River Nidd data appear in the evir package.

Source

Copied from the ismev package and the evir package

References

S. Coles, An Introduction to Statistical Modeling of Extreme Values, Springer, 2001

I. Papastathopoulos and J. A. Tawn, Extended Generalised Pareto Models for Tail Estimation, Journal of Statistical Planning and Inference, 143, 134 – 143, 2011

Extreme Value random process generation.

Description

Extreme Value random process generation.

Usage

rFrechet(n)
rFrechet(n)

Arguments

`n`	Number of samples to generate.

Details

Generation of samples from unit Frechet processes.

Examples

rFrechet(1000)
rFrechet(1000)

Generalized logistic distribution

Description

Density, distribution and quantile functions, and random number generation for the Generalized logistic distribution

Usage

rglo(n, mu, sigma, xi)

dglo(x, mu, sigma, xi, log.d = FALSE)

pglo(q, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

qglo(p, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)
rglo(n, mu, sigma, xi)

dglo(x, mu, sigma, xi, log.d = FALSE)

pglo(q, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

qglo(p, mu, sigma, xi, lower.tail = TRUE, log.p = FALSE)

Arguments

`n`	Number of random numbers to generate.
`mu`	Location parameter.
`sigma`	Scale parameter.
`xi`	Shape parameter.
`x`, `q`, `p`	Value, quantile or probability respectively.
`log.d`, `log.p`	Whether to work on the log scale.
`lower.tail`	Whether to return the lower tail.

Extreme Value random process generation.

Description

Extreme Value random process generation.

Usage

rMaxAR(n,theta)
rMaxAR(n,theta)

Arguments

`n`	Number of samples to generate.
`theta`	Parameter of the MAX AR process, takes values between 0 and 1.

Details

Generation of samples from Max AR(theta) processes.

Examples

rMaxAR(1000,0.2)
rMaxAR(1000,0.2)

Simulate from a fitted evm object

Description

Simulate random numbers from a fitted evm object

Usage

## S3 method for class 'evmOpt'
simulate(object, nsim = 1, seed = NULL, param = NULL, ...)

## S3 method for class 'evmSim'
simulate(object, nsim = 1, seed = NULL, ...)

## S3 method for class 'evmBoot'
simulate(object, nsim = 1, seed = NULL, ...)
## S3 method for class 'evmOpt'
simulate(object, nsim = 1, seed = NULL, param = NULL, ...)

## S3 method for class 'evmSim'
simulate(object, nsim = 1, seed = NULL, ...)

## S3 method for class 'evmBoot'
simulate(object, nsim = 1, seed = NULL, ...)

Arguments

`object`	A fitted evm object having class 'evmOpt', 'evmSim' or 'evmBoot'.
`nsim`	The number of simulations to perform. Defaults to `nsim=1`. A single simulation involves simulating a new set of responses from the data that was provided to `evm` (after thresholding if thresholding is performed.)
`seed`	An integer to be passed to `set.seed`. Defaults to `seed=NULL`.
`param`	Parameters to use in the random number generator. Defaults to `param=NULL` in which case the parameters from the fitted model are used. For `simulate.evmSim` and `simulate.evmBoot`, this argument is not available and the simulated parameters or replicates are used.
`...`	Unused.

Details

For simulate.evmSim and simulate.evmBoot, the parameters from the Markov chains or bootstrap replicates are randomly permuted prior to each set of simulated responses being computed. In this way, reusing the same set of values is avoided.

Value

If nsim=1, a vector or random numbers simulated from the fitted model object. If nsim > 1, a matrix with each column being a set of simulated responses.

Author(s)

Paul Metcalfe, Harry Southworth

Examples


mod <- evm(rain, qu=.95)
hist(simulate(mod, 100))

mod <- evm(rain, qu=.95)
hist(simulate(mod, 100))

Air pollution data, separately for summer and winter months

Description

Air pollution data from Leeds (U.K.) city centre, collected from 1994 to 1998. The summer data set corresponds to the months of April to July inclusive. The winter data set corresponds to the months of November to February inclusive. Some outliers have been removed, as discussed by Heffernan and Tawn, 2004.

Format

Data frames with 578 (summer) and 532 (winter) observations on the following 5 variables.

O3: Daily maximum ozone in parts per billion.
NO2: Daily maximum NO2 in parts per billion.
NO: Daily maximum NO in parts per billion.
SO2: Daily maximum SO2 in parts per billion.
PM10: Daily maximum PM10 in micrograms/metre^3

Source

Provided as online supplementary material to Heffernan and Tawn, 2004:

http://www.blackwellpublishing.com/rss/Readmefiles/heffernan.htm

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical society B, 66, 497 – 546, 2004

Examples


data(summer)
data(winter)

data(summer)
data(winter)

Create families of distributions

Description

Create families of distributions for use with extreme value modelling.

Usage

texmexFamily(name, log.lik, param, info = NULL, sandwich =
    NULL, start = NULL, resid = NULL, rl, delta, endpoint, density,
    rng, prob, quant)
    ## S3 method for class 'texmexFamily'
print(x,...)
    ## S3 method for class 'texmexFamily'
summary(object,...)
    ## S3 method for class 'summary.texmexFamily'
print(x,...)
texmexFamily(name, log.lik, param, info = NULL, sandwich =
    NULL, start = NULL, resid = NULL, rl, delta, endpoint, density,
    rng, prob, quant)
    ## S3 method for class 'texmexFamily'
print(x,...)
    ## S3 method for class 'texmexFamily'
summary(object,...)
    ## S3 method for class 'summary.texmexFamily'
print(x,...)

Arguments

`name`	The name of the distribution.
`log.lik`	The distribution's log-likelihood function.
`param`	The names of the parameters in the model.
`info`	Function to compute the information matrix. If not provided, the modelling functions will work with a numerical approximation.
`sandwich`	Function to compute the filling in the Huber sandwich estimator of the covariance matrix of parameter estimates, used for dependent data. Only implemented in family `gpd`.
`start`	Function to compute starting parameters for the model. If not provided, the modelling functions will try to guess.
`resid`	Function to compute residuals for the model.
`rl`	Function to compute return levels.
`delta`	Function to compute adjustments for covariance for return levels.
`endpoint`	Function to compute the upper or lower endpoint of the fitted distribution.
`density`	Function to compute the density.
`rng`	Function for random number generation.
`prob`	Function to compute cumulative probabilities.
`quant`	Function to compute quantiles.
`...`	Additional arguments to the print and summary methods.
`x`, `object`	An object of class 'texmexFamily'.

Details

The density, rng, prob and quant functions can be simple wrappers for the usual d, r, p and q functions. They should take a matrix with number of columns equal to the number of parameters, and a fitted model object even if the model object is not used by the function.

Examples of "texmexFamily" objects are gpd, gev, glo, gpdIntCensored, weibull, gumbel and egp3. Take a look at those objects to see how the functions should be constructed.

The functions are used by the modelling functions to create diagnostic plots, predictions, etc..

Value

A object of class "texmexFamily", which is essentially a list containing the input arguments. If info, sandwich, start, resid are not provided, they default to NULL.

Note

The gpd, gev, weibull, generalised logistic (glo), gumbel, gpdIntCensored and egp3 families are provided. The evm function defaults to using the gpd family.

Author(s)

Harry Southworth

Process Metropolis output from extreme value model fitting to discard unwanted observations.

Description

Process observations from Metropolis fitting of extreme value models, to thin the output and discard observations from burn-in period.

Usage

## S3 method for class 'evmSim'
thinAndBurn(object, burn, thin)
## S3 method for class 'evmSim'
thinAndBurn(object, burn, thin)

Arguments

`object`	Object of class 'evmSim' as returned by `evm` called with `method="simulate"`.
`burn`	The number of observations from the simulated Markov Chain to be discarded as burn-in. Must be a non-negative integer, for no burn-in use `burn=0`.
`thin`	`thin` or its reciprocal must be a positive integer. If integer valued, this specifies the frequency of observations from the simulated Markov Chain which will be retained. If specified as a proportion, this is the proportion of values which will be retained. For no thinning use `thin=1`.

Value

Object of class evmSim. See Value returned by evm using method = "simulate" for details.

Note that the original chain is not discarded when this function is called: thinAndBurn can be called recursively on the original object with different values of burn and thin without the object getting progressively smaller!

Author(s)

Harry Southworth, Janet E. Heffernan

Examples


  x <- rnorm(1000)
  # For the values of burn and thin below, we should do many more iterations.
  # The number of iterations is kept low here due to the run time allowed
  # by CRAN.
  mod <- evm(x, qu=.7, method="sim", iter=11000)
  mod
  par(mfrow=c(3, 2))
  plot(mod)
  mod1 <- thinAndBurn(mod,burn=1000, thin=5)
  plot(mod1)

x <- rnorm(1000)
  # For the values of burn and thin below, we should do many more iterations.
  # The number of iterations is kept low here due to the run time allowed
  # by CRAN.
  mod <- evm(x, qu=.7, method="sim", iter=11000)
  mod
  par(mfrow=c(3, 2))
  plot(mod)
  mod1 <- thinAndBurn(mod,burn=1000, thin=5)
  plot(mod1)

Package 'texmex'

Help Index

Extreme value modelling

Description

Details

Author(s)

References

Examples

Accurately compute (exp(x) - 1) / x

Description

Usage

Arguments

Value

Accurately compute log(1-exp(x))

Description

Usage

Arguments

Value

Accurately compute log(1 + x) / x

Description

Usage

Arguments

Value

Compute pmax(x y, -1) in such a way that zeros in x beat infinities in y.

Description

Usage

Arguments

Value

Annotate a threshold selection ggplot

Description

Usage

Arguments

Information Criteria

Description

Usage

Arguments

Details

Value

See Also

Bootstrap a conditional multivariate extreme values model

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Measures of extremal dependence

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Calculate the copula of a matrix of variables

Description

Usage

Arguments

Details

Value

Methods (by class)

Author(s)

See Also

Examples

Cross-validation for a model object

Description

Usage

Arguments

Details

See Also

Cross-validation for the shape parameter in an extreme values model

Description

Usage

Arguments