Title: | Statistical Tolerance Intervals and Regions |
---|---|
Description: | Statistical tolerance limits provide the limits between which we can expect to find a specified proportion of a sampled population with a given level of confidence. This package provides functions for estimating tolerance limits (intervals) for various univariate distributions (binomial, Cauchy, discrete Pareto, exponential, two-parameter exponential, extreme value, hypergeometric, Laplace, logistic, negative binomial, negative hypergeometric, normal, Pareto, Poisson-Lindley, Poisson, uniform, and Zipf-Mandelbrot), Bayesian normal tolerance limits, multivariate normal tolerance regions, nonparametric tolerance intervals, tolerance bands for regression settings (linear regression, nonlinear regression, nonparametric regression, and multivariate regression), and analysis of variance tolerance intervals. Visualizations are also available for most of these settings. |
Authors: | Derek S. Young [aut, cre] |
Maintainer: | Derek S. Young <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.0.0 |
Built: | 2025-02-12 03:10:35 UTC |
Source: | https://github.com/dsy109/tolerance |
Statistical tolerance limits provide the limits between which we can expect to find a specified proportion of a sampled population with a given level of confidence. This package provides functions for estimating tolerance limits (intervals) for various univariate distributions (binomial, Cauchy, discrete Pareto, exponential, two-parameter exponential, extreme value, hypergeometric, Laplace, logistic, negative binomial, negative hypergeometric, normal, Pareto, Poisson-Lindley, Poisson, uniform, and Zipf-Mandelbrot), Bayesian normal tolerance limits, multivariate normal tolerance regions, nonparametric tolerance intervals, tolerance bands for regression settings (linear regression, nonlinear regression, nonparametric regression, and multivariate regression), and analysis of variance tolerance intervals. Visualizations are also available for most of these settings.
Package: | tolerance |
Type: | Package |
Version: | 2.0.0 |
Date: | 2020-02-04 |
Imports: | MASS, rgl, stats4 |
License: | GPL (>= 2) |
Derek S. Young, Ph.D.
Maintainer: Derek S. Young <[email protected]>
Hahn, G. J. and Meeker, W. Q. (1991), Statistical Intervals: A Guide for Practitioners, Wiley-Interscience.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
Patel, J. K. (1986), Tolerance Intervals - A Review, Communications in Statistics - Theory and Methodology, 15, 2719–2762.
Young, D. S. (2010), tolerance:
An R
Package for Estimating Tolerance Intervals, Journal of Statistical Software, 36(5), 1–39.
Young, D. S. (2014), Computing Tolerance Intervals and Regions in R
. In M. B. Rao and C. R. Rao (eds.), Handbook of Statistics, Volume 32: Computational Statistics with R
, 309–338. North-Holland, Amsterdam.
Provides an upper bound on the number of acceptable rejects or nonconformities in a process. This is similar to a 1-sided upper tolerance bound for a hypergeometric random variable.
acc.samp(n, N, alpha = 0.05, P = 0.99, AQL = 0.01, RQL = 0.02)
acc.samp(n, N, alpha = 0.05, P = 0.99, AQL = 0.01, RQL = 0.02)
n |
The sample size to be drawn from the inventory. |
N |
The total inventory (or lot) size. |
alpha |
|
P |
The proportion of items in the inventory which are to be accountable. |
AQL |
The acceptable quality level, which is the largest proportion of defects in a process considered
acceptable. Note that |
RQL |
The rejectable quality level, which is the largest proportion of defects in an independent lot
that one is willing to tolerate. Note that |
acc.samp
returns a matrix with the following quantities:
acceptance.limit |
The number of items in the sample which may be unaccountable, yet still be able to
attain the desired confidence level |
lot.size |
The total inventory (or lot) size |
confidence |
The confidence level |
P |
The proportion of accountable items specified by the user. |
AQL |
The acceptable quality level as specified by the user. If the sampling were to be repeated numerous times as a process, then
this quantity specifies the proportion of missing items considered acceptable from the process as a whole. Conditioning on the
calculated value for |
RQL |
The rejectable quality level as specified by the user. This is the proportion of individual items in a sample one is willing
to tolerate missing. Conditioning on the calculated value for |
sample.size |
The sample size drawn as specified by |
prod.risk |
The producer's risk at the specified |
cons.risk |
The consumer's risk at the specified |
Montgomery, D. C. (2005), Introduction to Statistical Quality Control, Fifth Edition, John Wiley & Sons, Inc.
## A 90%/90% acceptance sampling plan for a sample of 450 ## drawn from a lot size of 960. acc.samp(n = 450, N = 960, alpha = 0.10, P = 0.90, AQL = 0.07, RQL = 0.10)
## A 90%/90% acceptance sampling plan for a sample of 450 ## drawn from a lot size of 960. acc.samp(n = 450, N = 960, alpha = 0.10, P = 0.90, AQL = 0.07, RQL = 0.10)
Tolerance intervals for each factor level in a balanced (or nearly-balanced) ANOVA.
anovatol.int(lm.out, data, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50)
anovatol.int(lm.out, data, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50)
lm.out |
An object of class |
data |
A data frame consisting of the data fitted in |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
anovatol.int
returns a list where each element is a data frame corresponding to each main effect (i.e.,
factor) tested in the ANOVA and the rows of each data frame are the levels of that factor. The columns of each data
frame report the following:
mean |
The mean for that factor level. |
n |
The effective sample size for that factor level. |
k |
The k-factor for constructing the respective factor level's tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Howe, W. G. (1969), Two-Sided Tolerance Limits for Normal Populations - Some Improvements, Journal of the American Statistical Association, 64, 610–620.
Weissberg, A. and Beatty, G. (1969), Tables of Tolerance Limit Factors for Normal Distributions, Technometrics, 2, 483–500.
K.factor
, normtol.int
, lm
, anova
## 90%/95% 2-sided tolerance intervals for a 2-way ANOVA ## using the "warpbreaks" data. attach(warpbreaks) lm.out <- lm(breaks ~ wool + tension) out <- anovatol.int(lm.out, data = warpbreaks, alpha = 0.10, P = 0.95, side = 2, method = "HE") out plottol(out, x = warpbreaks)
## 90%/95% 2-sided tolerance intervals for a 2-way ANOVA ## using the "warpbreaks" data. attach(warpbreaks) lm.out <- lm(breaks ~ wool + tension) out <- anovatol.int(lm.out, data = warpbreaks, alpha = 0.10, P = 0.95, side = 2, method = "HE") out plottol(out, x = warpbreaks)
Provides 1-sided or 2-sided Bayesian tolerance intervals under the conjugate prior for data distributed according to a normal distribution.
bayesnormtol.int(x = NULL, norm.stats = list(x.bar = NA, s = NA, n = NA), alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, hyper.par = list(mu.0 = NULL, sig2.0 = NULL, m.0 = NULL, n.0 = NULL))
bayesnormtol.int(x = NULL, norm.stats = list(x.bar = NA, s = NA, n = NA), alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, hyper.par = list(mu.0 = NULL, sig2.0 = NULL, m.0 = NULL, n.0 = NULL))
x |
A vector of data which is distributed according to a normal distribution. |
norm.stats |
An optional list of statistics that can be provided in-lieu of the full dataset. If provided, the user must specify all three components: the sample mean ( |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
hyper.par |
A list consisting of the hyperparameters for the conjugate prior: the hyperparameters for the mean ( |
Note that if one considers the non-informative prior distribution, then the Bayesian tolerance intervals are the same as the classical solution, which can be obtained by using normtol.int
.
bayesnormtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
x.bar |
The sample mean. |
1-sided.lower |
The 1-sided lower Bayesian tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper Bayesian tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower Bayesian tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper Bayesian tolerance bound. This is given only if |
Aitchison, J. (1964), Bayesian Tolerance Regions, Journal of the Royal Statistical Society, Series B, 26, 161–175.
Guttman, I. (1970), Statistical Tolerance Regions: Classical and Bayesian, Charles Griffin and Company.
Young, D. S., Gordon, C. M., Zhu, S., and Olin, B. D. (2016), Sample Size Determination Strategies for Normal Tolerance Intervals Using Historical Data, Quality Engineering, 28, 337–351.
## 95%/85% 2-sided Bayesian normal tolerance limits for ## a sample of size 100. set.seed(100) x <- rnorm(100) out <- bayesnormtol.int(x = x, alpha = 0.05, P = 0.85, side = 2, method = "EXACT", hyper.par = list(mu.0 = 0, sig2.0 = 1, n.0 = 10, m.0 = 10)) out plottol(out, x, plot.type = "both", side = "upper", x.lab = "Normal Data")
## 95%/85% 2-sided Bayesian normal tolerance limits for ## a sample of size 100. set.seed(100) x <- rnorm(100) out <- bayesnormtol.int(x = x, alpha = 0.05, P = 0.85, side = 2, method = "EXACT", hyper.par = list(mu.0 = 0, sig2.0 = 1, n.0 = 10, m.0 = 10)) out plottol(out, x, plot.type = "both", side = "upper", x.lab = "Normal Data")
Provides 1-sided or 2-sided tolerance intervals for binomial random variables. From a statistical quality control perspective, these limits use the proportion of defective (or acceptable) items in a sample to bound the number of defective (or acceptable) items in future productions of a specified quantity.
bintol.int(x, n, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("LS", "WS", "AC", "JF", "CP", "AS", "LO", "PR", "PO", "CL", "CC", "CWS"), a1 = 0.5, a2 = 0.5)
bintol.int(x, n, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("LS", "WS", "AC", "JF", "CP", "AS", "LO", "PR", "PO", "CL", "CC", "CWS"), a1 = 0.5, a2 = 0.5)
x |
The number of defective (or acceptable) units in the sample. Can be a vector of length |
n |
The size of the random sample of units selected for inspection. |
m |
The quantity produced in future groups. If |
alpha |
The level chosen such that |
P |
The proportion of the defective (or acceptable) units in future samples of size |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the lower and upper confidence bounds, which are used in the calculation
of the tolerance bounds. The default method is |
a1 |
This specifies the first shape hyperparameter when using Jeffreys' method. |
a2 |
This specifies the second shape hyperparameter when using Jeffreys' method. |
bintol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of defective (or acceptable) units in future samples of size |
p.hat |
The proportion of defective (or acceptable) units in the sample, calculated by |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Brown, L. D., Cai, T. T., and DasGupta, A. (2001), Interval Estimation for a Binomial Proportion, Statistical Science, 16, 101–133.
Hahn, G. J. and Chandra, R. (1981), Tolerance Intervals for Poisson and Binomial Variables, Journal of Quality Technology, 13, 100–110.
Newcombe, R. G. (1998), Two-Sided Confidence Intervals for the Single Proportion: Comparison of Seven Methods, Statistics in Medicine, 17, 857–872.
## 85%/90% 2-sided binomial tolerance intervals for a future ## lot of 2500 when a sample of 230 were drawn from a lot of ## 1000. All methods but Jeffreys' method are compared ## below. bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "LS") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "WS") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "AC") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CP") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "AS") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "LO") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "PR") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "PO") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CL") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CC") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CWS") ## Using Jeffreys' method to construct the 85%/90% 1-sided ## binomial tolerance limits. The first calculation assumes ## a prior on the proportion of defects which places greater ## density on values near 0. The second calculation assumes ## a prior on the proportion of defects which places greater ## density on values near 1. bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 1, method = "JF", a1 = 2, a2 = 10) bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 1, method = "JF", a1 = 5, a2 = 1)
## 85%/90% 2-sided binomial tolerance intervals for a future ## lot of 2500 when a sample of 230 were drawn from a lot of ## 1000. All methods but Jeffreys' method are compared ## below. bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "LS") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "WS") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "AC") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CP") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "AS") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "LO") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "PR") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "PO") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CL") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CC") bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 2, method = "CWS") ## Using Jeffreys' method to construct the 85%/90% 1-sided ## binomial tolerance limits. The first calculation assumes ## a prior on the proportion of defects which places greater ## density on values near 0. The second calculation assumes ## a prior on the proportion of defects which places greater ## density on values near 1. bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 1, method = "JF", a1 = 2, a2 = 10) bintol.int(x = 230, n = 1000, m = 2500, alpha = 0.15, P = 0.90, side = 1, method = "JF", a1 = 5, a2 = 1)
This function allows the user to control what proportion of the population is to be in the tails of the given distribution for a 2-sided tolerance interval. The result is a conservative approximation based on Bonferroni's inequality.
bonftol.int(fn, P1 = 0.005, P2 = 0.005, alpha = 0.05, ...)
bonftol.int(fn, P1 = 0.005, P2 = 0.005, alpha = 0.05, ...)
fn |
The function name for the 2-sided tolerance interval to be calculated. |
P1 |
The proportion of the population not covered in the lower tail of the distribution. |
P2 |
The proportion of the population not covered in the upper tail of the distribution. |
alpha |
The level chosen such that |
... |
Additional arguments passed to |
The results for the 2-sided tolerance interval procedure are reported. See the corresponding help file for fn
about
specific output. Note that the (minimum) proportion of the population to be covered by this interval is 1 - (P1 + P2)
.
This function can be used with any 2-sided tolerance interval function, including the regression tolerance interval functions.
Jensen, W. A. (2009), Approximations of Tolerance Intervals for Normally Distributed Data, Quality and Reliability Engineering International, 25, 571–580.
Patel, J. K. (1986), Tolerance Intervals - A Review, Communications in Statistics - Theory and Methodology, 15, 2719–2762.
## 95%/97% tolerance interval for normally distributed ## data controlling 1% of the data is in the lower tail ## and 2% of the data in the upper tail. set.seed(100) x <- rnorm(100, 0, 0.2) bonftol.int(normtol.int, x = x, P1 = 0.01, P2 = 0.02, alpha = 0.05, method = "HE")
## 95%/97% tolerance interval for normally distributed ## data controlling 1% of the data is in the lower tail ## and 2% of the data in the upper tail. set.seed(100) x <- rnorm(100, 0, 0.2) bonftol.int(normtol.int, x = x, P1 = 0.01, P2 = 0.02, alpha = 0.05, method = "HE")
Provides 1-sided or 2-sided tolerance intervals for Cauchy distributed data.
cautol.int(x, alpha = 0.05, P = 0.99, side = 1)
cautol.int(x, alpha = 0.05, P = 0.99, side = 1)
x |
A vector of data which is Cauchy distributed. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
cautol.int
returns a data.frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Bain, L. J. (1978), Statistical Analysis of Reliability and Life-Testing Models, Marcel Dekker, Inc.
## 95%/90% 2-sided Cauchy tolerance interval for a sample ## of size 1000. set.seed(100) x <- rcauchy(1000, 100000, 10) out <- cautol.int(x = x, alpha = 0.05, P = 0.90, side = 2) out plottol(out, x, plot.type = "both", x.lab = "Cauchy Data")
## 95%/90% 2-sided Cauchy tolerance interval for a sample ## of size 1000. set.seed(100) x <- rcauchy(1000, 100000, 10) out <- cautol.int(x = x, alpha = 0.05, P = 0.90, side = 2) out plottol(out, x, plot.type = "both", x.lab = "Cauchy Data")
Provides 1-sided tolerance limits for the difference between two independent normal random variables. If the ratio of the variances is known, then an exact calculation is performed. Otherwise, approximation methods are implemented.
diffnormtol.int(x1, x2, var.ratio = NULL, alpha = 0.05, P = 0.99, method = c("HALL", "GK", "RG"))
diffnormtol.int(x1, x2, var.ratio = NULL, alpha = 0.05, P = 0.99, method = c("HALL", "GK", "RG"))
x1 |
A vector of sample data which is distributed according to a normal distribution (sample 1). |
x2 |
Another vector of sample data which is distributed according to a normal distribution (sample 2). It can be of a different sample size than
the sample specified by |
var.ratio |
A specified, known value of the variance ratio (i.e., the ratio of the variance for population 1 to the variance of population 2).
If |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by the tolerance limits. |
method |
The method for estimating the variance ratio. This only needs to be specified in the case when
|
Satterthwaite's approximation for the degrees of freedom is used when the variance ratio is unknown.
diffnormtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
diff.bar |
The difference between the sample means. |
1-sided.lower |
The 1-sided lower tolerance bound. |
1-sided.upper |
The 1-sided upper tolerance bound. |
Unlike other tolerance interval functions, the output from diffnormtol.int
cannot be passed to plottol
.
Guo, H. and Krishnamoorthy, K. (2004), New Approximate Inferential Methods for the Reliability Parameter in a Stress-Strength Model: The Normal Case, Communications in Statistics - Theory and Methods, 33, 1715–1731.
Hall, I. J. (1984), Approximate One-Sided Tolerance Limits for the Difference or Sum of Two Independent Normal Variates, Journal of Quality Technology, 16, 15–19.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
Reiser, B. J. and Guttman, I. (1986), Statistical Inference for Pr(Y < X): The Normal Case, Technometrics, 28, 253–257.
## 90%/99% tolerance limits for the difference between two ## simulated normal data sets. This data is taken from ## Krishnamoorthy and Mathew (2009). Note that there is a ## calculational error in their example, which yields different ## results with the output below. x1 <- c(10.166, 5.889, 8.258, 7.303, 8.757) x2 <- c(-0.204, 2.578, 1.182, 1.892, 0.786, -0.517, 1.156, 0.980, 0.323, 0.437, 0.397, 0.050, 0.812, 0.720) diffnormtol.int(x1, x2, alpha = 0.10, P = 0.99, method = "HALL") diffnormtol.int(x1, x2, alpha = 0.10, P = 0.99, method = "GK") diffnormtol.int(x1, x2, alpha = 0.10, P = 0.99, method = "RG") diffnormtol.int(x1, x2, var.ratio = 3.8, alpha = 0.10, P = 0.99)
## 90%/99% tolerance limits for the difference between two ## simulated normal data sets. This data is taken from ## Krishnamoorthy and Mathew (2009). Note that there is a ## calculational error in their example, which yields different ## results with the output below. x1 <- c(10.166, 5.889, 8.258, 7.303, 8.757) x2 <- c(-0.204, 2.578, 1.182, 1.892, 0.786, -0.517, 1.156, 0.980, 0.323, 0.437, 0.397, 0.050, 0.812, 0.720) diffnormtol.int(x1, x2, alpha = 0.10, P = 0.99, method = "HALL") diffnormtol.int(x1, x2, alpha = 0.10, P = 0.99, method = "GK") diffnormtol.int(x1, x2, alpha = 0.10, P = 0.99, method = "RG") diffnormtol.int(x1, x2, var.ratio = 3.8, alpha = 0.10, P = 0.99)
Density (mass), distribution function, quantile function, and random generation for the difference between two proportions. This is determined by taking the difference between two independent beta distributions.
ddiffprop(x, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5, log = FALSE, ...) pdiffprop(q, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5, lower.tail = TRUE, log.p = FALSE, ...) qdiffprop(p, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5, lower.tail = TRUE, log.p = FALSE, ...) rdiffprop(n, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5)
ddiffprop(x, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5, log = FALSE, ...) pdiffprop(q, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5, lower.tail = TRUE, log.p = FALSE, ...) qdiffprop(p, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5, lower.tail = TRUE, log.p = FALSE, ...) rdiffprop(n, k1, k2, n1, n2, a1 = 0.5, a2 = 0.5)
x , q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
The number of observations. If |
k1 , k2
|
The number of successes drawn from groups 1 and 2, respectively. |
n1 , n2
|
The sample sizes for groups 1 and 2, respectively. |
a1 , a2
|
The shift parameters for the beta distributions. For the fiducial approach, we know that the lower and upper limits are set at |
log , log.p
|
Logical vectors. If |
lower.tail |
Logical vector. If |
... |
Additional arguments passed to the Appell |
The difference between two proportions distribution has a fairly complicated functional form. Please see the article by Chen and Luo (2011), who corrected a typo in the article by Nadarajah and Kotz (2007), for the functional form of this distribution.
ddiffprop
gives the density (mass), pdiffprop
gives the distribution function, qdiffprop
gives the quantile function, and rdiffprop
generates random deviates.
Chen, Y. and Luo, S. (2011), A Few Remarks on 'Statistical Distribution of the Difference of Two Proportions', Statistics in Medicine, 30, 1913–1915.
Nadarajah, S. and Kotz, S. (2007), Statistical Distribution of the Difference of Two Proportions, Statistics in Medicine, 26, 3518–3523.
runif
and .Random.seed
about random number generation.
## Randomly generated data from the difference between ## two proportions distribution. set.seed(100) x <- rdiffprop(n = 100, k1 = 2, k2 = 10, n1 = 17, n2 = 13) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- ddiffprop(x = x.1, k1 = 2, k2 = 10, n1 = 17, n2 = 13) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pdiffprop(q = x.1, k1 = 2, k2 = 10, n1 = 17, n2 = 13), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qdiffprop(p = 0.20, k1 = 2, k2 = 10, n1 = 17, n2 = 13, lower.tail = FALSE) qdiffprop(p = 0.80, k1 = 2, k2 = 10, n1 = 17, n2 = 13)
## Randomly generated data from the difference between ## two proportions distribution. set.seed(100) x <- rdiffprop(n = 100, k1 = 2, k2 = 10, n1 = 17, n2 = 13) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- ddiffprop(x = x.1, k1 = 2, k2 = 10, n1 = 17, n2 = 13) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pdiffprop(q = x.1, k1 = 2, k2 = 10, n1 = 17, n2 = 13), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qdiffprop(p = 0.20, k1 = 2, k2 = 10, n1 = 17, n2 = 13, lower.tail = FALSE) qdiffprop(p = 0.80, k1 = 2, k2 = 10, n1 = 17, n2 = 13)
Density (mass), distribution function, quantile function, and random generation for the discrete Pareto distribution.
ddpareto(x, theta, log = FALSE) pdpareto(q, theta, lower.tail = TRUE, log.p = FALSE) qdpareto(p, theta, lower.tail = TRUE, log.p = FALSE) rdpareto(n, theta)
ddpareto(x, theta, log = FALSE) pdpareto(q, theta, lower.tail = TRUE, log.p = FALSE) qdpareto(p, theta, lower.tail = TRUE, log.p = FALSE) rdpareto(n, theta)
x , q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
The number of observations. If |
theta |
The shape parameter, which must be greater than 0 and less than 1. |
log , log.p
|
Logical vectors. If |
lower.tail |
Logical vector. If |
The discrete Pareto distribution has mass
where and
is the shape parameter.
ddpareto
gives the density (mass), pdpareto
gives the distribution function, qdpareto
gives the quantile function, and rdpareto
generates random deviates for the specified distribution.
Krishna, H. and Pundir, P. S. (2009), Discrete Burr and Discrete Pareto Distributions, Statistical Methodology, 6, 177–188.
Young, D. S., Naghizadeh Qomi, M., and Kiapour, A. (2019), Approximate Discrete Pareto Tolerance Limits for Characterizing Extremes in Count Data, Statistica Neerlandica, 73, 4–21.
runif
and .Random.seed
about random number generation.
## Randomly generated data from the discrete Pareto ## distribution. set.seed(100) x <- rdpareto(n = 150, theta = 0.2) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- ddpareto(x = x.1, theta = 0.2) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pdpareto(q = x.1, theta = 0.2), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qdpareto(p = 0.80, theta = 0.2, lower.tail = FALSE) qdpareto(p = 0.95, theta = 0.2)
## Randomly generated data from the discrete Pareto ## distribution. set.seed(100) x <- rdpareto(n = 150, theta = 0.2) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- ddpareto(x = x.1, theta = 0.2) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pdpareto(q = x.1, theta = 0.2), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qdpareto(p = 0.80, theta = 0.2, lower.tail = FALSE) qdpareto(p = 0.95, theta = 0.2)
When providing two of the three quantities n
, alpha
, and P
, this function solves for the
third quantity in the context of distribution-free tolerance intervals.
distfree.est(n = NULL, alpha = NULL, P = NULL, side = 1)
distfree.est(n = NULL, alpha = NULL, P = NULL, side = 1)
n |
The necessary sample size to cover a proportion |
alpha |
1 minus the confidence level attained when it is desired to cover a proportion |
P |
The proportion of the population to be covered with confidence |
side |
Whether a 1-sided or 2-sided tolerance interval is assumed (determined by |
When providing two of the three quantities n
, alpha
, and P
, distfree.est
returns the
third quantity. If more than one value of a certain quantity is specified, then a table will be returned.
Natrella, M. G. (1963), Experimental Statistics: National Bureau of Standards - Handbook No. 91, United States Government Printing Office, Washington, D.C.
## Solving for 1 minus the confidence level. distfree.est(n = 59, P = 0.95, side = 1) ## Solving for the sample size. distfree.est(alpha = 0.05, P = 0.95, side = 1) ## Solving for the proportion of the population to cover. distfree.est(n = 59, alpha = 0.05, side = 1) ## Solving for sample sizes for many tolerance specifications. distfree.est(alpha = seq(0.01, 0.05, 0.01), P = seq(0.80, 0.99, 0.01), side = 2)
## Solving for 1 minus the confidence level. distfree.est(n = 59, P = 0.95, side = 1) ## Solving for the sample size. distfree.est(alpha = 0.05, P = 0.95, side = 1) ## Solving for the proportion of the population to cover. distfree.est(n = 59, alpha = 0.05, side = 1) ## Solving for sample sizes for many tolerance specifications. distfree.est(alpha = seq(0.01, 0.05, 0.01), P = seq(0.80, 0.99, 0.01), side = 2)
Performs maximum likelihood estimation for the parameter of the discrete Pareto distribution.
dpareto.ll(x, theta = NULL, ...)
dpareto.ll(x, theta = NULL, ...)
x |
A vector of raw data which is distributed according to a Poisson-Lindley distribution. |
theta |
Optional starting value for the parameter. If |
... |
Additional arguments passed to the |
The discrete Pareto distribution is a discretized of the continuous Type II Pareto distribution (also called the Lomax distribution).
See the help file for mle
to see how the output is structured.
Krishna, H. and Pundir, P. S. (2009), Discrete Burr and Discrete Pareto Distributions, Statistical Methodology, 6, 177–188.
Young, D. S., Naghizadeh Qomi, M., and Kiapour, A. (2019), Approximate Discrete Pareto Tolerance Limits for Characterizing Extremes in Count Data, Statistica Neerlandica, 73, 4–21.
## Maximum likelihood estimation for randomly generated data ## from the discrete Pareto distribution. set.seed(100) dp.data <- rdpareto(n = 500, theta = 0.2) out.dp <- dpareto.ll(dp.data) stats4::coef(out.dp) stats4::vcov(out.dp)
## Maximum likelihood estimation for randomly generated data ## from the discrete Pareto distribution. set.seed(100) dp.data <- rdpareto(n = 500, theta = 0.2) out.dp <- dpareto.ll(dp.data) stats4::coef(out.dp) stats4::vcov(out.dp)
Provides 1-sided or 2-sided tolerance intervals for data distributed according to the discrete Pareto distribution.
dparetotol.int(x, m = NULL, alpha = 0.05, P = 0.99, side = 1, ...)
dparetotol.int(x, m = NULL, alpha = 0.05, P = 0.99, side = 1, ...)
x |
A vector of raw data which is distributed according to a discrete Pareto distribution. |
m |
The number of observations in a future sample for which the tolerance limits will be calculated. By default, |
alpha |
The level chosen such that 1-alpha is the confidence level. |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
... |
Additional arguments passed to the |
The discrete Pareto is a discretized of the continuous Type II Pareto distribution (also called the Lomax distribution). Discrete Pareto distributions are heavily right-skewed distributions and potentially good models for discrete lifetime data and extremes in count data. For most practical applications, one will typically be interested in 1-sided upper bounds.
dparetotol.int
returns a data frame with the following items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
theta |
MLE for the shape parameter |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Young, D. S., Naghizadeh Qomi, M., and Kiapour, A. (2019), Approximate Discrete Pareto Tolerance Limits for Characterizing Extremes in Count Data, Statistica Neerlandica, 73, 4–21.
## 95%/95% 1-sided tolerance intervals for data assuming ## the discrete Pareto distribution. set.seed(100) x <- rdpareto(n = 500, theta = 0.5) out <- dparetotol.int(x, alpha = 0.05, P = 0.95, side = 1) out
## 95%/95% 1-sided tolerance intervals for data assuming ## the discrete Pareto distribution. set.seed(100) x <- rdpareto(n = 500, theta = 0.5) out <- dparetotol.int(x, alpha = 0.05, P = 0.95, side = 1) out
Provides 1-sided or 2-sided tolerance intervals for data distributed according to a 2-parameter exponential distribution. Data with Type II censoring is permitted.
exp2tol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("GPU", "DUN", "KM"), type.2 = FALSE)
exp2tol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("GPU", "DUN", "KM"), type.2 = FALSE)
x |
A vector of data which is distributed according to the 2-parameter exponential distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for how the upper tolerance bound is approximated. |
type.2 |
Select |
exp2tol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Dunsmore, I. R. (1978), Some Approximations for Tolerance Factors for the Two Parameter Exponential Distribution, Technometrics, 20, 317–318.
Engelhardt, M. and Bain, L. J. (1978), Tolerance Limits and Confidence Limits on Reliability for the Two-Parameter Exponential Distribution, Technometrics, 20, 37–39.
Guenther, W. C., Patil, S. A., and Uppuluri, V. R. R. (1976), One-Sided -Content Tolerance Factors
for the Two Parameter Exponential Distribution, Technometrics, 18, 333–340.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
## 95%/90% 1-sided 2-parameter exponential tolerance intervals ## for a sample of size 50. set.seed(100) x <- r2exp(50, 6, shift = 55) out <- exp2tol.int(x = x, alpha = 0.05, P = 0.90, side = 1, method = "DUN", type.2 = FALSE) out plottol(out, x, plot.type = "both", side = "upper", x.lab = "2-Parameter Exponential Data")
## 95%/90% 1-sided 2-parameter exponential tolerance intervals ## for a sample of size 50. set.seed(100) x <- r2exp(50, 6, shift = 55) out <- exp2tol.int(x = x, alpha = 0.05, P = 0.90, side = 1, method = "DUN", type.2 = FALSE) out plottol(out, x, plot.type = "both", side = "upper", x.lab = "2-Parameter Exponential Data")
Provides 1-sided or 2-sided tolerance intervals for data distributed according to an exponential distribution. Data with Type II censoring is permitted.
exptol.int(x, alpha = 0.05, P = 0.99, side = 1, type.2 = FALSE)
exptol.int(x, alpha = 0.05, P = 0.99, side = 1, type.2 = FALSE)
x |
A vector of data which is distributed according to an exponential distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
type.2 |
Select |
exptol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
lambda.hat |
The mean of the data (i.e., |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Blischke, W. R. and Murthy, D. N. P. (2000), Reliability: Modeling, Prediction, and Optimization, John Wiley & Sons, Inc.
## 95%/99% 1-sided exponential tolerance intervals for a ## sample of size 50. set.seed(100) x <- rexp(100, 0.004) out <- exptol.int(x = x, alpha = 0.05, P = 0.99, side = 1, type.2 = FALSE) out plottol(out, x, plot.type = "both", side = "lower", x.lab = "Exponential Data")
## 95%/99% 1-sided exponential tolerance intervals for a ## sample of size 50. set.seed(100) x <- rexp(100, 0.004) out <- exptol.int(x = x, alpha = 0.05, P = 0.99, side = 1, type.2 = FALSE) out plottol(out, x, plot.type = "both", side = "lower", x.lab = "Exponential Data")
Provides 1-sided or 2-sided tolerance intervals for data distributed according to either a Weibull distribution or extreme-value (also called Gumbel) distributions.
exttol.int(x, alpha = 0.05, P = 0.99, side = 1, dist = c("Weibull", "Gumbel"), ext = c("min", "max"), NR.delta = 1e-8)
exttol.int(x, alpha = 0.05, P = 0.99, side = 1, dist = c("Weibull", "Gumbel"), ext = c("min", "max"), NR.delta = 1e-8)
x |
A vector of data which is distributed according to either a Weibull distribution or an extreme-value distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
dist |
Select either |
ext |
If |
NR.delta |
The stopping criterion used for the Newton-Raphson algorithm when finding the maximum likelihood estimates of the Weibull or extreme-value distribution. |
Recall that the relationship between the Weibull distribution and the extreme-value distribution for the minimum is that if the
random variable is distributed according to a Weibull distribution, then the random variable
is
distributed according to an extreme-value distribution for the minimum.
If dist = "Weibull"
, then the natural logarithm of the data are taken so that a Newton-Raphson algorithm can
be employed to find the MLEs of the extreme-value distribution for the minimum and then the data and MLEs are transformed back appropriately.
No transformation is performed if dist = "Gumbel"
. The Newton-Raphson algorithm is initialized by the method of moments
estimators for the parameters.
exttol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
shape.1 |
MLE for the shape parameter if |
shape.2 |
MLE for the scale parameter if |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Bain, L. J. and Engelhardt, M. (1981), Simple Approximate Distributional Results for Confidence and Tolerance Limits for the Weibull Distribution Based on Maximum Likelihood Estimators, Technometrics, 23, 15–20.
Coles, S. (2001), An Introduction to Statistical Modeling of Extreme Values, Springer.
## 85%/90% 1-sided Weibull tolerance intervals for a sample ## of size 150. set.seed(100) x <- rweibull(150, 3, 75) out <- exttol.int(x = x, alpha = 0.15, P = 0.90, side = 1, dist = "Weibull") out plottol(out, x, plot.type = "both", side = "lower", x.lab = "Weibull Data")
## 85%/90% 1-sided Weibull tolerance intervals for a sample ## of size 150. set.seed(100) x <- rweibull(150, 3, 75) out <- exttol.int(x = x, alpha = 0.15, P = 0.90, side = 1, dist = "Weibull") out plottol(out, x, plot.type = "both", side = "lower", x.lab = "Weibull Data")
The Appell function of the first kind, which is a two variable extension of the hypergeometric distribution.
F1(a, b, b.prime, c, x, y, ...)
F1(a, b, b.prime, c, x, y, ...)
a , b , b.prime , c
|
Appropriate parameters for this function. |
x , y
|
The inputted values to evaluate this function such that each is less than 1 in absolute value. |
... |
Additional arguments passed to the |
F1
returns the simple integral result for the Appell function of the first kind with the arguments specified above.
This function is solved by using a simple integral representation for real numbers. While all four of the Appell functions can be extended to the complex plane, this is not an option for this code.
Bailey, W. N. (1935), Generalised Hypergeometric Series, Cambridge University Press.
## Sample calculation. F1(a = 3, b = 4, b.prime = 5, c = 13, x = 0.2, y = 0.4)
## Sample calculation. F1(a = 3, b = 4, b.prime = 5, c = 13, x = 0.2, y = 0.4)
Provides 1-sided or 2-sided tolerance intervals for the function of two binomial proportions using fiducial quantities.
fidbintol.int(x1, x2, n1, n2, m1 = NULL, m2 = NULL, FUN, alpha = 0.05, P = 0.99, side = 1, K = 1000, B = 1000)
fidbintol.int(x1, x2, n1, n2, m1 = NULL, m2 = NULL, FUN, alpha = 0.05, P = 0.99, side = 1, K = 1000, B = 1000)
x1 |
A value of observed "successes" from group 1. |
x2 |
A value of observed "successes" from group 2. |
n1 |
The total number of trials for group 1. |
n2 |
The total number of trials for group 2. |
m1 |
The total number of future trials for group 1. If |
m2 |
The total number of future trials for group 2. If |
FUN |
Any reasonable (and meaningful) function taking exactly two arguments that we are interested in constructing a tolerance interval on. |
alpha |
The level chosen such that 1-alpha is the confidence level. |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
K |
The number of fiducial quantities to be generated. The number of iterations should be at least as large as the default value of 1000. See |
B |
The number of iterations used for the Monte Carlo algorithm which determines the tolerance limits. The number of iterations should be at least as large as the default value of 1000. |
If is observed from a
distribution, then the fiducial quantity for
is
.
fidbintol.int
returns a list with two items. The first item (tol.limits
) is a data frame with the following items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
fn.est |
A point estimate of the functional form of interest using the maximum likelihood estimates calculated with the inputted values of |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
The second item (fn
) simply returns the functional form specified by FUN
.
Clopper, C. J. and Pearson, E. S. (1934), The Use of Confidence or Fiducial Limits Illustrated in the Case of the Binomial, Biometrika, 26, 404–413.
Krishnamoorthy, K. and Lee, M. (2010), Inference for Functions of Parameters in Discrete Distributions Based on Fiducial Approach: Binomial and Poisson Cases, Journal of Statistical Planning and Inference, 140, 1182–1192.
Mathew, T. and Young, D. S. (2013), Fiducial-Based Tolerance Intervals for Some Discrete Distributions, Computational Statistics and Data Analysis, 61, 38–49.
fidnegbintol.int
, fidpoistol.int
## 95%/99% 1-sided and 2-sided tolerance intervals for ## the difference between binomial proportions. set.seed(100) p1 <- 0.2 p2 <- 0.4 n1 <- n2 <- 200 x1 <- rbinom(1, n1, p1) x2 <- rbinom(1, n2, p2) fun.ti <- function(x, y) x - y fidbintol.int(x1, x2, n1, n2, m1 = 500, m2 = 500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 1) fidbintol.int(x1, x2, n1, n2, m1 = 500, m2 = 500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 2)
## 95%/99% 1-sided and 2-sided tolerance intervals for ## the difference between binomial proportions. set.seed(100) p1 <- 0.2 p2 <- 0.4 n1 <- n2 <- 200 x1 <- rbinom(1, n1, p1) x2 <- rbinom(1, n2, p2) fun.ti <- function(x, y) x - y fidbintol.int(x1, x2, n1, n2, m1 = 500, m2 = 500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 1) fidbintol.int(x1, x2, n1, n2, m1 = 500, m2 = 500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 2)
Provides 1-sided or 2-sided tolerance intervals for the function of two negative binomial proportions using fiducial quantities.
fidnegbintol.int(x1, x2, n1, n2, m1 = NULL, m2 = NULL, FUN, alpha = 0.05, P = 0.99, side = 1, K = 1000, B = 1000)
fidnegbintol.int(x1, x2, n1, n2, m1 = NULL, m2 = NULL, FUN, alpha = 0.05, P = 0.99, side = 1, K = 1000, B = 1000)
x1 |
A value of observed "failures" from group 1. |
x2 |
A value of observed "failures" from group 2. |
n1 |
The target number of successes for group 1. |
n2 |
The target number of successes for group 2. |
m1 |
The total number of future trials for group 1. If |
m2 |
The total number of future trials for group 2. If |
FUN |
Any reasonable (and meaningful) function taking exactly two arguments that we are interested in constructing a tolerance interval on. |
alpha |
The level chosen such that 1-alpha is the confidence level. |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
K |
The number of fiducial quantities to be generated. The number of iterations should be at least as large as the default value of 1000. See |
B |
The number of iterations used for the Monte Carlo algorithm which determines the tolerance limits. The number of iterations should be at least as large as the default value of 1000. |
If is observed from a
distribution, then the fiducial quantity for
is
.
fidnegbintol.int
returns a list with two items. The first item (tol.limits
) is a data frame with the following items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
fn.est |
A point estimate of the functional form of interest using the maximum likelihood estimates calculated with the inputted values of |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
The second item (fn
) simply returns the functional form specified by FUN
.
Cai, Y. and Krishnamoorthy, K. (2005), A Simple Improved Inferential Method for Some Discrete Distributions, Computational Statistics and Data Analysis, 48, 605–621.
Clopper, C. J. and Pearson, E. S. (1934), The Use of Confidence or Fiducial Limits Illustrated in the Case of the Binomial, Biometrika, 26, 404–413.
Krishnamoorthy, K. and Lee, M. (2010), Inference for Functions of Parameters in Discrete Distributions Based on Fiducial Approach: Binomial and Poisson Cases, Journal of Statistical Planning and Inference, 140, 1182–1192.
Mathew, T. and Young, D. S. (2013), Fiducial-Based Tolerance Intervals for Some Discrete Distributions, Computational Statistics and Data Analysis, 61, 38–49.
## 95%/99% 1-sided and 2-sided tolerance intervals for ## the ratio of odds ratios for negative binomial proportions. set.seed(100) p1 <- 0.6 p2 <- 0.2 n1 <- n2 <- 50 x1 <- rnbinom(1, n1, p1) x2 <- rnbinom(1, n2, p2) fun.ti <- function(x, y) x * (1 - y) / (y * (1 - x)) fidnegbintol.int(x1, x2, n1, n2, m1 = 50, m2 = 50, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 1) fidnegbintol.int(x1, x2, n1, n2, m1 = 50, m2 = 50, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 2)
## 95%/99% 1-sided and 2-sided tolerance intervals for ## the ratio of odds ratios for negative binomial proportions. set.seed(100) p1 <- 0.6 p2 <- 0.2 n1 <- n2 <- 50 x1 <- rnbinom(1, n1, p1) x2 <- rnbinom(1, n2, p2) fun.ti <- function(x, y) x * (1 - y) / (y * (1 - x)) fidnegbintol.int(x1, x2, n1, n2, m1 = 50, m2 = 50, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 1) fidnegbintol.int(x1, x2, n1, n2, m1 = 50, m2 = 50, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 2)
Provides 1-sided or 2-sided tolerance intervals for the function of two Poisson rates using fiducial quantities.
fidpoistol.int(x1, x2, n1, n2, m1 = NULL, m2 = NULL, FUN, alpha = 0.05, P = 0.99, side = 1, K = 1000, B = 1000)
fidpoistol.int(x1, x2, n1, n2, m1 = NULL, m2 = NULL, FUN, alpha = 0.05, P = 0.99, side = 1, K = 1000, B = 1000)
x1 |
A value of observed counts from group 1. |
x2 |
A value of observed counts from group 2. |
n1 |
The length of time that |
n2 |
The length of time that |
m1 |
The total number of future trials for group 1. If |
m2 |
The total number of future trials for group 2. If |
FUN |
Any reasonable (and meaningful) function taking exactly two arguments that we are interested in constructing a tolerance interval on. |
alpha |
The level chosen such that 1-alpha is the confidence level. |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
K |
The number of fiducial quantities to be generated. The number of iterations should be at least as large as the default value of 1000. See |
B |
The number of iterations used for the Monte Carlo algorithm which determines the tolerance limits. The number of iterations should be at least as large as the default value of 1000. |
If is observed from a
distribution, then the fiducial quantity for
is
.
fidpoistol.int
returns a list with two items. The first item (tol.limits
) is a data frame with the following items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
fn.est |
A point estimate of the functional form of interest using the maximum likelihood estimates calculated with the inputted values of |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
The second item (fn
) simply returns the functional form specified by FUN
.
Cox, D. R. (1953), Some Simple Approximate Tests for Poisson Variates, Biometrika, 40, 354–360.
Krishnamoorthy, K. and Lee, M. (2010), Inference for Functions of Parameters in Discrete Distributions Based on Fiducial Approach: Binomial and Poisson Cases, Journal of Statistical Planning and Inference, 140, 1182–1192.
Mathew, T. and Young, D. S. (2013), Fiducial-Based Tolerance Intervals for Some Discrete Distributions, Computational Statistics and Data Analysis, 61, 38–49.
fidbintol.int
, fidnegbintol.int
## 95%/99% 1-sided and 2-sided tolerance intervals for ## the ratio of two Poisson rates. set.seed(100) lambda1 <- 10 lambda2 <- 2 n1 <- 3000 n2 <- 3250 x1 <- rpois(1, n1 * lambda1) x2 <- rpois(1, n2 * lambda2) fun.ti <- function(x, y) x / y fidpoistol.int(x1, x2, n1, n2, m1 = 2000, m2 = 2500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 1) fidpoistol.int(x1, x2, n1, n2, m1 = 2000, m2 = 2500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 2)
## 95%/99% 1-sided and 2-sided tolerance intervals for ## the ratio of two Poisson rates. set.seed(100) lambda1 <- 10 lambda2 <- 2 n1 <- 3000 n2 <- 3250 x1 <- rpois(1, n1 * lambda1) x2 <- rpois(1, n2 * lambda2) fun.ti <- function(x, y) x / y fidpoistol.int(x1, x2, n1, n2, m1 = 2000, m2 = 2500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 1) fidpoistol.int(x1, x2, n1, n2, m1 = 2000, m2 = 2500, FUN = fun.ti, alpha = 0.05, P = 0.99, side = 2)
Provides 1-sided or 2-sided tolerance intervals for data distributed according to either a gamma distribution or log-gamma distribution.
gamtol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, log.gamma = FALSE)
gamtol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, log.gamma = FALSE)
x |
A vector of data which is distributed according to either a gamma distribution or a log-gamma distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
log.gamma |
If |
Recall that if the random variable is distributed according to a log-gamma distribution, then the random variable
is
distributed according to a gamma distribution.
gamtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Krishnamoorthy, K., Mathew, T., and Mukherjee, S. (2008), Normal-Based Methods for a Gamma Distribution: Prediction and Tolerance Intervals and Stress-Strength Reliability, Technometrics, 50, 69–78.
## 99%/99% 1-sided gamma tolerance intervals for a sample ## of size 50. set.seed(100) x <- rgamma(50, 0.30, scale = 2) out <- gamtol.int(x = x, alpha = 0.01, P = 0.99, side = 1) out plottol(out, x, plot.type = "both", side = "upper", x.lab = "Gamma Data")
## 99%/99% 1-sided gamma tolerance intervals for a sample ## of size 50. set.seed(100) x <- rgamma(50, 0.30, scale = 2) out <- gamtol.int(x = x, alpha = 0.01, P = 0.99, side = 1) out plottol(out, x, plot.type = "both", side = "upper", x.lab = "Gamma Data")
Provides 1-sided or 2-sided tolerance intervals for hypergeometric random variables. From a sampling without replacement perspective, these limits use the proportion of units from group A (e.g., "black balls" in an urn) in a sample to bound the number of potential units drawn from group A in a future sample taken from the universe.
hypertol.int(x, n, N, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("EX", "LS", "CC"))
hypertol.int(x, n, N, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("EX", "LS", "CC"))
x |
The number of units from group A in the sample. Can be a vector, in which case the sum of |
n |
The size of the random sample of units selected. |
N |
The population size. |
m |
The quantity of units to be sampled from the universe for a future study. If |
alpha |
The level chosen such that |
P |
The proportion of units from group A in future samples of size |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the lower and upper confidence bounds, which are used in the calculation
of the tolerance bounds. The default method is |
hypertol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of units from group A in future samples of size |
rate |
The sampling rate determined by |
p.hat |
The proportion of units in the sample from group A, calculated by |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
As this methodology is built using large-sample theory, if the sampling rate is less than 0.05, then a warning is generated stating that the results are not reliable. Also, compare the functionality of this procedure with the acc.samp
procedure, which is to determine a minimal acceptance limit for a particular sampling plan.
Brown, L. D., Cai, T. T., and DasGupta, A. (2001), Interval Estimation for a Binomial Proportion, Statistical Science, 16, 101–133.
Eichenberger, P., Hulliger, B., and Potterat, J. (2011), Two Measures for Sample Size Determination, Survey Research Methods, 5, 27–37.
Young, D. S. (2014), Tolerance Intervals for Hypergeometric and Negative Hypergeometric Variables, Sankhya: The Indian Journal of Statistics, Series B, 77(1), 114–140.
## 90%/95% 1-sided and 2-sided hypergeometric tolerance ## intervals for a future sample of 30 when the universe ## is of size 100. hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 1, method = "LS") hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 1, method = "CC") hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 2, method = "LS") hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 2, method = "CC")
## 90%/95% 1-sided and 2-sided hypergeometric tolerance ## intervals for a future sample of 30 when the universe ## is of size 100. hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 1, method = "LS") hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 1, method = "CC") hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 2, method = "LS") hypertol.int(x = 15, n = 50, N = 100, m = 30, alpha = 0.10, P = 0.95, side = 2, method = "CC")
Estimates k-factors for tolerance intervals based on normality.
K.factor(n, f = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50)
K.factor(n, f = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50)
n |
The (effective) sample size. |
f |
The number of degrees of freedom associated with calculating the estimate of the population standard deviation.
If |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by the tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
K.factor
returns the k-factor for tolerance intervals based on normality with the arguments specified above.
For larger sample sizes, there may be some accuracy issues with the 1-sided calculation since it depends on the noncentral t-distribution.
The code is primarily intended to be used for moderate values of the noncentrality parameter. It will not be highly accurate, especially in the tails, for large values.
See TDist
for further details.
Ellison, B. E. (1964), On Two-Sided Tolerance Intervals for a Normal Distribution, Annals of Mathematical Statistics, 35, 762–772.
Howe, W. G. (1969), Two-Sided Tolerance Limits for Normal Populations - Some Improvements, Journal of the American Statistical Association, 64, 610–620.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
Odeh, R. E. and Owen, D. B. (1980), Tables for Normal Tolerance Limits, Sampling Plans, and Screening, Marcel-Dekker.
Owen, D. B. (1964), Controls of Percentages in Both Tails of the Normal Distribution, Technometrics, 6, 377-387.
Wald, A. and Wolfowitz, J. (1946), Tolerance Limits for a Normal Distribution, Annals of the Mathematical Statistics, 17, 208–215.
Weissberg, A. and Beatty, G. (1969), Tables of Tolerance Limit Factors for Normal Distributions, Technometrics, 2, 483–500.
integrate
, K.table
, normtol.int
, TDist
## Showing the k-factor under the Howe, Weissberg-Beatty, ## and exact estimation methods. K.factor(10, P = 0.95, side = 2, method = "HE") K.factor(10, P = 0.95, side = 2, method = "WBE") K.factor(10, P = 0.95, side = 2, method = "EXACT", m = 20)
## Showing the k-factor under the Howe, Weissberg-Beatty, ## and exact estimation methods. K.factor(10, P = 0.95, side = 2, method = "HE") K.factor(10, P = 0.95, side = 2, method = "WBE") K.factor(10, P = 0.95, side = 2, method = "EXACT", m = 20)
Estimates k-factors for simultaneous tolerance intervals based on normality.
K.factor.sim(n, l = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("EXACT", "BONF"), m = 50)
K.factor.sim(n, l = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("EXACT", "BONF"), m = 50)
n |
If |
l |
The number of normal populations for which the k-factors will be constructed simultaneously.
If |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by the tolerance interval. |
side |
Whether a k-factor for a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. |
m |
The maximum number of subintervals to be used in the |
K.factor
returns the k-factor for simultaneous tolerance intervals based on normality with the arguments specified above.
For larger combinations of n
and l
when side = 2
and method = "EXACT"
, the calculation can be slow. For larger sample sizes when method = "BONF"
, there may be some accuracy issues with the 1-sided calculation since it depends on the noncentral t-distribution.
The code is primarily intended to be used for moderate values of the noncentrality parameter. It will not be highly accurate, especially in the tails, for large values.
See TDist
for further details.
Thanks to Andrew Landgraf for providing the basic code for the method = "EXACT"
procedure.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
Mee, R. W. (1990), Simultaneous Tolerance Intervals for Normal Populations with Common Variance, Technometrics, 32, 83-92.
## Reproducing part of Table B5 from Krishnamoorthy and ## Mathew (2009). n_sizes <- c(2:20, seq(30, 100, 10)) l_sizes <- 2:10 KM_table <- sapply(1:length(l_sizes), function(i) sapply(1:length(n_sizes), function(j) round(K.factor.sim(n = n_sizes[j], l = l_sizes[i], side=1, alpha = 0.1, P = 0.9),3))) dimnames(KM_table) <- list(n = n_sizes, l = l_sizes) KM_table
## Reproducing part of Table B5 from Krishnamoorthy and ## Mathew (2009). n_sizes <- c(2:20, seq(30, 100, 10)) l_sizes <- 2:10 KM_table <- sapply(1:length(l_sizes), function(i) sapply(1:length(n_sizes), function(j) round(K.factor.sim(n = n_sizes[j], l = l_sizes[i], side=1, alpha = 0.1, P = 0.9),3))) dimnames(KM_table) <- list(n = n_sizes, l = l_sizes) KM_table
Tabulated summary of k-factors for tolerance intervals based on normality. The user can specify multiple values for each of the three inputs.
K.table(n, alpha, P, side = 1, f = NULL, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, by.arg = c("n", "alpha", "P"))
K.table(n, alpha, P, side = 1, f = NULL, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, by.arg = c("n", "alpha", "P"))
n |
A vector of (effective) sample sizes. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. Can be a vector. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
f |
The number of degrees of freedom associated with calculating the estimate of the population standard deviation.
If |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
by.arg |
How you would like the output organized. If |
The method used for estimating the k-factors is that due to Howe as it is generally viewed as more accurate than the Weissberg-Beatty method.
K.table
returns a list with a structure determined by the argument by.arg
described above.
Howe, W. G. (1969), Two-Sided Tolerance Limits for Normal Populations - Some Improvements, Journal of the American Statistical Association, 64, 610–620.
Weissberg, A. and Beatty, G. (1969), Tables of Tolerance Limit Factors for Normal Distributions, Technometrics, 2, 483–500.
## Tables generated for each value of the sample size. K.table(n = seq(50, 100, 10), alpha = c(0.01, 0.05, 0.10), P = c(0.90, 0.95, 0.99), by.arg = "n") ## Tables generated for each value of the confidence level. K.table(n = seq(50, 100, 10), alpha = c(0.01, 0.05, 0.10), P = c(0.90, 0.95, 0.99), by.arg = "alpha") ## Tables generated for each value of the coverage proportion. K.table(n = seq(50, 100, 10), alpha = c(0.01, 0.05, 0.10), P = c(0.90, 0.95, 0.99), by.arg = "P")
## Tables generated for each value of the sample size. K.table(n = seq(50, 100, 10), alpha = c(0.01, 0.05, 0.10), P = c(0.90, 0.95, 0.99), by.arg = "n") ## Tables generated for each value of the confidence level. K.table(n = seq(50, 100, 10), alpha = c(0.01, 0.05, 0.10), P = c(0.90, 0.95, 0.99), by.arg = "alpha") ## Tables generated for each value of the coverage proportion. K.table(n = seq(50, 100, 10), alpha = c(0.01, 0.05, 0.10), P = c(0.90, 0.95, 0.99), by.arg = "P")
Provides 1-sided or 2-sided tolerance intervals for data distributed according to a Laplace distribution.
laptol.int(x, alpha = 0.05, P = 0.99, side = 1)
laptol.int(x, alpha = 0.05, P = 0.99, side = 1)
x |
A vector of data which is distributed according to a Laplace distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
laptol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Bain, L. J. and Engelhardt, M. (1973), Interval Estimation for the Two Parameter Double Exponential Distribution, Technometrics, 15, 875–887.
## First generate data from a Laplace distribution with location ## parameter 70 and scale parameter 3. set.seed(100) tmp <- runif(40) x <- rep(70, 40) - sign(tmp - 0.5)*rep(3, 40)* log(2*ifelse(tmp < 0.5, tmp, 1-tmp)) ## 95%/90% 1-sided Laplace tolerance intervals for the sample ## of size 40 generated above. out <- laptol.int(x = x, alpha = 0.05, P = 0.90, side = 1) out plottol(out, x, plot.type = "hist", side = "two", x.lab = "Laplace Data")
## First generate data from a Laplace distribution with location ## parameter 70 and scale parameter 3. set.seed(100) tmp <- runif(40) x <- rep(70, 40) - sign(tmp - 0.5)*rep(3, 40)* log(2*ifelse(tmp < 0.5, tmp, 1-tmp)) ## 95%/90% 1-sided Laplace tolerance intervals for the sample ## of size 40 generated above. out <- laptol.int(x = x, alpha = 0.05, P = 0.90, side = 1) out plottol(out, x, plot.type = "hist", side = "two", x.lab = "Laplace Data")
Provides 1-sided or 2-sided tolerance intervals for data distributed according to a logistic or log-logistic distribution.
logistol.int(x, alpha = 0.05, P = 0.99, log.log = FALSE, side = 1, method = c("HALL", "BE"))
logistol.int(x, alpha = 0.05, P = 0.99, log.log = FALSE, side = 1, method = c("HALL", "BE"))
x |
A vector of data which is distributed according to a logistic or log-logistic distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
log.log |
If |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the tolerance limits. |
Recall that if the random variable is distributed according to a log-logistic distribution, then the random variable
is
distributed according to a logistic distribution. For
method = "HALL"
, the method due to Hall (1975) is implemented. This, however, can have
numerical instabilities due to taking square roots of negative numbers in the calculation, thus leading to two-sided tolerance limits where the upper
tolerance limit is smaller than the lower tolerance limit. method = "BE"
calculates the limits using the method due to Bain and Englehardt (1991),
which tends to be more reliable.
logistol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Bain, L. and Englehardt, M. (1991), Statistical Analysis of Reliability and Life Testing Models: Theory and Methods, Second Edition, Marcel Dekker, Inc.
Balakrishnan, N. (1992), Handbook of the Logistic Distribution, Marcel Dekker, Inc.
Hall, I. J. (1975), One-Sided Tolerance Limits for a Logistic Distribution Based on Censored Samples, Biometrics, 31, 873–880.
## 90%/95% 1-sided logistic tolerance intervals for a sample ## of size 20. set.seed(100) x <- rlogis(20, 5, 1) out <- logistol.int(x = x, alpha = 0.10, P = 0.95, log.log = FALSE, side = 1) out plottol(out, x, plot.type = "control", side = "two", x.lab = "Logistic Data")
## 90%/95% 1-sided logistic tolerance intervals for a sample ## of size 20. set.seed(100) x <- rlogis(20, 5, 1) out <- logistol.int(x = x, alpha = 0.10, P = 0.95, log.log = FALSE, side = 1) out plottol(out, x, plot.type = "control", side = "two", x.lab = "Logistic Data")
Determines the appropriate tolerance factor for computing multivariate (multiple) linear regression tolerance regions based on Monte Carlo simulation.
mvregtol.region(mvreg, new.x = NULL, alpha = 0.05, P = 0.99, B = 1000)
mvregtol.region(mvreg, new.x = NULL, alpha = 0.05, P = 0.99, B = 1000)
mvreg |
A multivariate (multiple) linear regression fit, having class |
new.x |
An optional data frame of new values for which to approximate k-factors. This must be a data frame with named columns that match those in the data frame used for the |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance region. |
B |
The number of iterations used for the Monte Carlo algorithm which determines the tolerance factor. The number of iterations should be at least as large as the default value of 1000. |
A basic sketch of how the algorithm works is as follows:
(1) Generate independent chi-square random variables and Wishart random matrices.
(2) Compute the eigenvalues of the randomly generated Wishart matrices.
(3) Iterate the above steps to generate a set of B
sample values such that the 100(1-alpha)
-th percentile is an approximate tolerance factor.
mvregtol.region
returns a matrix where the first column is the k-factor, the next q
columns are the estimated responses
from the least squares fit, and the final m
columns are the predictor values. The first n
rows of the matrix pertain to the raw data
as specified by y
and x
. If values for new.x
are specified, then there is one additional row appended to this output for each
row in the matrix new.x
.
As of tolerance version 2.0.0, the arguments to this function have changed. This function no longer depends on inputted y
and x
matrices or an int
argument. Instead, the function requires mvreg
, which is of class "mlm", and provides all of the necessary components for the way the output is formatted. Also, new.x
must now be a data frame with columns matching those from the data frame used in the mvreg
fitted object.
Anderson, T. W. (2003) An Introduction to Multivariate Statistical Analysis, Third Edition, Wiley.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
Krishnamoorthy, K. and Mondal, S. (2008), Tolerance Factors in Multiple and Multivariate Linear Regressions, Communications in Statistics - Simulation and Computation, 37, 546–559.
## 95%/95% multivariate regression tolerance factors using ## a fertilizer data set presented in Anderson (2003, p. 374). grain <- c(40, 17, 9, 15, 6, 12, 5, 9) straw <- c(53, 19, 10, 29, 13, 27, 19, 30) fert <- c(24, 11, 5, 12, 7, 14, 11, 18) DF <- data.frame(grain,straw,fert) new.x <- data.frame(fert = c(10, 15, 20)) mvreg <- lm(cbind(grain, straw) ~ fert + I(fert^2), data = DF) set.seed(100) out <- mvregtol.region(mvreg, new.x = new.x, alpha = 0.05, P = 0.95, B = 5000) out
## 95%/95% multivariate regression tolerance factors using ## a fertilizer data set presented in Anderson (2003, p. 374). grain <- c(40, 17, 9, 15, 6, 12, 5, 9) straw <- c(53, 19, 10, 29, 13, 27, 19, 30) fert <- c(24, 11, 5, 12, 7, 14, 11, 18) DF <- data.frame(grain,straw,fert) new.x <- data.frame(fert = c(10, 15, 20)) mvreg <- lm(cbind(grain, straw) ~ fert + I(fert^2), data = DF) set.seed(100) out <- mvregtol.region(mvreg, new.x = new.x, alpha = 0.05, P = 0.95, B = 5000) out
Determines the appropriate tolerance factor for computing multivariate normal tolerance regions based on Monte Carlo methods or other approximations.
mvtol.region(x, alpha = 0.05, P = 0.99, B = 1000, M = 1000, method = c("KM", "AM", "GM", "HM", "MHM", "V11", "HM.V11", "MC"))
mvtol.region(x, alpha = 0.05, P = 0.99, B = 1000, M = 1000, method = c("KM", "AM", "GM", "HM", "MHM", "V11", "HM.V11", "MC"))
x |
An |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance region. A vector of |
B |
The number of iterations used for the Monte Carlo algorithms (i.e., when |
M |
The number of iterations used for the inner loop of the Monte Carlo algorithm specified through |
method |
The method for estimating the tolerance factors. |
All of the methods are outlined in the references that we provided. In practice, we recommend using the Krishnamoorthy-Mondal approach. A basic sketch of how the Krishnamoorthy-Mondal algorithm works is as follows:
(1) Generate independent chi-square random variables and Wishart random matrices.
(2) Compute the eigenvalues of the randomly generated Wishart matrices.
(3) Iterate the above steps to generate a set of B
sample values such that the 100(1-alpha)
-th percentile is an approximate tolerance factor.
mvtol.region
returns a matrix where the rows pertain to each confidence level 1-alpha
specified and the columns
pertain to each proportion level P
specified.
Krishnamoorthy, K. and Mathew, T. (1999), Comparison of Approximation Methods for Computing Tolerance Factors for a Multivariate Normal Population, Technometrics, 41, 234–249.
Krishnamoorthy, K. and Mondal, S. (2006), Improved Tolerance Factors for Multivariate Normal Distributions, Communications in Statistics - Simulation and Computation, 35, 461–478.
## 90%/90% bivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x <- cbind(x1, x2) out1 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out1 plottol(out1, x) ## 90%/90% trivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x3 <- rnorm(100, 5, 1) x <- cbind(x1, x2, x3) mvtol.region(x = x, alpha = c(0.10, 0.05, 0.01), P = c(0.90, 0.95, 0.99), B = 1000, method = "KM") out2 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out2 plottol(out2, x)
## 90%/90% bivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x <- cbind(x1, x2) out1 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out1 plottol(out1, x) ## 90%/90% trivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x3 <- rnorm(100, 5, 1) x <- cbind(x1, x2, x3) mvtol.region(x = x, alpha = c(0.10, 0.05, 0.01), P = c(0.90, 0.95, 0.99), B = 1000, method = "KM") out2 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out2 plottol(out2, x)
Provides 1-sided or 2-sided tolerance intervals for negative binomial random variables. From a statistical quality control perspective, these limits use the number of failures that occur to reach n
successes to bound the number of failures for a specified amount of future successes (m
).
negbintol.int(x, n, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("LS", "WU", "CB", "CS", "SC", "LR", "SP", "CC"))
negbintol.int(x, n, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("LS", "WU", "CB", "CS", "SC", "LR", "SP", "CC"))
x |
The total number of failures that occur from a sample of size |
n |
The target number of successes (sometimes called size) for each trial. |
m |
The target number of successes in a future lot for which the tolerance limits will be calculated. If |
alpha |
The level chosen such that |
P |
The proportion of the defective (or acceptable) units in future samples of size |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the lower and upper confidence bounds, which are used in the calculation
of the tolerance bounds. The default method is |
This function takes the approach for Poisson and binomial random variables developed in Hahn and Chandra (1981) and applies it to the negative binomial case.
negbintol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of defective (or acceptable) units in future samples of size |
pi.hat |
The probability of success in each trial, calculated by |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Recall that the geometric distribution is the negative binomial distribution where the size is 1. Therefore, the case when n = m = 1
will provide tolerance limits for a geometric distribution.
Casella, G. and Berger, R. L. (1990), Statistical Inference, Duxbury Press.
Hahn, G. J. and Chandra, R. (1981), Tolerance Intervals for Poisson and Binomial Variables, Journal of Quality Technology, 13, 100–110.
Tian, M., Tang, M. L., Ng, H. K. T., and Chan, P. S. (2009), A Comparative Study of Confidence Intervals for Negative Binomial Proportions, Journal of Statistical Computation and Simulation, 79, 241–249.
Young, D. S. (2014), A Procedure for Approximate Negative Binomial Tolerance Intervals, Journal of Statistical Computation and Simulation, 84, 438–450.
## Comparison of 95%/99% 1-sided tolerance limits with ## 50 failures before 10 successes are reached. negbintol.int(x = 50, n = 10, side = 1, method = "LS") negbintol.int(x = 50, n = 10, side = 1, method = "WU") negbintol.int(x = 50, n = 10, side = 1, method = "CB") negbintol.int(x = 50, n = 10, side = 1, method = "CS") negbintol.int(x = 50, n = 10, side = 1, method = "SC") negbintol.int(x = 50, n = 10, side = 1, method = "LR") negbintol.int(x = 50, n = 10, side = 1, method = "SP") negbintol.int(x = 50, n = 10, side = 1, method = "CC") ## 95%/99% 1-sided tolerance limits and 2-sided tolerance ## interval for the same setting above, but when we are ## interested in a future experiment that requires 20 successes ## be reached for each trial. negbintol.int(x = 50, n = 10, m = 20, side = 1) negbintol.int(x = 50, n = 10, m = 20, side = 2)
## Comparison of 95%/99% 1-sided tolerance limits with ## 50 failures before 10 successes are reached. negbintol.int(x = 50, n = 10, side = 1, method = "LS") negbintol.int(x = 50, n = 10, side = 1, method = "WU") negbintol.int(x = 50, n = 10, side = 1, method = "CB") negbintol.int(x = 50, n = 10, side = 1, method = "CS") negbintol.int(x = 50, n = 10, side = 1, method = "SC") negbintol.int(x = 50, n = 10, side = 1, method = "LR") negbintol.int(x = 50, n = 10, side = 1, method = "SP") negbintol.int(x = 50, n = 10, side = 1, method = "CC") ## 95%/99% 1-sided tolerance limits and 2-sided tolerance ## interval for the same setting above, but when we are ## interested in a future experiment that requires 20 successes ## be reached for each trial. negbintol.int(x = 50, n = 10, m = 20, side = 1) negbintol.int(x = 50, n = 10, m = 20, side = 2)
Density, distribution function, quantile function, and random generation for the negative hypergeometric distribution.
dnhyper(x, m, n, k, log = FALSE) pnhyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE) qnhyper(p, m, n, k, lower.tail = TRUE, log.p = FALSE) rnhyper(nn, m, n, k)
dnhyper(x, m, n, k, log = FALSE) pnhyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE) qnhyper(p, m, n, k, lower.tail = TRUE, log.p = FALSE) rnhyper(nn, m, n, k)
x , q
|
Vector of quantiles representing the number of trials until |
m |
The number of successes in the population (e.g., the number of white balls in the urn). |
n |
The population size (e.g., the total number of balls in the urn). |
k |
The number of successes (e.g., white balls) to achieve with the sample. |
p |
Vector of probabilities, which must be between 0 and 1. |
nn |
The number of observations. If |
log , log.p
|
Logical vectors. If |
lower.tail |
Logical vector. If |
A negative hypergeometric distribution (sometimes called the inverse hypergeometric distribution) models the total number of trials until k
successes occur. Compare this to the negative binomial distribution, which models the number of failures that occur until a specified number of successes has been reached. The negative hypergeometric distribution has density
for .
dnhyper
gives the density, pnhyper
gives the distribution function, qnhyper
gives the quantile
function, and rnhyper
generates random deviates.
Invalid arguments will return value NaN
, with a warning.
Wilks, S. S. (1963), Mathematical Statistics, Wiley.
runif
and .Random.seed
about random number generation.
## Randomly generated data from the negative hypergeometric ## distribution. set.seed(100) x <- rnhyper(nn = 1000, m = 15, n = 40, k = 10) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 = sort(x) y <- dnhyper(x = x.1, m = 15, n = 40, k = 10) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pnhyper(q = x.1, m = 15, n = 40, k = 10), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qnhyper(p = 0.20, m = 15, n = 40, k = 10, lower.tail = FALSE) qnhyper(p = 0.80, m = 15, n = 40, k = 10)
## Randomly generated data from the negative hypergeometric ## distribution. set.seed(100) x <- rnhyper(nn = 1000, m = 15, n = 40, k = 10) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 = sort(x) y <- dnhyper(x = x.1, m = 15, n = 40, k = 10) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pnhyper(q = x.1, m = 15, n = 40, k = 10), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qnhyper(p = 0.20, m = 15, n = 40, k = 10, lower.tail = FALSE) qnhyper(p = 0.80, m = 15, n = 40, k = 10)
Provides 1-sided or 2-sided tolerance intervals for negative hypergeometric random variables. When sampling without replacement, these limits are on the total number of expected draws in a future sample in order to achieve a certain number from group A (e.g., "black balls" in an urn).
neghypertol.int(x, n, N, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("EX", "LS", "CC"))
neghypertol.int(x, n, N, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("EX", "LS", "CC"))
x |
The number of units drawn in order to achieve |
n |
The target number of successes in the sample drawn (e.g., the number of "black balls" you are to draw in the sample). |
N |
The population size (e.g., the total number of balls in the urn). |
m |
The target number of successes to be sampled from the universe for a future study. If |
alpha |
The level chosen such that |
P |
The proportion of units from group A in future samples of size |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the lower and upper confidence bounds, which are used in the calculation
of the tolerance bounds. The default method is |
neghypertol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of units from group A in future samples of size |
rate |
The sampling rate determined by |
p.hat |
The proportion of units in the sample from group A, calculated by |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
As this methodology is built using large-sample theory, if the sampling rate is less than 0.05, then a warning is generated stating that the results are not reliable.
Khan, R. A. (1994), A Note on the Generating Function of a Negative Hypergeometric Distribution, Sankhya: The Indian Journal of Statistics, Series B, 56, 309–313.
Young, D. S. (2014), Tolerance Intervals for Hypergeometric and Negative Hypergeometric Variables, Sankhya: The Indian Journal of Statistics, Series B, 77(1), 114–140.
## 90%/95% 2-sided negative hypergeometric tolerance ## intervals for a future number of 20 successes when ## the universe is of size 100. The estimates are ## based on having drawn 50 in another sample to achieve ## 20 successes. neghypertol.int(50, 20, 100, m = 20, alpha = 0.05, P = 0.95, side = 2, method = "LS")
## 90%/95% 2-sided negative hypergeometric tolerance ## intervals for a future number of 20 successes when ## the universe is of size 100. The estimates are ## based on having drawn 50 in another sample to achieve ## 20 successes. neghypertol.int(50, 20, 100, m = 20, alpha = 0.05, P = 0.95, side = 2, method = "LS")
Provides 1-sided or 2-sided nonlinear regression tolerance bounds.
nlregtol.int(formula, xy.data = data.frame(), x.new = NULL, side = 1, alpha = 0.05, P = 0.99, maxiter = 50, new = FALSE, ...)
nlregtol.int(formula, xy.data = data.frame(), x.new = NULL, side = 1, alpha = 0.05, P = 0.99, maxiter = 50, new = FALSE, ...)
formula |
A nonlinear model formula including variables and parameters. |
xy.data |
A data frame in which to evaluate the formulas in |
x.new |
Any new levels of the predictor(s) for which to report the tolerance bounds. The number of columns must be
1 less than the number of columns for |
side |
Whether a 1-sided or 2-sided tolerance bound is required (determined by |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by the tolerance bound(s). |
maxiter |
A positive integer specifying the maximum number of iterations that the nonlinear least squares routine ( |
new |
When |
... |
Optional arguments passed to |
It is highly recommended that the user specify starting values for the nls
routine.
npregtol.int2
returns a list with items:
tol |
Data frame of original response varible |
alpha.P.side |
Model specifications of critical level, content level and side. |
reg.type |
Type of regression model. |
model |
The linear regression model fitted. |
newdata |
X values of new data for prediction. |
xy.data.original |
Original data frame |
Wallis, W. A. (1951), Tolerance Intervals for Linear Regression, in Second Berkeley Symposium on Mathematical Statistics and Probability, ed. J. Neyman, Berkeley: University of CA Press, 43–51.
Young, D. S. (2013), Regression Tolerance Intervals, Communications in Statistics - Simulation and Computation, 42, 2040–2055.
## 95%/95% 2-sided nonlinear regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) formula <- as.formula(y ~ b1 + (0.49 - b1)*exp(-b2*(x - 8))) out1 <- nlregtol.int(formula = formula, xy.data = data.frame(cbind(y, x)), x.new=c(10,20), side = 2, alpha = 0.05, P = 0.95 , new = TRUE) out1 ######### set.seed(100) x1 <- runif(50, 5, 45) x2 <- rnorm(50, 0, 10) f1 <- function(x1, x2, b1, b2) {(0.49 - b1)*exp(-b2*(x1 + x2 - 8)) + rnorm(50, sd = 0.01)} y <- f1(x1 , x2 , 0.25 , 0.39) formula <- as.formula(y ~ (0.49 - b1)*exp(-b2*(x1 + x2 - 8))) out2 <- nlregtol.int(formula = formula, xy.data = data.frame(cbind(y, x1 , x2)), x.new=cbind(c(10,20) , c(47 , 53)), side = 2, alpha = 0.05, P = 0.95 , new = TRUE) out2
## 95%/95% 2-sided nonlinear regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) formula <- as.formula(y ~ b1 + (0.49 - b1)*exp(-b2*(x - 8))) out1 <- nlregtol.int(formula = formula, xy.data = data.frame(cbind(y, x)), x.new=c(10,20), side = 2, alpha = 0.05, P = 0.95 , new = TRUE) out1 ######### set.seed(100) x1 <- runif(50, 5, 45) x2 <- rnorm(50, 0, 10) f1 <- function(x1, x2, b1, b2) {(0.49 - b1)*exp(-b2*(x1 + x2 - 8)) + rnorm(50, sd = 0.01)} y <- f1(x1 , x2 , 0.25 , 0.39) formula <- as.formula(y ~ (0.49 - b1)*exp(-b2*(x1 + x2 - 8))) out2 <- nlregtol.int(formula = formula, xy.data = data.frame(cbind(y, x1 , x2)), x.new=cbind(c(10,20) , c(47 , 53)), side = 2, alpha = 0.05, P = 0.95 , new = TRUE) out2
Provides OC-type curves to illustrate how values of the k-factors for normal tolerance intervals, confidence levels, and content levels change as a function of the sample size.
norm.OC(k = NULL, alpha = NULL, P = NULL, n, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50)
norm.OC(k = NULL, alpha = NULL, P = NULL, n, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50)
k |
If wanting OC curves where the confidence level or content level is on the y-axis, then a single positive value of |
alpha |
The set of levels chosen such that |
P |
The set of content levels to be considered. If wanting OC curves where the confidence level is being calculated, then each curve will correspond to a level in the set of |
n |
A sequence of sample sizes to consider. This must be a vector of at least length 2 since all OC curves are constructed as functions of |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
norm.OC
returns a figure with the OC curves constructed using the specifications in the arguments.
Young, D. S. (2016), Normal Tolerance Interval Procedures in the tolerance Package, The R Journal, 8, 200–212.
## The three types of OC-curves that can be constructed ## with the norm.OC function. norm.OC(k = 4, alpha = NULL, P = c(0.90, 0.95, 0.99), n = 10:20, side = 1) norm.OC(k = 4, alpha = c(0.01, 0.05, 0.10), P = NULL, n = 10:20, side = 1) norm.OC(k = NULL, P = c(0.90, 0.95, 0.99), alpha=c(0.01,0.05,0.10), n = 10:20, side = 1)
## The three types of OC-curves that can be constructed ## with the norm.OC function. norm.OC(k = 4, alpha = NULL, P = c(0.90, 0.95, 0.99), n = 10:20, side = 1) norm.OC(k = 4, alpha = c(0.01, 0.05, 0.10), P = NULL, n = 10:20, side = 1) norm.OC(k = NULL, P = c(0.90, 0.95, 0.99), alpha=c(0.01,0.05,0.10), n = 10:20, side = 1)
Provides minimum sample sizes for a future sample size when constructing normal tolerance intervals. Various strategies are available for determining the sample size, including strategies that incorporate known specification limits.
norm.ss(x = NULL, alpha = 0.05, P = 0.99, delta = NULL, P.prime = NULL, side = 1, m = 50, spec = c(NA, NA), hyper.par = list(mu.0 = NULL, sig2.0 = NULL, m.0 = NULL, n.0 = NULL), method = c("DIR", "FW", "YGZO"))
norm.ss(x = NULL, alpha = 0.05, P = 0.99, delta = NULL, P.prime = NULL, side = 1, m = 50, spec = c(NA, NA), hyper.par = list(mu.0 = NULL, sig2.0 = NULL, m.0 = NULL, n.0 = NULL), method = c("DIR", "FW", "YGZO"))
x |
A vector of current data that is distributed according to a normal distribution. This is only required for |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
delta |
The precision measure for the future tolerance interval as specified under the Faulkenberry-Weeks method. |
P.prime |
The proportion of the population (greater than |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
m |
The maximum number of subintervals to be used in the |
spec |
A vector of length 2 given known specification limits. These are required when |
hyper.par |
Necessary parameter values for the different methods. If |
method |
The method for performing the sample size determination. |
norm.ss
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
delta |
The user-specified or calculated precision measure. Not returned if |
P.prime |
The user-specified or calculated closeness measure. Not returned if |
n |
The minimum sample size determined using the conditions specified for this function. |
Faulkenberry, G. D. and Weeks, D. L. (1968), Sample Size Determination for Tolerance Limits, Technometrics, 10, 343–348.
Young, D. S., Gordon, C. M., Zhu, S., and Olin, B. D. (2016), Sample Size Determination Strategies for Normal Tolerance Intervals Using Historical Data, Quality Engineering, 28, 337–351.
bayesnormtol.int
, Normal
, normtol.int
## Sample size determination for 95%/95% 2-sided normal ## tolerance intervals using the direct method. set.seed(100) norm.ss(alpha = 0.05, P = 0.95, side = 2, spec = c(-3, 3), method = "DIR", hyper.par = list(mu.0 = 0, sig2.0 = 1))
## Sample size determination for 95%/95% 2-sided normal ## tolerance intervals using the direct method. set.seed(100) norm.ss(alpha = 0.05, P = 0.95, side = 2, spec = c(-3, 3), method = "DIR", hyper.par = list(mu.0 = 0, sig2.0 = 1))
Provides 1-sided or 2-sided tolerance intervals for data distributed according to either a normal distribution or log-normal distribution.
normtol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, log.norm = FALSE)
normtol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, log.norm = FALSE)
x |
A vector of data which is distributed according to either a normal distribution or a log-normal distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
log.norm |
If |
Recall that if the random variable is distributed according to a log-normal distribution, then the random variable
is
distributed according to a normal distribution.
normtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
x.bar |
The sample mean. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Howe, W. G. (1969), Two-Sided Tolerance Limits for Normal Populations - Some Improvements, Journal of the American Statistical Association, 64, 610–620.
Wald, A. and Wolfowitz, J. (1946), Tolerance Limits for a Normal Distribution, Annals of Mathematical Statistics, 17, 208–215.
Weissberg, A. and Beatty, G. (1969), Tables of Tolerance Limit Factors for Normal Distributions, Technometrics, 2, 483–500.
## 95%/95% 2-sided normal tolerance intervals for a sample ## of size 100. set.seed(100) x <- rnorm(100, 0, 0.2) out <- normtol.int(x = x, alpha = 0.05, P = 0.95, side = 2, method = "HE", log.norm = FALSE) out plottol(out, x, plot.type = "both", side = "two", x.lab = "Normal Data")
## 95%/95% 2-sided normal tolerance intervals for a sample ## of size 100. set.seed(100) x <- rnorm(100, 0, 0.2) out <- normtol.int(x = x, alpha = 0.05, P = 0.95, side = 2, method = "HE", log.norm = FALSE) out plottol(out, x, plot.type = "both", side = "two", x.lab = "Normal Data")
For given values of m
, alpha
, and P
, this function solves the necessary sample size such that the
r
-th (or (n-s+1
)-th) order statistic is the [100(1-alpha)%, 100(P)%]
lower (or upper) tolerance
limit (see the Details section below for further explanation). This function can also report all combinations of order
statistics for 2-sided intervals.
np.order(m, alpha = 0.05, P = 0.99, indices = FALSE)
np.order(m, alpha = 0.05, P = 0.99, indices = FALSE)
m |
See the Details section below for how |
alpha |
1 minus the confidence level attained when it is desired to cover a proportion |
P |
The proportion of the population to be covered with confidence |
indices |
An optional argument to report all combinations of order statistics indices for the upper and lower limits
of the 2-sided intervals. Note that this can only be calculated when |
For the 1-sided tolerance limits, m=s+r
such that the probability is at least 1-alpha
that at least the
proportion P
of the population is below the (n-s+1
)-th order statistic for the upper limit or above the r
-th order statistic
for the lower limit. This means for the 1-sided upper limit that r=1
, while for the 1-sided lower limit it means that s=1
.
For the 2-sided tolerance intervals, m=s+r
such that the probability is at least 1-alpha
that at least the
proportion P
of the population is between the r
-th and (n-s+1
)-th order statistics. Thus, all combinations of r>0 and
s>0 such that m=s+r
are considered.
If indices = FALSE
, then a single number is returned for the necessary sample size such that the
r
-th (or (n-s+1
)-th) order statistic is the [100(1-alpha)%, 100(P)%]
lower (or upper) tolerance
limit. If indices = TRUE
, then a list is returned with a single number for the necessary sample size and a matrix
with 2 columns where each row gives the pairs of indices for the order statistics for all permissible [100(1-alpha)%, 100(P)%]
2-sided tolerance intervals.
Hanson, D. L. and Owen, D. B. (1963), Distribution-Free Tolerance Limits Elimination of the Requirement That Cumulative Distribution Functions Be Continuous, Technometrics, 5, 518–522.
Scheffe, H. and Tukey, J. W. (1945), Non-Parametric Estimation I. Validation of Order Statistics, Annals of Mathematical Statistics, 16, 187–192.
## Only requesting the sample size. np.order(m = 5, alpha = 0.05, P = 0.95) ## Requesting the order statistics indices as well. np.order(m = 5, alpha = 0.05, P = 0.95, indices = TRUE)
## Only requesting the sample size. np.order(m = 5, alpha = 0.05, P = 0.95) ## Requesting the order statistics indices as well. np.order(m = 5, alpha = 0.05, P = 0.95, indices = TRUE)
Provides 1-sided or 2-sided nonparametric (i.e., distribution-free) beta-expectation tolerance intervals for any continuous data set. These are equivalent to nonparametric prediction intervals based on order statistics.
npbetol.int(x, Beta = 0.95, side = 1, upper = NULL, lower = NULL)
npbetol.int(x, Beta = 0.95, side = 1, upper = NULL, lower = NULL)
x |
A vector of data which no distributional assumptions are made. The data is only assumed to come from a continuous distribution. |
Beta |
The confidence level. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
upper |
The upper bound of the data. When |
lower |
The lower bound of the data. When |
nptol.int
returns a data frame with items:
Beta |
The specified confidence level. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Beran, R. and Hall, P. (1993), Interpolated Nonparametric Prediction Intervals and Confidence Intervals, Journal of the Royal Statistical Society, Series B, 55, 643–652.
distfree.est
, npregtol.int
, nptol.int
## Nonparametric 90%-expectation tolerance intervals ## for a sample of size 100. set.seed(100) x <- rexp(100, 5) out <- npbetol.int(x = x, Beta = 0.90, side = 2, upper = NULL, lower = NULL) out
## Nonparametric 90%-expectation tolerance intervals ## for a sample of size 100. set.seed(100) x <- rexp(100, 5) out <- npbetol.int(x = x, Beta = 0.90, side = 2, upper = NULL, lower = NULL) out
Provides depth-based multivariate central or semi-space nonparametric tolerance regions. These can be calculated for any continuous multivariate data set. Either (P, 1-alpha) tolerance regions or beta-expectation tolerance regions can be specified.
npmvtol.region(x, alpha = NULL, P = NULL, Beta = NULL, depth.fn, adjust = c("no", "floor", "ceiling"), type = c("central", "semispace"), semi.order = list(lower = NULL, center = NULL, upper = NULL), L = -Inf, U = Inf, ...)
npmvtol.region(x, alpha = NULL, P = NULL, Beta = NULL, depth.fn, adjust = c("no", "floor", "ceiling"), type = c("central", "semispace"), semi.order = list(lower = NULL, center = NULL, upper = NULL), L = -Inf, U = Inf, ...)
x |
An |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. Note that if a (P, 1-alpha) tolerance region is required, then both |
Beta |
The confidence level for a beta-expectation tolerance region. Note that if a beta-expectation tolerance region is required, then |
depth.fn |
The data depth function used to perform the ordering of the multivariate data. Thus function must be coded in such a way that the first argument is multivariate data for which to calculate the depth values and the second argument is the original multivariate sample, |
adjust |
Whether an adjustment should be made during an intermediate calculation for determining the number of points that need to be included in the multivariate region. If |
type |
The type of multivariate hyperrectangular region to calculate. If |
semi.order |
If |
L |
If |
U |
If |
... |
Additional arguments passed to the |
npmvtol.region
returns a p
x2
matrix where the columns give the lower and upper limits, respectively, of the multivariate hyperrectangular tolerance region.
Young, D. S. and Mathew, T. (2020), Nonparametric Hyperrectangular Tolerance and Prediction Regions for Setting Multivariate Reference Regions in Laboratory Medicine, Statistical Methods in Medical Research, 29, 3569–3585.
distfree.est
, mvtol.region
, npregtol.int
## 90%/95% semi-space tolerance region for a sample ## of size 20 generated from a multivariate normal ## distribution. The mdepth function below is not ## a true depth function, but used only for ## illustrative purposes. mdepth <- function(pts, x){ mahalanobis(pts, center = rep(0, 3), cov = diag(1, 3)) } set.seed(100) x <- cbind(rnorm(100), rnorm(100), rnorm(100)) out <- npmvtol.region(x = x, alpha = 0.10, P = 0.95, depth.fn = mdepth, type = "semispace", semi.order = list(lower = 2, center = 3, upper = 1)) out
## 90%/95% semi-space tolerance region for a sample ## of size 20 generated from a multivariate normal ## distribution. The mdepth function below is not ## a true depth function, but used only for ## illustrative purposes. mdepth <- function(pts, x){ mahalanobis(pts, center = rep(0, 3), cov = diag(1, 3)) } set.seed(100) x <- cbind(rnorm(100), rnorm(100), rnorm(100)) out <- npmvtol.region(x = x, alpha = 0.10, P = 0.95, depth.fn = mdepth, type = "semispace", semi.order = list(lower = 2, center = 3, upper = 1)) out
Provides 1-sided or 2-sided nonparametric regression tolerance bounds.
npregtol.int(x, y, y.hat, side = 1, alpha = 0.05, P = 0.99, method = c("WILKS", "WALD", "HM"), upper = NULL, lower = NULL, new = FALSE)
npregtol.int(x, y, y.hat, side = 1, alpha = 0.05, P = 0.99, method = c("WILKS", "WALD", "HM"), upper = NULL, lower = NULL, new = FALSE)
x |
A vector of values for the predictor variable. Currently, this function is only capable of handling a single predictor. |
y |
A vector of values for the response variable. |
y.hat |
A vector of fitted values extracted from a nonparametric smoothing routine. |
side |
Whether a 1-sided or 2-sided tolerance bound is required (determined by |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by the tolerance bound(s). |
method |
The method for determining which indices of the ordered residuals will be used for the tolerance bounds.
|
upper |
The upper bound of the data. When |
lower |
The lower bound of the data. When |
new |
When |
npregtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by the tolerance bound(s). |
x |
The values of the predictor variable. |
y |
The values of the response variable. |
y.hat |
The predicted value of the response for the fitted nonparametric smoothing routine. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Young, D. S. (2013), Regression Tolerance Intervals, Communications in Statistics - Simulation and Computation, 42, 2040–2055.
## 95%/95% 2-sided nonparametric regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) y.hat <- loess(y~x)$fit out <- npregtol.int(x = x, y = y, y.hat = y.hat, side = 2, alpha = 0.05, P = 0.95, method = "WILKS", new = TRUE) out library(plotly) plotly_regtol(tol.out = out , x = x , y = y)
## 95%/95% 2-sided nonparametric regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) y.hat <- loess(y~x)$fit out <- npregtol.int(x = x, y = y, y.hat = y.hat, side = 2, alpha = 0.05, P = 0.95, method = "WILKS", new = TRUE) out library(plotly) plotly_regtol(tol.out = out , x = x , y = y)
Provides 1-sided or 2-sided nonparametric (i.e., distribution-free) tolerance intervals for any continuous data set.
nptol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("WILKS", "WALD", "HM", "YM"), upper = NULL, lower = NULL)
nptol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("WILKS", "WALD", "HM", "YM"), upper = NULL, lower = NULL)
x |
A vector of data which no distributional assumptions are made. The data is only assumed to come from a continuous distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for determining which indices of the ordered observations will be used for the tolerance intervals.
|
upper |
The upper bound of the data. When |
lower |
The lower bound of the data. When |
For the Young-Mathew method, interpolation or extrapolation is performed. When side = 1
, two intervals are given: one based on linear interpolation/extrapolation of order statistics (OS-Based
) and one based on fractional order statistics (FOS-Based
). When side = 2
, only an interval based on linear interpolation/extrapolation of order statistics is given.
nptol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Bury, K. (1999), Statistical Distributions in Engineering, Cambridge University Press.
Hahn, G. J. and Meeker, W. Q. (1991), Statistical Intervals: A Guide for Practitioners, Wiley-Interscience.
Wald, A. (1943), An Extension of Wilks' Method for Setting Tolerance Limits, The Annals of Mathematical Statistics, 14, 45–55.
Wilks, S. S. (1941), Determination of Sample Sizes for Setting Tolerance Limits, The Annals of Mathematical Statistics, 12, 91–96.
Young, D. S. and Mathew, T. (2014), Improved Nonparametric Tolerance Intervals Based on Interpolated and Extrapolated Order Statistics, Journal of Nonparametric Statistics, 26, 415–432.
## 90%/95% 2-sided nonparametric tolerance intervals for a ## sample of size 200. set.seed(100) x <- rlogis(200, 5, 1) out <- nptol.int(x = x, alpha = 0.10, P = 0.95, side = 1, method = "WILKS", upper = NULL, lower = NULL) out plottol(out, x, plot.type = "both", side = "two", x.lab = "X")
## 90%/95% 2-sided nonparametric tolerance intervals for a ## sample of size 200. set.seed(100) x <- rlogis(200, 5, 1) out <- nptol.int(x = x, alpha = 0.10, P = 0.95, side = 1, method = "WILKS", upper = NULL, lower = NULL) out plottol(out, x, plot.type = "both", side = "two", x.lab = "X")
Provides 1-sided or 2-sided tolerance intervals for data distributed according to either a Pareto distribution or a power distribution (i.e., the inverse Pareto distribution).
paretotol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("GPU", "DUN"), power.dist = FALSE)
paretotol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("GPU", "DUN"), power.dist = FALSE)
x |
A vector of data which is distributed according to either a Pareto distribution or a power distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for how the upper tolerance bound is approximated when transforming to utilize the relationship with the 2-parameter
exponential distribution. |
power.dist |
If |
Recall that if the random variable is distributed
according to a Pareto distribution, then the random variable
is distributed according to a 2-parameter exponential
distribution. Moreover, if the random variable
is
distributed according to a power distribution, then the random
variable
is distributed according to a Pareto
distribution, which in turn means that the random variable
is distributed according to a 2-parameter exponential
distribution.
paretotol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Dunsmore, I. R. (1978), Some Approximations for Tolerance Factors for the Two Parameter Exponential Distribution, Technometrics, 20, 317–318.
Engelhardt, M. and Bain, L. J. (1978), Tolerance Limits and Confidence Limits on Reliability for the Two-Parameter Exponential Distribution, Technometrics, 20, 37–39.
Guenther, W. C., Patil, S. A., and Uppuluri, V. R. R. (1976), One-Sided -Content Tolerance Factors
for the Two Parameter Exponential Distribution, Technometrics, 18, 333–340.
Krishnamoorthy, K., Mathew, T., and Mukherjee, S. (2008), Normal-Based Methods for a Gamma Distribution: Prediction and Tolerance Intervals and Stress-Strength Reliability, Technometrics, 50, 69–78.
TwoParExponential
, exp2tol.int
## 95%/99% 2-sided Pareto tolerance intervals ## for a sample of size 500. set.seed(100) x <- exp(r2exp(500, rate = 0.15, shift = 2)) out <- paretotol.int(x = x, alpha = 0.05, P = 0.99, side = 2, method = "DUN", power.dist = FALSE) out plottol(out, x, plot.type = "both", side = "two", x.lab = "Pareto Data")
## 95%/99% 2-sided Pareto tolerance intervals ## for a sample of size 500. set.seed(100) x <- exp(r2exp(500, rate = 0.15, shift = 2)) out <- paretotol.int(x = x, alpha = 0.05, P = 0.99, side = 2, method = "DUN", power.dist = FALSE) out plottol(out, x, plot.type = "both", side = "two", x.lab = "Pareto Data")
Plot tolerance intervals for each factor level in a balanced (or nearly-balanced) ANOVA
plotly_anovatol(tol.out, x, factors = NULL, side = c("two","upper", "lower"), range.min = NULL, range.max = NULL, x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, x.col = NULL, x.cex = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), tol.lower.col = NULL, tol.lower.lwd = NULL, tol.lower.line.type = c("dash","dot","dashdot","solid"), tol.upper.col = NULL, tol.upper.lwd = NULL, tol.upper.line.type = c("dash","dot","dashdot","solid"), title = NULL, title.position.x = NULL, title.position.y = NULL, title.size = NULL)
plotly_anovatol(tol.out, x, factors = NULL, side = c("two","upper", "lower"), range.min = NULL, range.max = NULL, x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, x.col = NULL, x.cex = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), tol.lower.col = NULL, tol.lower.lwd = NULL, tol.lower.line.type = c("dash","dot","dashdot","solid"), tol.upper.col = NULL, tol.upper.lwd = NULL, tol.upper.line.type = c("dash","dot","dashdot","solid"), title = NULL, title.position.x = NULL, title.position.y = NULL, title.size = NULL)
tol.out |
Output from any ANOVA tolerance interval procedure. |
x |
A data frame consisting of the data fitted in lm.out. Note that data must have one column for each main effect (i.e., factor) that is analyzed in lm.out and that these columns must be of class factor. |
factors |
Specify certain factor(s) to present. The name(s) of the factor(s) needs to be consistent with the name(s) in the original dataset. |
side |
|
range.min |
Minimum value on the y-axis. If actual lower limit is greater than |
range.max |
Maximum value on the y-axis. If actual upper limit is smaller than |
x.lab |
Label of the x-axis. |
x.lab.size |
Size of label of the x-axis. |
y.lab |
Label of the y-axis. |
y.lab.size |
Size of label of the y-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.tick.size |
Size of tick marks on the y-axis. |
x.col |
Color of original data points. |
x.cex |
Size of original data points. |
tol.col |
Color of the tolerance intervals when |
tol.lwd |
Width of the tolerance intervals when |
tol.line.type |
Line type of the tolerance intervals when |
tol.lower.col |
Color of the lower tolerance interval when |
tol.lower.lwd |
Width of the lower tolerance interval when |
tol.lower.line.type |
Line type of lower tolerance interval when |
tol.upper.col |
Color of the upper tolerance interval when |
tol.upper.lwd |
Width of the upper tolerance interval when |
tol.upper.line.type |
Line type of upper tolerance interval when |
title |
The main title on top of the plot |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
title.size |
Size of the title. |
plotly_anovatol
returns box plots as well as corresponding tolerance intervals for each main effect of an ANOVA.
Howe, W. G. (1969), Two-Sided Tolerance Limits for Normal Populations - Some Improvements, Journal of the American Statistical Association, 64, 610–620.
Weissberg, A. and Beatty, G. (1969), Tables of Tolerance Limit Factors for Normal Distributions, Technometrics, 2, 483–500.
anovatol.int
, plottol
, K.factor
,
normtol.int
, lm
, anova
## 90%/95% 1-sided tolerance intervals for a 2-way ANOVA ## using the "warpbreaks" data. attach(warpbreaks) lm.out <- lm(breaks ~ wool + tension) out.1 <- anovatol.int(lm.out, data = warpbreaks, alpha = 0.10, P = 0.95, side = 1, method = "HE") out.1 plotly_anovatol(out.1, x = warpbreaks , factors = 'wool' , x.lab = "Wool" , side="two") ## 90%/95% 2-sided tolerance intervals for a 2-way ANOVA ## using the "warpbreaks" data. out.2 <- anovatol.int(lm.out, data = warpbreaks, alpha = 0.10, P = 0.95, side = 2, method = "HE") out.2 plotly_anovatol(out.2, x = warpbreaks , range.min = 20 , range.max = 60)
## 90%/95% 1-sided tolerance intervals for a 2-way ANOVA ## using the "warpbreaks" data. attach(warpbreaks) lm.out <- lm(breaks ~ wool + tension) out.1 <- anovatol.int(lm.out, data = warpbreaks, alpha = 0.10, P = 0.95, side = 1, method = "HE") out.1 plotly_anovatol(out.1, x = warpbreaks , factors = 'wool' , x.lab = "Wool" , side="two") ## 90%/95% 2-sided tolerance intervals for a 2-way ANOVA ## using the "warpbreaks" data. out.2 <- anovatol.int(lm.out, data = warpbreaks, alpha = 0.10, P = 0.95, side = 2, method = "HE") out.2 plotly_anovatol(out.2, x = warpbreaks , range.min = 20 , range.max = 60)
Provides interactive control charts for tolerance bounds on continuous data.
plotly_controltol(tol.out , x , side = c("two","upper", "lower"), x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, x.col = NULL, x.cex = NULL, fit.col = NULL, fit.lwd = NULL, fit.line.type = c("solid","dash","dot","dashdot"), tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), title.position.x = NULL, title.position.y = NULL, title.size = NULL, title = NULL)
plotly_controltol(tol.out , x , side = c("two","upper", "lower"), x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, x.col = NULL, x.cex = NULL, fit.col = NULL, fit.lwd = NULL, fit.line.type = c("solid","dash","dot","dashdot"), tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), title.position.x = NULL, title.position.y = NULL, title.size = NULL, title = NULL)
tol.out |
Output from any continuous tolerance interval procedure. |
x |
Data from a continuous distribution. |
side |
|
x.lab |
Label of the x-axis. |
x.lab.size |
Size of label of the x-axis. |
y.lab |
Label of the y-axis. |
y.lab.size |
Size of label of the y-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.tick.size |
Size of tick marks on the y-axis. |
x.col |
Color of original data points. |
x.cex |
Size of original data points. |
fit.col |
Color of fitted line. |
fit.lwd |
Width of fitted line. |
fit.line.type |
Type of the fitted line. |
tol.col |
Color of the tolerance intervals when |
tol.lwd |
Width of the tolerance intervals when |
tol.line.type |
Line type of tolerance intervals. |
title |
The main title on top of the plot. |
title.size |
Size of the title. |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
plotly_controltol
can return boxplots as well as corresponding tolerance intervals for any continuous data.
Montgomery, D. C. (2005), Introduction to Statistical Quality Control, Fifth Edition, John Wiley & Sons, Inc.
## 95%/85% 2-sided Bayesian normal tolerance limits for ## a sample of size 100. set.seed(100) x <- rnorm(100) out <- bayesnormtol.int(x = x, alpha = 0.05, P = 0.85, side = 2, method = "EXACT", hyper.par = list(mu.0 = 0, sig2.0 = 1, n.0 = 10, m.0 = 10)) out plotly_controltol(out, x, x.lab = "Normal Data")
## 95%/85% 2-sided Bayesian normal tolerance limits for ## a sample of size 100. set.seed(100) x <- rnorm(100) out <- bayesnormtol.int(x = x, alpha = 0.05, P = 0.85, side = 2, method = "EXACT", hyper.par = list(mu.0 = 0, sig2.0 = 1, n.0 = 10, m.0 = 10)) out plotly_controltol(out, x, x.lab = "Normal Data")
Provides interactive tolerance intervals for continous data based on its histogram.
plotly_histtol(tol.out, x, side = c("two","upper", "lower"), x.lab = NULL, x.lab.size = NULL, x.tick.size = NULL, y.lab.size = NULL, y.tick.size = NULL, title = NULL, title.size = NULL, title.position.x = NULL, title.position.y = NULL, bin.col = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"))
plotly_histtol(tol.out, x, side = c("two","upper", "lower"), x.lab = NULL, x.lab.size = NULL, x.tick.size = NULL, y.lab.size = NULL, y.tick.size = NULL, title = NULL, title.size = NULL, title.position.x = NULL, title.position.y = NULL, bin.col = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"))
tol.out |
Output from any continuous tolerance interval procedure. |
x |
Data from a continuous distribution. |
side |
|
x.lab |
Label of the x-axis. |
x.lab.size |
Size of label of the x-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.lab.size |
Size of label of the y-axis. |
y.tick.size |
Size of tick marks on the y-axis. |
title |
The main title on top of the histogram. |
title.size |
Size of the title. |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
bin.col |
Color of the bins. |
tol.col |
Color of the tolerance interval(s). |
tol.lwd |
Width of the tolerance interval(s). |
tol.line.type |
Line type of the tolerance interval(s). |
plotly_histtol
can return histograms as well as corresponding tolerance intervals for any continuous data.
Montgomery, D. C. (2005), Introduction to Statistical Quality Control, Fifth Edition, John Wiley & Sons, Inc.
## 90%/90% 1-sided Weibull tolerance intervals for a sample ## of size 150. set.seed(100) x <- rweibull(150, 3, 75) out <- exttol.int(x = x, alpha = 0.15, P = 0.90, dist = "Weibull" , side = 1) out plotly_histtol(out, x, side = "lower", x.lab = "Weibull Data" , tol.lwd = 3)
## 90%/90% 1-sided Weibull tolerance intervals for a sample ## of size 150. set.seed(100) x <- rweibull(150, 3, 75) out <- exttol.int(x = x, alpha = 0.15, P = 0.90, dist = "Weibull" , side = 1) out plotly_histtol(out, x, side = "lower", x.lab = "Weibull Data" , tol.lwd = 3)
Provides interactive tolerance region on multivariate continuous data.
plotly_multitol(tol.out, x, x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, z.lab = NULL, z.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, z.tick.size = NULL, x.col = NULL, x.cex = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), title = NULL, title.position.x = NULL, title.position.y = NULL, title.size = NULL)
plotly_multitol(tol.out, x, x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, z.lab = NULL, z.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, z.tick.size = NULL, x.col = NULL, x.cex = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), title = NULL, title.position.x = NULL, title.position.y = NULL, title.size = NULL)
tol.out |
Output from |
x |
Multivariate data from continuous distributions. |
x.lab |
Label of the x-axis. |
x.lab.size |
Size of label of the x-axis. |
y.lab |
Label of the y-axis. |
y.lab.size |
Size of label of the y-axis. |
z.lab |
Label of the z-axis. |
z.lab.size |
Size of label of the z-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.tick.size |
Size of tick marks on the y-axis. |
z.tick.size |
Size of tick marks on the z-axis. |
x.col |
Color of original data points. |
x.cex |
Size of original data points. |
tol.col |
Color of the tolerance region. |
tol.lwd |
Width of boundary of the tolerance region when data is bivariate. |
tol.line.type |
Line type of the tolerance region for bivariate data. |
title |
The main title on top of the plot. |
title.size |
Size of the title. |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
plotly_multitol
returns tolerance regions for both bivariate and trivariate continuous data.
Krishnamoorthy, K. and Mathew, T. (1999), Comparison of Approximation Methods for Computing Tolerance Factors for a Multivariate Normal Population, Technometrics, 41, 234–249.
Krishnamoorthy, K. and Mondal, S. (2006), Improved Tolerance Factors for Multivariate Normal Distributions, Communications in Statistics - Simulation and Computation, 35, 461–478.
## 90%/90% bivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x <- cbind(x1, x2) out1 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out1 plotly_multitol(out1, x , x.lab = "X1" , y.lab = "X2") ## 90%/90% trivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x3 <- rnorm(100, 5, 1) x <- cbind(x1, x2, x3) mvtol.region(x = x, alpha = c(0.10, 0.05, 0.01), P = c(0.90, 0.95, 0.99), B = 1000, method = "KM") out2 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out2 plotly_multitol(out2, x , x.lab = "X1" , y.lab = "X2" , z.lab = "X3", title.position.x = 0.57)
## 90%/90% bivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x <- cbind(x1, x2) out1 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out1 plotly_multitol(out1, x , x.lab = "X1" , y.lab = "X2") ## 90%/90% trivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x3 <- rnorm(100, 5, 1) x <- cbind(x1, x2, x3) mvtol.region(x = x, alpha = c(0.10, 0.05, 0.01), P = c(0.90, 0.95, 0.99), B = 1000, method = "KM") out2 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000, method = "KM") out2 plotly_multitol(out2, x , x.lab = "X1" , y.lab = "X2" , z.lab = "X3", title.position.x = 0.57)
plotly
version of norm.OC
)plotly_normOC
is an updated function rooted in norm.OC
.
plotly_normOC(k = NULL, alpha = NULL, P = NULL, n, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, range.min = NULL, range.max = NULL, x.lab.size = NULL, y.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, title = NULL, title.size = NULL, title.position.x = NULL, title.position.y = NULL, legend.size = NULL, x.cex = NULL, line.width = NULL, line.type = c("solid","dash","dot","dashdot"))
plotly_normOC(k = NULL, alpha = NULL, P = NULL, n, side = 1, method = c("HE", "HE2", "WBE", "ELL", "KM", "EXACT", "OCT"), m = 50, range.min = NULL, range.max = NULL, x.lab.size = NULL, y.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, title = NULL, title.size = NULL, title.position.x = NULL, title.position.y = NULL, legend.size = NULL, x.cex = NULL, line.width = NULL, line.type = c("solid","dash","dot","dashdot"))
k |
If wanting OC curves where the confidence level or content level is on the y-axis, then a single positive value of |
alpha |
The set of levels chosen such that |
P |
The set of content levels to be considered. If wanting OC curves where the confidence level is being calculated, then each curve will correspond to a level in the set of |
n |
A sequence of sample sizes to consider. This must be a vector of at least length 2 since all OC curves are constructed as functions of |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the k-factors. The k-factor for the 1-sided tolerance intervals
is performed exactly and thus is the same for the chosen method. |
m |
The maximum number of subintervals to be used in the |
range.min |
The minimum value of the y-axis. |
range.max |
The maximum value of the y-axis. |
x.lab.size |
Size of label of the x-axis. |
y.lab.size |
Size of label of the y-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.tick.size |
Sze of tick marks on the y-axis. |
title |
The main title on top of the plot. |
title.size |
Size of the title. |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
legend.size |
Size of the legend. |
x.cex |
Size of data points. |
line.width |
Width of lines connecting data points. |
line.type |
The type of lines connection data points. |
norm.OC
returns a figure with the OC curves constructed using the specifications in the arguments.
Young, D. S. (2016), Normal Tolerance Interval Procedures in the tolerance Package, The R Journal, 8, 200–212.
K.factor
, normtol.int
, norm.OC
## The three types of OC-curves that can be constructed ## with the ggnorm.OC function. plotly_normOC(k = 4, alpha = NULL, P = c(0.90, 0.95, 0.99), n = 10:20, side = 1) plotly_normOC(k = 4, alpha = c(0.01, 0.05, 0.10), P = NULL, n = 10:20, side = 1) plotly_normOC(k = NULL, P = c(0.90, 0.95, 0.99), alpha=c(0.01,0.05,0.10), n = 10:20, side = 1)
## The three types of OC-curves that can be constructed ## with the ggnorm.OC function. plotly_normOC(k = 4, alpha = NULL, P = c(0.90, 0.95, 0.99), n = 10:20, side = 1) plotly_normOC(k = 4, alpha = c(0.01, 0.05, 0.10), P = NULL, n = 10:20, side = 1) plotly_normOC(k = NULL, P = c(0.90, 0.95, 0.99), alpha=c(0.01,0.05,0.10), n = 10:20, side = 1)
plotly_npmvtol
is plotting function for nonparametric multivaraite hyperrectangular tolerance region. The function takes the outcome of npmvtol.region
as an input and provides visualzation for hypperrectangular tolerance regions between two variables.
plotly_npmvtol(tol.out, x, var.names = NULL, title = NULL, x.col = "#4298B5", x.cex = 6, x.shape = "dot", outlier.col = "#A6192E", outlier.cex = 8, outlier.shape = "triangle-up", tol.col = "#D1DDE6", tol.opacity = 0.4, x.lab.size = 12, x.tick.size = 12, y.lab.size = 12, y.tick.size = 12, title.position.x = 0.5, title.position.y = 0.98, title.size = 12, show.bound = TRUE, bound.type = c("dash", "dot", "solid", "longdash", "dashdot", "longdashdot"), bound.col = "#000000", bound.lwd = 1 )
plotly_npmvtol(tol.out, x, var.names = NULL, title = NULL, x.col = "#4298B5", x.cex = 6, x.shape = "dot", outlier.col = "#A6192E", outlier.cex = 8, outlier.shape = "triangle-up", tol.col = "#D1DDE6", tol.opacity = 0.4, x.lab.size = 12, x.tick.size = 12, y.lab.size = 12, y.tick.size = 12, title.position.x = 0.5, title.position.y = 0.98, title.size = 12, show.bound = TRUE, bound.type = c("dash", "dot", "solid", "longdash", "dashdot", "longdashdot"), bound.col = "#000000", bound.lwd = 1 )
tol.out |
Output from |
x |
Data frame for different variables. Columns of |
var.names |
Labels of variable names. The dimension of |
title |
The main title on top of the plot. The length of |
x.col |
Color of original data points, excluding outliers. |
x.cex |
Size of original data points, excluding outliers. |
x.shape |
Shape of original data points, excluding outliers. |
outlier.col |
Color of outliers. |
outlier.cex |
Size of outliers. |
outlier.shape |
Shape of outliers. |
tol.col |
Color of tolerance region. |
tol.opacity |
Opacity of tolerance region. |
x.lab.size |
Size of label of the x-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.lab.size |
Size of label of the y-axis. |
y.tick.size |
Size of tick marks on the y-axis. |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
title.size |
Size of the title. |
show.bound |
Logical indicating to show rectanglular boundaries. Default is |
bound.type |
Line type of the rectangle boundaries. |
bound.col |
Color of the rectangle boundaries. |
bound.lwd |
Width of the rectangle boundaries. |
plotly_npmvtol
returns figures of hypperectangular tolerance regions between two random variable generated by npmvtol.region
.
Young, D. S., & Mathew, T. (2020), Nonparametric Hyperrectangular Tolerance and Prediction Regions for Setting Multivariate Reference Regions in Laboratory Medicine. Statistical Methods in Medical Research, 29, 3569–3585.
library(plotly) mdepth <- function(pts, x){ mahalanobis(pts, center = rep(0, 3), cov = diag(1, 3)) } set.seed(100) x <- cbind(X1=rnorm(300), X2=rnorm(300), X3=rnorm(300)) out <-npmvtol.region(x = x, alpha = 0.10, P = 0.90, depth.fn = mdepth, type = "semispace", semi.order = list(lower = 2, center = 3, upper = 1)) gg.out <- plotly_npmvtol(tol.out = out , x = x)
library(plotly) mdepth <- function(pts, x){ mahalanobis(pts, center = rep(0, 3), cov = diag(1, 3)) } set.seed(100) x <- cbind(X1=rnorm(300), X2=rnorm(300), X3=rnorm(300)) out <-npmvtol.region(x = x, alpha = 0.10, P = 0.90, depth.fn = mdepth, type = "semispace", semi.order = list(lower = 2, center = 3, upper = 1)) gg.out <- plotly_npmvtol(tol.out = out , x = x)
Provides interactive tolerance intervals for regression data. More specifically, plotly_regtol
presents tolerance bounds for linear regression, nonlinear regression, and nonparametric regression models. In addtion, this updated function is capable of showing tolerance plane for trivariate regression models.
plotly_regtol(tol.out, x, new.x = NULL, y, side = c("two","upper", "lower"), rect = FALSE, smooth = 4, x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, z.lab = NULL, z.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, z.tick.size = NULL, x.col = NULL, x.cex = NULL, fit.col = NULL, fit.lwd = NULL, fit.line.type = c("dash","dot","dashdot","solid"), fit.opacity = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), tol.opacity = NULL, title.position.x = NULL, title.position.y = NULL, title = NULL, title.size = NULL)
plotly_regtol(tol.out, x, new.x = NULL, y, side = c("two","upper", "lower"), rect = FALSE, smooth = 4, x.lab = NULL, x.lab.size = NULL, y.lab = NULL, y.lab.size = NULL, z.lab = NULL, z.lab.size = NULL, x.tick.size = NULL, y.tick.size = NULL, z.tick.size = NULL, x.col = NULL, x.cex = NULL, fit.col = NULL, fit.lwd = NULL, fit.line.type = c("dash","dot","dashdot","solid"), fit.opacity = NULL, tol.col = NULL, tol.lwd = NULL, tol.line.type = c("dash","dot","dashdot","solid"), tol.opacity = NULL, title.position.x = NULL, title.position.y = NULL, title = NULL, title.size = NULL)
tol.out |
Output from |
x |
Data frame for explanatory variables. If there are more than one explanatory variables, columns of |
new.x |
An optional data frame in which to look for variables with which to predict. |
y |
Data frame for response variable. |
side |
|
rect |
This argument is used for plotting tolerance plane(s) of multivariate regression region. When |
smooth |
The smooth parameter for the x1-x2 plane when |
x.lab |
Label of the x-axis. |
x.lab.size |
Size of label of the x-axis. |
y.lab |
Label of the y-axis. |
y.lab.size |
Size of label of the y-axis. |
z.lab |
Label of the z-axis. |
z.lab.size |
Size of label of the z-axis. |
x.tick.size |
Size of tick marks on the x-axis. |
y.tick.size |
Size of tick marks on the y-axis. |
z.tick.size |
Size of tick marks on the z-axis. |
x.col |
Color of original data points. |
x.cex |
Size of original data points. |
fit.col |
Color of fitted line or fitted plane. |
fit.lwd |
Width of fitted line or fitted plane. |
fit.line.type |
Type of fitted line or fitted plane. |
fit.opacity |
Opacity of fitted line or fitted plane. |
tol.col |
Color of tolerance intervals or tolerance plane. |
tol.lwd |
Width of tolerance intervals. |
tol.line.type |
Line type of tolerance intervals |
tol.opacity |
Opacity of tolerance region. |
title.position.x |
Horizontal position of the title. |
title.position.y |
Vertical position of the title. |
title |
The main title on top of the plot. |
title.size |
Size of the title. |
plotly_regtol
returns tolerance intervals for linear regression, nonlinear regression, nonparametric regression, as well as tolerance planes for multivariate (multiple) linear regression models.
Montgomery, D. C. (2005), Introduction to Statistical Quality Control, Fifth Edition, John Wiley & Sons, Inc.
plottol
, regtol.int
, regtol.int
, nlregtol.int
,
npregtol.int
, npregtol.int
,mvregtol.region
## 95%/95% 1-sided linear regression tolerance bounds ## for a sample of size 100. library(plotly) set.seed(100) x <- runif(100, 0, 10) y <- 20 + 5*x + rnorm(100, 0, 3) out1 <- regtol.int(reg = lm(y ~ x), new.x = c(3,6,20), new=TRUE , side = 1, alpha = 0.05, P = 0.95) plotly_regtol(tol.out = out1 , x=x , y=y , new.x = c(6,20), side = "two" , fit.line.type = "dash" , tol.line.type = "solid") ######################## set.seed(100) x1 <- runif(100, 0, 10) x2 <- rpois(100 , 5) y <- 20 + 5*x1 + 3*x2 + rnorm(100, 0, 3) x1.new <- runif(10 , 0 , 10) x2.new <- rpois(10 , 5) out2 <- regtol.int(reg = lm(y ~ x1 + x2), new.x = cbind(x1.new , x2.new), new=TRUE, side = 1, alpha = 0.05, P = 0.95) plotly_regtol(tol.out = out2 , y=y , x=cbind(x1,x2) , new.x = cbind(x1.new , x2.new) , rect = TRUE , side = "two") ########################### ## 95%/95% 2-sided nonlinear regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) formula <- as.formula(y ~ b1 + (0.49 - b1)*exp(-b2*(x - 8))) out1 <- nlregtol.int(formula = formula, xy.data = data.frame(cbind(y, x)), x.new=c(10,20,50), side = 2, alpha = 0.05, P = 0.95 , new = TRUE) plotly_regtol(tol.out = out1 , x=x , y=y , new.x = c(20,50) , side = "two", fit.line.type = "dot") ############### ## 95%/95% 1-sided nonparametric regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) y.hat <- loess(y~x)$fit out1 <- npregtol.int(x = x, y = y, y.hat = y.hat, side = 1, alpha = 0.05, P = 0.95, method = "WILKS" , new = TRUE) plotly_regtol(tol.out = out1 , x=x , y=y , side = "two" , fit.line.type = "dash") ############ set.seed(100) x1 <- runif(50, 5, 45) x2 <- rnorm(50 , 0 , 1) f1 <- function(x1 , x2 , b1, b2) {b1 + (0.49 - b1)*exp(-b2*(x1 + x2 - 8)) + rnorm(50, sd = 0.01)} y <- f1(x1 , x2 , 0.39, 0.11) y.hat <- loess(y~ x1 + x2)$fit out2 <- npregtol.int(x = cbind(x1 , x2), y = y, y.hat = y.hat, side = 1, alpha = 0.05, P = 0.95, method = "WILKS" , new = TRUE) plotly_regtol(tol.out = out2 , y=y , x=cbind(x1,x2) , rect = TRUE , smooth = 100 , side = "two")
## 95%/95% 1-sided linear regression tolerance bounds ## for a sample of size 100. library(plotly) set.seed(100) x <- runif(100, 0, 10) y <- 20 + 5*x + rnorm(100, 0, 3) out1 <- regtol.int(reg = lm(y ~ x), new.x = c(3,6,20), new=TRUE , side = 1, alpha = 0.05, P = 0.95) plotly_regtol(tol.out = out1 , x=x , y=y , new.x = c(6,20), side = "two" , fit.line.type = "dash" , tol.line.type = "solid") ######################## set.seed(100) x1 <- runif(100, 0, 10) x2 <- rpois(100 , 5) y <- 20 + 5*x1 + 3*x2 + rnorm(100, 0, 3) x1.new <- runif(10 , 0 , 10) x2.new <- rpois(10 , 5) out2 <- regtol.int(reg = lm(y ~ x1 + x2), new.x = cbind(x1.new , x2.new), new=TRUE, side = 1, alpha = 0.05, P = 0.95) plotly_regtol(tol.out = out2 , y=y , x=cbind(x1,x2) , new.x = cbind(x1.new , x2.new) , rect = TRUE , side = "two") ########################### ## 95%/95% 2-sided nonlinear regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) formula <- as.formula(y ~ b1 + (0.49 - b1)*exp(-b2*(x - 8))) out1 <- nlregtol.int(formula = formula, xy.data = data.frame(cbind(y, x)), x.new=c(10,20,50), side = 2, alpha = 0.05, P = 0.95 , new = TRUE) plotly_regtol(tol.out = out1 , x=x , y=y , new.x = c(20,50) , side = "two", fit.line.type = "dot") ############### ## 95%/95% 1-sided nonparametric regression tolerance bounds ## for a sample of size 50. set.seed(100) x <- runif(50, 5, 45) f1 <- function(x, b1, b2) b1 + (0.49 - b1)*exp(-b2*(x - 8)) + rnorm(50, sd = 0.01) y <- f1(x, 0.39, 0.11) y.hat <- loess(y~x)$fit out1 <- npregtol.int(x = x, y = y, y.hat = y.hat, side = 1, alpha = 0.05, P = 0.95, method = "WILKS" , new = TRUE) plotly_regtol(tol.out = out1 , x=x , y=y , side = "two" , fit.line.type = "dash") ############ set.seed(100) x1 <- runif(50, 5, 45) x2 <- rnorm(50 , 0 , 1) f1 <- function(x1 , x2 , b1, b2) {b1 + (0.49 - b1)*exp(-b2*(x1 + x2 - 8)) + rnorm(50, sd = 0.01)} y <- f1(x1 , x2 , 0.39, 0.11) y.hat <- loess(y~ x1 + x2)$fit out2 <- npregtol.int(x = cbind(x1 , x2), y = y, y.hat = y.hat, side = 1, alpha = 0.05, P = 0.95, method = "WILKS" , new = TRUE) plotly_regtol(tol.out = out2 , y=y , x=cbind(x1,x2) , rect = TRUE , smooth = 100 , side = "two")
Provides control charts and/or histograms for tolerance bounds on continuous data as well as tolerance ellipses for data distributed according to bivariate and trivariate normal distributions. Scatterplots with regression tolerance bounds and interval plots for ANOVA tolerance intervals may also be produced.
plottol(tol.out, x, y = NULL, y.hat = NULL, side = c("two", "upper", "lower"), plot.type = c("control", "hist", "both"), x.lab = NULL, y.lab = NULL, z.lab = NULL, ...)
plottol(tol.out, x, y = NULL, y.hat = NULL, side = c("two", "upper", "lower"), plot.type = c("control", "hist", "both"), x.lab = NULL, y.lab = NULL, z.lab = NULL, ...)
tol.out |
Output from any continuous (including ANOVA) tolerance interval procedure or from a regression tolerance bound procedure. |
x |
Either data from a continuous distribution or the predictors for a regression model. If this is a design matrix
for a linear regression model, then it must be in matrix form AND include a column of 1's if there is to be an intercept. Note
that multiple predictors are only allowed if considering polynomial regression. If the output for |
y |
The response vector for a regression setting. Leave as |
y.hat |
The fitted values from a nonparametric smoothing routine if plotting nonparametric regression tolerance bounds. Otherwise,
leave as |
side |
|
plot.type |
|
x.lab |
Specify the label for the x-axis. |
y.lab |
Specify the label for the y-axis. |
z.lab |
Specify the label for the z-axis. |
... |
Additional arguments passed to the plotting function used for the control charts or regression scatterplots. |
plottol
can return a control chart, histogram, or both for continuous data along with the calculated tolerance intervals.
For regression data, plottol
returns a scatterplot along with the regression tolerance bounds. For ANOVA output, plottol
returns an interval plot for each factor.
Montgomery, D. C. (2005), Introduction to Statistical Quality Control, Fifth Edition, John Wiley & Sons, Inc.
## 90%/90% 1-sided Weibull tolerance intervals for a sample ## of size 150. set.seed(100) x <- rweibull(150, 3, 75) out <- exttol.int(x = x, alpha = 0.15, P = 0.90, dist = "Weibull") out plottol(out, x, plot.type = "both", side = "lower", x.lab = "Weibull Data") ## 90%/90% trivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x3 <- rnorm(100, 5, 1) x <- cbind(x1, x2, x3) mvtol.region(x = x, alpha = c(0.10, 0.05, 0.01), P = c(0.90, 0.95, 0.99), B = 1000) out2 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000) out2 plottol(out2, x) ## 95%/95% 2-sided linear regression tolerance bounds ## for a sample of size 100. set.seed(100) x <- runif(100, 0, 10) y <- 20 + 5*x + rnorm(100, 0, 3) out3 <- regtol.int(reg = lm(y ~ x), new.x = data.frame(x = c(3, 6, 9)), side = 2, alpha = 0.05, P = 0.95) plottol(out3, x = cbind(1, x), y = y, side = "two", x.lab = "X", y.lab = "Y")
## 90%/90% 1-sided Weibull tolerance intervals for a sample ## of size 150. set.seed(100) x <- rweibull(150, 3, 75) out <- exttol.int(x = x, alpha = 0.15, P = 0.90, dist = "Weibull") out plottol(out, x, plot.type = "both", side = "lower", x.lab = "Weibull Data") ## 90%/90% trivariate normal tolerance region. set.seed(100) x1 <- rnorm(100, 0, 0.2) x2 <- rnorm(100, 0, 0.5) x3 <- rnorm(100, 5, 1) x <- cbind(x1, x2, x3) mvtol.region(x = x, alpha = c(0.10, 0.05, 0.01), P = c(0.90, 0.95, 0.99), B = 1000) out2 <- mvtol.region(x = x, alpha = 0.10, P = 0.90, B = 1000) out2 plottol(out2, x) ## 95%/95% 2-sided linear regression tolerance bounds ## for a sample of size 100. set.seed(100) x <- runif(100, 0, 10) y <- 20 + 5*x + rnorm(100, 0, 3) out3 <- regtol.int(reg = lm(y ~ x), new.x = data.frame(x = c(3, 6, 9)), side = 2, alpha = 0.05, P = 0.95) plottol(out3, x = cbind(1, x), y = y, side = "two", x.lab = "X", y.lab = "Y")
Performs maximum likelihood estimation for the parameter of the Poisson-Lindley distribution.
poislind.ll(x, theta = NULL, ...)
poislind.ll(x, theta = NULL, ...)
x |
A vector of raw data which is distributed according to a Poisson-Lindley distribution. |
theta |
Optional starting value for the parameter. If |
... |
Additional arguments passed to the |
The discrete Poisson-Lindley distribution is a compound distribution that, potentially, provides a better fit for count data relative to the traditional Poisson and negative binomial distributions.
See the help file for mle
to see how the output is structured.
Ghitany, M. E. and Al-Mutairi, D. K. (2009), Estimation Methods for the Discrete Poisson-Lindley Distribution, Journal of Statistical Computation and Simulation, 79, 1–9.
Sankaran, M. (1970), The Discrete Poisson-Lindley Distribution, Biometrics, 26, 145–149.
## Maximum likelihood estimation for randomly generated data ## from the Poisson-Lindley distribution. set.seed(100) pl.data <- rpoislind(n = 500, theta = 0.5) out.pl <- poislind.ll(pl.data) stats4::coef(out.pl) stats4::vcov(out.pl)
## Maximum likelihood estimation for randomly generated data ## from the Poisson-Lindley distribution. set.seed(100) pl.data <- rpoislind(n = 500, theta = 0.5) out.pl <- poislind.ll(pl.data) stats4::coef(out.pl) stats4::vcov(out.pl)
Provides 1-sided or 2-sided tolerance intervals for data distributed according to the Poisson-Lindley distribution.
poislindtol.int(x, m = NULL, alpha = 0.05, P = 0.99, side = 1, ...)
poislindtol.int(x, m = NULL, alpha = 0.05, P = 0.99, side = 1, ...)
x |
A vector of raw data which is distributed according to a Poisson-Lindley distribution. |
m |
The number of observations in a future sample for which the tolerance limits will be calculated. By default, |
alpha |
The level chosen such that 1-alpha is the confidence level. |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
... |
Additional arguments passed to the |
The discrete Poisson-Lindley distribution is a compound distribution that, potentially, provides a better fit for count data relative to the traditional Poisson and negative binomial distributions. Poisson-Lindley distributions are heavily right-skewed distributions. For most practical applications, one will typically be interested in 1-sided upper bounds.
poislindtol.int
returns a data frame with the following items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
theta |
MLE for the shape parameter |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Naghizadeh Qomi, M., Kiapour, A., and Young, D. S. (2015), Approximate Tolerance Intervals for the Discrete Poisson-Lindley Distribution, Journal of Statistical Computation and Simulation, 86, 841–854.
## 90%/90% 1-sided tolerance intervals for data assuming ## the Poisson-Lindley distribution. x <- c(rep(0, 447), rep(1, 132), rep(2, 42), rep(3, 21), rep(4, 3), rep(5, 2)) out <- poislindtol.int(x, alpha = 0.10, P = 0.90, side = 1) out
## 90%/90% 1-sided tolerance intervals for data assuming ## the Poisson-Lindley distribution. x <- c(rep(0, 447), rep(1, 132), rep(2, 42), rep(3, 21), rep(4, 3), rep(5, 2)) out <- poislindtol.int(x, alpha = 0.10, P = 0.90, side = 1) out
Density (mass), distribution function, quantile function, and random generation for the Poisson-Lindley distribution.
dpoislind(x, theta, log = FALSE) ppoislind(q, theta, lower.tail = TRUE, log.p = FALSE) qpoislind(p, theta, lower.tail = TRUE, log.p = FALSE) rpoislind(n, theta)
dpoislind(x, theta, log = FALSE) ppoislind(q, theta, lower.tail = TRUE, log.p = FALSE) qpoislind(p, theta, lower.tail = TRUE, log.p = FALSE) rpoislind(n, theta)
x , q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
The number of observations. If |
theta |
The shape parameter, which must be greater than 0. |
log , log.p
|
Logical vectors. If |
lower.tail |
Logical vector. If |
The Poisson-Lindley distribution has mass
where and
is the shape parameter.
dpoislind
gives the density (mass), ppoislind
gives the distribution function, qpoislind
gives the quantile function, and rpoislind
generates random deviates for the specified distribution.
Ghitany, M. E. and Al-Mutairi, D. K. (2009), Estimation Methods for the Discrete Poisson-Lindley Distribution, Journal of Statistical Computation and Simulation, 79, 1–9.
Sankaran, M. (1970), The Discrete Poisson-Lindley Distribution, Biometrics, 26, 145–149.
runif
and .Random.seed
about random number generation.
## Randomly generated data from the Poisson-Lindley ## distribution. set.seed(100) x <- rpoislind(n = 150, theta = 0.5) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dpoislind(x = x.1, theta = 0.5) lines(x.1, y, col = 2, lwd = 2) plot(x.1, ppoislind(q = x.1, theta = 0.5), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qpoislind(p = 0.20, theta = 0.5, lower.tail = FALSE) qpoislind(p = 0.80, theta = 0.5)
## Randomly generated data from the Poisson-Lindley ## distribution. set.seed(100) x <- rpoislind(n = 150, theta = 0.5) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dpoislind(x = x.1, theta = 0.5) lines(x.1, y, col = 2, lwd = 2) plot(x.1, ppoislind(q = x.1, theta = 0.5), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qpoislind(p = 0.20, theta = 0.5, lower.tail = FALSE) qpoislind(p = 0.80, theta = 0.5)
Provides 1-sided or 2-sided tolerance intervals for Poisson random variables. From a statistical quality control perspective, these limits bound the number of occurrences (which follow a Poisson distribution) in a specified future time period.
poistol.int(x, n, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("TAB", "LS", "SC", "CC", "VS", "RVS", "FT", "CSC"))
poistol.int(x, n, m = NULL, alpha = 0.05, P = 0.99, side = 1, method = c("TAB", "LS", "SC", "CC", "VS", "RVS", "FT", "CSC"))
x |
The number of occurrences of the event in time period |
n |
The time period of the original measurements. |
m |
The specified future length of time. If |
alpha |
The level chosen such that |
P |
The proportion of occurrences in future time lengths of size |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
method |
The method for calculating the lower and upper confidence bounds, which are used in the calculation
of the tolerance bounds. The default method is |
poistol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of occurrences in future time periods of length |
lambda.hat |
The mean occurrence rate per unit time, calculated by |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Barker, L. (2002), A Comparison of Nine Confidence Intervals for a Poisson Parameter When the Expected Number of Events Is , The American Statistician, 56, 85–89.
Freeman, M. F. and Tukey, J. W. (1950), Transformations Related to the Angular and the Square Root, Annals of Mathematical Statistics, 21, 607–611.
Hahn, G. J. and Chandra, R. (1981), Tolerance Intervals for Poisson and Binomial Variables, Journal of Quality Technology, 13, 100–110.
## 95%/90% 1-sided Poisson tolerance limits for future ## occurrences in a period of length 3. All seven methods ## are presented for comparison. poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "TAB") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "LS") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "SC") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "CC") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "VS") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "RVS") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "FT") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "CSC") ## 95%/90% 2-sided Poisson tolerance intervals for future ## occurrences in a period of length 15. All seven methods ## are presented for comparison. poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "TAB") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "LS") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "SC") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "CC") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "VS") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "RVS") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "FT") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "CSC")
## 95%/90% 1-sided Poisson tolerance limits for future ## occurrences in a period of length 3. All seven methods ## are presented for comparison. poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "TAB") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "LS") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "SC") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "CC") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "VS") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "RVS") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "FT") poistol.int(x = 45, n = 9, m = 3, alpha = 0.05, P = 0.90, side = 1, method = "CSC") ## 95%/90% 2-sided Poisson tolerance intervals for future ## occurrences in a period of length 15. All seven methods ## are presented for comparison. poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "TAB") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "LS") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "SC") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "CC") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "VS") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "RVS") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "FT") poistol.int(x = 45, n = 9, m = 15, alpha = 0.05, P = 0.90, side = 2, method = "CSC")
Provides 1-sided or 2-sided (multiple) linear regression tolerance bounds. It is also possible to fit a regression through the origin model.
regtol.int(reg, new.x = NULL, side = 1, alpha = 0.05, P = 0.99, new = FALSE)
regtol.int(reg, new.x = NULL, side = 1, alpha = 0.05, P = 0.99, new = FALSE)
reg |
An object of class |
new.x |
An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used. |
side |
Whether a 1-sided or 2-sided tolerance bound is required (determined by |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by the tolerance bound(s). |
new |
When |
regtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by the tolerance bound(s). |
y |
The value of the response given on the left-hand side of the model in |
y.hat |
The predicted value of the response for the fitted linear regression model. This data frame is sorted by this value. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Wallis, W. A. (1951), Tolerance Intervals for Linear Regression, in Second Berkeley Symposium on Mathematical Statistics and Probability, ed. J. Neyman, Berkeley: University of CA Press, 43–51.
Young, D. S. (2013), Regression Tolerance Intervals, Communications in Statistics - Simulation and Computation, 42, 2040–2055.
## 95%/95% 2-sided linear regression tolerance bounds ## for a sample of size 100. set.seed(100) x <- runif(100, 0, 10) y <- 20 + 5*x + rnorm(100, 0, 3) out <- regtol.int(reg = lm(y ~ x), new.x = data.frame(x = c(3, 6, 9)), side = 2, alpha = 0.05, P = 0.95) out plottol(out, x = cbind(1, x), y = y, side = "two", x.lab = "X", y.lab = "Y")
## 95%/95% 2-sided linear regression tolerance bounds ## for a sample of size 100. set.seed(100) x <- runif(100, 0, 10) y <- 20 + 5*x + rnorm(100, 0, 3) out <- regtol.int(reg = lm(y ~ x), new.x = data.frame(x = c(3, 6, 9)), side = 2, alpha = 0.05, P = 0.95) out plottol(out, x = cbind(1, x), y = y, side = "two", x.lab = "X", y.lab = "Y")
Provides confidence intervals, one-sided prediction limits, and one-sided tolerance limits for semicontinuous data — either zero-inflated gamma (ZIG) or zero-inflated lognormal (ZILN) distribution — using a generalized fiducial framework.
semiconttol.int(x, alpha = 0.05, P = 0.99, N = 1000)
semiconttol.int(x, alpha = 0.05, P = 0.99, N = 1000)
x |
A vector of semicontinuous data. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
N |
The number of fiducial samples to generate. |
semiconttol.int
returns a list with items:
ZIG.CI |
The generalized confidence interval under a ZIG distribution. |
ZIG.PI |
The generalized (upper) prediction limit under a ZIG distribution. |
ZIG.TI |
The generalized (upper) tolerance limit under a ZIG distribution. |
ZIG.TI.appx |
The generalized (upper) tolerance limit under a ZIG distribution based on the Wilson-Hilferty approximation. |
ZILN.CI |
The generalized confidence interval under a ZILN distribution. |
ZILN.PI |
The generalized (upper) prediction limit under a ZILN distribution. |
ZILN.TI |
The generalized (upper) tolerance limit under a ZILN distribution. |
ZILN.TI.appx |
The generalized (upper) tolerance limit under a ZILN distribution based on an approximation used in Hasan and Krishnamoorthy (2018). |
`NA` |
The number of times generalized fiducial quantities could not be calculated due to unlucky samples being drawn; e.g., a sample with all 0s. This will happen rarely and usually only when there is a very large proportion of zeros. |
Hasan, M. S. and Krishnamoorthy, K. (2018), Confidence Intervals for the Mean and a Percentile Based on Zero-Inflated Lognormal Data, Journal of Statistical Computation and Simulation, 88, 1499–1514.
Zou, Y. and Young, D. S. (2024), Fiducial-Based Statistical Intervals for Zero-Inflated Gamma Data, Journal of Statistical Theory and Practice, 18, 1–20.
fidbintol.int
, fidnegbintol.int
, fidpoistol.int
## Generalized intervals assuming 95% confidence and ## 95% content for a dataset analyzed in Hasan and ## Krishnamoorthy (2018). x <- c(6, 0, 6, 9, 6.5, 0, 0, 0, 1, 0.5, 2, 2, 0, 0, 1) set.seed(1) out <- semiconttol.int(x, P = 0.95, alpha = 0.05, N = 500) out
## Generalized intervals assuming 95% confidence and ## 95% content for a dataset analyzed in Hasan and ## Krishnamoorthy (2018). x <- c(6, 0, 6, 9, 6.5, 0, 0, 0, 1, 0.5, 2, 2, 0, 0, 1) set.seed(1) out <- semiconttol.int(x, P = 0.95, alpha = 0.05, N = 500) out
Provides simultaneous 1-sided or 2-sided tolerance intervals for data distributed according to either a normal distribution or log-normal distribution.
simnormtol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("EXACT", "BONF"), m = 50, log.norm = FALSE)
simnormtol.int(x, alpha = 0.05, P = 0.99, side = 1, method = c("EXACT", "BONF"), m = 50, log.norm = FALSE)
x |
Either a matrix or list of vectors of the data. If a matrix, then the columns are the samples from the different normal (or log-normal) populations. If |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether simultaneous 1-sided or 2-sided tolerance intervals are required (determined by |
method |
The method for calculating the k-factors. |
m |
The maximum number of subintervals to be used in the |
log.norm |
If |
Recall that if the random variable is distributed according to a log-normal distribution, then the random variable
is
distributed according to a normal distribution.
normtol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
x.bar |
The sample means. |
1-sided.lower |
The simultaneous 1-sided lower tolerance bounds. This is given only if |
1-sided.upper |
The simultaneous 1-sided upper tolerance bounds. This is given only if |
2-sided.lower |
The simultaneous 2-sided lower tolerance bounds. This is given only if |
2-sided.upper |
The simultaneous 2-sided upper tolerance bounds. This is given only if |
The code for this functions is built upon code provided by Andrew Landgraf.
Krishnamoorthy, K. and Mathew, T. (2009), Statistical Tolerance Regions: Theory, Applications, and Computation, Wiley.
Mee, R. W. (1990), Simultaneous Tolerance Intervals for Normal Populations with Common Variance, Technometrics, 32, 83-92.
## 95%/95% simultaneous 1-sided normal tolerance ## intervals for two samples of unequal size. set.seed(100) x <- list(rnorm(5,1),rnorm(7,1,2)) out <- simnormtol.int(x = x, alpha = 0.05, P = 0.95, side = 1, method = "BONF") out
## 95%/95% simultaneous 1-sided normal tolerance ## intervals for two samples of unequal size. set.seed(100) x <- list(rnorm(5,1),rnorm(7,1,2)) out <- simnormtol.int(x = x, alpha = 0.05, P = 0.95, side = 1, method = "BONF") out
Density, distribution function, quantile function, and random generation for the 2-parameter
exponential distribution with rate equal to rate
and shift equal to shift
.
d2exp(x, rate = 1, shift = 0, log = FALSE) p2exp(q, rate = 1, shift = 0, lower.tail = TRUE, log.p = FALSE) q2exp(p, rate = 1, shift = 0, lower.tail = TRUE, log.p = FALSE) r2exp(n, rate = 1, shift = 0)
d2exp(x, rate = 1, shift = 0, log = FALSE) p2exp(q, rate = 1, shift = 0, lower.tail = TRUE, log.p = FALSE) q2exp(p, rate = 1, shift = 0, lower.tail = TRUE, log.p = FALSE) r2exp(n, rate = 1, shift = 0)
x , q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
The number of observations. If |
rate |
Vector of rates. |
shift |
Vector of shifts. |
log , log.p
|
Logical vectors. If |
lower.tail |
Logical vector. If |
If rate
or shift
are not specified, then they assume the default values of 1 and 0, respectively.
The 2-parameter exponential distribution has density
where ,
is the shift parameter, and
is the scale parameter.
d2exp
gives the density, p2exp
gives the distribution function, q2exp
gives the quantile
function, and r2exp
generates random deviates.
runif
and .Random.seed
about random number generation.
## Randomly generated data from the 2-parameter exponential ## distribution. set.seed(100) x <- r2exp(n = 500, rate = 3, shift = -10) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 = sort(x) y <- d2exp(x = x.1, rate = 3, shift = -10) lines(x.1, y, col = 2, lwd = 2) plot(x.1, p2exp(q = x.1, rate = 3, shift = -10), type = "l", xlab = "x", ylab = "Cumulative Probabilities") q2exp(p = 0.20, rate = 3, shift = -10, lower.tail = FALSE) q2exp(p = 0.80, rate = 3, shift = -10)
## Randomly generated data from the 2-parameter exponential ## distribution. set.seed(100) x <- r2exp(n = 500, rate = 3, shift = -10) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 = sort(x) y <- d2exp(x = x.1, rate = 3, shift = -10) lines(x.1, y, col = 2, lwd = 2) plot(x.1, p2exp(q = x.1, rate = 3, shift = -10), type = "l", xlab = "x", ylab = "Cumulative Probabilities") q2exp(p = 0.20, rate = 3, shift = -10, lower.tail = FALSE) q2exp(p = 0.80, rate = 3, shift = -10)
Provides uniformly most accurate upper tolerance limits for the binomial, negative binomial, and Poisson distributions.
umatol.int(x, n = NULL, dist = c("Bin", "NegBin", "Pois"), N, alpha = 0.05, P = 0.99)
umatol.int(x, n = NULL, dist = c("Bin", "NegBin", "Pois"), N, alpha = 0.05, P = 0.99)
x |
A vector of data which is distributed according to one of the binomial, negative binomial, or Poisson distributions.
If the length of |
n |
The sample size of the data. If |
dist |
The distribution for the data given by |
N |
Must be specified for the binomial and negative binomial distributions. If |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
umatol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
p.hat |
The maximum likelihood estimate for the probability of success in each trial; reported if |
nu.hat |
The maximum likelihood estimate for the probability of success in each trial; reported if |
lambda.hat |
The maximum likelihood estimate for the rate of success; reported if |
1-sided.upper |
The 1-sided upper tolerance limit. |
Zacks, S. (1970), Uniformly Most Accurate Tolerance Limits for Monotone Likelihood Ratio Families of Discrete Distributions, Journal of the American Statistical Association, 65, 307–316.
Binomial
, NegBinomial
, Poisson
## Examples from Zacks (1970). umatol.int(25, n = 4, dist = "Bin", N = 10, alpha = 0.10, P = 0.95) umatol.int(13, n = 10, dist = "NegBin", N = 2, alpha = 0.10, P = 0.95) umatol.int(37, n = 10, dist = "Pois", alpha = 0.10, P = 0.95)
## Examples from Zacks (1970). umatol.int(25, n = 4, dist = "Bin", N = 10, alpha = 0.10, P = 0.95) umatol.int(13, n = 10, dist = "NegBin", N = 2, alpha = 0.10, P = 0.95) umatol.int(37, n = 10, dist = "Pois", alpha = 0.10, P = 0.95)
Provides 1-sided or 2-sided tolerance intervals for data distributed according to a uniform distribution.
uniftol.int(x, alpha = 0.05, P = 0.99, upper = NULL, lower = NULL, side = 1)
uniftol.int(x, alpha = 0.05, P = 0.99, upper = NULL, lower = NULL, side = 1)
x |
A vector of data which is distributed according to a uniform distribution. |
alpha |
The level chosen such that |
P |
The proportion of the population to be covered by this tolerance interval. |
upper |
The upper bound of the data. When |
lower |
The lower bound of the data. When |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
uniftol.int
returns a data frame with items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
Faulkenberry, G. D. and Weeks, D. L. (1968), Sample Size Determination for Tolerance Limits, Technometrics, 10, 343–348.
## 90%/90% 1-sided uniform tolerance intervals for a sample ## of size 50 with a known lower bound of 0. set.seed(100) x <- runif(50, 0, 50) out <- uniftol.int(x = x, alpha = 0.10, P = 0.90, lower = 0, side = 1) out plottol(out, x, plot.type = "hist", side = "two", x.lab = "Uniform Data")
## 90%/90% 1-sided uniform tolerance intervals for a sample ## of size 50 with a known lower bound of 0. set.seed(100) x <- runif(50, 0, 50) out <- uniftol.int(x = x, alpha = 0.10, P = 0.90, lower = 0, side = 1) out plottol(out, x, plot.type = "hist", side = "two", x.lab = "Uniform Data")
Density (mass), distribution function, quantile function, and random generation for the Zipf, Zipf-Mandelbrot, and zeta distributions.
dzipfman(x, s, b = NULL, N = NULL, log = FALSE) pzipfman(q, s, b = NULL, N = NULL, lower.tail = TRUE, log.p = FALSE) qzipfman(p, s, b = NULL, N = NULL, lower.tail = TRUE, log.p = FALSE) rzipfman(n, s, b = NULL, N = NULL)
dzipfman(x, s, b = NULL, N = NULL, log = FALSE) pzipfman(q, s, b = NULL, N = NULL, lower.tail = TRUE, log.p = FALSE) qzipfman(p, s, b = NULL, N = NULL, lower.tail = TRUE, log.p = FALSE) rzipfman(n, s, b = NULL, N = NULL)
x , q
|
Vector of quantiles. |
p |
Vector of probabilities. |
n |
The number of observations. If |
s , b
|
The shape parameters, both of which must be greater than 0. |
N |
The number of categories, which must be integer-valued for Zipf and Zipf-Mandelbrot distributions. For a zeta distribution, |
log , log.p
|
Logical vectors. If |
lower.tail |
Logical vector. If |
The Zipf-Mandelbrot distribution has mass
where ,
s,b>0
are shape parameters, and N
is the number of distinct categories. The Zipf distribution is just a special case of the Zipf-Mandelbrot distribution where the second shape parameter b=0
. The zeta distribution has mass
where ,
s>1
is the shape parameter, and is the Riemann zeta function given by:
Note that the zeta distribution is just a special case of the Zipf distribution where s>1
and N
goes to infinity.
dzipfman
gives the density (mass), pzipfman
gives the distribution function, qzipfman
gives the quantile function, and rzipfman
generates random deviates for the specified distribution.
These functions may be updated in a future version of the package so as to allow greater flexibility with the inputs.
Mandelbrot, B. B. (1965), Information Theory and Psycholinguistics. In B. B. Wolman and E. Nagel, editors. Scientific Psychology, Basic Books.
Young, D. S. (2013), Approximate Tolerance Limits for Zipf-Mandelbrot Distributions, Physica A: Statistical Mechanics and its Applications, 392, 1702–1711.
Zipf, G. K. (1949), Human Behavior and the Principle of Least Effort, Hafner.
Zornig, P. and Altmann, G. (1995), Unified Representation of Zipf Distributions, Computational Statistics and Data Analysis, 19, 461–473.
runif
and .Random.seed
about random number generation.
## Randomly generated data from the Zipf distribution. set.seed(100) x <- rzipfman(n = 150, s = 2, N = 100) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dzipfman(x = x.1, s = 2, N = 100) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pzipfman(q = x.1, s = 2, N = 100), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qzipfman(p = 0.20, s = 2, N = 100, lower.tail = FALSE) qzipfman(p = 0.80, s = 2, N = 100) ## Randomly generated data from the Zipf-Mandelbrot distribution. set.seed(100) x <- rzipfman(n = 150, s = 2, b = 3, N = 100) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dzipfman(x = x.1, s = 2, b = 3, N = 100) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pzipfman(q = x.1, s = 2, b = 3, N = 100), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qzipfman(p = 0.20, s = 2, b = 3, N = 100, lower.tail = FALSE) qzipfman(p = 0.80, s = 2, b = 3, N = 100) ## Randomly generated data from the zeta distribution. set.seed(100) x <- rzipfman(n = 100, s = 1.3, N = Inf) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dzipfman(x = x.1, s = 1.3, N = Inf) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pzipfman(q = x.1, s = 1.3, N = Inf), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qzipfman(p = 0.20, s = 1.3, lower.tail = FALSE, N = Inf) qzipfman(p = 0.80, s = 1.3, N = Inf)
## Randomly generated data from the Zipf distribution. set.seed(100) x <- rzipfman(n = 150, s = 2, N = 100) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dzipfman(x = x.1, s = 2, N = 100) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pzipfman(q = x.1, s = 2, N = 100), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qzipfman(p = 0.20, s = 2, N = 100, lower.tail = FALSE) qzipfman(p = 0.80, s = 2, N = 100) ## Randomly generated data from the Zipf-Mandelbrot distribution. set.seed(100) x <- rzipfman(n = 150, s = 2, b = 3, N = 100) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dzipfman(x = x.1, s = 2, b = 3, N = 100) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pzipfman(q = x.1, s = 2, b = 3, N = 100), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qzipfman(p = 0.20, s = 2, b = 3, N = 100, lower.tail = FALSE) qzipfman(p = 0.80, s = 2, b = 3, N = 100) ## Randomly generated data from the zeta distribution. set.seed(100) x <- rzipfman(n = 100, s = 1.3, N = Inf) hist(x, main = "Randomly Generated Data", prob = TRUE) x.1 <- sort(x) y <- dzipfman(x = x.1, s = 1.3, N = Inf) lines(x.1, y, col = 2, lwd = 2) plot(x.1, pzipfman(q = x.1, s = 1.3, N = Inf), type = "l", xlab = "x", ylab = "Cumulative Probabilities") qzipfman(p = 0.20, s = 1.3, lower.tail = FALSE, N = Inf) qzipfman(p = 0.80, s = 1.3, N = Inf)
Provides 1-sided or 2-sided tolerance intervals for data distributed according to Zipf, Zipf-Mandelbrot, and zeta distributions.
zipftol.int(x, m = NULL, N = NULL, alpha = 0.05, P = 0.99, side = 1, s = 1, b = 1, dist = c("Zipf", "Zipf-Man", "Zeta"), ties = FALSE, ...)
zipftol.int(x, m = NULL, N = NULL, alpha = 0.05, P = 0.99, side = 1, s = 1, b = 1, dist = c("Zipf", "Zipf-Man", "Zeta"), ties = FALSE, ...)
x |
A vector of raw data or a table of counts which is distributed according to a Zipf, Zipf-Mandelbrot, or zeta distribution. Do not supply a vector of counts! |
m |
The number of observations in a future sample for which the tolerance limits will be calculated. By default, |
N |
The number of categories when |
alpha |
The level chosen such that 1-alpha is the confidence level. |
P |
The proportion of the population to be covered by this tolerance interval. |
side |
Whether a 1-sided or 2-sided tolerance interval is required (determined by |
s |
The initial value to estimate the shape parameter in the |
b |
The initial value to estimate the second shape parameter in the |
dist |
Options are |
ties |
How to handle if there are other categories with the same frequency as the category at the estimated tolerance limit. The default is |
... |
Additional arguments passed to the |
Zipf-Mandelbrot models are commonly used to model phenomena where the frequencies of categorical data are approximately inversely proportional to its rank in the frequency table. Zipf-Mandelbrot distributions are heavily right-skewed distributions with a (relatively) large mass placed on the first category. For most practical applications, one will typically be interested in 1-sided upper bounds.
zipftol.int
returns a data frame with the following items:
alpha |
The specified significance level. |
P |
The proportion of the population covered by this tolerance interval. |
s.hat |
MLE for the shape parameter |
b.hat |
MLE for the shape parameter |
1-sided.lower |
The 1-sided lower tolerance bound. This is given only if |
1-sided.upper |
The 1-sided upper tolerance bound. This is given only if |
2-sided.lower |
The 2-sided lower tolerance bound. This is given only if |
2-sided.upper |
The 2-sided upper tolerance bound. This is given only if |
This function may be updated in a future version of the package so as to allow greater flexibility with the inputs.
Mandelbrot, B. B. (1965), Information Theory and Psycholinguistics. In B. B. Wolman and E. Nagel, editors. Scientific Psychology, Basic Books.
Young, D. S. (2013), Approximate Tolerance Limits for Zipf-Mandelbrot Distributions, Physica A: Statistical Mechanics and its Applications, 392, 1702–1711.
Zipf, G. K. (1949), Human Behavior and the Principle of Least Effort, Hafner.
Zornig, P. and Altmann, G. (1995), Unified Representation of Zipf Distributions, Computational Statistics and Data Analysis, 19, 461–473.
## 95%/99% 1-sided tolerance intervals for the Zipf, ## Zipf-Mandelbrot, and zeta distributions. set.seed(100) s <- 2 b <- 5 N <- 50 zipf.data <- rzipfman(n = 150, s = s, N = N) zipfman.data <- rzipfman(n = 150, s = s, b = b, N = N) zeta.data <- rzipfman(n = 150, s = s, N = Inf) out.zipf <- zipftol.int(zipf.data, dist = "Zipf") out.zipfman <- zipftol.int(zipfman.data, dist = "Zipf-Man") out.zeta <- zipftol.int(zeta.data, N = Inf, dist = "Zeta") out.zipf out.zipfman out.zeta
## 95%/99% 1-sided tolerance intervals for the Zipf, ## Zipf-Mandelbrot, and zeta distributions. set.seed(100) s <- 2 b <- 5 N <- 50 zipf.data <- rzipfman(n = 150, s = s, N = N) zipfman.data <- rzipfman(n = 150, s = s, b = b, N = N) zeta.data <- rzipfman(n = 150, s = s, N = Inf) out.zipf <- zipftol.int(zipf.data, dist = "Zipf") out.zipfman <- zipftol.int(zipfman.data, dist = "Zipf-Man") out.zeta <- zipftol.int(zeta.data, N = Inf, dist = "Zeta") out.zipf out.zipfman out.zeta
Performs maximum likelihood estimation for the parameters of the Zipf, Zipf-Mandelbrot, and zeta distributions.
zm.ll(x, N = NULL, s = 1, b = 1, dist = c("Zipf", "Zipf-Man", "Zeta"), ...)
zm.ll(x, N = NULL, s = 1, b = 1, dist = c("Zipf", "Zipf-Man", "Zeta"), ...)
x |
A vector of raw data or a table of counts which is distributed according to a Zipf, Zipf-Mandelbrot, or zeta distribution. Do not supply a vector of counts! |
N |
The number of categories when |
s |
The initial value to estimate the shape parameter, which is set to 1 by default. If a poor initial value is specified, then a |
b |
The initial value to estimate the second shape parameter when |
dist |
Options are |
... |
Additional arguments passed to the |
Zipf-Mandelbrot models are commonly used to model phenomena where the frequencies of categorical data are approximately inversely proportional to its rank in the frequency table.
See the help file for mle
to see how the output is structured.
This function may be updated in a future version of the package so as to allow greater flexibility with the inputs.
Mandelbrot, B. B. (1965), Information Theory and Psycholinguistics. In B. B. Wolman and E. Nagel, editors. Scientific Psychology, Basic Books.
Zipf, G. K. (1949), Human Behavior and the Principle of Least Effort, Hafner.
Zornig, P. and Altmann, G. (1995), Unified Representation of Zipf Distributions, Computational Statistics and Data Analysis, 19, 461–473.
## Maximum likelihood estimation for randomly generated data ## from the Zipf, Zipf-Mandelbrot, and zeta distributions. set.seed(100) s <- 2 b <- 5 N <- 50 zipf.data <- rzipfman(n = 500, s = s, N = N) out.zipf <- zm.ll(zipf.data, N = N, dist = "Zipf") stats4::coef(out.zipf) stats4::vcov(out.zipf) zipfman.data <- rzipfman(n = 500, s = s, b = b, N = N) out.zipfman <- zm.ll(zipfman.data, N = N, dist = "Zipf-Man") stats4::coef(out.zipfman) diag(stats4::vcov(out.zipfman)) zeta.data <- rzipfman(n = 200, s = s, N = Inf) out.zeta <- zm.ll(zeta.data, N = Inf, dist = "Zeta") stats4::coef(out.zeta) stats4::vcov(out.zeta)
## Maximum likelihood estimation for randomly generated data ## from the Zipf, Zipf-Mandelbrot, and zeta distributions. set.seed(100) s <- 2 b <- 5 N <- 50 zipf.data <- rzipfman(n = 500, s = s, N = N) out.zipf <- zm.ll(zipf.data, N = N, dist = "Zipf") stats4::coef(out.zipf) stats4::vcov(out.zipf) zipfman.data <- rzipfman(n = 500, s = s, b = b, N = N) out.zipfman <- zm.ll(zipfman.data, N = N, dist = "Zipf-Man") stats4::coef(out.zipfman) diag(stats4::vcov(out.zipfman)) zeta.data <- rzipfman(n = 200, s = s, N = Inf) out.zeta <- zm.ll(zeta.data, N = Inf, dist = "Zeta") stats4::coef(out.zeta) stats4::vcov(out.zeta)