Transform pdqr-functions with form_*() and base operations

Concept of form functions is to take one or more pdqr-function(s) and return a transformed pdqr-function. Argument method is used to choose function-specific algorithm of computation.

Transformation of pdqr-functions can be done with form_*() family or with base R operators (implemented with S3 group generic functions).

form_*() family

There are several form_*() functions. Here are some important examples. For more information, please, browse documentation.


form_trans() performs a transformation of input pdqr-function(s) (assumed to represent independent distributions). Default method is “random” which works based on random simulation: generates sample(s) from input pdqr-function(s), calls transformation function on them, and creates output pdqr-function based on transformation output.

The other method is “bruteforce”, which converts input function(s) to have type “discrete”, applies function to all possible combinations of output values, creates pdqr-function based on output, and possibly converts back to “continuous” type. This method is very time consuming and might be useful only when applied to “discrete” functions with not many combinations of “x” values.

form_resupport() and form_tails()

form_resupport() transforms distribution to have support predefined by one or both edges. This might be useful when dealing with “extending property” of density() function in case of known value boundaries (for which default method “reflect” suits best). Continuing previous section:

form_tails() also modifies support of input distribution by removing its tail(s). Removed amount is defined by level - total probability of tail. This function is useful for computing robust versions of distributions.

form_recenter() and form_respread()

During hypothesis testing there is usually a need to alter some existing distribution to have certain center and/or spread. Functions form_recenter() and form_respread() implement linear transformations that accomplish this goal:


form_mix() results into mixture of distributions. Input can have both types of pdqr-functions. Output will be “discrete” only if all inputs have “discrete” type. If at least one input pdqr-function has “continuous” type, all possible “discrete” inputs are converted to be “continuous” type (with form_retype() and method “dirac”) in form of mixture of dirac-like “continuous” functions. Note that output can have many piecewise-linear intervals, and to reduce their amount (with possible accuracy loss) use form_regrid().


form_estimate() computes distribution of sample estimate. For example, what distribution would have mean of 20 sample elements from uniform distribution? To answer this question, form_estimate() would use simulation. One randomly generate element from target distribution is achieved by generating 20 elements from input one and computing desired statistic. This is repeated many times with calling one of new_*() functions on resulted sample from target distribution. Note that this algorithm usually quite time consuming.

Base operations

Almost all basic R operations (implemented with S3 group generic functions) has methods for pdqr-functions. Operations are done as if applied to independent random variables with distributions represented by input pdqr-function(s). Many of methods have random nature and are implemented with form_trans(), but have little tweaks that make their direct usage better than form_trans().

Methods for Math are mostly implemented with simulation:

# Exponent of uniform distribution
#> Density function of continuous type
#> Support: ~[1, 2.71828] (517 intervals)

Methods for Ops may be divided into two parts:

# Distribution of used in `form_trans()` section transformation function. Note
# the correct support [-1, 1] without effect of "extending property" of
# `density()`. Here the default method of `form_resupport()` is used.
sin(d_norm * d_unif)
#> Density function of continuous type
#> Support: [-1, 1] (517 intervals)

# Comparing random variables results into boolean random variable represented
# by boolean pdqr-function.
# Here it means that random value of `d_norm` will be greater than random value
# of `d_unif` with probability around 0.316.
d_norm > d_unif
#> Probability mass function of discrete type
#> Support: [0, 1] (2 elements, probability of 1: ~0.31563)

Methods for Summary may also be divided into conceptually the same parts as in Ops:

Function range() doesn’t make sense in this setup because it returns two numbers instead of one.

# Distribution of maximum of three random variables
max(d_norm, d_norm, d_norm)
#> Density function of continuous type
#> Support: ~[-2.34473, 4.46546] (511 intervals)

# Probability that all inequalities are true
summ_prob_true(all(d_norm > d_unif, d_norm > 2*d_unif))
#> [1] 0.06161823