Articles Random Number Generation and Sampling Methods by Peter Occil

emailx45 · Jun 24, 2020

[SHOWTOGROUPS=4,20]

Hyperexponential distribution: See Mixtures of Distributions.
Hypergeometric distribution: See Hypergeometric Distribution.
Hypoexponential distribution: See Transformations of Random Numbers.
Inverse chi-squared distribution†: df / (GammaDist(df * 0.5, 2)), where df is the number of degrees of freedom. The scale parameter (sigma) is usually 1.0 / df.
Inverse Gaussian distribution (Wald distribution): Generate n = mu + (mu*mu*y/(2*lamda)) - mu * sqrt(4 * mu * lamda * y + mu * mu * y * y) / (2 * lamda), where y = pow(Normal(0, 1), 2), then return n if RNDU01OneExc() <= mu / (mu + n), or mu * mu / n otherwise. mu is the mean and lamda is the scale; both parameters are greater than 0. Based on method published in (Devroye 1986)(14).
kth-order statistic distribution: BetaDist(k, n+1-k). Returns the kth smallest out of n uniform random numbers. See also (Devroye 1986, p. 210)(14).
Kumaraswamy distribution⬦: pow(1-pow(RNDU01ZeroExc(),1.0/b),1.0/a), where a and b are shape parameters.
Landau distribution: See stable distribution.
Lévy distribution†: 0.5 / GammaDist(0.5, 1). The scale parameter (sigma) is also called dispersion.
Logarithmic logistic distribution: See beta prime distribution.
Logarithmic series distribution: Generate n = NegativeBinomialInt(1, py - px, py)+1 (where px/py is a parameter in (0,1)), then return n if ZeroOrOne(1, n) == 1, or repeat this process otherwise (Flajolet et al., 2010)(9). If the application can trade accuracy for speed, the following can be used instead: floor(1.0 - Expo(log1p(-pow(1.0 - p, RNDU01ZeroOneExc())))), where p is the parameter in (0, 1); see (Devroye 1986)(14).
Logistic distribution†: (ln(x)-log1p(-x)) (logit function), where x is RNDU01ZeroOneExc().
Log-multinormal distribution: See Multivariate Normal (Multinormal) Distribution.
Max-of-uniform distribution: BetaDist(n, 1). Returns a number that simulates the largest out of n uniform random numbers. See also (Devroye 1986, p. 675)(14).
Maxwell distribution†: sqrt(GammaDist(1.5, 2)).
Min-of-uniform distribution: BetaDist(1, n). Returns a number that simulates the smallest out of n uniform random numbers. See also (Devroye 1986, p. 210)(14).
Moyal distribution: See the Python sample code.
Multinomial distribution: See Multinomial Distribution.
Multivariate Poisson distribution: See the Python sample code.
Multivariate t-copula: See the Python sample code.
Multivariate t-distribution: See the Python sample code.
Negative binomial distribution (NegativeBinomial(successes, p)): See Negative Binomial Distribution. The negative binomial distribution can take a successes value other than an integer; in that case, a negative binomial (successes, p) random number is Poisson(GammaDist(successes, (1 - p) / p)).
Negative multinomial distribution: See the Python sample code.
Noncentral beta distribution⬦: BetaDist(a + Poisson(nc), b), where nc (a noncentrality), a, and b are greater than 0.
Parabolic distribution⬦: BetaDist(2, 2) (Saucier 2000, p. 30).
Pascal distribution: NegativeBinomial(successes, p) + successes, where successes and p have the same meaning as in the negative binomial distribution, except successes is always an integer.
Pearson VI distribution: GammaDist(v, 1) / GammaDist(w, 1), where v and w are shape parameters greater than 0 (Saucier 2000, p. 33; there, an additional b parameter is defined, but that parameter is canceled out in the source code).
Piecewise constant distribution: See Weighted Choice With Replacement.
Piecewise linear distribution: See Continuous Weighted Choice.
Pólya–Aeppli distribution: See Transformations of Random Numbers: Additional Examples.
Power distribution: BetaDist(alpha, 1) / b, where alpha is the shape and b is the domain. Nominally in the interval (0, 1).
Power law distribution: pow(RNDRANGE(pow(mn,n+1),pow(mx,n+1)), 1.0 / (n+1)), where n is the exponent, mn is the minimum, and mx is the maximum. Reference.
Power lognormal distribution: See the Python sample code.
Power normal distribution: See the Python sample code.
Product copula: See Gaussian and Other Copulas.
Rice distribution: See Multivariate Normal (Multinormal) Distribution.
Rice–Norton distribution: See Multivariate Normal (Multinormal) Distribution.
Singh–Maddala distribution: See beta prime distribution.
Skellam distribution: Poisson(mean1) - Poisson(mean2), where mean1 and mean2 are the means of the two Poisson random numbers.
Skewed normal distribution: Normal(0, x) + mu + alpha * abs(Normal(0, x)), where x is sigma / sqrt(alpha * alpha + 1.0), mu and sigma have the same meaning as in the normal distribution, and alpha is a shape parameter.
Snedecor's (Fisher's) F-distribution: GammaDist(m * 0.5, n) / (GammaDist(n * 0.5 + Poisson(sms * 0.5)) * m, 1), where m and n are the numbers of degrees of freedom of two random numbers with a chi-squared distribution, and if sms is other than 0, one of those distributions is noncentral with sum of mean squares equal to sms.
Stable distribution: See Stable Distribution. Four-parameter stable distribution: Stable(alpha, beta) * sigma + mu, where mu is the mean and sigma is the scale; if alpha and beta are 1, the result is a Landau distribution. "Type 0" stable distribution: Stable(alpha, beta) * sigma + (mu - sigma * beta * x), where x is ln(sigma)*2.0/pi if alpha is 1, and tan(pi*0.5*alpha) otherwise.
Standard complex normal distribution: See Multivariate Normal (Multinormal) Distribution.
Suzuki distribution: See Rayleigh distribution.
Tukey lambda distribution: (pow(x, lamda)-pow(1.0-x,lamda))/lamda, where x is RNDU01() and lamda is a shape parameter.
Twin-t distribution (Baker and Jackson 2018)(76): Generate x, a random Student's t-distributed number (not a noncentral one). Accept x if RNDU01OneExc() < pow((1 + y) / ((1 + y * y) + y), (df + 1) * 0.5), where y = x * x / df and df is the degrees of freedom used to generate the number; repeat this process otherwise.
von Mises distribution: See von Mises Distribution.
Waring–Yule distribution: See beta negative binomial distribution.
Wigner (semicircle) distribution†: (BetaDist(1.5, 1.5)*2-1). The scale parameter (sigma) is the semicircular radius.
Yule–Simon distribution: See beta negative binomial distribution.
Zeta distribution: Generate n = floor(pow(RNDU01ZeroOneExc(), -1.0 / r)), and if d / pow(2, r) < (d - 1) * RNDU01OneExc() * n / (pow(2, r) - 1.0), where d = pow((1.0 / n) + 1, r), repeat this process. The parameter r is greater than 0. Based on method described in (Devroye 1986)(14). A zeta distribution truncated by rejecting random values greater than some positive integer is called a Zipf distribution or Estoup distribution. (Note that Devroye uses "Zipf distribution" to refer to the untruncated zeta distribution.)
Zipf distribution: See zeta distribution.

[/SHOWTOGROUPS]

emailx45 · Jun 24, 2020

[SHOWTOGROUPS=4,20]
Geometric Sampling
Requires random real numbers.

This section contains ways to choose independent uniform random points in or on geometric shapes.

Random Points Inside a Box
To generate a random point inside an N-dimensional box, generate RNDRANGEMaxExc(mn, mx) for each coordinate, where mn and mx are the lower and upper bounds for that coordinate. For example—

to generate a random point inside a rectangle bounded in [0, 2) along the X axis and [3, 6) along the Y axis, generate [RNDRANGEMaxExc(0,2), RNDRANGEMaxExc(3,6)], and
to generate a complex number with real and imaginary parts bounded in [0, 1], generate [RNDRANGE(0, 1), RNDRANGE(0, 1)].

Random Points Inside a Simplex
The following pseudocode generates a random point inside an n-dimensional simplex (simplest convex figure, such as a line segment, triangle, or tetrahedron). It takes one parameter, points, a list consisting of the n plus one vertices of the simplex, all of a single dimension n or greater. See also (Grimme 2015)(77), which shows MATLAB code for generating a random point uniformly inside a simplex just described, but in a different way.

Code:

METHOD RandomPointInSimplex(points):
   ret=NewList()
   if size(points) > size(points[0])+1: return error
   if size(points)==1 // Return a copy of the point
     for i in 0...size(points[0]): AddItem(ret,points[0][i])
     return ret
   end
   gammas=NewList()
   // Sample from a Dirichlet distribution
   for i in 0...size(points): AddItem(gammas, Expo(1))
   tsum=0
   for i in 0...size(gammas): tsum = tsum + gammas[i]
   tot = 0
   for i in 0...size(gammas) - 1
       gammas[i] = gammas[i] / tsum
       tot = tot + gammas[i]
   end
   tot = 1.0 - tot
   for i in 0...size(points[0]): AddItem(ret, points[0][i]*tot)
   for i in 1...size(points)
      for j in 0...size(points[0])
         ret[j]=ret[j]+points[i][j]*gammas[i-1]
      end
   end
   return ret
END METHOD

Random Points on the Surface of a Hypersphere
The following pseudocode shows how to generate a random N-dimensional point on the surface of an N-dimensional hypersphere, centered at the origin, of radius radius (if radius is 1, the result can also serve as a unit vector in N-dimensional space). Here, Norm is given in the appendix. See also (Weisstein)(78).

Code:

METHOD RandomPointInHypersphere(dims, radius)
  x=0
  while x==0
    ret=[]
    for i in 0...dims: AddItem(ret, Normal(0, 1))
    x=Norm(ret)
  end
  invnorm=radius/x
  for i in 0...dims: ret[i]=ret[i]*invnorm
  return ret
END METHOD

Note: The Python sample code contains an optimized method for points on the edge of a circle.
Example: To generate a random point on the surface of a cylinder running along the Z axis, generate random X and Y coordinates on the edge of a circle (2-dimensional hypersphere) and generate a random Z coordinate by RNDRANGE(mn, mx), where mn and mx are the highest and lowest Z coordinates possible.

Random Points Inside a Ball, Shell, or Cone
To generate a random N-dimensional point on or inside an N-dimensional ball, centered at the origin, of radius R, either—

generate a random (N+2)-dimensional point on the surface of an (N+2)-dimensional hypersphere with that radius (e.g., using RandomPointInHypersphere), then discard the last two coordinates (Voelker et al., 2017)(79), or
follow the pseudocode in RandomPointInHypersphere, except replace Norm(ret) with sqrt(S + Expo(1)), where S is the sum of squares of the numbers in ret.

To generate a random point on or inside an N-dimensional spherical shell (a hollow ball), centered at the origin, with inner radius A and outer radius B (where A is less than B), either—

generate a random point for a ball of radius B until the norm of that point is A or greater (see the appendix), or
generate a random point on the surface of an N-dimensional hypersphere with radius equal to pow(RNDRANGE(pow(A, N), pow(B, N)), 1.0 / N)(80).

To generate a random point on or inside a cone with height H and radius R at its base, running along the Z axis, generate a random Z coordinate by Z = max(max(RNDRANGE(0, H), RNDRANGE(0, H)), RNDRANGE(0, H)), then generate random X and Y coordinates inside a disc (2-dimensional ball) with radius equal to max(RNDRANGE(0,R*Z/H), RNDRANGE(0,R*Z/H))(81).

Example: To generate a random point inside a cylinder running along the Z axis, generate random X and Y coordinates inside a disc (2-dimensional ball) and generate a random Z coordinate by RNDRANGE(mn, mx), where mn and mx are the highest and lowest Z coordinates possible.
Note: The Python sample code contains a method for generating a random point on the surface of an ellipsoid modeling the Earth.

Random Latitude and Longitude
To generate a random point on the surface of a sphere in the form of a latitude and longitude (in radians with west and south coordinates negative)(82)—

generate the longitude RNDRANGEMaxExc(-pi, pi), where the longitude is in the interval [-π, π), and
generate the latitude atan2(sqrt(1 - x * x), x) - pi / 2, where x = RNDRANGE(-1, 1) and the latitude is in the interval [-π/2, π/2] (the interval excludes the poles, which have many equivalent forms; if poles are not desired, generate x until neither -1 nor 1 is generated this way).

[/SHOWTOGROUPS]

emailx45 · Jun 24, 2020

[SHOWTOGROUPS=4,20]
Acknowledgments
I acknowledge the commenters to the CodeProject version of this page, including George Swan, who referred me to the reservoir sampling method.

I also acknowledge Christoph Conrads, who gave suggestions in parts of this article.

Notes
(1) For the definition of an RNG, it is irrelevant—

how hard it is to predict the numbers the item produces,
how well the item passes statistical randomness tests,
whether the item is initialized automatically or not,
whether the item uses only its input and its state to produce numbers, or
whether the item extracts random bits from one or more noise sources.

If the generator produces numbers with unequal probabilities, but is otherwise an RNG as defined here, then randomness extraction (which is outside the scope of this document) can make it produce numbers with closer to equal probabilities.

(2) Pedersen, K., "Reconditioning your quantile function", arXiv:1704.07949v3 [stat.CO], 2018.

(3) For an exercise solved by the RNDINT pseudocode, see A. Koenig and B. E. Moo, Accelerated C++, 2000; see also a blog post by Johnny Chan. In addition, M. O'Neill discusses various methods, both biased and unbiased, for generating random integers in a range with an RNG in a blog post from July 2018. Finally, a post in the Math Forum ("Probability and Random Numbers", Feb. 29, 2004) and Mennucci, A.C.G., "Bit Recycling for Scaling Random Number Generators", arXiv:1012.4290 [cs.IT], 2018, independently show a method for batching and recycling random bits to produce random integers in a range.

(4) D. Lemire, "A fast alternative to the modulo reduction", Daniel Lemire's blog, 2016.

(5) Lumbroso, J., "Optimal Discrete Uniform Generation from Coin Flips, and Applications", arXiv:1304.1916 [cs.DS].

(6) A naïve RNDINTEXC implementation often seen in certain languages like JavaScript is the idiom floor(Math.random() * maxExclusive), where Math.random() is any method that outputs an independent uniform random number in the interval [0, 1). However, no implementation of Math.random() can choose from all real numbers in [0, 1), so this idiom can bias some results over others depending on the value of maxExclusive. For example, if Math.random() is implemented as RNDINT(X - 1)/X and X is not divisible by maxExclusive, the result will be biased. Also, an implementation might pre-round Math.random() * maxExclusive (before the floor) to the closest number it can represent; in rare cases, that might be maxExclusive for certain rounding modes. If an application is concerned about these issues, it should treat the Math.random() implementation as the underlying RNG for RNDINT and implement RNDINTEXC through RNDINT instead.

(7) Canonne, C., Kamath, G., Steinke, T., "The Discrete Gaussian for Differential Privacy", arXiv:2004.00010v2 [cs.DS], 2020.

(8) Keane, M. S., and O'Brien, G. L., "A Bernoulli factory", ACM Transactions on Modeling and Computer Simulation 4(2), 1994.

(9) Flajolet, P., Pelletier, M., Soria, M., "On Buffon machines and numbers", arXiv:0906.5560v2 [math.PR], 2010.

(10) Jeff Atwood, "The danger of naïveté", Dec. 7, 2007.

(11) Bacher, A., Bodini, O., et al., "MergeShuffle: A Very Fast, Parallel Random Permutation Algorithm", arXiv:1508.03167 [cs.DS], 2015.

(12) Merlini, D., Sprugnoli, R., Verri, M.C., "An Analysis of a Simple Algorithm for Random Derangements", 2007.

(13) If the strings identify database records, file system paths, or other shared resources, special considerations apply, including the need to synchronize access to those resources. For uniquely identifying database records, alternatives to random strings include auto-incrementing or sequentially assigned row numbers. The choice of underlying RNG is important when it comes to unique random strings; see my RNG recommendation document.

(14) Devroye, L., Non-Uniform Random Variate Generation, 1986.

(15) See also the Stack Overflow question "Random index of a non zero value in a numpy array".

(16) S. Linderman, "A Parallel Gamma Sampling Implementation", Laboratory for Independent Probabilistic Systems Blog, Feb. 21, 2013, illustrates one example, a GPU-implemented sampler of gamma-distributed random numbers.

(17) Brownlee, J. "A Gentle Introduction to the Bootstrap Method", Machine Learning Mastery, May 25, 2018.

(18) Propp, J.G., Wilson, D.B., "Exact sampling with coupled Markov chains and applications to statistical mechanics", 1996.

(19) E. N. Gilbert, "Random Graphs", Annals of Mathematical Statistics 30(4), 1959.

(20) V. Batagelj and U. Brandes, "Efficient generation of large random networks", Phys.Rev. E 71:036113, 2005.

(21) Penschuck, M., et al., "Recent Advances in Scalable Network Generation", arXiv:2003.00736v1 [cs.DS], 2020.

(22) Jon Louis Bentley and James B. Saxe, "Generating Sorted Lists of Random Numbers", ACM Trans. Math. Softw. 6 (1980), pp. 359-364, describes a way to generate random numbers in sorted order, but it's not given here because it relies on generating real numbers in the interval [0, 1], which is inherently imperfect because computers can't choose among all random numbers between 0 and 1, and there are infinitely many of them.

(23) Efraimidis, P. and Spirakis, P. "Weighted Random Sampling (2005; Efraimidis, Spirakis)", 2005.

(24) Efraimidis, P. "Weighted Random Sampling over Data Streams", arXiv:1012.0256v2 [cs.DS], 2015.

(25) T. Vieira, "Gumbel-max trick and weighted reservoir sampling", 2014.

(26) T. Vieira, "Faster reservoir sampling by waiting", 2019.

(27) Duffield, N., Lund, C., Thorup, M., "Priority sampling for estimation of arbitrary subset sums", October 2007.

(28) The Python sample code includes a ConvexPolygonSampler class that implements this kind of sampling for convex polygons; unlike other polygons, convex polygons are trivial to decompose into triangles.

(29) That article also mentions a critical-hit distribution, which is actually a mixture of two distributions: one roll of dice and the sum of two rolls of dice.

(30) An affine transformation is one that keeps straight lines straight and parallel lines parallel.

(31) Farach-Colton, M. and Tsai, M.T., 2015. Exact sublinear binomial sampling. Algorithmica 73(4), pp. 637-651.

(32) von Neumann, J., "Various techniques used in connection with random digits", 1951.

(33) Karney, C.F.F., "Sampling exactly from the normal distribution", arXiv:1303.6257v2 [physics.comp-ph], 2014.

(34) Smith and Tromble, "Sampling Uniformly from the Unit Simplex", 2004.

(35) Durfee, et al., "l1 Regression using Lewis Weights Preconditioning and Stochastic Gradient Descent", Proceedings of Machine Learning Research 75(1), 2018.

(36) The NVIDIA white paper "Floating Point and IEEE 754 Compliance for NVIDIA GPUs", and "Floating-Point Determinism" by Bruce Dawson, discuss issues with non-integer numbers in much more detail.

(37) Note that RNDU01OneExc() corresponds to Math.random() in Java and JavaScript.

(38) This includes integers if e is limited to 0, and fixed-point numbers if e is limited to a single exponent less than 0.

(39) Downey, A. B. "Generating Pseudo-random Floating Point Values", 2007.

(40) Ideally, X is the highest integer p such that all multiples of 1/p in the interval [0, 1] are representable in the number format in question. For example, X is 2^53 (9007199254740992) for binary64, and 2^24 (16777216) for binary32.

(41) Goualard, F., "Generating Random Floating-Point Numbers by Dividing Integers: a Case Study", 2020.

(42) Monahan, J.F., "Accuracy in Random Number Generation", Mathematics of Computation 45(172), 1985.

(43) Spall, J.C., "An Overview of the Simultaneous Perturbation Method for Efficient Optimization", Johns Hopkins APL Technical Digest 19(4), 1998, pp. 482-492.

(44) P. L'Ecuyer, "Tables of Linear Congruential Generators of Different Sizes and Good Lattice Structure", Mathematics of Computation 68(225), January 1999, with errata.

(45) Harase, S., "A table of short-period Tausworthe generators for Markov chain quasi-Monte Carlo", arXiv:2002.09006 [math.NA], 2020.

(46) Brassard, G., Devroye, L., Gravel, C., "Exact Classical Simulation of the Quantum-Mechanical GHZ Distribution", IEEE Transactions on Information Theory 62(2), February 2016. Note that that paper defines a Bernoulli trial as 0 for success and 1 for failure, rather than the other way around, as in this document.

[/SHOWTOGROUPS]

emailx45 · Jun 24, 2020

[SHOWTOGROUPS=4,20]

(47) D. Revuz, M. Yor, "Continuous Martingales and Brownian Motion", 1999.

(48) Alexopoulos, Goldsman, "Random Variate Generation", 2017.

(49) Saucier, R. "Computer Generation of Statistical Distributions", March 2000.

(50) Other references on density estimation include a Wikipedia article on multiple-variable kernel density estimation, and a blog post by M. Kay.

(51) "Jitter", as used in this step, follows a distribution formally called a kernel, of which the normal distribution is one example. Bandwidth should be set so that the estimated distribution fits the data and remains smooth. A more complex kind of "jitter" (for multi-component data points) consists of a point generated from a multinormal distribution with all the means equal to 0 and a covariance matrix that, in this context, serves as a bandwidth matrix. "Jitter" and bandwidth are not further discussed in this document.

(52) More formally—

the PDF is the derivative (instantaneous rate of change) of the distribution's CDF (that is, PDF(x) = CDF′(x)), and
the CDF is also defined as the integral ("area under the curve") of the PDF,

provided the PDF's values are all 0 or greater and the area under the PDF's curve is 1.

(53) A discrete distribution is a distribution that associates one or more items with a separate probability. This page assumes (without loss of generality) that these items are integers. A discrete distribution can produce non-integer values (e.g., x/y with probability x/(1+y)) as long as the values can be converted to and from integers. Two examples:

- A rational number in lowest terms can be converted to an integer by interleaving the bits of the numerator and denominator.
Integer-quantized numbers (popular in "deep-learning" neural networks) take a relatively small number of bits (usually 8 bits or even smaller). An 8-bit quantized number format is effectively a "look-up table" that maps 256 integers to real numbers.

(54) If those values are inexact floating-point numbers, their relative error will generally be smaller the closer they are to 0 (see also Walter, 2019).

(55) This includes integers if FPExponent is limited to 0, and fixed-point numbers if FPExponent is limited to a single exponent less than 0.

(56) Based on a suggestion by F. Saad in a personal communication (Mar. 26, 2020).

(57) Saad, F.A., et al., "Optimal Approximate Sampling from Discrete Probability Distributions", arXiv:2001.04555 [cs.DS], 2020. See also the associated source code.

(58) Devroye, L., Gravel, C., "Sampling with arbitrary precision", arXiv:1502.02539v5 [cs.IT]

(59) Bringmann, K., and Friedrich, T., 2013, July. Exact and efficient generation of geometric random variates and random graphs, in International Colloquium on Automata, Languages, and Programming (pp. 267-278).

(60) Walter, M., "Sampling the Integers with Low Relative Error", in International Conference on Cryptology in Africa, Jul. 2019, pp. 157-180.

(61) Devroye, L., "Non-Uniform Random Variate Generation". In Handbooks in Operations Research and Management Science: Simulation, Henderson, S.G., Nelson, B.L. (eds.), 2006, p.83.

(62) Nguyen, N. and Ökten, G., "The acceptance-rejection method for low-discrepancy sequences", arXiv:1403.5599v1 [q-fin.CP], 2014; also in Monte Carlo Methods and Applications 22(2), pp. 133-148, 2016.

(63) Tran, K.H., "A Common Derivation for Markov Chain Monte Carlo Algorithms with Tractable and Intractable Targets", arXiv:1607.01985v5 [stat.CO], 2018, gives a common framework for describing many MCMC algorithms, including Metropolis–Hastings, slice sampling, and Gibbs sampling.

(64) The KVectorSampler class uses algorithms described in Arnas, D., Leake, C., Mortari, D., "Random Sampling using k-vector", Computing in Science & Engineering 21(1) pp. 94-107, 2019, and Mortari, D., Neta, B., "k-Vector Range Searching Techniques".

(65) Gerhard Derflinger, Wolfgang Hörmann, and Josef Leydold, "Random variate generation by numerical inversion when only the density is known", ACM Transactions on Modeling and Computer Simulation 20(4) article 18, October 2010.

(66) Devroye, L., Gravel, C., "The expected bit complexity of the von Neumann rejection algorithm", arXiv:1511.02273v2 [cs.IT], 2016/2018

(67) Sainudiin, Raazesh, and Thomas L. York. "An Auto-Validating, Trans-Dimensional, Universal Rejection Sampler for Locally Lipschitz Arithmetical Expressions," Reliable Computing 18 (2013): 15-54.

(68) Devroye, L., 1996, December, "Random variate generation in one line of code" In Proceedings Winter Simulation Conference (pp. 265-272). IEEE.

(69) McGrath, E.J., Irving, D.C., "Techniques for Efficient Monte Carlo Simulation, Volume II", Oak Ridge National Laboratory, April 1975.

(70) Devroye, L., "Expected Time Analysis of a Simple Recursive Poisson Random Variate Generator", Computing 46, pp. 165-173, 1991.

(71) Giammatteo, P., and Di Mascio, T., "Wilson-Hilferty-type approximation for Poisson Random Variable", Advances in Science, Technology and Engineering Systems Journal 5(2), 2020.

(72) Bailey, R.W., "Polar generation of random variates with the t distribution", Mathematics of Computation 62 (1984).

(73) Stein, W.E. and Keblis, M.F., "A new method to simulate the triangular distribution", Mathematical and Computer Modelling 49(5-6), 2009, pp.1143-1147.

(74) Saha, M., et al., "The extended xgamma distribution", arXiv:1909.01103 [math.ST], 2019.

(75) Sen, S., et al., "The xgamma distribution: statistical properties and application", 2016.

(76) Baker, R., Jackson, D., "A new distribution for robust least squares", arXiv:1408.3237 [stat.ME], 2018.

(77) Grimme, C., "Picking a Uniformly Random Point from an Arbitrary Simplex", 2015.

(78) Weisstein, Eric W. "Hypersphere Point Picking". From MathWorld—A Wolfram Web Resource.

(79) Voelker, A.R., Gosmann, J., Stewart, T.C., "Efficiently sampling vectors and coordinates from the n-sphere and n-ball", Jan. 4, 2017.

(80) See the Mathematics Stack Exchange question titled "Random multivariate in hyperannulus", questions/1885630.

(81) See the Stack Overflow question "Uniform sampling (by volume) within a cone", questions/41749411. Square and cube roots replaced with maximums.

(82) Reference: "Sphere Point Picking" in MathWorld (replacing inverse cosine with atan2 equivalent).

(83) Describing differences between SQL dialects is outside the scope of this document, but Flourish SQL describes many such differences, including those concerning RNGs.

(84) Mironov, I., "On Significance of the Least Significant Bits For Differential Privacy", 2012.

(85) For example, see Balcer, V., Vadhan, S., "Differential Privacy on Finite Computers", Dec. 4, 2018; as well as Micciancio, D. and Walter, M., "Gaussian sampling over the integers: Efficient, generic, constant-time", in Annual International Cryptology Conference, August 2017 (pp. 455-485).

(86) Saad, F.A., Freer C.E., et al. "The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions", in AISTATS 2020: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research 108, Palermo, Sicily, Italy, 2020.

(87) Klundert, B. van de, "Efficient Generation of Discrete Random Variates", Master thesis, Universiteit Utrecht, 2019.

(88) Kschischang, Frank R. "A Trapezoid-Ziggurat Algorithm for Generating Gaussian Pseudorandom Variates." (2019).
[/SHOWTOGROUPS]

emailx45 · Jun 24, 2020

[SHOWTOGROUPS=4,20]

Appendix

Mean and Variance Calculation
The following method calculates the mean and the bias-corrected sample variance of a list of real numbers, using the Welford method presented by J. D. Cook. The method returns a two-item list containing the mean and that kind of variance in that order. (Sample variance is the estimated variance of a population or distribution assuming list is a random sample of that population or distribution.) The square root of the variance calculated here is what many APIs call a standard deviation (e.g. Python's statistics.stdev).

Code:

METHOD MeanAndVariance(list)
if size(list)==0: return [0, 0]
if size(list)==1: return [list[0], 0]
xm=list[0]
xs=0
i=1
while i < size(list)
c = list[i]
i = i + 1
cxm = (c - xm)
xm = xm + cxm *1.0/ i
xs = xs + cxm * (c - xm)
end
return [xm, xs*1.0/(size(list)-1)]
END METHOD

Note: The population variance (or biased sample variance) is found by dividing by size(list) rather than (size(list)-1), and the standard deviation of the population or a sample of it is the square root of that variance.

Norm Calculation
The following method calculates the norm of a vector (list of numbers).

Code:

METHOD Norm(vec)
ret=0
rc=0
for i in 0...size(vec)
rc=vec[i]*vec[i]-rc
rt=rc+ret
rc=(rt-ret)-rc
ret=rt
end
return sqrt(ret)
END METHOD

License
This article, along with any associated source code and files, is licensed under A Public Domain dedication

[/SHOWTOGROUPS]

Articles Random Number Generation and Sampling Methods by Peter Occil

emailx45

Бывалый

emailx45

Бывалый

emailx45

Бывалый

emailx45

Бывалый

emailx45

Бывалый