Velvet Star Monitor

Standout celebrity highlights with iconic style.

news

Generating random numbers with a skewed distribution with many possible values

Writer Sophia Terry
$\begingroup$

Possible duplicate, but not sure: Generating random numbers with skewed distribution

So, I would like to generate integers bewteen $x$ and $y$ with a skewed distribution of $n$. What is the best way to do this?

I believe it is different to the link above, because it only chooses between two numbers, $1$ and $46$, but I want a larger range (i.e. between 1-100), but I don't know how.

$\endgroup$ 3

1 Answer

$\begingroup$

Just to get you started:

Let $X$ have density function $f(x) = 2(1-x),$ for $0 < x < 1.$. That is a right-skewed continuous distribution taking values in $(0,1).$ The distribution of $X$ is called $\mathsf{Beta}(1,2);$ you can look at the Wikipedia article to see the general form of the family of distributions.

Then $Y = 100X$ is a continuous distribution taking values in $(0,100).$ You can use the ceiling function to get a discrete distribution taking integer values $1, 2, \dots, 100.$

How to generate values of $X:$ The CDF of $X$ is $F(x) = (1-x)^2,$ for $0 < x < 1.$ Suppose you have a pseudorandom generator that essentially give you random observations from $U \sim \mathsf{UNIF}(0,1).$ Then you can set $U = F(X)$ and solve for $X$ in terms of $U$ to get $X = F^{-1}(U) = 1 - \sqrt{U}.$

In R statistical software the function runif(10^6) generates a vector of a million observations that are essentially sampled from $\mathsf{Unif}(0,1)$ so the program below generates and plots a histogram of a million pseudo-realizations of $X.$ [In R this could be done more easily by using the built-in function for generating beta random variables: rbeta(m, 1, 2).]

m = 10^6; u = runif(m)
x = 1 - sqrt(u)
par(mfrow=c(1,2)) # enables 2-panel plots hist(u, prob=T, col="skyblue2", main="Realizations of UNIF(0,1)") curve(dunif(x), -.2, 1.2, col="blue", lwd=2, n=1001, add=T) hist(x, prob=T, col="skyblue2", main="Realizations of BETA(1,2)") curve(dbeta(x,1,2), -.2, 1.2, col="blue", lwd=2, n=1001, add=T)
par(mfrow=c(1,1)) # returns to single panel plots

enter image description here

Then a discrete distribution can be obtained by taking ceilings, as mentioned above. For a graph that will display clearly at the resulution available here, I have used values 1 through 20 (instead of 1 through 100).

k = 20
y = ceiling(k*x)
cutp=seq(0, k, by=.1)
hist(y, prob=T, br=cutp, main="Right-Skewed Discrete Distribution")

enter image description here

If you want a more extreme skew, you can use a different member of the beta family of distributions. [But then the relationship to uniform random variables is not so transparent.]

m = 10^6; x = rbeta(m, 1, 4); y = ceiling(20*x); cutp=seq(0, 20, by=.1)
hist(y, prob=T, br=cutp, main="Right-Skewed Discrete Distribution")

enter image description here

$\endgroup$ 2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy