CDF (cumulative density function)

The Cumulative Density Function (CDF) is an important concept in probability theory and statistics. It is a function that maps a random variable's possible values to the probability that the random variable takes on a value less than or equal to a given value. This function is used to describe the probability distribution of a random variable and is a fundamental tool in many areas of mathematics and statistics. In this article, we will explain what the CDF is, how it is used, and some of its important properties.

Definition of CDF

The CDF of a random variable X is defined as:

F(x) = P(X ≤ x)

where P(X ≤ x) is the probability that X is less than or equal to x. In other words, the CDF gives the probability that a random variable takes on a value less than or equal to a particular value. The CDF is a non-decreasing function, which means that as x increases, the value of F(x) can only increase or remain constant.

The CDF is defined for both continuous and discrete random variables. For a continuous random variable, the CDF is a continuous function, while for a discrete random variable, the CDF is a step function. In the case of a discrete random variable, the CDF has jumps at the values of the random variable.

Properties of CDF

The CDF has several important properties:

  1. Non-negative: The CDF is always non-negative, that is, it is greater than or equal to zero.
  2. Non-decreasing: The CDF is a non-decreasing function, which means that as x increases, the value of F(x) can only increase or remain constant.
  3. Right-continuous: The CDF is a right-continuous function, which means that the limit of F(x) as x approaches a value from the right is equal to F(a). That is, there are no jumps in the CDF.
  4. Limits: The CDF has limits of 0 and 1, that is, as x approaches negative infinity, F(x) approaches 0, and as x approaches positive infinity, F(x) approaches 1.
  5. Probability: The area under the curve of the CDF between two values of x represents the probability that the random variable takes on a value between those two values. That is, the probability of the event {a < X ≤ b} is equal to F(b) - F(a).

Applications of CDF

The CDF is a fundamental tool in probability theory and statistics. It is used to describe the probability distribution of a random variable and to calculate various statistical measures such as the mean, median, mode, and variance. Some of the applications of CDF are:

  1. Describing the probability distribution of a random variable: The CDF can be used to describe the probability distribution of a random variable. For example, the CDF of a normal distribution gives the probability that a random variable takes on a value less than or equal to a given value. The CDF can also be used to describe the probability distribution of other types of random variables, such as binomial, Poisson, and exponential.
  2. Calculating the mean, median, mode, and variance: The CDF can be used to calculate various statistical measures of a random variable, such as the mean, median, mode, and variance. For example, the mean of a random variable X can be calculated using the following formula:

μ = ∫x f(x) dx

where f(x) is the probability density function of X, and the integral is taken over the range of X. The median of X is the value of x for which F(x) = 0.5. The mode of X is the value of x for which f(x) is at its maximum. The variance of X can be calculated using the following formula:

σ^2 = ∫(x-μ)^2 f(x) dx

where μ is the mean of X, and the integral is taken over the range of X.

  1. Calculating probabilities: The CDF can be used to calculate probabilities of events involving random variables. For example, the probability of X taking on a value between a and b can be calculated as F(b) - F(a).
  2. Hypothesis testing: The CDF is used in hypothesis testing to determine whether a hypothesis is true or false based on the observed data. For example, in a hypothesis test for the mean of a normal distribution, the CDF is used to calculate the p-value, which is the probability of obtaining a test statistic as extreme as or more extreme than the observed value, assuming that the null hypothesis is true.

Examples of CDF

Let's consider some examples of CDF.

Example 1: CDF of a uniform distribution

The CDF of a uniform distribution is given by:

F(x) = 0 for x < a F(x) = (x-a)/(b-a) for a ≤ x < b F(x) = 1 for x ≥ b

where a and b are the lower and upper limits of the uniform distribution, respectively. The CDF is a step function with a jump at x = a and x = b. The area under the curve between a and b is equal to 1.

Example 2: CDF of a normal distribution

The CDF of a normal distribution with mean μ and standard deviation σ is given by:

F(x) = 1/2[1 + erf((x-μ)/(σ√2))]

where erf is the error function. The CDF is a bell-shaped curve, symmetric about the mean. The area under the curve between negative infinity and a is equal to the probability that the random variable takes on a value less than or equal to a.

Example 3: CDF of a exponential distribution

The CDF of an exponential distribution with parameter λ is given by:

F(x) = 1 - e^(-λx)

where e is the base of natural logarithms. The CDF is a decreasing function that starts at 0 and approaches 1 as x approaches infinity.

Conclusion

The Cumulative Density Function (CDF) is a fundamental tool in probability theory and statistics. It is a function that maps a random variable's possible values to the probability that the random variable takes on a value less than or equal to a given value. The CDF is a non-decreasing function that is non-negative, right-continuous, and has limits of 0 and 1. It is used to describe the probability distribution of a random variable, calculate various statistical measures, calculate probabilities, and perform hypothesis testing. The CDF is defined for both continuous and discrete random variables and has applications in many areas of mathematics and statistics.