MHT (multiple-hypotheses testing)

Multiple-hypotheses testing (MHT) is a statistical approach used to test multiple hypotheses simultaneously. It is used in situations where researchers want to test a large number of hypotheses, such as in genomics or medical studies, where researchers may want to test thousands or even millions of hypotheses. The MHT approach is designed to control the false positive rate, which is the probability of incorrectly rejecting a null hypothesis. In this essay, we will discuss the concepts and procedures related to multiple-hypotheses testing.

The basics of multiple-hypotheses testing

In a standard hypothesis testing scenario, we test one hypothesis with a single significance level. However, in MHT, we test multiple hypotheses simultaneously, which can increase the likelihood of false positives. Therefore, to control the false positive rate, we need to adjust the significance level for each individual test.

The two main types of errors in hypothesis testing are false positives (Type I error) and false negatives (Type II error). A false positive occurs when we reject a null hypothesis that is actually true. A false negative occurs when we fail to reject a null hypothesis that is actually false. The significance level is the probability of making a Type I error, and the power is the probability of making a Type II error.

When we perform multiple tests, we have a higher probability of making a false positive error. To account for this, we can use multiple testing correction procedures to adjust the significance level. There are several methods for multiple testing correction, and we will discuss some of the most common methods in the following sections.

Methods for multiple testing correction

Bonferroni correction

The Bonferroni correction is one of the most widely used methods for multiple testing correction. It is a simple method that involves dividing the significance level (alpha) by the number of tests performed. For example, if we are testing 100 hypotheses with an alpha of 0.05, we would divide 0.05 by 100 to get a corrected alpha of 0.0005. This corrected alpha is then used as the significance level for each individual test.

The Bonferroni correction is conservative, meaning that it controls the false positive rate at a very stringent level. However, it can also be overly conservative, leading to a loss of power in the tests. Additionally, the Bonferroni correction assumes that the tests are independent, which may not be true in all cases.

False discovery rate (FDR) correction

The false discovery rate (FDR) correction is another commonly used method for multiple testing correction. Unlike the Bonferroni correction, the FDR correction allows for a higher false positive rate. The FDR is defined as the proportion of false positives among all rejected hypotheses.

The FDR correction involves calculating a q-value for each hypothesis, which is the minimum FDR at which that hypothesis would be rejected. The q-value is calculated by ranking the p-values from smallest to largest, and then multiplying each p-value by the total number of hypotheses divided by its rank. The hypothesis with the lowest q-value is rejected first, and then the process is repeated with the remaining hypotheses.

The FDR correction is less conservative than the Bonferroni correction, which means that it has greater power. However, it is also more likely to produce false positives, especially when the number of tests is very large.

Benjamini-Hochberg (BH) procedure

The Benjamini-Hochberg (BH) procedure is a modified version of the FDR correction. The BH procedure is designed to control the FDR at a specific level, which is set by the researcher. The procedure involves ranking the p-values from smallest to largest, and then comparing each p-value to a critical value calculated from the FDR level.

If a p-value is smaller than the critical value, then the null hypothesis is rejected. The BH procedure is less conservative than the Bonferroni correction and more powerful than the FDR correction. However, like the FDR correction, it is more likely to produce false positives, especially when the number of tests is very large.

Permutation testing

Permutation testing is a non-parametric method for multiple testing correction. It involves randomly shuffling the data multiple times and recalculating the test statistics for each permutation. The permutation distribution is then used to calculate the p-value for each hypothesis.

Permutation testing is particularly useful when the data do not follow a normal distribution or when the assumptions of parametric tests are violated. It is also more flexible than other methods of multiple testing correction because it does not require assumptions about the distribution of the data.

Bayesian methods

Bayesian methods are a relatively new approach to multiple testing correction. They involve assigning prior probabilities to each hypothesis and then updating these probabilities based on the data. The posterior probability of each hypothesis is then used to determine whether to reject or accept the null hypothesis.

Bayesian methods are particularly useful when there is prior information available about the hypotheses. They can also be used to incorporate information from multiple sources, such as different studies or datasets.

Applications of multiple-hypotheses testing

Multiple-hypotheses testing is used in a wide range of fields, including genomics, medical research, psychology, and economics. In genomics, for example, researchers may want to test thousands of genes to identify those that are associated with a particular disease. In medical research, researchers may want to test multiple treatments to identify the most effective one.

MHT is also useful in psychology and social science research, where researchers may want to test multiple hypotheses about behavior, attitudes, or personality traits. In economics, MHT can be used to test multiple hypotheses about the effects of different policies or interventions.

Conclusion

Multiple-hypotheses testing is a statistical approach used to test multiple hypotheses simultaneously. It is used in situations where researchers want to test a large number of hypotheses, such as in genomics or medical studies. The MHT approach is designed to control the false positive rate, which is the probability of incorrectly rejecting a null hypothesis. There are several methods for multiple testing correction, including the Bonferroni correction, FDR correction, BH procedure, permutation testing, and Bayesian methods. Each method has its own advantages and disadvantages, and researchers should choose the method that is best suited to their data and research question.