Abstract

As the interim editor of The Journal of Foot & Ankle Surgery, I look forward to carrying on the tradition established by my predecessor, John M. Schuberth, DPM, of publishing informative and timely articles covering the full scope of foot and ankle surgery. In fact, I see JFAS expanding its role as an outlet for meaningful biomedical information that translates to quality patient care. I also believe that one of the pillars of quality patient care is the publication of meaningful clinical investigations that, in the hands of discerning readers, convey information that can be used to the benefit of patients. Critical to this process is the ability of the reader to interpret and understand the results and conclusions presented in biomedical literature. With this in mind, I would like, from time to time, to share with our readers some of my thoughts related to critical appraisal and interpretation of the medical literature. In time, I believe that these efforts will heighten the level to which the articles published in JFAS will be scrutinized by our readership. In return, a more critical readership will elevate the quality and usefulness of the clinical investigations, from case reports to clinical trials, submitted and selected for publication. In the long run, these efforts will increase the impact that our Journal has on the foot and ankle surgery community, as well as the biomedical community in general.To begin, I would like to make a few points about the importance of understanding the distinctions between different types of clinical data and the statistical tests that are most appropriate for analyzing such data. Readers, as well as investigators, should realize that use of an inappropriate statistical test, or one that does not “fit” the data, may yield a result that is statistically significant but clinically useless and invalid. The key to avoiding this type of analytical mistake is an understanding of the relationship between data type, data distribution, and statistical testing. This information should be useful to readers when they encounter the description of a statistical plan as well as the results of a clinical investigation, and whenever the term “statistically significant” is encountered in a published article.Data TypeThe data that are collected in a clinical investigation serve as the numeric representations of the variables that researchers are interested in. Independent variables represent exposures that are considered to have an influence on the dependent, or outcome, variable. Variables can be continuous, such as numeric data that represent numbers on an interval scale; or discrete data, such as categorical data that represent a nonquantifiable designation such as gender. Variables are further classified as nominal, ordinal, interval, or ratio. Nominal variables qualitatively classify data into specific categories without any indication of quantity or rank order. Typical examples of nominal variables used in clinical research include an individual’s race and gender. Ordinal variables enable investigators to rank order data into categories such as mild, moderate, and severe; however, there is no ability to specifically quantify the magnitude of a difference between any of the categories. Interval variables enable investigators to rank order and quantify the magnitude of differences between data. A typical example of interval data would be body temperature measured on the Celsius (or Fahrenheit) scale, or a lab value that is measured on a fixed and ordered quantitative scale. Ratio variables, such as those used to represent measurements of space and time, convey the same characteristics of interval data with the additional feature of having an established zero value that enables investigators to compare the relative magnitudes of two quantities (a quotient).Data DistributionA data distribution is described by the shape of the plot that is observed when the data are plotted and visualized graphically. Once plotted, the shape of the plot is assessed for skewness (whether the data tail off to the left or right side of the plot) and kurtosis (the sharpness of a peak). Some of the more common distributions observed in continuous datasets include the normal, hypergeometric, and uniform distributions. The normal, or Gaussian, distribution is a symmetrical (parametric) distribution commonly referred to as a bell-shaped distribution. It is usually associated with a large sample size and an equal chance of success or failure in regard to the outcome of interest. The exponential distribution occurs as a result of reliability calculations, and is akin to the Poisson (see below) distribution without successful cases and the probability of the outcome of interest decreases as the number of trials increases. The uniform distribution displays a constant success rate at certain intervals and a null, or zero, success rate at other intervals.Some of the more common distributions observed in discrete (categorical) datasets include the binomial, Poisson, and hypergeometric distributions. The binomial distribution, which is also referred to as the Bernoulli distribution, describes a random sampling process wherein all of the outcomes are categorized as either yes or no (success or failure, heads or tales, or binary data); and the probability of either outcome remains constant and independent from one trial to another. The Poisson distribution describes the random sampling process in which the outcome of interest occurs at a regular but infrequent rate. The hypergeometric distribution describes a random sampling process in which a sample is selected from a larger population and never placed back into the sampling pool, thereby reducing the total number of available samples by one after each trial. This is also referred to as sampling without replacement.Data Analysis and Statistical SignificanceTo determine whether an observed difference in the outcome between treatment groups is due to the intervention being tested, or simply due to chance, mathematical tests of statistical significance have to be calculated. Nowadays, these statistical tests are rapidly computed with any number of readily available statistical software programs. One of the keys to making valid conclusions about the results of a clinical investigation, however, lies in choosing a statistical test that is appropriate for the type and distribution of the data being analyzed. These statistical computations test the hypothesis that there is no difference (the null hypothesis) between the sets of data collected for the different treatment groups, and the tests are based on probability theory and make certain assumptions about the data. The basic assumptions that are made about datasets include, among other characteristics, whether the data are random or nonrandom, independent or linked, and continuous or categorical. Random, or chance, differences in the data serve as the foundation for the scientific analysis of the influence that an intervention has on the outcome of interest. The randomness of any outcome observed in a study can be affected by systematic or measurement biases that influence the chance that such an outcome will occur. Well-designed scientific investigations use bias-reducing methods, such as blinding of participants and outcomes assessors, and random allocation to a specific treatment group, to minimize the influence that bias will have on the results. Independent data have no reasonable connection between measured values and, in the classic sense, fit two different (unpaired) sets of study participants receiving different treatments. Dependent data are linked by some characteristic variable that describes the data, such as data that are associated with the same study participant (classically, pretreatment and posttreatment data collected on the same patient, and referred to as paired data), the same treatment center, the same surgeon, or some other variable common to the participants in the investigation.Statistical analyses that are typically undertaken to discover the essential meaning of the measurements made in a clinical investigation entail descriptive and inferential methods. Data that are collected should be described in regard to central tendency (average) and dispersion (variance or range). For data that are normally distributed (parametric; displays a bell-shaped distribution when plotted), the mean average serves as the measure of central tendency and the standard deviation (the square root of the variance) serves as the measure of dispersion. For data that are non-normally distributed (nonparametric; do not display a bell-shaped curve), the median average serves as the measure of central tendency and the range serves as the measure of dispersion. Because the mean average is sensitive to outlying values, it is not an appropriate measure of central tendency when outliers skew the distribution curve in one direction or another. Moreover, with skewed data, mean-based statistical methods are inappropriate. In such cases, the median average, and statistical tests that are median based, should be used in the analysis. The reason that different statistical tests and methods of estimation are sensitive to the distribution of the data relates back to probability theory, which serves as the foundation of statistical methodology. In essence, the mathematical procedures used in statistical analyses take into consideration certain assumptions about the type and distribution of the data, and the calculations differ depending on the values of the parameters (average and dispersion) that represent the data. If an inappropriate statistical computation is used, then an investigator’s conclusion about the statistical significance of the results of a study could be (and often is) incorrect. This relationship, in general, becomes less of a problem when the sample size being analyzed becomes large and the distribution of the data approaches normality.Inferential statistical tests are undertaken to make the distinction between whether observed differences between treatment groups were due to the interventions being tested, or simply due to chance. To say that a statistically significant difference was observed means nothing more than, at the desired level of significance, typically stated to be the 5% level, the observed difference was not likely due to chance but, more likely, the observed difference was due to the intervention being tested. In other words, to say that a result is “statistically significant” means nothing more than that the result was not likely due to chance (hence, the difference observed was probably due to the intervention being tested).Inferential statistical tests that are based on the normal distribution, or at least based on distributions that are derived from or related to the normal distribution, are appropriate when the data meet the normality assumption, and include the Student’s t test, the F-test, and the chi-square tests. Problems with conclusions may occur, however, when a normal distribution-based (parametric) test is used to analyze data from variables that are themselves not normally distributed. Nonparametric (distribution-free) tests are often used in place of their parametric counterparts when certain assumptions about the underlying population are questionable. For example, when comparing two independent samples, the Wilcoxon Mann-Whitney U test does not assume that the difference between the samples is normally distributed, whereas its parametric counterpart, the 2-sample t test does make this assumption. Nonparametric tests may be, and often are, more powerful in detecting population differences when certain assumptions are not satisfied; and this difference in the ability of a specific test to detect a nonchance difference between the observed groups is the reason that choosing an appropriate statistical test is important to investigators and readers alike. In essence, one test may identify a statistically significant difference, whereas the other may not. All tests involving ordered (ordinal, ranked) data, such as those defined by mild, moderate, or severe categories, are nonparametric. Some commonly used nonparametric tests include the sign test, Wilcoxon Mann-Whitney U test, Wilcoxon signed-rank test, and the Kruskal-Wallis test.In summary, investigators and readers of the medical literature need to be aware of the differences that exist between data type and data distribution, and the relationship that data type and distribution have with the appropriateness of a specific statistical test. Data that are continuous and normally distributed, and derived from large datasets without outlying values, are suitably described by parameters such as the mean and standard deviation, and can be confidently analyzed with statistical methods that assume a normal distribution. On the other hand, categorical data, as well as small datasets and datasets containing outliers, are best described by parameters such as the median and range, and are most suited to nonparametric statistical methods that make no assumptions about the distribution of the data. Failure to appreciate these important distinctions about data and distributions can lead to the use of an inappropriate statistical test, which, in turn, may lead to an invalid conclusion about the results of a clinical investigation. As the interim editor of The Journal of Foot & Ankle Surgery, I look forward to carrying on the tradition established by my predecessor, John M. Schuberth, DPM, of publishing informative and timely articles covering the full scope of foot and ankle surgery. In fact, I see JFAS expanding its role as an outlet for meaningful biomedical information that translates to quality patient care. I also believe that one of the pillars of quality patient care is the publication of meaningful clinical investigations that, in the hands of discerning readers, convey information that can be used to the benefit of patients. Critical to this process is the ability of the reader to interpret and understand the results and conclusions presented in biomedical literature. With this in mind, I would like, from time to time, to share with our readers some of my thoughts related to critical appraisal and interpretation of the medical literature. In time, I believe that these efforts will heighten the level to which the articles published in JFAS will be scrutinized by our readership. In return, a more critical readership will elevate the quality and usefulness of the clinical investigations, from case reports to clinical trials, submitted and selected for publication. In the long run, these efforts will increase the impact that our Journal has on the foot and ankle surgery community, as well as the biomedical community in general. To begin, I would like to make a few points about the importance of understanding the distinctions between different types of clinical data and the statistical tests that are most appropriate for analyzing such data. Readers, as well as investigators, should realize that use of an inappropriate statistical test, or one that does not “fit” the data, may yield a result that is statistically significant but clinically useless and invalid. The key to avoiding this type of analytical mistake is an understanding of the relationship between data type, data distribution, and statistical testing. This information should be useful to readers when they encounter the description of a statistical plan as well as the results of a clinical investigation, and whenever the term “statistically significant” is encountered in a published article. Data TypeThe data that are collected in a clinical investigation serve as the numeric representations of the variables that researchers are interested in. Independent variables represent exposures that are considered to have an influence on the dependent, or outcome, variable. Variables can be continuous, such as numeric data that represent numbers on an interval scale; or discrete data, such as categorical data that represent a nonquantifiable designation such as gender. Variables are further classified as nominal, ordinal, interval, or ratio. Nominal variables qualitatively classify data into specific categories without any indication of quantity or rank order. Typical examples of nominal variables used in clinical research include an individual’s race and gender. Ordinal variables enable investigators to rank order data into categories such as mild, moderate, and severe; however, there is no ability to specifically quantify the magnitude of a difference between any of the categories. Interval variables enable investigators to rank order and quantify the magnitude of differences between data. A typical example of interval data would be body temperature measured on the Celsius (or Fahrenheit) scale, or a lab value that is measured on a fixed and ordered quantitative scale. Ratio variables, such as those used to represent measurements of space and time, convey the same characteristics of interval data with the additional feature of having an established zero value that enables investigators to compare the relative magnitudes of two quantities (a quotient). The data that are collected in a clinical investigation serve as the numeric representations of the variables that researchers are interested in. Independent variables represent exposures that are considered to have an influence on the dependent, or outcome, variable. Variables can be continuous, such as numeric data that represent numbers on an interval scale; or discrete data, such as categorical data that represent a nonquantifiable designation such as gender. Variables are further classified as nominal, ordinal, interval, or ratio. Nominal variables qualitatively classify data into specific categories without any indication of quantity or rank order. Typical examples of nominal variables used in clinical research include an individual’s race and gender. Ordinal variables enable investigators to rank order data into categories such as mild, moderate, and severe; however, there is no ability to specifically quantify the magnitude of a difference between any of the categories. Interval variables enable investigators to rank order and quantify the magnitude of differences between data. A typical example of interval data would be body temperature measured on the Celsius (or Fahrenheit) scale, or a lab value that is measured on a fixed and ordered quantitative scale. Ratio variables, such as those used to represent measurements of space and time, convey the same characteristics of interval data with the additional feature of having an established zero value that enables investigators to compare the relative magnitudes of two quantities (a quotient). Data DistributionA data distribution is described by the shape of the plot that is observed when the data are plotted and visualized graphically. Once plotted, the shape of the plot is assessed for skewness (whether the data tail off to the left or right side of the plot) and kurtosis (the sharpness of a peak). Some of the more common distributions observed in continuous datasets include the normal, hypergeometric, and uniform distributions. The normal, or Gaussian, distribution is a symmetrical (parametric) distribution commonly referred to as a bell-shaped distribution. It is usually associated with a large sample size and an equal chance of success or failure in regard to the outcome of interest. The exponential distribution occurs as a result of reliability calculations, and is akin to the Poisson (see below) distribution without successful cases and the probability of the outcome of interest decreases as the number of trials increases. The uniform distribution displays a constant success rate at certain intervals and a null, or zero, success rate at other intervals.Some of the more common distributions observed in discrete (categorical) datasets include the binomial, Poisson, and hypergeometric distributions. The binomial distribution, which is also referred to as the Bernoulli distribution, describes a random sampling process wherein all of the outcomes are categorized as either yes or no (success or failure, heads or tales, or binary data); and the probability of either outcome remains constant and independent from one trial to another. The Poisson distribution describes the random sampling process in which the outcome of interest occurs at a regular but infrequent rate. The hypergeometric distribution describes a random sampling process in which a sample is selected from a larger population and never placed back into the sampling pool, thereby reducing the total number of available samples by one after each trial. This is also referred to as sampling without replacement. A data distribution is described by the shape of the plot that is observed when the data are plotted and visualized graphically. Once plotted, the shape of the plot is assessed for skewness (whether the data tail off to the left or right side of the plot) and kurtosis (the sharpness of a peak). Some of the more common distributions observed in continuous datasets include the normal, hypergeometric, and uniform distributions. The normal, or Gaussian, distribution is a symmetrical (parametric) distribution commonly referred to as a bell-shaped distribution. It is usually associated with a large sample size and an equal chance of success or failure in regard to the outcome of interest. The exponential distribution occurs as a result of reliability calculations, and is akin to the Poisson (see below) distribution without successful cases and the probability of the outcome of interest decreases as the number of trials increases. The uniform distribution displays a constant success rate at certain intervals and a null, or zero, success rate at other intervals. Some of the more common distributions observed in discrete (categorical) datasets include the binomial, Poisson, and hypergeometric distributions. The binomial distribution, which is also referred to as the Bernoulli distribution, describes a random sampling process wherein all of the outcomes are categorized as either yes or no (success or failure, heads or tales, or binary data); and the probability of either outcome remains constant and independent from one trial to another. The Poisson distribution describes the random sampling process in which the outcome of interest occurs at a regular but infrequent rate. The hypergeometric distribution describes a random sampling process in which a sample is selected from a larger population and never placed back into the sampling pool, thereby reducing the total number of available samples by one after each trial. This is also referred to as sampling without replacement. Data Analysis and Statistical SignificanceTo determine whether an observed difference in the outcome between treatment groups is due to the intervention being tested, or simply due to chance, mathematical tests of statistical significance have to be calculated. Nowadays, these statistical tests are rapidly computed with any number of readily available statistical software programs. One of the keys to making valid conclusions about the results of a clinical investigation, however, lies in choosing a statistical test that is appropriate for the type and distribution of the data being analyzed. These statistical computations test the hypothesis that there is no difference (the null hypothesis) between the sets of data collected for the different treatment groups, and the tests are based on probability theory and make certain assumptions about the data. The basic assumptions that are made about datasets include, among other characteristics, whether the data are random or nonrandom, independent or linked, and continuous or categorical. Random, or chance, differences in the data serve as the foundation for the scientific analysis of the influence that an intervention has on the outcome of interest. The randomness of any outcome observed in a study can be affected by systematic or measurement biases that influence the chance that such an outcome will occur. Well-designed scientific investigations use bias-reducing methods, such as blinding of participants and outcomes assessors, and random allocation to a specific treatment group, to minimize the influence that bias will have on the results. Independent data have no reasonable connection between measured values and, in the classic sense, fit two different (unpaired) sets of study participants receiving different treatments. Dependent data are linked by some characteristic variable that describes the data, such as data that are associated with the same study participant (classically, pretreatment and posttreatment data collected on the same patient, and referred to as paired data), the same treatment center, the same surgeon, or some other variable common to the participants in the investigation.Statistical analyses that are typically undertaken to discover the essential meaning of the measurements made in a clinical investigation entail descriptive and inferential methods. Data that are collected should be described in regard to central tendency (average) and dispersion (variance or range). For data that are normally distributed (parametric; displays a bell-shaped distribution when plotted), the mean average serves as the measure of central tendency and the standard deviation (the square root of the variance) serves as the measure of dispersion. For data that are non-normally distributed (nonparametric; do not display a bell-shaped curve), the median average serves as the measure of central tendency and the range serves as the measure of dispersion. Because the mean average is sensitive to outlying values, it is not an appropriate measure of central tendency when outliers skew the distribution curve in one direction or another. Moreover, with skewed data, mean-based statistical methods are inappropriate. In such cases, the median average, and statistical tests that are median based, should be used in the analysis. The reason that different statistical tests and methods of estimation are sensitive to the distribution of the data relates back to probability theory, which serves as the foundation of statistical methodology. In essence, the mathematical procedures used in statistical analyses take into consideration certain assumptions about the type and distribution of the data, and the calculations differ depending on the values of the parameters (average and dispersion) that represent the data. If an inappropriate statistical computation is used, then an investigator’s conclusion about the statistical significance of the results of a study could be (and often is) incorrect. This relationship, in general, becomes less of a problem when the sample size being analyzed becomes large and the distribution of the data approaches normality.Inferential statistical tests are undertaken to make the distinction between whether observed differences between treatment groups were due to the interventions being tested, or simply due to chance. To say that a statistically significant difference was observed means nothing more than, at the desired level of significance, typically stated to be the 5% level, the observed difference was not likely due to chance but, more likely, the observed difference was due to the intervention being tested. In other words, to say that a result is “statistically significant” means nothing more than that the result was not likely due to chance (hence, the difference observed was probably due to the intervention being tested).Inferential statistical tests that are based on the normal distribution, or at least based on distributions that are derived from or related to the normal distribution, are appropriate when the data meet the normality assumption, and include the Student’s t test, the F-test, and the chi-square tests. Problems with conclusions may occur, however, when a normal distribution-based (parametric) test is used to analyze data from variables that are themselves not normally distributed. Nonparametric (distribution-free) tests are often used in place of their parametric counterparts when certain assumptions about the underlying population are questionable. For example, when comparing two independent samples, the Wilcoxon Mann-Whitney U test does not assume that the difference between the samples is normally distributed, whereas its parametric counterpart, the 2-sample t test does make this assumption. Nonparametric tests may be, and often are, more powerful in detecting population differences when certain assumptions are not satisfied; and this difference in the ability of a specific test to detect a nonchance difference between the observed groups is the reason that choosing an appropriate statistical test is important to investigators and readers alike. In essence, one test may identify a statistically significant difference, whereas the other may not. All tests involving ordered (ordinal, ranked) data, such as those defined by mild, moderate, or severe categories, are nonparametric. Some commonly used nonparametric tests include the sign test, Wilcoxon Mann-Whitney U test, Wilcoxon signed-rank test, and the Kruskal-Wallis test.In summary, investigators and readers of the medical literature need to be aware of the differences that exist between data type and data distribution, and the relationship that data type and distribution have with the appropriateness of a specific statistical test. Data that are continuous and normally distributed, and derived from large datasets without outlying values, are suitably described by parameters such as the mean and standard deviation, and can be confidently analyzed with statistical methods that assume a normal distribution. On the other hand, categorical data, as well as small datasets and datasets containing outliers, are best described by parameters such as the median and range, and are most suited to nonparametric statistical methods that make no assumptions about the distribution of the data. Failure to appreciate these important distinctions about data and distributions can lead to the use of an inappropriate statistical test, which, in turn, may lead to an invalid conclusion about the results of a clinical investigation. To determine whether an observed difference in the outcome between treatment groups is due to the intervention being tested, or simply due to chance, mathematical tests of statistical significance have to be calculated. Nowadays, these statistical tests are rapidly computed with any number of readily available statistical software programs. One of the keys to making valid conclusions about the results of a clinical investigation, however, lies in choosing a statistical test that is appropriate for the type and distribution of the data being analyzed. These statistical computations test the hypothesis that there is no difference (the null hypothesis) between the sets of data collected for the different treatment groups, and the tests are based on probability theory and make certain assumptions about the data. The basic assumptions that are made about datasets include, among other characteristics, whether the data are random or nonrandom, independent or linked, and continuous or categorical. Random, or chance, differences in the data serve as the foundation for the scientific analysis of the influence that an intervention has on the outcome of interest. The randomness of any outcome observed in a study can be affected by systematic or measurement biases that influence the chance that such an outcome will occur. Well-designed scientific investigations use bias-reducing methods, such as blinding of participants and outcomes assessors, and random allocation to a specific treatment group, to minimize the influence that bias will have on the results. Independent data have no reasonable connection between measured values and, in the classic sense, fit two different (unpaired) sets of study participants receiving different treatments. Dependent data are linked by some characteristic variable that describes the data, such as data that are associated with the same study participant (classically, pretreatment and posttreatment data collected on the same patient, and referred to as paired data), the same treatment center, the same surgeon, or some other variable common to the participants in the investigation. Statistical analyses that are typically undertaken to discover the essential meaning of the measurements made in a clinical investigation entail descriptive and inferential methods. Data that are collected should be described in regard to central tendency (average) and dispersion (variance or range). For data that are normally distributed (parametric; displays a bell-shaped distribution when plotted), the mean average serves as the measure of central tendency and the standard deviation (the square root of the variance) serves as the measure of dispersion. For data that are non-normally distributed (nonparametric; do not display a bell-shaped curve), the median average serves as the measure of central tendency and the range serves as the measure of dispersion. Because the mean average is sensitive to outlying values, it is not an appropriate measure of central tendency when outliers skew the distribution curve in one direction or another. Moreover, with skewed data, mean-based statistical methods are inappropriate. In such cases, the median average, and statistical tests that are median based, should be used in the analysis. The reason that different statistical tests and methods of estimation are sensitive to the distribution of the data relates back to probability theory, which serves as the foundation of statistical methodology. In essence, the mathematical procedures used in statistical analyses take into consideration certain assumptions about the type and distribution of the data, and the calculations differ depending on the values of the parameters (average and dispersion) that represent the data. If an inappropriate statistical computation is used, then an investigator’s conclusion about the statistical significance of the results of a study could be (and often is) incorrect. This relationship, in general, becomes less of a problem when the sample size being analyzed becomes large and the distribution of the data approaches normality. Inferential statistical tests are undertaken to make the distinction between whether observed differences between treatment groups were due to the interventions being tested, or simply due to chance. To say that a statistically significant difference was observed means nothing more than, at the desired level of significance, typically stated to be the 5% level, the observed difference was not likely due to chance but, more likely, the observed difference was due to the intervention being tested. In other words, to say that a result is “statistically significant” means nothing more than that the result was not likely due to chance (hence, the difference observed was probably due to the intervention being tested). Inferential statistical tests that are based on the normal distribution, or at least based on distributions that are derived from or related to the normal distribution, are appropriate when the data meet the normality assumption, and include the Student’s t test, the F-test, and the chi-square tests. Problems with conclusions may occur, however, when a normal distribution-based (parametric) test is used to analyze data from variables that are themselves not normally distributed. Nonparametric (distribution-free) tests are often used in place of their parametric counterparts when certain assumptions about the underlying population are questionable. For example, when comparing two independent samples, the Wilcoxon Mann-Whitney U test does not assume that the difference between the samples is normally distributed, whereas its parametric counterpart, the 2-sample t test does make this assumption. Nonparametric tests may be, and often are, more powerful in detecting population differences when certain assumptions are not satisfied; and this difference in the ability of a specific test to detect a nonchance difference between the observed groups is the reason that choosing an appropriate statistical test is important to investigators and readers alike. In essence, one test may identify a statistically significant difference, whereas the other may not. All tests involving ordered (ordinal, ranked) data, such as those defined by mild, moderate, or severe categories, are nonparametric. Some commonly used nonparametric tests include the sign test, Wilcoxon Mann-Whitney U test, Wilcoxon signed-rank test, and the Kruskal-Wallis test. In summary, investigators and readers of the medical literature need to be aware of the differences that exist between data type and data distribution, and the relationship that data type and distribution have with the appropriateness of a specific statistical test. Data that are continuous and normally distributed, and derived from large datasets without outlying values, are suitably described by parameters such as the mean and standard deviation, and can be confidently analyzed with statistical methods that assume a normal distribution. On the other hand, categorical data, as well as small datasets and datasets containing outliers, are best described by parameters such as the median and range, and are most suited to nonparametric statistical methods that make no assumptions about the distribution of the data. Failure to appreciate these important distinctions about data and distributions can lead to the use of an inappropriate statistical test, which, in turn, may lead to an invalid conclusion about the results of a clinical investigation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call