Even with very large sample sizes, few datasets are perfectly Normal. Nevertheless, it is often many statistical tests assume normality. In order to apply a statistical test that assumes normality (i.e., Student’s t Test (Biostatistics Text)) it is necessary to determine if the data is “sufficiently normal”. In the medical literature, the Kolmogorov–Smirnov test is used for this purpose most frequently; however, the test is less powerful for testing normality than the Shapiro–Wilk test (also known as the W test) or Anderson–Darling test.
Programming in MATLAB
To visually inspect the distribution of data, enter the following command:
where x is a column of data containing all datapoints and n is the number of columns in the histogram.
To formally test a dataset for normality, i.e., to perform a one-sided Kolmogorov–Smirnov test in MATLAB, use the following command:
h = kstest(x)
where x is a column of data containing all datapoints. The null hypothesis assumes that x is distributed normally (i.e., that there is no difference between x and a normal distribution). The function kstest will return a value of 1 if the test rejects the null hypothesis at the 5% significance level (i.e., there is a difference between your data and a normal data set), and returns a value of 0 if your data is normally distributed. Information on one-sided and two-sided K-S tests is available at the MATLAB hypothesis testing website