Do to others what you want them to do to you

20 Jul 2011

Non-normal data and SEM

There are a number of interesting discussions going on in the Doctorate Support Group on Facebook. One of the more recent discussion was started by one of the members who had normality issues with her data.

Normality is an issue because it is one of the basic assumptions required in order to carry out structural equation modelling (SEM) analysis (BYRNE, B. M. (2010) Structural equation modeling with AMOS: Basic concepts, application and programming, New York, Routledge: Taylor and Francis Group.).

Normality means that the distribution of the data is normally distributed with mean=0, standard deviation=1 and a symmetric bell shaped curve. Normally the Skewness and Kurtosis measures are checked:

  • Skewness: value should be within the range ±1 for normal distribution.
  • Kurtosis: Value should be within range ±3 for normal distribution.

So what happens when your data is non-normal?

You don't have to worry unless the departure from normality is very severe.

1. Several studies have shown that most data in social sciences has non-normal distribution.

Bentler, P.M., & Chou, C.-P. (1987). Practical issues in structural modeling. Sociological Methods & Research, 16, 78-117.

Barnes, J., Cote,J., Cudeck, R. and Malthouse, E. (2001). Checking Assumptions of Normality before Conducting Factor Analyses. Journal of Consumer Psychology, 10(1/2), pp. 79-81.

2. The ML estimator is considered relatively robust to violations of normality assumptions.

Diamantopoulos, A., Siguaw, J. & Siguaw, J. A. (2000). Introducing LISREL: A guide for the uninitiated, Sage Publications.
Bollen, K. A. (1989) Structural equations with latent variables, Wiley New York.

3. Monte-Carlo experiments found no major differences in terms of SEM analysis results using ML estimator on samples of different sizes and with different Kurtosis and Skewness levels.

Reinartz, W., Haenlein, M. & Henseler, J. (2009). An empirical comparison of the efficacy of covariance-based and variance-based SEM. International Journal of Research in Marketing, 26, 332-344.

4. Bootstrapping is increasingly being used to get around this issue.

Preacher, K. J. & Hayes, A. F. (2004). SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behavior Research Methods, Instruments, & Computers, 36, p. 717.

5. Large sample size leads to reduction in the problem if multivariate non-normality.

Hair, J. F., Black, W. C., Babin, B. J. & Anderson, R. E. (2010) Multivariate Data Analysis: A global perspective Upper Saddle River, NJ, Pearson Education Inc.

Note: Severely non-normal data would probably need another alternative approach.

4 comments:

  1. Thank you very much!

    ReplyDelete
  2. Nor Hisham Haron30 April 2013 at 07:02

    Hello...
    I would like to know from the values of skewness and kurtosis, how can we determine whether the data is moderately non-normal and extremely non-normal?

    ReplyDelete
  3. Hi Nor Hisham, It's quite subjective and researchers use these terms by looking at the graphs. However, I found this link which might be useful. http://www.une.edu.au/WebStat/unit_materials/c4_descriptive_statistics/determine_skew_kurt.html

    ReplyDelete