Karla News

Factor Analysis in Quantitative Research Methods

Anxiety in Children, Quantitative Research

Session Long Project Tutorials:

TUTORIAL ONE:
PRINCIPLE COMPONENT/FACTOR ANALYSIS

1. Purpose of Methodology:
The types of research questions that principle component and factor analysis would be most appropriate for are ones where you are attempting to reduce the number of variables and to detect structure in the relationship between variables, also known as classifying variables. Basically you are attempting to reduce the amount of data.

Example Research Question:
A researcher could pass out a questionnaire to measure job satisfaction. The questionnaire would ask subjects how satisfied they are in their present jobs and how intensely they are pursuing a new job. More than likely the responses to the two items are highly correlated with one another. Showing a high correlation between the two items, we can conclude that they are redundant.

2. Existing Studies:
One can conclude that the correlation between two variables can best be represented by a scatterplot (shown below). A regression line can also be used to represent the most accurate summary of the linear relationship between the variables. If it were possible to define a variable on the regression line in a scatterplot, then that variable would represent the “truth” of the two items. The subjects’ single scores on that new factor, represented by the regression line, could be used in future data analyses to represent the essence of the two items. Thus, the two variables have been reduced to one factor. The new factor is actually a linear combination of the two variables.

3. How and Why it Works?:
Principal components analysis is one data reduction method. It is a method for reducing the number of variables. The decision of when to stop extracting factors is determinant on when there is only very little random variability remaining. As you extract consecutive factors, it accounts for less variability. The researcher typically starts with a correlation matrix where the variances of all variables are equal to (1.0). The total variance in that matrix is equal to the number of variables. If the example research question above concerning job satisfaction included 10 items to measure different aspects of satisfaction at work the variance accounted for by successive factors would be summarized as follows.

STATISTICAFACTORANALYSIS Eigenvalues (factor.sta)Extraction: Principal components
Value Eigenval % totalVariance Cumul.Eigenval Cumul.%
12345678910 6.1183691.800682.472888.407996.317222.293300.195808.170431.137970.085334 61.1836918.006824.728884.079963.172222.933001.958081.704311.37970.85334 6.118377.919058.391948.799939.117169.410469.606269.776709.9146710.00000 61.183779.190583.919487.999391.171694.104696.062697.767099.1467100.0000

In the second column (Eigenvalue) above (variances extracted by the factors derived from the computational issues involved), the variance on the new factors that were successfully extracted are shown using SPSS. The third column shows these values are expressed as a percent of the total variance. Factor 1 accounts for 61 percent of the variance, Factor 4 accounts for 4 percent. The sum of the eigenvalues is equal to the number of variables. The third column contains the cumulative variance that was extracted.

See also  Different Research Method

In 1960 Kaiser proposed criteria concerning eigenvalues. It was stated that you can only retain factors with eigenvalues greater than (1). Unless a factor extracts at least as much as the equivalent of one original variable, you drop it. Based on this Kaiser criterion, only 2 of the principal components factors would be retained.

To plot eigenvalues is to use a graphical representation referred to as a scree test first proposed by Cattell in 1966. Cattel suggested finding the place where the decrease of eigenvalues appears to balance off to the right of the plot.

The defining characteristic that distinguishes between the two factor analytic model is that in principal components analysis we assume that “all” variability in an item should be used in the analysis, while in principal factors analysis we only use the variability in an item that it has in common with the other items. These two cases typically yield very similar results. However, for data reduction, principal components analysis is more appropriate. Additionally, for the purpose of detecting structure, principal factors analysis is most often used.

4. Limitations of the Methodology
If in the correlation matrix there are variables that are (100%) redundant, then the inverse of the matrix will not be able to be computed. If a variable is the sum of the two other variables selected for the analysis, then the correlation matrix of those variables cannot be inverted and the factor analysis can’t even be attempted. It can occur in instances when you are attempting to factor analyze a set of highly intercorrelated variables such as the correlational issues brought out in questionnaires. If the pairwise deletion of missing data does not introduce any systematic bias to the correlation matrix, then all those pairwise descriptive statistics for one variable should be very similar. However, if they differ, then there are good reasons to suspect a bias. For example, if the mean (or standard deviation) of the values of a variable that were taken into account in calculating its correlation with another variable is much lower than the mean (or standard deviation) of those values of the original variable that were used in calculating its correlation with the final variable , then we would have good reason to suspect that those two correlations (A-B and A-C) are based on different subsets of data, and thus, that there is a bias in the correlation matrix caused by a non-random distribution of missing data.

See also  How to Treat Your Toddler's Fever

TUTORIAL TWO:
STRUCTURAL EQUATION MODELING

1. Purpose of Methodology:

There are major applications of structural equation modeling. Some examples include the following:
· Causal modeling or path analysis, which is the hypothesis of causal relationships among variables and tests the causal models with a linear equation system. The causal models can include either manifest variables, latent variables, or both
· Confirmatory factor analysis, which is an extension of factor analysis in which specific hypotheses about the structure of the factor loadings and intercorrelations are tested.
· Second order factor analysis, which is a variation of factor analysis in which the correlation matrix of the common factors is itself factor analyzed to provide second order factors.
· Regression models, which is an extension of linear regression analysis in which regression weights may be constrained to be equal to each other, or to the specified numerical values.
· Covariance structure models, which hypothesize that a covariance matrix has a particular form. You are able to test the hypothesis that a set of all variables have equal variances with this procedure.
· Correlation structure models, which hypothesize that a correlation matrix has a particular form. An example is the hypothesis that the correlation matrix has the structure of circumplex.

Example Research Question:
A researcher could examine children taking standardized tests and attempt to determine if the tests cause anxiety in children. Additionally, it can be hypothesized that test anxiety and anxiety are correlated in some fashion.

2. Existing Studies:

It is possible to test whether variables are interrelated through a set of linear relationships by examining the variances and covariances of the variables. Researchers have developed procedures for testing whether a set of variances and covariances in a covariance matrix fits a predetermined structure. Structural modeling is effective as follows:

· The researcher states the way that he/she believes the variables are inter-related, often with the use of a path diagram (shown below).
· The researcher works out, using some complex internal regulations, what the implications of this are for the variances and covariances of the variables.
· The researcher tests whether the variances and covariances fit this model of them.
· The results of the statistical testing, and also parameter estimates and standard errors for the numerical coefficients in the linear equations are documented.
· The researcher decides, based on the information above, whether the model seems like a good fit for the data.

See also  Understanding Oppression: Linking Knowledge and Practice

3. How and Why it Works?:

A researcher cannot expect all structural models to fit perfectly due to many reasons. A structural model with linear equations is only an approximation. The true relationships between variables are most likely nonlinear. Many of the statistical assumptions are somewhat suspect also. The most important question a researcher can ask oneself is if the trends in the data reflect and approximation that is realistic to the real world.

Path diagrams play an important role in structural modeling. Path diagrams are equivalent to flowcharts. They both show variables interconnected with lines that are used to indicate causal flow. The purpose of the path diagram is to show which variables cause changes in other variables.

4. Limitations of the Methodology:
It is important to consider that simply because a model fits the data does not mean that the model is necessarily accurate. One cannot prove that a model is true. For example, we could justify and say “If I am a giraffe, than I have freckles.” However, “I have freckles” does not imply that I am a giraffe. Additionally, if a specified causal model is true, than it will fit the data. The model fitting the data does not imply the model is the most accurate representation. There may be other models that fit the data equally as well.

Causal modeling allows us to examine the extent to which data fail to agree with one reasonably viable consequence of a model of causality. If the linear equations system isomorphic to the path diagram fits the data well, it is a good sign, however, it is not proof or truth of the causal model.