Transcription

NCSS Statistical SoftwareNCSS.comChapter 415Multivariate Analysisof Variance (MANOVA)IntroductionMultivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). InANOVA, differences among various group means on a single-response variable are studied. In MANOVA, thenumber of response variables is increased to two or more. The hypothesis concerns a comparison of vectors ofgroup means. When only two groups are being compared, the results are identical to Hotelling’s T² procedure.The multivariate extension of the F-test is not completely direct. Instead, several test statistics are available, suchas Wilks’ Lambda and Lawley’s trace. The actual distributions of these statistics are difficult to calculate, so werely on approximations based on the F-distribution.Technical DetailsA MANOVA has one or more factors (each with two or more levels) and two or more dependent variables. Thecalculations are extensions of the general linear model approach used for ANOVA.Unlike the univariate situation in which there is only one statistical test available (the F-ratio), the multivariatesituation provides several alternative statistical tests. We will describe these tests in terms of two matrices, H andE. H is called the hypothesis matrix and E is the error matrix. These matrices may be computed using a number ofmethods. In NCSS, we use the standard general linear models (GLM) approach in which a sum of squares andcross-products matrix is computed. This matrix is based on the dependent variables and independent variablesgenerated for each degree of freedom in the model. It may be partitioned according to the terms in the model.MANOVA Test StatisticsFor a particular p-variable multivariate test, assume that the matrices H and E have h and e degrees of freedom,respectively. Four tests may be defined as follows. See Seber (1984) for details. Let θ i , φ i , and λ i be theeigenvalues of H(E H)-1, HE-1, and E(E H)-1 respectively. Note that these eigenvalues are related as follows:θ i 1 - λi φi1 φi1φi θ i λi1-θiλiλi 1 - θ i 11 φi415-1 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Wilks’ LambdaDefine Wilks’ Lambda as follows:Λ p,h,e E E H p (1 - θj)j 1with e p.The following approximation based on the F-distribution is used to determine significance levels:F ph, ft - g (ft - g)(1 - Λ1/t )ph Λ1/twhere1f e - (p - h 1)2g ph - 22 p 2h 2 4 22 p h 5t 1 if p 2 h 2 5 0otherwiseThis approximation is exact if p or h 2.Lawley - Hotelling TraceThe trace statistic, T 2g , is defined as follows:s2Tg e φjj 1wheres min(p, h)The following approximation based on the F-distribution is used to determine significance levels:2F a,b Tgcewherea phb 4 (a 2)/(B - 1)c a(b - 2)b(e - p - 1)B (e h - p - 1)(e - 1)(e - p - 3)(e - p)415-2 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Pillai’s TracePillai’s trace statistic, V(s), is defined as follows:s(s)V θj tr(H(E H )- 1 )j 1wheres min(p, h)The following approximation based on the F-distribution is used to determine significance levels:F s(2m s 1),s(2n s 1) (2n s 1)V (s)(2m s 1)(s - V (s) )wheres min(p, h)m ( p - h -1)/2n (e - p - 1)/2Roy’s Largest RootRoy’s largest root, φ max , is defined as the largest of the φ i ’s. The following approximation based on the Fdistribution is used to determine significance levels:F (2ν 1 2),(2ν 2 2) 2ν 2 2φ2ν 1 2 maxwheres min(p, h)ν 1 ( p - h -1)/2ν 2 (e - p - 1)/2Which Test to UseWhen the hypothesis degrees of freedom, h, is one, all four test statistics will lead to identical results. When h 1,the four statistics will usually lead to the same result. When they do not, the following guidelines fromTabachnick (1989) may be of some help.Wilks’ Lambda, Lawley’s trace, and Roy’s largest root are often more powerful than Pillai’s trace if h 1 and onedimension accounts for most of the separation among groups. Pillai’s trace is more robust to departures fromassumptions than the other three.Tabachnick (1989) provides the following checklist for conducting a MANOVA. We suggest that you considerthese issues and guidelines carefully.415-3 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Assumptions and LimitationsThe following assumptions are made when using a MANOVA.1. The response variables are continuous.2. The residuals follow the multivariate-normal probability distribution with means equal to zero.3. The variance-covariance matrices of each group of residuals are equal.4. The individuals are independent.Multivariate Normality and OutliersMANOVA is robust to modest amount of skewness in the data. A sample size that produces 20 degrees offreedom in the univariate F-test is adequate to ensure robustness. Non-normality caused by the presence ofoutliers can cause severe problems that even the robustness of the test will not overcome. You should screen yourdata for outliers and run it through various univariate and multivariate normality tests and plots to determine if thenormality assumption is reasonable.Homogeneity of Covariance MatricesMANOVA makes the assumption that the within-cell (group) covariance matrices are equal. If the design isbalanced so that there is an equal number of observations in each cell, the robustness of the MANOVA tests isguaranteed. If the design is unbalanced, you should test the equality of covariance matrices using Box’s M test. Ifthis test is significant at less than .001, there may be severe distortion in the alpha levels of the tests. You shouldonly use Pillai’s trace criterion in this situation.LinearityMANOVA assumes linear relationships among the dependent variables within a particular cell. You should studyscatter plots of each pair of dependent variables using a different color for each level of a factor. Look carefullyfor curvilinear patterns and for outliers. The occurrence of curvilinear relationships will reduce the power of theMANOVA tests.Multicollinearity and SingularityMulticollinearity occurs when one dependent variable is almost a weighted average of the others. This collinearitymay only show up when the data are considered one cell at a time. The R²-Other Y’s in the Within-CellCorrelations Analysis report lets you determine if multicollinearity is a problem. If this R² value is greater than.99 for any variable, you should take corrective action (remove one of the variables). To correct formulticollinearity, begin removing the variables one at a time until all of the R²’s are less than .99. Do not removethem all at once! Singularity is the extreme form of multicollinearity in which the R² value is one.Forms of multicollinearity may show up when you have very small cell sample sizes (when the number ofobservations is less than the number of variables). In this case, you must reduce the number of dependentvariables.415-4 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Data StructureThe data must be entered in a format that places the dependent variables and values of each factor side by side. Anexample of the data for a MANOVA design is shown in the table below. In this example, WRATR and WRATA arethe two dependent variables. Treatment and Disability are two factor variables. This database is stored in the fileMANOVA1.MANOVA1 dataset 2222Unequal Sample Size and Missing DataYou should begin by screening your data. Pay particular attention to patterns of missing values. When usingMANOVA, you should have more observations per factor category than you have dependent variables so that youcan test the equality of covariance matrices using Box’s M test.NCSS ignores rows with missing values. If it appears that most of the missing values occur in one or twovariables, you might want to leave these out of the analysis in order to obtain more data and hence more power.NCSS uses the GLM procedure for calculating the hypothesis and error matrices. Each matrix is calculated as if itwere fit last in the model. This is the recommended way of obtaining these matrices. This method is valid evenwhen the sample sizes for the various groups are unequal.415-5 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Example 1 – Multivariate Analysis of VarianceThis section presents an example of how to run an analysis of the data contained in the MANOVA1 dataset.SetupTo run this example, complete the following steps:1Open the MANOVA1 example dataset From the File menu of the NCSS Data window, select Open Example Data. 2Select MANOVA1 and click OK.Specify the Multivariate Analysis of Variance (MANOVA) procedure options Find and open the Multivariate Analysis of Variance (MANOVA) procedure using the menus or theProcedure Navigator. The settings for this example are listed below and are stored in the Example 1 settings template. To loadthis template, click Open Example Template in the Help Center or File menu.OptionValueVariables TabResponse Variables . WRATR-WRATAFactor Variable 1 . TreatmentFactor Variable 2 . Disability3Run the procedure Click the Run button to perform the calculations and generate the output.Expected Mean Squares SectionExpected Mean Squares Section SourceTermA: TreatmentB: inatorTermS(AB)S(AB)S(AB)ExpectedSquareS bsAS asBS sABSNote: Expected Mean Squares are for the balanced cell-frequency case.The Expected Mean Square expressions are provided to show the appropriate error term for each factor. Thecorrect error term for a factor is that term that is identical except for the factor being tested.Source TermThe source of variation or term in the model.DFThe degrees of freedom. The number of observations “used” by this term.Term Fixed?Indicates whether the term is “fixed” or “random.”415-6 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Denominator TermIndicates the term used as the denominator in the F-ratio.Expected Mean SquareThis expression represents the expected value of the corresponding mean square if the design was completelybalanced. “S” represents the expected value of the mean square error (sigma). The uppercase letters representeither the adjusted sum of squared treatment means if the factor is fixed, or the variance component if the factor israndom. The lowercase letter represents the number of levels for that factor, and “s” represents the number ofreplications of the experimental layout.These EMS expressions are provided to determine the appropriate error term for each factor. The correct errorterm for a factor is that term whose EMS is identical except for the factor being tested.MANOVA Tests SectionMANOVA Tests Section Term(DF)Test .05)A(1):TreatmentWilks' LambdaHotelling-Lawley TracePillai's TraceRoy's Largest B(2):DisabilityWilks' LambdaHotelling-Lawley TracePillai's TraceRoy's Largest ilks' LambdaHotelling-Lawley TracePillai's TraceRoy's Largest 116AcceptAcceptAcceptAcceptAcceptAcceptThis report gives the results of the various significance tests. Usually, the four multivariate tests will lead to thesame conclusions. When they do not, refer to the discussion of these tests found earlier in this chapter. Once amultivariate test has found a term significant, use the univariate ANOVA to determine which of the variables andfactors are “causing” the significance.Term(DF)The term in the design model with the degrees of freedom of the term in parentheses.Test StatisticThe name of the statistical test shown on this row of the report. The four multivariate tests are followed by theunivariate F-tests of each variable.Test ValueThe value of the test statistic.DF1The numerator degrees of freedom of the F-ratio corresponding to this test.415-7 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)DF2The denominator degrees of freedom of the F-ratio corresponding to this test.F-RatioThe value of the F-test corresponding to this test. In some cases, this is an exact test. In other cases, this is anapproximation to the exact test. See the discussion of each test to determine if it is exact or approximate.Prob LevelThe significance level of the above F-ratio. The probability of an F-ratio larger than that obtained by this analysis.For example, to test at an alpha of 0.05, this probability would have to be less than 0.05 to make the F-ratiosignificant.Decision(0.05)The decision to accept or reject the null hypothesis at the given level of significance. Note that you specify thelevel of significance when you select Alpha.Within Correlations\Covariances SectionWithin Correlations\Covariances 444This report displays the correlations and covariances formed by averaging across all the individual groupcovariance matrices. The correlations are shown in the lower-left half of the matrix. The within-group covariancesare shown on the diagonal and in the upper-right half of the matrix.Within-Cell Correlations Analysis SectionWithin-Cell Correlations Analysis VariableWRATRWRATAR-SquaredOther 572310.942769Percentof Total52.8647.14CumulativeTotal52.86100.00This report analyzes the within-cell correlation matrix. It lets you diagnose multicollinearity problems as well asdetermine the number of dimensions that are being used. This is useful in determining if Pillai’s trace should beused.R-Squared Other Y’sThis is the R-Squared index of this variable with the other variables. When this value is larger than 0.99, severemulticollinearity problems exist. If this happens, you should remove the variable with the largest R-Squared andre-run your analysis.Canonical VariateThe identification numbers of the canonical variates that are generated during the analysis. The total number ofvariates is the smaller of the number of variables and the number of degrees of freedom in the model.EigenvalueThe eigenvalues of the within correlation matrix. Note that this value is not associated with the variable at thebeginning of the row, but rather with the canonical variate number directly to the left.415-8 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Percent of TotalThe percent that the eigenvalue is of the total. Note that the sum of the eigenvalues will equal the number ofvariates. If the percentage accounted for by the first eigenvalue is relatively large (70 or 80 percent), Pillai's tracewill be less powerful than the other three multivariate tests.Cumulative TotalThe cumulative total of the Percent of Total column.Univariate Analysis of Variance SectionAnalysis of Variance Table for WRATR SourceTermA: TreatmentB: DisabilityABSTotal (Adjusted)TotalDF122121718Sum ha 0.05)0.9999880.7638590.052757* Term significant at alpha 0.05Analysis of Variance Table for WRATA SourceTermA: TreatmentB: DisabilityABSTotal (Adjusted)TotalDF122121718Sum er(Alpha 0.05)0.9995190.9811440.125682* Term significant at alpha 0.05This is the standard ANOVA report as documented in the General Linear Models chapter. A separate report isdisplayed for each of the dependent variables.415-9 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Means and Plots SectionMeans and Standard Errors of WRATR TermAllCount18Mean89.11111StandardErrorA: Treatment192999.8888978.333342.2346872.234687B: 7369222.736922AB: Treatment, 705923.870592Means and Standard Errors of WRATA TermAllCount18Mean87.22222StandardErrorA: Treatment192996.3333478.111112.2346872.234687B: 6922AB: Treatment, 3.8705923.8705923.870592415-10 NCSS, LLC. All Rights Reserved.

NCSS Statistical SoftwareNCSS.comMultivariate Analysis of Variance (MANOVA)Plots Section This report provides the least-squares means and standard errors for each variable. Note that the standard errorsare calculated from the mean square error of the ANOVA table. They are not the standard errors that would becalculated from the individual cells.415-11 NCSS, LLC. All Rights Reserved.

Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various group means on a single-response variable are studied. In MANOVA, the