Transcription

Common method bias in PLS-SEM: A full collinearityassessment approachNed KockFull reference:Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach.International Journal of e-Collaboration, 11(4), 1-10.AbstractWe discuss common method bias in the context of structural equation modeling employing thepartial least squares method (PLS-SEM). Two datasets were created through a Monte Carlosimulation to illustrate the discussion: one contaminated by common method bias, and the othernot contaminated. A practical approach is presented for the identification of common methodbias based on variance inflation factors generated via a full collinearity test. Our discussionbuilds on an illustrative model in the field of e-collaboration, with outputs generated by thesoftware WarpPLS. We demonstrate that the full collinearity test is successful in theidentification of common method bias with a model that nevertheless passes standard convergentand discriminant validity assessment criteria based on a confirmation factor analysis.KEYWORDS: Partial Least Squares; Structural Equation Modeling; Common Method Bias;Monte Carlo Simulation1

IntroductionThe method of path analysis has been developed by Wright (1934; 1960) to study causalassumptions in the field of evolutionary biology (Kock, 2011), and now provides the foundationon which structural equation modeling (SEM) rests. Both path analysis and SEM rely on thecreation of models expressing causal relationships through links among variables.Two main types of SEM exist today: covariance-based and PLS-based SEM. While the formerrelies on the minimization of differences between covariance matrices, the latter employs thepartial least squares method (PLS) developed by Herman Wold (Wold, 1980). PLS-based SEMis often referred to simply as PLS-SEM, and is widely used in the field of e-collaboration andmany other fields.Regardless of SEM flavor, models expressing causal assumptions include latent variables.These latent variables are measured indirectly through other variables generally known asindicators (Maruyama, 1998; Mueller, 1996). Indicator values are usually obtained fromquestionnaires where answers are provided on numeric scales, of which the most commonly usedare Likert-type scales (Cohen et al., 2003).Using questionnaires answered on Likert-type scales constitutes an integral part of an SEMstudy’s measurement method. Common method bias is a phenomenon that is caused by themeasurement method used in an SEM study, and not by the network of causes and effects amonglatent variables in the model being studied.We provide a discussion of common method bias in PLS-SEM, and of a method for itsidentification based on full collinearity tests (Kock & Lynn, 2012). Our discussion builds on anillustrative model in the field of e-collaboration, with outputs from the software WarpPLS,version 5.0 (Kock, 2015).The algorithm used to generate latent variable scores based on indicators was PLS Mode A,employing the path weighting scheme. While this is the algorithm-scheme combination mostcommonly used in PLS-SEM, it is by no means the only combination available. The recentemergence of factor-based PLS-SEM algorithms further broadened the space of existingcombinations (Kock, 2014).We created two datasets based on a Monte Carlo simulation (Robert & Casella, 2005; Paxtonet al., 2001). One of the two datasets was contaminated by common method bias; the other wasnot. We demonstrate that the full collinearity test is successful in the identification of commonmethod bias with a model that nevertheless passes standard validity assessment criteria based ona confirmation factor analysis.In our discussion all variables are assumed to be standardized; i.e., scaled to have a mean ofzero and standard deviation of one. This has no impact on the generality of the discussion.Standardization of any variable is accomplished by subtraction of its mean and division by itsstandard deviation. A standardized variable can be rescaled back to its original scale by reversingthese operations.What is common method bias?Common method bias, in the context of PLS-SEM, is a phenomenon that is caused by themeasurement method used in an SEM study, and not by the network of causes and effects in themodel being studied. For example, the instructions at the top of a questionnaire may influencethe answers provided by different respondents in the same general direction, causing the2

indicators to share a certain amount of common variation. Another possible cause of commonmethod bias is the implicit social desirability associated with answering questions in aquestionnaire in a particularly way, again causing the indicators to share a certain amount ofcommon variation.A mathematical understanding of common method bias can clarify some aspects of its nature.The adoption of an illustrative model can help reduce the level of abstraction of a mathematicalexposition. Therefore, our discussion is based on the illustrative model depicted in Figure 1,which is inspired by an actual empirical study in the field of e-collaboration (Kock, 2005; 2008;Kock & Lynn, 2012). The illustrative model incorporates three latent variables, each measuredthrough six indicators. It assumes that the unit of analysis is the firm.Figure 1. Illustrative modelThe latent variables are: collaborative culture (𝐹 ), the perceived degree to which a firm’sculture promotes continuous collaboration among its members to improve the firm’s productivityand the quality of the firm’s products; e-collaboration technology use (𝐹 ), the perceived degreeof use of e-collaboration technologies by the members of a firm; and competitive advantage (𝐹 ),the perceived degree of competitive advantage that a firm possesses when compared with firmsthat compete with it.Mathematically, if our model were not contaminated with common method bias, each of thesix indicators π‘₯ would be derived from its latent variable 𝐹 (of which there are three in themodel) according to (1), where: πœ† is the loading of indicator π‘₯ on 𝐹 , πœƒ is the standardizedindicator error term, and πœ” is the weight of πœƒ with respect to π‘₯ .π‘₯ πœ† 𝐹 πœ” πœƒ , 𝑖 1 3, 𝑗 1 6.(1)Since πœƒ and 𝐹 are assumed to be uncorrelated, the value of πœ” in this scenario can be easilyobtained as:πœ” 1 πœ† .If our model were contaminated with common method bias, each of the six indicators π‘₯ would be derived from its latent variable 𝐹 according to (2), where: 𝑀 is a standardized variable3

that represents common method variation, and πœ” is the common method weight (a.k.a.common method loading, or the positive square root of the common method variance).π‘₯ πœ† 𝐹 πœ” 𝑀 πœ” πœƒ , 𝑖 1 3, 𝑗 1 6.(2)In this scenario, the value of πœ” can be obtained as:πœ” 1 πœ† πœ” .In (2) we assume that the common method weight πœ” is the same for all indicators. Analternative perspective assumes that the common method weight πœ” is not the same for allindicators, varying based on a number of factors. Two terms are used to refer to these differentperspectives, namely congeneric and noncongeneric, although there is some confusion in theliterature as to which term refers to what perspective.Note that the term πœ” 𝑀 introduces common variation that is shared by all indicators in themodel. Since latent variables aggregate indicators in PLS-SEM, this shared variation has theeffect of artificially increasing the level of collinearity among latent variables. As we will seelater, this also has the predictable effect of artificially increasing path coefficients.Data used in the analysisWe created two datasets of 300 rows of data, equivalent to 300 returned questionnaires, withanswers provided on Likert-type scales going from 1 to 7. This was done based on a Monte Carlosimulation (Robert & Casella, 2005; Paxton et al., 2001). The data was created for the threelatent variables and the eighteen indicators (six per latent variable) in our illustrative model.Using this method we departed from a β€œtrue” model, which is a model for which we know thenature and magnitude of all of the relationships among variables beforehand. One of the twodatasets was contaminated by common method bias; the other was not. In both datasets pathcoefficients and loadings were set as follows:𝛽 𝛽 𝛽 .45.πœ† .7, 𝑖 1 3, 𝑗 1 6.That is, all path coefficients were set as . 45 and all indicator loadings as . 7. In the datasetcontaminated by common method bias, the common method weight was set to a value slightlylower than the indicator loadings:πœ” .6.In Monte Carlo simulations where samples of finite size are created, true sample coefficientsvary. Usually true sample coefficients vary according to a normal distribution centered on thetrue population value. Given this, and since we created a single sample of simulated data, ourtrue sample coefficients differed from the true population coefficients.Nevertheless, when we compared certain coefficients obtained via a PLS-SEM analysis for thetwo datasets, with and without contamination, the effects of common method bias became4

visible. This is particularly true for path coefficients, which tend to be inflated by commonmethod bias. As noted earlier, path coefficient inflation is a predictable outcome of sharedvariation among latent variables.Path coefficient inflationTable 1 shows the path coefficients for the models not contaminated by common method bias(No CMB) and contaminated (CMB). As we can see, all three path coefficients were greater inthe model contaminated by common method bias. The differences among path coefficientsranged from a little over 20 to nearly 40 percent.Table 1. Path coefficientsπœ·πŸπŸπœ·πŸ‘πŸ.447.409No CMB.625.512CMBNote: CMB common method bias.πœ·πŸ‘πŸ.357.435This path coefficient inflation effect is one of the key reasons why researchers are concernedabout common method bias, as it may cause type I errors (false positives). Nevertheless,common method bias may also be associated with path coefficient deflation, potentially leadingto type II errors (false negatives).As we can see, the inflation effect can lead to marked differences in path coefficients. In thecase of the path coefficient 𝛽 , the difference is of approximately 39.82 percent. As notedearlier, path coefficient inflation occurs because common variation is introduced, being sharedby all indicators in the model. As latent variables aggregate indicators, they also incorporate thecommon variation, leading to an increase in the level of collinearity among latent variables.Greater collinearity levels in turn lead to inflated path coefficients.One of the goals of a confirmatory factor analysis is to assess two main types of validity in amodel: convergent and discriminant validity. Acceptable convergent validity occurs whenindicators load strongly on their corresponding latent variables. Acceptable discriminant validityoccurs when the correlations among a latent variable and other latent variables in a model arelower than a measure of communality among the latent variable indicators.Given these expectations underlying acceptable convergent and discriminant validity, onecould expect that a confirmatory factor analysis would allow for the identification of commonmethod bias. In fact, many researchers in the past have proposed the use of confirmatory factoranalysis as a more desirable alternative to Harman’s single-factor test – a widely used commonmethod bias test that relies on exploratory factor analysis. Unfortunately, as we will see in thenext section, conducting a confirmatory factor analysis is not a very effective way of identifyingcommon method bias. Models may pass criteria for acceptable convergent and discriminantvalidity, and still be contaminated by common method bias.Confirmatory factor analysisTable 2 is a combined display showing loadings and cross-loadings. Loadings, shown inshaded cells, are unrotated. Cross-loadings are oblique-rotated. Acceptable convergent validitywould normally be assumed if the loadings were all above a certain threshold, typically .5. As we5

can see, all loadings pass this test. This is the case for both models, with and without commonmethod bias contamination. That is, both models present acceptable convergent validity.These results highlight one interesting aspect of the common method bias phenomenon in thecontext of PLS-SEM. There appears to be a marked inflation in loadings, similarly to what wasobserved for path coefficients. Since convergent validity relies on the comparison of loadingsagainst a fixed threshold, then it follows that common method bias would tend to artificiallyincrease the level of convergent validity of a model.Table 3 shows correlations among latent variables and square roots of average variancesextracted (AVEs). The latter are shown in shaded cells, along diagonals. Acceptable discriminantvalidity would typically be assumed if the number in the diagonal cell for each column is greaterthan any of the other numbers in the same column.Table 2. Assessing convergent validityNo 903.762.030.065.041.088π’™πŸ‘πŸ”Notes: CMB common method bias; loadings are unrotated and cross-loadings are oblique-rotated; loadings shownin shaded cells.Table 3. Assessing discriminant validityNo ‘­πŸ‘Notes: Square roots of average variances extracted (AVEs) shown on shaded 11That is, if the square root of the AVE for a given latent variable is greater than any correlationinvolving that latent variable, and this applies to all latent variables in a model, then the modelpresents acceptable discriminant validity. As we can see, this is the case for both of our models,6

with and without common method bias contamination. Both models can thus be assumed todisplay acceptable discriminant validity.Here we see another interesting aspect of the common method bias phenomenon in the contextof PLS-SEM. While correlations among latent variables increase, the same happens with theAVEs. This simultaneous increase in correlations and AVEs is what undermines the potential ofa discriminant validity check in the identification of common method bias.In summary, two key elements of a traditional confirmatory factor analysis are a convergentvalidity test and a discriminant vadity test. According to our analysis, neither test seems to bevery effective in the identification of common method bias. An analogous analysis wasconducted by Kock & Lynn (2012), which prompted them to offer the full collinearity test as aneffective alternative for the identification of common method bias.The full collinearity testCollinearity has classically been defined as a predictor-predictor phenomenon in multipleregression models. In this traditional perspective, when two or more predictors measure the sameunderlying construct, or a facet of such construct, they are said to be collinear. This definition isrestricted to classic, or vertical, collinearity.Lateral collinearity is defined as a predictor-criterion phenomenon, whereby a predictorvariable measures the same underlying construct, or a facet of such construct, as a variable towhich it points in a model. The latter is the criterion variable in the predictor-criterionrelationship of interest.Kock & Lynn (2012) proposed the full collinearity test as comprehensive procedure for thesimultaneous assessment of both vertical and lateral collinearity (see, also, Kock & Gaskins,2014). Through this procedure, which is fully automated by the software WarpPLS, varianceinflation factors (VIFs) are generated for all latent variables in a model. The occurrence of a VIFgreater than 3.3 is proposed as an indication of pathological collinearity, and also as anindication that a model may be contaminated by common method bias. Therefore, if all VIFsresulting from a full collinearity test are equal to or lower than 3.3, the model can be consideredfree of common method bias.Table 4 shows the VIFs obtained for all the latent variables in both of our models, based on afull collinearity test. As we can see, the model contaminated with common method bias includesa latent variable with VIF greater than 3.3, which is shown in a shaded cell. That is, the commonmethod bias test proposed by Kock & Lynn (2012), based on the full collinearity test procedure,seems to succeed in the identification of common method bias.Table 4. Full collinearity VIFsπ‘­πŸπ‘­πŸ1.5411.472No CMB2.6192.347CMBNote: CMB common method bias.π‘­πŸ‘1.7393.720While it is noteworthy that the full collinearity test was successful in the identification ofcommon method bias in a situation where a confirmation factor analysis was not, this success isnot entirely surprising given our previous discussion based on the mathematics underlyingcommon method bias. That discussion clearly points at an increase in the overall level of7

collinearity in a model, corresponding to an increase in the full collinearity VIFs for the latentvariables in the model, as a clear outcome of common method bias.Discussion and conclusionThere is disagreement among methodological researchers about the nature of common methodbias, how it should be addressed, and even whether it should be addressed at al. Richardson et al.(2009) discuss various perspectives about common method bias, including the perspective putforth by Spector (1987) that common method bias is an β€œurban legend”. Assuming that theproblem is real, what can we do to avoid common method bias in the first place? A seminalsource in this respect is Podsakoff et al. (2003), who provide a number of suggestions on how toavoid the introduction of common method bias during data collection.Our discussion focuses on the identification of common method bias based on full collinearityassessment, whereby a model is checked for the existence of both vertical and lateral collinearity(Kock & Gaskins, 2014; Kock & Lynn, 2012). If we find evidence of common method bias, isthere anything we can do to eliminate or at least reduce it? The answer is arguably β€œyes”, and,given the focus of our discussion, the steps discussed by Kock & Lynn (2012) for dealing withcollinearity are an obvious choice: indicator removal, indicator re-assignment, latent variableremoval, latent variable aggregation, and hierarchical analysis. Readers are referred to thatpublication for details on how and when to implement these steps.Full collinearity VIFs tend to increase with model complexity, in terms of number of latentvariables in the model, because: (a) the likelihood that questions associated with differentindicators will overlap in perceived meaning goes up as the size of a questionnaire increases,which should happen as the number of constructs covered grows; and (b) the likelihood thatlatent variables will overlap in terms of the facets of the constructs to which they refer goes up asmore latent variables are added to a model.Models found in empirical research studies in the field of e-collaboration typically containmore than three latent variables. This applies to many other fields where path analysis and SEMare employed. Therefore, we can reasonably conclude that our illustration of the full collinearitytest of common method bias discussed here is conservative in its demonstration of the likelyeffectiveness of the test in actual empirical studies.Kock & Lynn (2012) pointed out that classic PLS-SEM algorithms are particularly effective atreducing model-wide collinearity, because those algorithms maximize the variance explained inlatent variables by their indicators. Such maximization is due in part to classic PLS-SEMalgorithms not modeling measurement error, essentially assuming that it is zero. As such, theindicators associated with a latent variable always explain 100 percent of the variance in thelatent variable.Nevertheless, one of the key downsides of classic PLS-SEM algorithms is that pathcoefficients tend to be attenuated (Kock, 2015b). In a sense, they reduce collinearity levels β€œtoomuch”. The recently proposed factor-based PLS-SEM algorithms (Kock, 2014) address thisproblem. Given this, one should expect the use of factor-based PLS-SEM algorithms to yieldslightly higher full collinearity VIFs than classic PLS-SEM algorithms, with those slightly higherVIFs being a better reflection of the true values.Consequently, the VIF threshold used in common method bias tests should arguably besomewhat higher than 3.3 when factor-based PLS-SEM algorithms are used. In their discussionof possible thresholds, Kock & Lynn (2012) note that a VIF of 5 could be employed whenalgorithms that incorporate measurement error are used. Even though they made this remark in8

reference to covariance-based SEM algorithms, the remark also applies to factor-based PLSSEM algorithms, as both types of algorithms incorporate measurement error.Our goal here is to help empirical researchers who need practical and straightforwardmethodological solutions to assess the overall quality of their measurement frameworks. To thatend, we discussed and demonstrated a practical approach whereby researchers can conductcommon method bias assessment based on a full collinearity test of a model. Our discussion wasillustrated with outputs of the software WarpPLS (Kock, 2015), in the context of e-collaborationresearch. Nevertheless, our discussion arguably applies to any field where path analysis andSEM can be used.AcknowledgmentsThe author is the developer of the software WarpPLS, which has over 7,000 users in morethan 33 different countries at the time of this writing, and moderator of the PLS-SEM e-maildistribution list. He is grateful to those users, and to the members of the PLS-SEM e-maildistribution list, for questions, comments, and discussions on topics related to SEM and to theuse of WarpPLS.ReferencesCohen, J., Cohen, P., West, S.G., & Aiken, L.S. (2003). Applied multiple regression/correlationanalysis for the behavioral sciences. Mahwah, N.J.: L. Erlbaum Associates.Kock, N. (2005). What is e-collaboration. International Journal of E-collaboration, 1(1), 1-7.Kock, N. (2008). E-collaboration and e-commerce in virtual worlds: The potential of SecondLife and World of Warcraft. International Journal of e-Collaboration, 4(3), 1-13.Kock, N. (2011). A mathematical analysis of the evolution of human mate choice traits:Implications for evolutionary psychologists. Journal of Evolutionary Psychology, 9(3), 219247.Kock, N. (2014). A note on how to conduct a factor-based PLS-SEM analysis. Laredo, TX:ScriptWarp Systems.Kock, N. (2015). WarpPLS 5.0 User Manual. Laredo, TX: ScriptWarp Systems.Kock, N. (2015b). One-tailed or two-tailed P values in PLS-SEM? International Journal of eCollaboration, 11(2), 1-7.Kock, N., & Gaskins, L. (2014). The mediating role of voice and accountability in therelationship between Internet diffusion and government corruption in Latin America andSub-Saharan Africa. Information Technology for Development, 20(1), 23-43.Kock, N., & Lynn, G.S. (2012). Lateral collinearity and misleading results in variance-basedSEM: An illustration and recommendations. Journal of the Association for InformationSystems, 13(7), 546-580.Maruyama, G.M. (1998). Basics of structural equation modeling. Thousand Oaks, CA: SagePublications.Mueller, R.O. (1996). Basic principles of structural equation modeling. New York, NY:Springer.Paxton, P., Curran, P.J., Bollen, K.A., Kirby, J., & Chen, F. (2001). Monte Carlo experiments:Design and implementation. Structural Equation Modeling, 8(2), 287-312.9

Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., & Podsakoff, N.P. (2003). Common method biasesin behavioral research: A critical review of the literature and recommended remedies.Journal of Applied Psychology, 88(5), 879-903.Richardson, H.A., Simmering, M.J., & Sturman, M.C. (2009). A tale of three perspectives:Examining post hoc statistical techniques for detection and correction of common methodvariance. Organizational Research Methods, 12(4), 762-800.Robert, C.P., & Casella, G. (2005). Monte Carlo statistical methods. New York, NY: Springer.Spector, P.E. (1987). Method variance as an artifact in self-reported affect and perceptions atwork: Myth or significant problem? Journal of Applied Psychology, 72(3), 438-443.Wold, H. (1980). Model construction and evaluation when theoretical knowledge is scarce. In J.Kmenta and J. B. Ramsey (Eds.), Evaluation of econometric models (pp. 47-74). Waltham,MA: Academic Press.Wright, S. (1934). The method of path coefficients. The Annals of Mathematical Statistics, 5(3),161-215.Wright, S. (1960). Path coefficients and path regressions: Alternative or complementaryconcepts? Biometrics, 16(2), 189-202.10

Kock, N. (2015). Common method bias in PLS-SEM: A full collinearity assessment approach. International Journal of e-Collaboration, 11(4), 1-10. Abstract We discuss common method bias in the context of structural equation modeling employing the partial least squares method (