Abstract
This chapter presents the statistical (econometric) identification of the most important success drivers of the projects in the sample. The chapter first compresses the 41 questionnaire variables (which contain duplication for robustness sake) into 5 success factors. Then, regressions identify which success factors drive the probability of project completion and which factors drive budget and schedule overruns (for the completed projects).
The economic levers are huge—the regressions identify that moderately improving the success factors can double completion chances and save hundreds of millions of dollars. The role of corruption becomes apparent: not only does it depress completion chances and inflate the budgets, but it also compromises the effectiveness of the other success factors.
You have full access to this open access chapter, Download chapter PDF
This chapter presents the results of the econometric analysis of the questionnaire data. An econometric analysis identifies and interprets patterns in the data that we have collected. The patterns allow identifying causal connections between actions that were taken in projects and project outcomes, namely completion (versus abandonment) and for the completed projects, schedule and cost performance. We present the fundamental logic of the analysis, including key results in a graphical form. We put technical content (related to the econometrics methods) into the Appendix of this chapter, so that readers who are not interested in the technical details can read this chapter and understand its critical implications, and readers who do want to check the rigour and care of the analysis can check.
As we explained in Chap. 2, we have data for 19 abandoned projects and 19 completed projects, each with 3 respondents—an owner, a supervisor (both civil servants) and a project manager of the main contractor. A total of 38 of 40 targeted projects represent a response rate of 95%. This gives us a total of 114 questionnaires to work with.
We first examine the distributions of the responses. We show that the three respondent types indeed show “biases”, or views of the project from “where they sit”: owners evaluate differently what went well and what did not; for example, a cost overrun that was absorbed by the contractor looks like a problem to the contractor but may not even register as important for the owner. We also check whether the responses actually differ across abandoned and completed projects: we find that they do, which means that our questionnaire variables capture something that is happening differently across abandoned versus completed projects.
Then, we “condense” the 41 variables to a smaller number of 4 “composite variables ”, which are called “factors” in the social sciences. We need to do this because each factor captures an underlying dimension of managerial differences that is shared across a number of our variables; the shared dimension represents a common “essence” underlying several variables, of which each variable expresses a piece. (We also do not have enough data points to incorporate all 41 variables separately in a regression to obtain sharp results.) We then show that the four factors (in addition to corruption, the one variable that represents a dimension of its own) are able to statistically explain project completion with success. Finally, we show that the factors also successfully explain the budget and schedule performance of the 19 projects that were completed.
5.1 Variable Distributions and Variable Capability to Detect Differences Across Projects
5.1.1 Each Respondent Type Adds Unique Perspectives and Information
Let us first compare the three different respondent types (owners, supervisors and contractors)—the comparison will give some indication of how different the information and views are that are expressed by the three respondent types (Table 5.1).
First, we see that the responses are positively correlated. A correlation of zero between two variables means that two have nothing to do with each other—they move independently from each other. A correlation of 1 means that the two variables move in unison (whenever one moves up or down, the other does the same), which means the two variables are the same (possibly scaled by a factor). A modest correlation of up to 0.5 means that the two variables are somewhat related (which is to be expected, as, after all, the three respondents do look at the same project), but they differ significantly.
As the three respondents reported on the same project, their responses should have some commonality—but is only moderate, so the three respondents emphasized in their own views the different characteristics of the project. For the completed projects, the two civil servants (owner and supervisor) agree more, with a correlation of 41%, but for the abandoned projects, the supervisor’s response is as highly correlated with the contractor as it is with the owner (at a lower level of 30%).
Now we examine how much the responses shift between completed and abandoned projects while distinguishing between the three respondents. Figure 5.1 shows the distributions of all answers by project outcome: across all answers, a higher score is “better”; therefore, a shift of distribution to the left means a shift towards—across all management aspects—“lower management performance”. We see in Fig. 5.1 that for abandoned projects (right), compared to completed projects (left), the means of the answers shift left (towards lower performance), and the standard deviations grow (the abandoned projects differ more among themselves in their evaluations than the completed projects). The largest difference among projects is in the abandoned category for the supervisors—the standard deviation is the largest, and this is the only non-unimodal distribution.
Most interestingly, the evaluation shift from completed to abandoned projects is the largest for the supervisors: the mean shifts by 1.27 Likert points (versus only 0.9 and 0.83 for owners and contractors, respectively). We thus observe that the supervisor evaluations are the most sensitive to project outcomes. Specifically, the mean responses are not statistically different across the three respondent types for completed projects, but among abandoned projects the responses differ statistically significantly across respondent types, and this is because supervisors lower their evaluations significantly more for abandoned projects than owners and contractors do.
The difference could be caused by the supervisors being closest to the “mess” of the projects and perceiving the differences in practices and actions more acutely than owners and (senior) contract personnel. On the other hand, the owners and contractors might be more reluctant to admit problems or articulate them. Indeed, the following observation provides some evidence of participants not wanting to talk about weaknesses despite seeing them: we had a private conversation with an experienced project manager who worked for a large, respected international contractor. The person said, “If we were not speaking privately at this unobserved place, I would not be able to openly give you any information.” This suggests that, in the questionnaire responses, the contractors, as well as the owners, may have been somewhat more guarded.
However, the key conclusion from this discussion is that the three respondents for the same project see significant differences in their realities of this project. Each respondent brings unique perspectives and observations to the data. Therefore, we take this situation as a justification to treat the three questionnaires of one project as separate data points (each containing information of its own). Therefore, we perform the key analyses as if we had 38 X 3–114 data points, which allows us to identify more subtly patterns (we do check, however, several times whether looking only at one respondent type might invalidate the key patterns, which is not the case—we report this in the Appendix).
5.1.2 The Variables Capture Robust Differences Between Abandoned and Completed Projects
The next question is whether differences between completed and abandoned projects were driven by a few variables, or whether the evaluations differed across many questions. In other words, were the differences between completed and abandoned projects “concentrated” on a few variables? If this were the case, we would see evidence of focused weaknesses or of an inability by respondents to perceive differences across the board. This is examined in the following t-test tables (Tables 5.2 and 5.3). Table 5.2 shows the differences in supervisor responses across project outcomes (the respondent type with the largest distribution shift in Table 5.1), and Table 5.3 shows the differences in contractor responses (the respondent type with the smallest distribution shift in Table 5.1).
Tables 5.2 and 5.3 tell us that the differences between completed and abandoned projects are not focused on a few variables. To illustrate how the key patterns that we observe are robust across the respondent types, we, for now, still distinguish between them—Table 5.2 shows the variable differences across abandoned and completed projects for supervisors, and Table 5.3 for contractors.
Moreover, even for the contractors, 29 out of 40 questions differ significantly (Table 5.3), although the contractors represent the tightest comparison, with their answers “guarded”, as we saw in Fig. 5.1. Therefore, we can conclude that our questions captured systematic differences between abandoned and completed projects and were indeed seen as different across the projects; differences are observable not just across a few questions.
Importantly, even for the contractors, each of the three broad questionnaire areas (governance, initiation and execution) differs at the 1% significance level (bottom of Table 5.3). Almost every question differs statistically significantly between completed and abandoned projects for the supervisors (Table 5.2: the chances that the differences between the responses for abandoned versus completed projects might have arisen “randomly” are below 5% for the vast majority of variables, as the last column indicates).
5.2 Condensing Variables into Aggregated Success Factors
5.2.1 Approach
Examination of the data in the first section of this chapter suggests two further steps. First, there are stable differences in the perspectives between owners, supervisors and contractors, which reflect genuine differences in the information that they possessed and the observations they made. Therefore, it makes sense to treat each questionnaire as a separate data point—although three questionnaires have the same project as their subject, the three questionnaires are not simply redundant “duplications”; in fact, they contain complementary data. We therefore treat our data set as consisting of 114 responses.
Second, as explained earlier, groups of the 41 variables “are related” and “get at” the same underlying characteristic of how a project was managed. The questionnaire started with three variable groups—governance, initiation and execution—and then multiple questions explored the areas. For example, questions G3–G9 all probe for a common “thing”, namely how stable, informed and insightful the oversight committee was (e.g. supervision structure was … G3: stable; G4: regularly in action; G5: giving clear guidance; G6: giving clear approval; G7: kept informed; G8: meeting regularly; G9: examined by initial due diligence). The reason for these “overlaps” is, of course, reliability of getting at the underlying concepts—a respondent may misinterpret or wrongly fill out a single question, but if we “get at” a managerial characteristic with multiple questions, there is a better chance that the responses will be stable and reliable.
In light of the fact that the questions were designed to have overlaps, for reasons of reliability, it makes sense to capture the underlying common (or “essential”) management characteristics by “condensing” the variables. This is accomplished through a statistical approach known as factor analysis. In essence, it is an exercise in testing for commonalities among groups of variables. A factor is an unobserved underlying force, and each variable that is measured in the questionnaire is treated as if it were a linear combination of multiple underlying factors: where in our primary data table, each data point (each questionnaire) is represented by 41 numbers (values on the 41 variables), we now want to represent the questionnaire as a representation of a smaller number, n, of factors. We do not know a priori how many factors will emerge, so we let the data speak and see what factors emerge—we initially thought it might be three, which was why the questionnaire had three sections of governance, initiation and execution. However, when we rigorously searched for meaningful and statistically powerful factors, we found not three but four.
The factor analysis was conducted with all variables except for “corruption”, which was treated as a separate concept. The rationale is that corruption does not fall under the managerial characteristics of the project; corruption is part of the project environment and is therefore in its own category.
5.2.2 Identifying the Factors
In exploring different possible factor configurations (“exploratory analysis”), we found that a four-factor model offered the best balance between separating the variables well and having factors that successfully combined several variables and had a managerial meaning.
The set of factors is meaningful: the factors cut across our “pre-named” categories (which stemmed from our review of previous work) of governance, initiation and execution. However, when we examine the variables that attach to each factor, there is a clear interpretation of each one, and we can give each factor a name that reflects the variables that it combines (Table 5.4).
Specifically, we conclude that the first factor captures variables connected to contractor selection and qualification. Only G1, “defined supervision structure”, is a surprise in this context, but it loads strongly, and it may capture that once a supervision committee was in place, a solid contractor selection (in contrast to, for example, a political selection) was enabled.
The second factor connects strongly to variables relating to the project goals—business goals as well as societal goals. The third factor collects variables that relate to resources (funding, personnel and logistics) and planning (stakeholders, timelines and risks). The fourth factor captures elements of the supervision structure and stakeholder involvement. (The reader might wonder whether it would be better to split this factor in two, one on supervision and one on stakeholders. However, it turned out that such a five-factor solution was less statistically robust and had more cross-loadings; in other words, the data suggests that supervision and stakeholder management capability tended to go together.)
Thus, we have consolidated the 41 variables into 5 success factors that approximately summarize the larger number of variables in underlying success factors—contractor selection, project goals, resource provision and planning, governance and stakeholder management, and corruption (remember, this was treated separately, as an “external context” going into the analysis). We will now search for patterns of what explains project completion armed with these aggregated success drivers.
5.3 Econometric Prediction of Project Completion
Armed with the aggregated success factors, or underlying management characteristics, we can now attempt to detect causal patterns that explain why projects were abandoned: we predict the probability of project completion in a logistical regression (a probit model)—completion is a zero-one variable, so we cannot use a normal linear regression with a continuous dependent variable. The dependent variable in the regression is the logarithm of the probability of a project being completed (the formal specification is shown in Appendix 3 of this chapter).
We add one more variable into this logistical regression. The reader may recall that we treat the three responses related to one project (owner, supervisor and contractor) as three different data points. We include a measure of how much the three respondents on one project disagree: if the three respondents disagree strongly, this may reflect problems (for instance, in working together, in agreeing on plans or in agreeing on goals). For any of the variables, respondent disagreement is measured as follows:
-
1.
For each project and variable, take the three responses and average them to create a baseline.
-
2.
For each respondent, take the absolute value of the difference from the baseline (the average of this variable). This is the disagreement for a variable for each respondent, and the average over the three respondents’ disagreement scores is this variable’s disagreement score; averaged over all variables, we get the project’s respondent disagreement score.
The set of analyses shown in Table 5.5 predicts the logarithm of the probability of project completion as the dependent variable, with the independent variables discussed earlier. The coefficient related to each variable expresses how much the (log of the) completion probability changes if this success variable changes by a small amount (and the standard error of the coefficient expresses how much “noise” is in the data, and thus how reliable this coefficient is). If the standard error is much larger than the coefficient itself, then we cannot be sure whether this coefficient is really even different from zero, in other words, whether this variable even has an effect. This is also expressed by the statistical significance.
Table 5.5 gives us the first core finding of this chapter: the high rate of project abandonment in Nigeria is not mysterious. It can in fact be explained by the managerial characteristics of the projects: all four factors are strongly significant (at the 5% level or better), and their coefficients are of an equal order of magnitude, which means that no one factor dominates, but they all have important influence. In addition, corruption is as important as each of the four managerial factors—this is not surprising, as corruption not only makes a project more expensive but also distorts decisions (as we will quantitatively show later). Finally, disagreement among the respondents (in their answers) is also a significant factor, as it captures the potential for tensions and misalignments among their actions.
All variables are statistically significant, being combined in one model, which implies that they measure different aspects of the project. Finally, the model offers a level of explained variance of 58%. This suggests that the project characteristics that we have measured do not merely capture small influences, but our variables together explain a large part of the probability of a project reaching completion or being abandoned during execution.
In order to examine the robustness of the model, we added an additional control variable: we counted how many times the president and thus the government changed during the life of a project (this varied between 0 and, for three projects, 12 times). The idea behind this variable is that each government change carries with it the danger of disruption and discontinuity (as we will see amply illustrated in the case studies). However, this control variable is not statistically significant (neither alone nor when included together with the other variables), and we therefore do not show it in the reported tables. The effect of discontinuity, while plausible, is so noisy that it cannot be reliably identified in an econometric analysis.
Now we need to discuss the meaning of the parameters in Table 5.5, which represent a “model” that predicts the completion probability of a project depending on its scores of the factors and the corruption and disagreement variables. We show some elements of this model in graphical form in Fig. 5.2, which shows by how much the completion probability changes if the two most influential variables change by one score point up or down. The midpoint of the x-axis in the graph is the completion probability when all factors are at their average—it is 55%.Footnote 1 The two curves show how the completion probability changes when one variable changes while the other variables are held constant (we chose the two variables/factors with the largest and smallest regression parameter because they have the greatest effects; the effects of the other variables lie in between).
The two curves in the graph demonstrate powerfully how large the effects of the variables are: if the corruption score can be reduced by one score point from its average (which is 4.89, in a range between 1 and 7). If corruption can be lowered to 3.89, the completion probability increases from 55% to 88%! If, in contrast, corruption deteriorates to a score of 5.89, the completion probability diminishes to 20%. We could not more powerfully confirm our previous prediction that corruption does not just increase project costs but may destroy the chances of completion at all. The effect is literally huge—a 30% completion probability increase for a $1B project translates into an expected cost of $300M (assuming the whole budget is spent, which was indeed the case in our case studies)!
Similarly, a one-score-point improvement in the contractor selection score, from its average of 4.74 to 5.74, increases the probability of completion to 95%. Again, we could not demonstrate more powerfully the importance of contractor selection.
Thus, the econometric analysis is not a theoretical exercise of style; rather, it shows how incredibly important it is to manage the success variables that we have identified and measured. Our data demonstrates that the effect of achieving even moderate improvements can be staggering.
In order to be sure that we are not biasing our results by treating the responses from the three respondent groups as separate data points, we carry out an additional set of analyses in Appendix 4. It examines how each of the three types of respondent explains the success of the project. This analysis uses the same variables as in Table 5.5, but without respondent disagreement (as we now look at only one respondent group). In this analysis, significance levels are lower because the number of data points in each regression is only one-third of the overall regression. However, the qualitative shape of the results stays robust across the three types of respondent.
5.4 Econometric Prediction of Cost and Schedule Overruns for Completed Projects
Having shown that the variables measured in our questionnaire (consolidated into four success factors, plus the corruption measure), we now examine whether our variables can also predict schedule and cost performance for the set of completed projects. We conduct this examination using linear OLS (ordinary least square) regressions.
5.4.1 Effect of Variables on Budget Overruns
Table 5.6 shows the regression results of how our variables predict cost overruns (measured as a percentage of budget, which normalizes the absolute budget size away).
As in the prediction of project completion, we again find that our (condensed) variables matter, all reducing budget overruns (the signs of their coefficients are negative). All variables are statistically significant, and they explain not just some but a large fraction of the variance in the cost overrun performance measure (69% for the full model). Not only is the explained variance high, but the model overall is also highly statistically significant (the F-statistic for the model is F = 14.612. p < 0.001).
As in the prediction of project completion, the coefficients of the four factors (managerial characteristics) are a similar size. However, the coefficient for corruption is—at −33—four times the size of the coefficients of any of the managerial factors. This strengthens the finding of the completion regression: corruption as an individual variable is very important, especially for the project’s budget compliance—corruption directly inflates the project budget, in addition to contributing to inefficient decision-making.
Interestingly, the respondent disagreement reduces budget overruns. Disagreement increases the chance of the project of being abandoned (Table 5.5), but given that the project was completed, disagreements among respondents are associated with lower overruns. The most plausible explanation of this is that given that the project was completed rather than abandoned, there is a “luxury of different views” associated with lower overruns: when the project goes badly (overruns are high), everyone has to agree that it goes badly. When the project is proceeding adequately (“it is OK”), things are possibly more ambiguous in the sense that people might disagree how well (or badly) things are going.
In order to test the robustness of the statistical results, we again added control variables: first, the number of government changes during the life of the project (the same variable is in the project completion regression), and again, this variable turned out statistically insignificant. Second, the initial budget size of the project was included (we could do this only in the regression with the completed projects as we did not have reliable total budget estimates for the abandoned projects). The initial budget size is a measure of complexity and therefore project difficulty, and one might expect that (percentage) overruns are worse for larger projects. However, this turns out to not be the case—the budget size is (as for the government changes) statistically insignificant. One interpretation is that all the projects in the sample are large enough to be difficult, and the forces that cause them to encounter difficulties are not driven by size.
Similar to Fig. 5.2, Fig. 5.3 demonstrates that the variable effects are large enough to be of strong economic significance. The average budget overrun of the 19 completed projects is 760% (of the overrun, as a percentage of the original budget). This drives home the point that “completed” is not the same as “successful”—an almost eight-fold overrun is not a great performance. However, not all projects had such large overruns, and the econometric model from Table 5.6 predicts that the budget performance can be greatly influenced if the success factors can be changed.
The two curves in the graph again powerfully demonstrate how large the effects of the variables are: if the corruption score can be reduced by 1 from its average of 4.4 (while holding the other variables unchanged), the overruns can be almost halved (however, if the corruption score deteriorates by 1, the overrun increases by almost 50% to 1100%). Reducing the budget overrun by half is worth $370M, on average, over the 19 projects! If the contractor selection process score can be improved by 1 point, overruns diminish by two-thirds, to just over 200% (but if the contractor selection deteriorates by 1 score point, the overrun almost doubles). The impacts of the other variables are in between (closer to the contractor selection variable).
We again verify that these results across all three respondent groups are not caused by one (or dominated by one) respondent group only. We show the cost overrun regressions separately by respondent group in Appendix 5. This analysis shows that the results are representative and similar in each of the respondent groups, with slightly lower significance levels because of a smaller number of data points.
5.4.2 Effect of Variables on Schedule Overruns
Table 5.7 shows the regression results of how our variables predict schedule overruns (measured as a percentage of planned project duration). As for budget overruns, we again find that our (consolidated) variables matter, all reducing schedule overruns as well (the signs of all coefficients are negative). All variables are statistically significant, and they explain not just some but a large fraction of the variance in the cost overrun performance measure (49% for the full model). Not only is the explained variance high, but the model overall is also highly significant (the F-statistic for the model is F = 9.41 p < 0.05).
We again demonstrate the economic significance of our success drivers (factors and variables) in graphical form in Fig. 5.4. The average schedule overrun among the 19 completed projects was 134% (of the originally planned duration). The highest impact on the schedule lies in project goals and supervision (and we can see in Table 5.7 that the stakeholders’ factor is almost as important): if we could improve the project goals and supervision factor by 1 score point (from its average of 6, while holding the other variables constant), the schedule overrun would be reversed to a schedule acceleration of 50%!
This is, of course, not a “prediction” but an artefact of a linear extrapolation pushed further than is realistic. Once the schedule has been achieved, further improvements will not improve the schedule further, as the pressure to do so disappears. Whatever slack one has created will then be used to improve quality, reduce cost or increase profit. This limit of linear extrapolation is, of course, the reason why we show only “one-score-point changes” in the graphs in the first place. However, the linear regression model still provides an estimation of how powerful the difference made by small improvements can be.
Interestingly, the schedule is less affected by contractor selection, resources and planning factors. Contractor selection has a dominant effect on budget adherence, as we have seen in Fig. 5.3 (after all, that’s where prices are negotiated), but it does not dominate schedule adherence. Clearly, there is room to look for more fine-grained evidence of this in our case studies.
Moreover, corruption is much less important for schedule adherence—the schedule overrun varies “only” between 100% and 180% for a full two-point change in the corruption score around the average. This is instructive—it gives us tangible evidence that the corrosive effect of corruption lies in bad decisions that can derail a project, which we see in the completion probability graph in Fig. 5.2; and the corrosive effect lies in the budget—corruption directly costs money, as we have seen in Fig. 5.3.
(The respondent disagreement variable has only a small effect, so we do not further discuss it here.)
Finally, we again verify that these results across all three respondent groups are not caused by one (or dominated by one) respondent group only. We show the schedule overrun regressions separately by respondent group in Appendix 6. This analysis shows that the results are representative and similar in each of the respondent groups, with slightly lower significance levels because of a smaller number of data points.
5.5 The Corrosive Effect of Corruption
This chapter establishes an important basis for the conclusions that this book will reach. At first, this study articulated a number of “project success factor” variables, entirely arising from the study of previous expert work on very large projects in other countries, without any consideration of Nigerian special circumstances, and certainly without any “partial interest” input from parties in Nigeria that may prefer certain conclusions over others. These variables were given to 114 professionals who have been actively working on Nigerian projects, but without any explanation of how which variable fits into a predicted framework of project success, and, moreover, ensuring different perspectives by asking respondents from owners and supervisors (civil servants), as well as contractors (employees of private companies). None of the respondents could “censor” their responses in order to influence our findings, because no one knew how the many managerial variables would turn out to have influenced success. (We saw that respondents were possibly a bit more, or less, open in admitting the size of project weaknesses, but there was no “biasing” of our results; directionally, there was agreement.)
Because of this impossibility of external influence on the outcomes of our examination, we can claim that our analysis is “objective”—it is in no way influenced by any opinions of powerful parties who might have had an interest in the direction that our conclusions might take. No one, including ourselves, was able to predict which elements of the framework that we had assembled from previous project success studies in other countries would turn out to be the most important in the Nigerian public project context.
This is what our statistical analysis accomplishes: it identifies four managerial “characteristics” or “success factors” underlying, or “consolidating”, our 41 variables: (1) the way the project goals were articulated and followed up, (2) the selection process of the contractor, (3) the way the project was resourced and planned, and (4) the way a supervision structure was set up and obeyed and stakeholders were taken into account. A critical additional success factor was the absence (or presence) of corruption (a single variable in the questionnaire), which was as important for a project’s completion as any of the other factors. Finally, we saw that disagreements in the responses among the three respondent groups captured some types of underlying miscommunication, or possibly tensions and misalignment, and it therefore also predicted lower project success.
Our statistical analysis strongly demonstrates that all six drivers matter, not only for project completion but also for budget and schedule adherence in those projects that were completed. These findings imply that project success in Nigeria is not mysterious but analysable and understandable, and improvements can be identified and put in place.
One limitation of statistical analysis is that the variables it uses are aggregated and therefore somewhat abstract. In addition, the causal interactions between the four managerial characteristics do not appear in the econometrics: for instance, if the project were not planned well, resources may not be stably and sustainably allocated. This, in turn, disturbs the way the contractors behave—they may walk out at some point or play games in order to cushion their budgets so they do not go bankrupt when funding is disrupted. These interactions will become fully apparent only when we look at the projects in more narrative detail.
However, one causal interaction that we can examine econometrically is the effect of corruption on the decisions in the project. In order to do this, we include not only corruption in the project completion regression, but also an interaction term, the product of (Factor x) X (Corruption index). If this product is significant in the regression, this means that the effect of Factor x will be changed (get larger or smaller) as the extent of corruption changes; in other words, corruption has an effect on effectiveness of other managerial decisions—this is precisely the “corrosive effect” of corruption that we have previously mentioned.
In the completion probability regression, the “corrosive effects” of corruption are not detectable; in other words, corruption directly reduces the completion chance of a project but does not influence the effects of the other variables. However, the interactions are econometrically visible in the cost overrun regression for the completed projects. The result is reported in Table 5.8. Because of the small size of the data set, we could not simply add the interactions into the full regression without losing significance; instead, the table elaborates elements of Table 5.6, showing each factor and its interaction with corruption one at a time.
As in Table 5.6, the coefficients of the factors are negative, which means that increasing the index of, for instance, contractor selection reduces the predicted amount of budget overrun. In contrast, the coefficients of corruption (in each partial regression) are positive, which means that an increase in the index of corruption increases the predicted budget overrun.
The focus of this table is the coefficient of the interaction term (Factor x) X (corruption index). This coefficient is positive (and significant) in all four partial regressions. This coefficient means that if the corruption index increases, then the budget overrun increases, and thus the overrun-reducing effect of the factor is weakened. In other words, increasing corruption weakens the budget-overrun-reducing effects of contractor selection, project goals, resources and planning, and supervision and stakeholder relations. This is graphically illustrated in Fig. 5.5, which adds the interaction to the main effects of Fig. 5.2: an increase in corruption (by one point) flattens the regression coefficient, and thus the slope of the regression curve, of contractor selection—it becomes less effective.
This illustrates that the corrosive effect of corruption on the important project decisions and practices can be econometrically measured—as we discussed in the overview of existing knowledge in Chap. 2, corruption does not “merely” inflate budgets but weakens the effectiveness of project management practices throughout the project.
This analysis has illustrated how we can quantitatively demonstrate the corrosive effect of corruption. However, the observation is still valid that econometric analysis is somewhat abstract and does not directly demonstrate how the success factors and corruption affect other decisions and project outcomes. The next step in our study is therefore the assembly of 11 case studies: detailed narratives that illustrate what it looks like when budget continuity is not assured, when stakeholders are ignored, when project goals are not articulated and accepted by the public, or when the choice of contractor is not made professionally, based on track record and competence; moreover, the causal interactions among the managerial success drivers will become apparent in the case studies.
The next chapters will in this way connect the econometric results with life on the ground. Then, we will be in a position to identify the core reasons for large public project failure in Nigeria; and, once we have identified the core reasons, we can try to offer sensible and practical recommendations.
Notes
- 1.
The average completion probability of our 38 projects is, of course, 50%, because that is how the sample was constructed. However, as our regression is not linear, the success probability of the average parameter values is not the same as the average success probability; it is slightly offset.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1 Correlations Among Independent Variables Across All 114 Responses
This appendix contains the customary correlations table, which shows that the variables are only weakly or moderately correlated.
I22 | I23 | I24 | I25 | E26 | E27 | E28 | E29 | E30 | E31 | E32 | E33 | E34 | E35 | E36 | E37 | E38 | E39 | E40 | E41 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
G1 | ||||||||||||||||||||
G3 | ||||||||||||||||||||
G4 | ||||||||||||||||||||
G5 | ||||||||||||||||||||
G6 | ||||||||||||||||||||
G7 | ||||||||||||||||||||
G8 | ||||||||||||||||||||
G9 | ||||||||||||||||||||
G10 | ||||||||||||||||||||
G11 | ||||||||||||||||||||
G12 | ||||||||||||||||||||
G13 | ||||||||||||||||||||
G14 | ||||||||||||||||||||
G15 | ||||||||||||||||||||
G16 | ||||||||||||||||||||
I17 | ||||||||||||||||||||
I18 | ||||||||||||||||||||
I19 | ||||||||||||||||||||
I20 | ||||||||||||||||||||
I21 | ||||||||||||||||||||
I22 | 1 | |||||||||||||||||||
I23 | 0.421*** | 1 | ||||||||||||||||||
I24 | 0.467*** | 0.748*** | 1 | |||||||||||||||||
I25 | 0.607*** | 0.608*** | 0.642*** | 1 | ||||||||||||||||
E26 | 0.670*** | 0.459*** | 0.506*** | 0.705*** | 1 | |||||||||||||||
E27 | 0.604*** | 0.415*** | 0.508*** | 0.683*** | 0.897*** | 1 | ||||||||||||||
E28 | 0.598*** | 0.492*** | 0.543*** | 0.616*** | 0.687*** | 0.644*** | 1 | |||||||||||||
E29 | 0.632*** | 0.432*** | 0.412*** | 0.633*** | 0.731*** | 0.699*** | 0.737*** | 1 | ||||||||||||
E30 | 0.695*** | 0.421*** | 0.465*** | 0.674*** | 0.810*** | 0.782*** | 0.646*** | 0.744*** | 1 | |||||||||||
E31 | 0.538*** | 0.179* | 0.225** | 0.509*** | 0.476*** | 0.417*** | 0.392*** | 0.468*** | 0.572*** | 1 | ||||||||||
E32 | 0.592*** | 0.280*** | 0.379*** | 0.551*** | 0.575*** | 0.506*** | 0.472*** | 0.512*** | 0.677*** | 0.834*** | 1 | |||||||||
E33 | 0.577*** | 0.189** | 0.304*** | 0.556*** | 0.503*** | 0.518*** | 0.456*** | 0.544*** | 0.671*** | 0.703*** | 0.789*** | 1 | ||||||||
E34 | 0.382*** | 0.210** | 0.163* | 0.432*** | 0.402*** | 0.372*** | 0.247*** | 0.263*** | 0.446*** | 0.390*** | 0.454*** | 0.453*** | 1 | |||||||
E35 | 0.404*** | 0.247*** | 0.189** | 0.465*** | 0.417*** | 0.426*** | 0.298*** | 0.360*** | 0.490*** | 0.484*** | 0.555*** | 0.544*** | 0.848*** | 1 | ||||||
E36 | 0.403*** | 0.275*** | 0.263*** | 0.520*** | 0.427*** | 0.443*** | 0.369*** | 0.437*** | 0.485*** | 0.277*** | 0.403*** | 0.559*** | 0.546*** | 0.643*** | 1 | |||||
E37 | 0.587*** | 0.372*** | 0.352*** | 0.566*** | 0.564*** | 0.511*** | 0.478*** | 0.473*** | 0.586*** | 0.469*** | 0.600*** | 0.539*** | 0.801*** | 0.819*** | 0.647*** | 1 | ||||
E38 | 0.397*** | 0.250*** | 0.257*** | 0.411*** | 0.326*** | 0.332*** | 0.231** | 0.275*** | 0.364*** | 0.341*** | 0.420*** | 0.382*** | 0.680*** | 0.627*** | 0.441*** | 0.675*** | 1 | |||
E39 | 0.705*** | 0.454*** | 0.502*** | 0.663*** | 0.630*** | 0.615*** | 0.536*** | 0.556*** | 0.634*** | 0.466*** | 0.601*** | 0.598*** | 0.618*** | 0.643*** | 0.594*** | 0.755*** | 0.615*** | 1 | ||
E40 | 0.720*** | 0.383*** | 0.434*** | 0.627*** | 0.660*** | 0.597*** | 0.538*** | 0.607*** | 0.685*** | 0.496*** | 0.634*** | 0.667*** | 0.563*** | 0.606*** | 0.540*** | 0.715*** | 0.526*** | 0.855*** | 1 | |
E41 | 0.626*** | 0.303*** | 0.375*** | 0.628*** | 0.620*** | 0.625*** | 0.466*** | 0.603*** | 0.695*** | 0.470*** | 0.607*** | 0.687*** | 0.571*** | 0.600*** | 0.532*** | 0.679*** | 0.492*** | 0.796*** | 0.899*** | 1 |
There are a number of moderate correlations; for example, G1, the existence of a well-defined supervision structure, is correlated with a number of positive outcomes. The highest correlation is between E39 (a realistic timeline) and I25 (budget risk scenarios), with value of 0.663. In other words, the variables are different and not just repetitions of one another.
Appendix 2 Factor Analysis
Suppose for now that we have identified n = 3 factors. Thus, the realization of variable i (i runs from 1 to 41, as we have 41 variables) for questionnaire j (j runs from 1 to 114, as we have 114 questionnaires), xij, becomes:
The factor analysis algorithm chooses a set of (3 × 114) numbers that minimize the “error”, εij, or the deviation from the actual collected numbers caused by representing the data with a smaller number of three new underlying variables. The hope is to find factors where each of the variables is indeed influenced only by a coefficient, αik, belonging to one factor, which then “represents” several variables—if each variable were equally influenced by all factors, we would not be able to condense the regression analysis. The factor analysis approach has two steps:
-
1.
Exploratory factor analysis: We do not want to “presuppose” what the underlying factors are (we want to let the data speak rather than only look for what we thought at the outset might be there). The exploratory factor analysis identifies what number of “candidate factors” makes sense (how many underlying managerial characteristics are there really?) through a structural model (in which each variable is modelled as a linear combination of the factors).
-
2.
Confirmatory factor analysis: This establishes the robustness of the candidate number of factors in the structural model. (More detail on the structural model can be found in Appendix 2.)
The factor analysis was implemented through a family of statistical techniques known as structural equation modelling (SEM). SEM is widely used in social science research and analyses the structural relationship between the measured variables (the 41 items shown above) and the underlying (latent) constructs. This method is powerful because it estimates the multiple and interrelated dependence across variables and latent factors in a single analysis. The exploratory and confirmatory factor analysis was conducted using the sem package in R. The probit analysis was conducted using the glm package in R. A number of statistical tests was performed in order to ensure robustness of the factor analysis and the regressions.
This four-factor solution was validated through confirmatory factor analysis. Table 5.10 shows the final factor loadings, rather than just showing which variable was explained by the factors in Table 5.4 in the body of this chapter. The reader may remember that factor analysis pretends that each variable is a linear combination of the underlying factors, as shown above in the equation. A “factor loading” then represents the coefficient αik in the earlier equation, which connects this variable to each factor (normalized such that each variable’s factor loadings add up to 1).
Ideally, we want to have each variable represented by only one factor (i.e. all the bold numbers in the table above are 1, and all the non-bold numbers are 0), which would mean that four variable groups were each perfectly “summarized” by one factor. This does not work, of course, because real-life data is never that clean, and it would imply that the multiple variables in each group are all the “same”. They are not, however, as each one captures a separate “flavour” of the underlying factor and is therefore not the same (but the factor “abstracts” these flavour differences away).
Nonetheless, if the variables are indiscriminately determined by all factors, then the “variable consolidation” does not work because the factors do not “group” multiple variables into underlying characteristics. As a rule of thumb, if a variable has a loading of above 0.7 on one factor (which means that its loading on other factors must be low), then it is viewed as strongly expressing this factor. Table 5.4 shows that the final factor loadings from the confirmatory factor analysis are very strong indeed.
This factor model is statistically robust (summary statistics are shown above) and has strong factor loadings. A factor model is weak if many variables attach to more than one factor and thus do not strongly represent one “underlying management characteristic”. However, in this model, few variables attach to more than one factor; “cross-factor loadings” are few, and they are not very strong (only one variable, E41 = risk plan quality, touched upon all factors and had to be taken out). Therefore, we can conclude that this set of factors is robust.
Appendix 3 Specification of the Logistical Regression
As the outcomes are binary (completed or abandoned), we use a probit model based on the assumption that Prob(Y=1) = Φ(XT β), where Y is the vector of outcomes (0 s or 1 s corresponding to abandonment or completion), X is the vector of independent variables (the factor scores), β is the vector of coefficients, which are the parameters to be estimated, and Φ is the cumulative standard normal distribution. We then estimate the log likelihood function:
Logarithms of the likelihood variable are taken in order to turn the product of independent variables into a sum, to which a regression can be applied. If the value of an independent variable changes, it then influences (according to the regression) the logarithm of the probability of the project being completed.
Appendix 4 The Logistical Completion Probability Regression by Respondent Group
The three respondent-specific regressions are shown next to one another in Table 5.11.
The contractor regression adds a twist by including the size of the project as an additional variable (which implies that the larger the project gets, the more contractors struggle). The overall levels of explained variance are lower, because one variable is missing (namely, the disagreement among respondents), and significance levels are lower because the data points in each regression are only one-third of the overall regression (as Table 5.12, Factor 2 illustrates). However, the qualitative shape of the results stays robust across the three types of respondent, as demonstrated again by the size of the coefficients of the independent variables being compared.
Because of the lower significance levels, due to smaller numbers of data points, each factor is not statistically significant for each respondent group in Table 5.10. However, each factor is significant at least for two respondent groups, so the overall conclusion remains robust that all four factors matter.
Appendix 5 Robustness Analysis: Cost Overrun Regressions by Respondent Group
Appendix 6 Robustness Analysis: Schedule Overrun Regressions by Respondent Group
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Ibrahim, J., Loch, C., Sengupta, K. (2022). Insights from the Analysis of the Questionnaires. In: How Megaprojects Are Damaging Nigeria and How to Fix It. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-96474-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-96474-0_5
Published:
Publisher Name: Palgrave Macmillan, Cham
Print ISBN: 978-3-030-96473-3
Online ISBN: 978-3-030-96474-0
eBook Packages: Business and ManagementBusiness and Management (R0)