| |
Generating the Regression Analysis
- Download the data from http://pubpages.unh.edu/~bte/growth.htm and save it on your computer. Open Microsoft Excel and then open the file you have just downloaded from the internet.
- You need to calculate the average growth rate of GDP per worker between 1960 and 1997 for each country you have in the dataset. You can do this using the following formula;

where g is the growth rate of GDP per worker between 1960 and 1997 (why do you need to use logarithms?, see Jones [2002: 203]) and i denotes the country.
- To create a natural logarithmic of your GDP data, in an open column, hit =LN(column and row). Ex: =ln(D7). Hit return. Click the cell and left drag mouse down to end of column. This will replicate the formula to all cells/countries. Notice that you must repeat this procedure twice: once to calculate the natural logarithm of GDP per worker in 1960 and another to calculate the natural logarithm of GDP per worker in 1997. Label each column accordingly.
- Calculate g in an open column using the formula from step b) and the procedures from step c). Ex: =(G7 - F8)/37. Type the formula and hit return. Click the cell and left drag mouse down to end of column.
- Constructing the Charts. These steps will be done twice, once for the relationship between corruption and the growth rate of GDP and once for the relationship between corruption and the level of income in 1997.
- On Excel, click on a blank cell;
- Click on the INSERT menu;
- Click on CHART
- On Chart Type, choose XY (Scatter)
- Click on NEXT
- Click on the SERIES tab,
- Click on ADD
- Click on the box X Values
- Select the worksheet range that has the data on corruption (Corruption Index must be in the X values)
- Click on the box Y Values
- Select the worksheet range that has the data on GDP (GDP Growth or per worker income must be in the Y values)
- Click on NEXT
- Type the chart title (choose a title)
- Type the X title (corruption Index)
- Type the Y title (growth rate/level of income)
- Click on NEXT
- Click on As new Worksheet
- Click on FINISH
- Note: You must construct two charts: Growth Rate versus Corruption and Income Level versus Corruption
f) Add a Trendline and Equation to chart
- Click on the CHART (only one click)
- On the CHART menu, click ADD TRENDLINE
- On the TYPE tab, click on LINEAR
- On the OPTIONS tab, enable checkbox (click on) “Display equation on chart” and also “Display R-squared value on chart.” Hit OK
- Save file (file, save as, provide a file name, click on OK)
If you are using a different period, you may apply the same formula, but note that the denominator of the formula will change because it is the difference in years between the initial and the final period.
Return to the top of the page |
| |
Generating and Evaluating Statistical Significance
To test the statistical significance of the coefficients you must run a regression in the Excel. Instructions on how to do this are presented below:
- Open the Excel worksheet that contains the required data (level of GDP, growth rate of GDP, and the corruption index)
- Click on TOOLS menu
- Click on DATA ANALYSIS
- Click on REGRESSION and hit OK
- On INPUT Y RANGE, click on
and select the cells in the worksheet that contains the GDP data. Click on to return to regression box;
- On INPUT X RANGE, click on
and select the cells in the worksheet that contains the corruption index. Click on to return to regression box;
- On OUTPUT OPTIONS, click on NEW WORKSHEET and type a name for it.
- Hit OK
- Excel will generate a new worksheet that contains the regression analysis. Note that the first section of the Excel output reports the R-Squared and the last section of the output reports the coefficient of corruption and its standard error, the t-ratio (test statistic), and the confidence interval.
If this option is not available in your computer, do the following:
a) Click on TOOLS, b) Click on ADD-INS, c) click on the blank box associated with the ANALYSIS TOOLPAK ( you must see a check mark in this option), d) Hit OK, e) follow Excel instructions (if any). This will install the regression package in your computer.
Return to the top of the page
|
| |
Interpreting the R-Squared and Conducting Hypothesis Tests
Consider the following simple linear model:
(A1)
Where Y is a variable to be explained, X is an explanatory variable, e is an error term, a is the intercept of the equation, and b is the slope of the equation. Using this simple model, one can evaluate the relationship between X and Y and identify the effects of a change of X on Y. This analysis can be done looking the value of the R-Squared or testing the statistical significance of b.
i. Interpreting the R-Squared
The R-Squared or coefficient of determination is a measure of the fit of the regression and it tells us how well the estimated line fits the data. In other words, it measures how much of the variation in Y is explained by the variation of X. For example, a R-Squared of 0.7 implies that 70% of the variation in Y is explained by variation in X.
ii. Hypothesis Testing
A good rule of thumb for testing the statistical significance of a coefficient is to look if its confidence interval includes the value zero. If this is the case, the coefficient is said to be statistically insignificant (not able to be statistically distinguished from zero) and so X and Y are not linearly related. Otherwise, the coefficient is said to be statistically significant and so X and Y have some relationship. The Excel program provides the confidence interval. Search in the last section of the output for the “LOWER 95%” and “UPPER 95%” values. These are the lower and upper bounds of the confidence interval. If it includes zero, the coefficient is statistically insignificant.
Another way to evaluate whether X and Y are related is to conduct a hypothesis test. The standard approach consists in testing if b is statistically different from zero, since a b equal to zero implies that there is no linear relationship between X and Y. The formal procedure requires stating a null and an alternative hypothesis:
Null hypothesis: b = 0 Þ (no relationship between X and Y)
Alternative hypothesis: b ¹ 0 Þ (X and Y are related)
The decision rule is based on the t-ratio calculated using the estimated coefficient (b) and the value from the t-student or normal distributions. The t-ratio, under the null hypothesis, is given by:
(A2)
where is the estimated value of the coefficient of X and is the standard deviation of . The test statistic has t-student distribution with (n - 2) degrees of freedom. Given a particular level of significance, the decision rule is to reject the null hypothesis if the absolute value of the test statistic (given by equation A2) exceeds the t statistic (critical value) acquired from the table of the t distribution. For example, consider that equation (A1) is estimated using 27 observations and with the following results:
=1.2 and =0.5
Therefore: 
At 5% of significance and for a two-tail test, one will find from a table of the t-student distribution the t statistic equal to 2.06. Since the test statistic (3.2) is greater than the critical value (2.06) we conclude that the null hypothesis (no relationship between X and X) should be rejected at 5% of significance, which implies that X and Y are linearly related.
Return to the top of the page
|
| |
Think Deeply About What You Are Doing
Now that you have some results, start thinking broadly about what corruption is and how it affects incomes and/or growth rates. In order to do this thinking, read chapter 12 in William Easterly’s book The Elusive Quest for Growth. The data on corruption is taken from Transparency International (2002), “Corruption Perceptions Index,” http://www.transparency.org/cpi/index.html .Get on their webpage and check out what they say as an additional way to familiarize yourself with the data and the issue generally.
i. Connecting Theory with Empirical Results
Now that you have some results and some thoughts on the effects of corruption, you are ready to begin your analysis. What you want to do is connect the statistical work in section 3.1 with your knowledge developed from your readings and from the data with the traditional Solow growth model. The first thing you will note is that corruption is not mentioned in the Solow growth model. But all is not lost. Much of the work on corruption suggests that it does its damage by decreasing investment. You should check this for yourself by regressing the investment share against the corruption index. Given this, the effect of corruption can be modeled through its impact on investment. Use this analysis to make a prediction of the effects of an increase in corruption on income levels and on growth rates. In other words, do the analysis in terms of growth versus level effects. This will involve a detailed explanation of the Solow model and the Solow graphs. Next, write up your statistical results. Relate your results to the predictions of the Solow model. How could we get growth effects from corruption?
ii. Developing a Critique of the Analysis
Next think about the model and the statistical results. Based on your knowledge of statistics (econometrics) what is wrong with this analysis? In order for your results to be robust, what needs to be done? In other words, now that you have done all this work, criticize it as best as you can.
iii) The Final Result
In the end, you will have a mini paper on the economics of corruption. It will have the following sections.
1. Introduction - discuss the general issue of how corruption can impact development
2. Corruption in the Solow Framework
3. Data and statistical results
4. Critique of the statistical results
5. Conclusion
Return to the top of the page |