## Wednesday, July 6, 2016

### Impact of SAT and School Size on Graduation Rates in 31 State Universities in California

Impact of SAT and School Size on Graduation Rates in 31 State Universities in California

Question:   The table below contains information on graduation rates and SAT scores for large state universities and mid-size state universities in the state of California.  Estimate a regression where graduation rate is a function of SAT score and a school size dummy variable.   Are students in mid-size schools more or less likely to graduate than students in large schools, when SAT score is held constant?

Data:

The data to analyze impact of school size and SAT measure on odds of graduating from a public university in California are presented in the table below.

 Information on Graduation Rates, SAT Performance and School Size for Four-year Public Universities in California Public Universities in California Odds of Graduating On Time SAT Measure Large School Dummy Cal Poly 2.45 620 1 UC Berkley 10.11 677.5 1 UCLA 10.11 650 1 UCSD 6.14 640 1 San Jose State 0.92 515 1 UC Davis 4.26 597.5 1 UC Irvine 6.14 565 1 SDSU 1.94 545 1 Cal State Poly 1.08 535 1 Cal State Sacramento 0.72 475 1 UCSB 4.00 607.5 1 San Francisco State University 0.85 497.5 1 Cal state Chico 1.33 507.5 1 Cal State Long Branch 1.44 510 1 Cal State Fullerton 1.08 487.5 1 Cal State Los Angeles 0.56 440 1 University of California Riverside 1.94 545 1 Cal State Northridge 0.89 460 1 Cal State San Bernadino 0.72 447.5 1 Cal State Fresno 0.92 462.5 1 UCAl Santa Cruz 2.85 547.5 1 Cal State East Bay 0.64 455 0 Cal State San Marcos 0.85 482.5 0 Sonoma State University 1.17 502.5 0 Cal State Channel Islands 1.04 477.5 0 Cal State Bakersfield 0.64 452.5 0 Cal State Dominquez Hill 0.39 425 0 Cal State Monterey Bay 0.61 485 0 Cal State Stanislaus 1.00 460 0 Humbolt State University 0.67 507.5 0 University of California Merced 1.33 510 0

The SAT measure is the average of four numbers the 25th and 75th percentiles of both the math and verbal SAT score.

The large school dummy is set to 1.0 if the school has more than 15,000 undergraduates and is set to 0 if school has between 2,000 and 15,000 undergraduates.

Regression Results:  I ran a regression model where the dependent variable is the log of the odds that a person graduates within six years of leaving school.

The explanatory variables used in the model are the SAT measures and the dummy variable set to 1 if the school has more than 15,000 undergraduates and 0 otherwise.

The regression results are laid out in the table below.

 Regression Results for Graduation Rate Equation variable Coeff. t-stat SAT 0.005 11.9 LARGE 0.057 0.95 CONSTANT -2.53 -12.1 R2 86.5

SAT score is significantly relate to log of the odds of the graduation rate.   High SAT scores are associated with higher levels of graduation on time.

School size is NOT significantly related to graduation rate.

The constant term is highly significant and negative.   The constant term is the value of the graduation rate for smaller schools when the SAT is zero.  The SAT measure can never be zero because the minimum value of the SAT is 200.   It is difficult to interpret the meaning of the negative constant term in the estimated regression.

Should I remove the constant term from the regression?

The existence of the significant constant term in this regression suggests to me that the model is missing important variables and the results may not be very robust.

Literature on whether regressions should be estimated without the constant term included is mixed.   Here are some links to this topic.

Since a significant constant term indicates to me that the model may be mis-specified and in particular some variables related to the graduation rate may have been omitted I reran the regression with the constant term omitted.    I also omitted the SIZE variable because it was not significant in the original regression.

When I reran the model with the constant term and the size variable omitted I got a positive but insignificant coefficient for the SAT variable.

Concluding Thoughts:  Simply glancing at the data indicates that high SAT schools have higher graduation rates.  However, model results are not incredibly robust.  I believe other variables are as important as the SAT average including (1) the socio-economic status of the students at the school and (2) the percent of students who attend part time.

Also, the sample size used to construct this model is really small.

More work will follow probably in August of 2016.

Authors Note:  I have created a new blog devoted to the creation and explanation of policy proposals that will help improve our world.   My first policy proposal examines whether allowing private course providers to teach courses inside public schools will help improve educational outcomes.

My post identifies three possible improvements that might be realized by this policy change.

First, the available experience to date suggests that it is very difficult to close poorly performing public schools.   It is more economically and politically feasible to offer a private alternative to the math or reading departments of a school when student math and reading scores are low inside a district.

Second, many school departments offer relatively few foreign language courses.  The expansion of foreign language courses to many school districts is likely to prove to be more cost effective when courses are offered through a private firm rather than through each school district separately.

Third, most school systems lack rigorous courses in computer programming.   Many young people from affluent families are learning computer programming in summer camps.   As a result there is a growing STEM knowledge gap between children from affluent families and the rest of society.   It would be very difficult for most school systems to offer advanced programming classes.   Private companies have proven they can offer good classes and should be allowed to offer these classes inside the public schools.