SAT Scores and
Graduation Rates at 21 Large California Schools
Question: What is the relationship between graduation
rate and SAT scores at large public universities in California?
Data: Data on the graduation rate and verbal and
math SAT scores for 21 large fouryear universities in the state of California
are reported in the table below.
SAT Scores For 21 Large
FourYear Public Universities in California


Graduation Rate

Verbal 25

Verbal 75

Math 25

Math 75


Cal Poly

71

550

650

590

690

UC Berkley

91

590

720

630

770

UCLA

91

560

680

600

760

UCSD

86

550

660

620

730

San Jose State

48

440

550

470

600

UC Davis

81

510

640

560

680

UC Irvine

86

460

600

530

670

SDSU

66

480

590

500

610

Cal State Poly

52

460

570

490

620

Cal State Sacramento

42

410

520

430

540

UCSB

80

530

650

560

690

San Francisco State
University

46

430

550

450

560

Cal state Chico

57

450

550

460

570

Cal State Long Branch

59

440

550

460

590

Cal State Fullerton

52

450

550

470

480

Cal State Los Angeles

36

380

480

390

510

University of California
Riverside

66

470

580

500

630

Cal State Northridge

47

400

510

400

530

Cal State San Bernandino

42

390

490

400

510

Cal State Fresno

48

400

510

410

530

UC Santa Cruz

74

470

610

490

620

The source of the data is the web site CollegeScorecard as
queried on June 2, 2016.
I set state to California, degree type four years, school
public university, and size large.
Discussion of data:
The graduation rate in this study is the rate of graduation
after six years at schools that offer fouryear degrees for students that were
enrolled full time in their first year.
Test scores were at all schools that report their test
scores. The scores listed are the 25^{th}
and 75^{th} percentile of verbal and math SAT scores.
Descriptive Statistics
Graduation Rates and SAT Scores California Schools


Graduation Rate

Verbal
25

Verbal
75

Math
25

Math
75


Mean

62.9

467.6

581.4

495.7

613.8

STD

17.8

60.3

65.9

74.1

84.6

Min

36.0

380.0

480.0

390.0

480.0

Max

91.0

590.0

720.0

630.0

770.0

Logistic Regression
Results:
I estimated logistic regression models for the graduation
rate variable. The dependent variable
in the logistic rate model is the log of the odds of the graduation rate. I estimated several models with various SAT
scores as explanatory variables. The SAT
variable used in the model presented below is the average of four SAT
scores  verbal 25^{th}
percentile and 75^{th} percentile, and math 25^{th} and 75^{th}
percentile.
Logistic Regression of
Graduation Rates on SAT Information


Coeff.

tstat

Adjusted R2


Average Four SAT Scores

0.0119542

11.09

0.8591

Constant Term

5.807844

9.9

Observation on Results: The average of the four SAT scores is highly
significantly related to graduation rate at the 21 large public universities in
California. These results are not
sensitive to the choice of SAT statistic used as an explanatory variable. Results are in fact extremely robust.
A thought about the
importance of tests: These results
suggest that colleges with smart kids, as measured by SAT performance, also
have high graduation rates. However,
the ability to test well may not be the cause of the higher graduation
rate. The model needs to be expanded to
hold other economic and socioeconomic variables constant in order to say more
about this issue.
Further Work: I am interested in this topic because of
potential policy implications and because it is an interesting statistical
problem.
I would like to extend the model to consider other issues of
potential interest to both policy makers and consumers of education.
How does the impact of SAT scores differ for large schools
versus small schools?
How does the graduation rate SAT relationship differ for
large public schools in Texas compare to large public schools in California?
Is the SAT as an important determinant of graduation rate at
private universities?
I will use these other issues to teach more about a number
of statistical issues including 
multicollinearity, interpreting logistic regression models, and hypothesis
testing.
More work on these topics will follow.
No comments:
Post a Comment