Should Your Kids go to Charter School?

Comparing charter and public 6th, 7th, and 8th graders CAASPP Test Results

Problem Summary

Our hypothesis is that 6th, 7th and 8th-grade charter schools students perform better than public schools students in the same grades on the ELA and mathematics standardized tests. We will measure this by comparing the number of students failing to meet standards and more students meeting or above standards on California's standardized testing CAASPP.

We will also implicitly be making the assumption that higher performance on the CAASPP indicates greater student learning.

Findings Summary

By conducting paired t-tests on the percentage of students who did not meet standards and for those who were at or above standards we were able to determine that there was significant evidence to believe that there was a real difference between the means of the publicly funded and locally funded charter schools. However, there was not the same evidence for the directly funded charter schools.

Our findings are tentative at this point. We have not separated our data further into subcategories. For instance, can this difference in performance be accounted for by a difference in gender or income level of attending students? It is possible that there are more locally funded charter schools in higher income areas, meaning that the difference in performance can be accounted for by family income? Is it possible that 2017 was just a good year for charter schools? How do other years compare? These are all areas for further analysis.

What is a charter school?

There are three types of schools we will be looking at public schools, direct-funded charter schools, and locally funded charter schools.

A charter school is an independently run public school granted more operation flexibility, in return for higher performance accountability. Each school is established by charter. This charter is essentially a contract detailing the school's mission, program, and performance goals.

Charter schools are public schools in the sense that they are free to attend. However, unlike public schools, charter schools are schools of choice, meaning that families choose to send their kids to them. These schools operate with freedom from some of the regulations that are typically imposed upon district schools but are still accountable for their academic results and for upholding the promises made in their charters. Charter schools must also participate in state standardized testing. They must demonstrate performance in academic achievement, financial management, and organizational stability. If a charter school does not meet their performance goals, it can be closed.

How are charter schools funded?

There are two funding types for charter schools. Direct-funded charter schools elect to receive funding directly from the state. Whereas locally funded charter schools get their funding from their local education agency/school district.

In [140]:
import pandas as pd
import sqlite3
import numpy as np 
import matplotlib.pyplot as plt 
import matplotlib.patches as mpatches
from scipy.stats import relfreq
from scipy.stats import ttest_ind
from scipy.stats import f_oneway


df_ca = pd.read_csv("ca2017.txt")
df_ca_entities = pd.read_csv("ca2017_entities.txt")

df_ca.columns = df_ca.columns.str.replace(' ','_' ).str.lower()
df_ca_entities.columns = df_ca_entities.columns.str.replace(' ','_' ).str.lower()


#remove entries with missing data 
df_ca = df_ca[df_ca.total_tested_at_entity_level != '*']
df_ca = df_ca[df_ca.percentage_standard_met != '*']

df_ca.head()
Out[140]:
county_code district_code school_code filler test_year subgroup_id test_type total_tested_at_entity_level total_tested_with_scores grade ... area_1_percentage_below_standard area_2_percentage_above_standard area_2_percentage_near_standard area_2_percentage_below_standard area_3_percentage_above_standard area_3_percentage_near_standard area_3_percentage_below_standard area_4_percentage_above_standard area_4_percentage_near_standard area_4_percentage_below_standard
0 0 0 0 NaN 2017 1 B 3209613 3206556 3 ... 38.58 23.54 44.00 32.46 17.72 61.55 20.73 24.16 47.60 28.25
1 0 0 0 NaN 2017 1 B 3220894 3218106 3 ... 34.14 24.96 45.81 29.23 25.92 48.74 25.34 0.00 0.00 0.00
2 0 0 0 NaN 2017 1 B 3209613 3206556 4 ... 30.40 23.01 45.24 31.75 16.17 57.39 26.44 23.80 49.85 26.34
3 0 0 0 NaN 2017 1 B 3220894 3218106 4 ... 42.15 20.13 44.42 35.45 22.43 43.29 34.28 0.00 0.00 0.00
4 0 0 0 NaN 2017 1 B 3209613 3206556 5 ... 32.89 28.13 42.91 28.96 16.70 59.67 23.63 25.43 43.84 30.72

5 rows × 32 columns

In [142]:
#change dtype from object to float
df_ca = df_ca.astype({col: float for col in df_ca.columns[11:]})
In [143]:
df_ca.iloc[:,:-12].describe()
Out[143]:
county_code district_code school_code filler test_year subgroup_id grade test_id caaspp_reported_enrollment students_tested mean_scale_score percentage_standard_exceeded percentage_standard_met percentage_standard_met_and_above percentage_standard_nearly_met percentage_standard_not_met students_with_scores
count 89222.000000 89222.000000 8.922200e+04 0.0 89222.0 89222.0 89222.000000 89222.000000 8.922200e+04 8.922200e+04 68304.000000 89222.000000 89222.000000 89222.000000 89222.000000 89222.000000 8.922200e+04
mean 29.077234 65602.389310 3.919898e+06 NaN 2017.0 1.0 7.278956 1.499809 5.816974e+02 5.664246e+02 2486.498086 17.403756 23.334445 40.738204 25.358098 33.903693 5.659247e+02
std 14.247435 11718.415172 2.772705e+06 NaN 0.0 0.0 3.709132 0.500003 1.753988e+04 1.706150e+04 67.568357 15.329506 10.494026 21.669364 8.691054 19.995055 1.704620e+04
min 0.000000 0.000000 0.000000e+00 NaN 2017.0 1.0 3.000000 1.000000 1.100000e+01 1.100000e+01 2248.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.100000e+01
25% 19.000000 64733.000000 1.216120e+05 NaN 2017.0 1.0 4.000000 1.000000 6.200000e+01 6.100000e+01 2437.200000 5.970000 16.130000 24.090000 20.000000 18.180000 6.100000e+01
50% 30.000000 67199.000000 6.015747e+06 NaN 2017.0 1.0 6.000000 1.000000 1.030000e+02 1.010000e+02 2481.500000 12.845000 23.170000 38.430000 25.600000 32.095000 1.010000e+02
75% 39.000000 69682.000000 6.049710e+06 NaN 2017.0 1.0 11.000000 2.000000 2.710000e+02 2.640000e+02 2531.600000 25.000000 30.300000 56.250000 30.720000 46.857500 2.640000e+02
max 58.000000 77032.000000 6.121081e+06 NaN 2017.0 1.0 13.000000 2.000000 3.305989e+06 3.220894e+06 2776.400000 96.880000 82.350000 100.000000 83.330000 100.000000 3.218106e+06
In [144]:
#Create dbs from caaspp data 
conn = sqlite3.connect('ca2017.db')

df_ca.to_sql('caasp', conn, if_exists = 'replace')
df_ca_entities.to_sql('entities', conn, if_exists = 'replace')
In [145]:
#query schools in the 6, 7, 8 grades where 100% of students either met or did not met standards

q = """SELECT c.school_code
        , e.county_name
        , e.district_name
        , e.school_name
        , c.percentage_standard_met_and_above
        , c.percentage_standard_not_met
        , c.subgroup_id
        FROM caasp c 
        LEFT JOIN entities e ON c.school_code = e.school_code AND c.district_code = e.district_code
        WHERE  (c.percentage_standard_met_and_above = 100 OR c.percentage_standard_not_met = 100) 
                AND c.grade IN (6, 7, 8) 
        ORDER BY c.percentage_standard_met_and_above DESC;
        """

The percentage of standards not met and percentage of standards met and above has a max value of 100. This seems supicious as it would mean that 100% of the school scored in the given category. We will look into these schools to see if these values seem reasonable. For example these could be prep or alternate schools.

In [190]:
schools_hund_percent = pd.read_sql_query(q, conn)
schools_hund_percent.head()
Out[190]:
total_schools type_id grade test_id AVG(c.students_tested) AVG(c.mean_scale_score) AVG(c.percentage_standard_exceeded) AVG(c.percentage_standard_met) AVG(c.percentage_standard_met_and_above) AVG(c.percentage_standard_nearly_met) AVG(c.percentage_standard_not_met) AVG(c.students_with_scores)
0 3497 7 6 1 118.480984 2515.154161 15.060835 30.376940 45.437815 26.928667 27.633566 118.388047
1 474 9 6 1 70.761603 2511.389451 12.904895 29.979198 42.883966 29.114937 28.001181 70.624473
2 146 10 6 1 88.890411 2527.628767 15.881918 34.872260 50.754521 27.503767 21.741918 88.801370
3 3496 7 6 2 119.075801 2503.938616 16.232228 18.448696 34.680878 28.804760 36.514465 119.014588
4 473 9 6 2 70.904863 2496.837421 13.249852 17.613150 30.863446 30.161734 38.974778 70.765328

Looking through the schools that have 100% of students meeting or above standards, we see that these schools are schools that are known for academic excellence.

On the other hand, schools that have 100% of students failing to meet criteria are often alternative/low income area schools.

We also see that rows with school code 0 are not schools but seem to be test centers for the county. Since these are not schools we shall not use these entries during further analysis. We will keep this in mind while writing our SQL.

In [147]:
df_ca_entities.describe()
Out[147]:
county_code district_code school_code filler test_year type_id
count 11327.000000 11327.000000 1.132700e+04 0.0 11327.0 11327.000000
mean 28.916836 65353.020482 3.871051e+06 NaN 2017.0 7.142491
std 14.283690 12049.756745 2.648044e+06 NaN 0.0 0.817326
min 0.000000 0.000000 0.000000e+00 NaN 2017.0 4.000000
25% 19.000000 64725.000000 1.305090e+05 NaN 2017.0 7.000000
50% 30.000000 67116.000000 6.008239e+06 NaN 2017.0 7.000000
75% 39.000000 69674.000000 6.046308e+06 NaN 2017.0 7.000000
max 58.000000 77032.000000 9.010745e+06 NaN 2017.0 10.000000
In [192]:
#create reference table for labels to id codes
test_types = pd.DataFrame([[1,'ELA'], [2, 'Mathematics']], columns =['test_id', 'test_type']).set_index('test_id')
school_types = pd.DataFrame([[7, 'Public School'], [9, 'Direct Funded Charter'], [10, 'Locally Funded Charter']], 
             columns = ['type_id', 'type']).set_index('type_id')

print('Reference Tables\n\n')
print(test_types, '\n')
print(school_types)
Reference Tables


           test_type
test_id             
1                ELA
2        Mathematics 

                           type
type_id                        
7                 Public School
9         Direct Funded Charter
10       Locally Funded Charter
In [149]:
"""query averages for each performance category for students in the 6th, 7th, and 8th grades. 
    Group results by school type, student grade, and test type

"""

perform_avgs_query = """SELECT COUNT(*) total_schools
                            , e.type_id
                            , c.grade
                            , c.test_id
                            , AVG(c.students_tested)
                            , AVG(c.mean_scale_score)
                            , AVG(c.percentage_standard_exceeded)
                            , AVG(c.percentage_standard_met)
                            , AVG(c.percentage_standard_met_and_above)
                            , AVG(c.percentage_standard_nearly_met)
                            , AVG(c.percentage_standard_not_met)
                            , AVG(c.students_with_scores)
                    FROM caasp c 
                    INNER JOIN entities e ON c.school_code = e.school_code 
                        AND c.county_code = e.county_code 
                        AND c.district_code = e.district_code 
                    WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10)
                        AND c.subgroup_id = 1 AND c.school_code >0
                    GROUP BY e.type_id, c.grade, c.test_id
                    ORDER BY c.grade, c.test_id;
                    """
In [150]:
middle_test_avgs = pd.read_sql_query(perform_avgs_query, conn)
In [152]:
middle_test_avgs.loc[middle_test_avgs.grade == 6]
Out[152]:
total_schools type_id grade test_id AVG(c.students_tested) AVG(c.mean_scale_score) AVG(c.percentage_standard_exceeded) AVG(c.percentage_standard_met) AVG(c.percentage_standard_met_and_above) AVG(c.percentage_standard_nearly_met) AVG(c.percentage_standard_not_met) AVG(c.students_with_scores)
0 3497 7 6 1 118.480984 2515.154161 15.060835 30.376940 45.437815 26.928667 27.633566 118.388047
1 474 9 6 1 70.761603 2511.389451 12.904895 29.979198 42.883966 29.114937 28.001181 70.624473
2 146 10 6 1 88.890411 2527.628767 15.881918 34.872260 50.754521 27.503767 21.741918 88.801370
3 3496 7 6 2 119.075801 2503.938616 16.232228 18.448696 34.680878 28.804760 36.514465 119.014588
4 473 9 6 2 70.904863 2496.837421 13.249852 17.613150 30.863446 30.161734 38.974778 70.765328
5 146 10 6 2 89.047945 2516.763699 17.082123 21.746370 38.828904 30.449863 30.721370 88.904110

When considereding average standardized test scores for 6th grade California students:

Direct funded charter schools (type_id 9) preform worse than locally funded charter schools (type_id 10) and public schools (type_id 8) in all categories.

When compared to public schools, locally funded charter schools have a smaller percentage of students failing to meet standards and comparable percentage of students exceeding.

In [153]:
middle_test_avgs.loc[middle_test_avgs.grade == 7]
Out[153]:
total_schools type_id grade test_id AVG(c.students_tested) AVG(c.mean_scale_score) AVG(c.percentage_standard_exceeded) AVG(c.percentage_standard_met) AVG(c.percentage_standard_met_and_above) AVG(c.percentage_standard_nearly_met) AVG(c.percentage_standard_not_met) AVG(c.students_with_scores)
6 1958 7 7 1 208.775281 2536.362921 14.081134 32.878514 46.959704 24.052211 28.988131 208.557201
7 469 9 7 1 73.332623 2541.061834 13.836439 34.042303 47.878699 25.859510 26.261343 73.098081
8 147 10 7 1 93.394558 2552.635374 15.171633 38.552313 53.724082 25.204150 21.070408 93.156463
9 1958 7 7 2 209.850868 2518.373544 16.019254 18.638882 34.658059 27.565756 37.776231 209.740552
10 469 9 7 2 73.362473 2518.918337 14.592836 18.973518 33.566397 29.457463 36.976013 73.249467
11 147 10 7 2 93.482993 2531.609524 17.028503 21.154150 38.182925 30.114422 31.702721 93.428571

When considereding average standardized test scores for 7th grade California students:

Direct funded charter schools under preform locally funded charter schools and public schools but outperform public schools is the number of students meeting, above, and exceeding in both test categories. On the other hand, direct funded charter schools have the highest percentage of students failing to meet standards. This might point to some direct funded charter schools significantly outpreforming others.

Locally funded charter schools outperform the direct funded charter schools and public schools, especially for the mathematics portion.

In [154]:
middle_test_avgs.loc[middle_test_avgs.grade == 8]
Out[154]:
total_schools type_id grade test_id AVG(c.students_tested) AVG(c.mean_scale_score) AVG(c.percentage_standard_exceeded) AVG(c.percentage_standard_met) AVG(c.percentage_standard_met_and_above) AVG(c.percentage_standard_nearly_met) AVG(c.percentage_standard_not_met) AVG(c.students_with_scores)
12 1986 7 8 1 205.294058 2551.972558 13.756012 32.270987 46.026949 26.669491 27.303510 205.115307
13 450 9 8 1 73.393333 2557.825333 13.462311 34.093000 47.555578 28.336089 24.108711 73.273333
14 150 10 8 1 91.513333 2569.438667 16.678200 36.637267 53.316333 25.787600 20.896400 91.433333
15 1981 7 8 2 206.566381 2532.360020 17.823433 16.056209 33.879460 23.707304 42.412807 206.317516
16 448 9 8 2 73.620536 2530.221652 15.793504 16.245022 32.038571 24.666629 43.294487 73.430804
17 150 10 8 2 91.506667 2540.207333 18.388400 17.317933 35.706600 24.267467 40.025533 91.426667

When considereding average standardized test scores for 8th grade California students:

Direct funded charter schools under preform compared to locally funded charter schools and public schools but outperform public schools is the number of students meeting, above, and exceeding in both test categories. Direct funded charter schools have the highest percentage of students failing to meet standards in mathematics.

Public schools have the lowest percentage of students failing to meet standards on both tests.

Locally funded charter schools have the highest percentage of students meetng and exceeding standards.

In [155]:
#Create view for avg test results above and below standards by school type, test type and grade


view = """ CREATE VIEW averages AS
        SELECT c.grade
        , c.test_id
        , e.type_id
        , AVG(c.percentage_standard_not_met) avg_percent_std_not_met
        , AVG(c.percentage_standard_met_and_above) avg_percentage_standard_met_and_above
        FROM caasp c 
        INNER JOIN entities e ON c.school_code = e.school_code 
            AND c.county_code = e.county_code 
            AND c.district_code = e.district_code 
        WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10) AND c.subgroup_id = 1 AND c.school_code >0
        GROUP BY e.type_id, c.grade, c.test_id
        ORDER BY c.grade, c.test_id;
        """
       
#query num of schools grouped by type, test type and grade with more than avg percentage of students below stnds

below_avg_query = """SELECT c.grade
                        , c.test_id
                        , e.type_id
                        , a.avg_percent_std_not_met
                        , COUNT(*) num_percent_std_not_met_grter_avg
                    FROM caasp c 
                    INNER JOIN entities e ON c.school_code = e.school_code 
                        AND c.county_code = e.county_code 
                        AND c.district_code = e.district_code 
                    INNER JOIN averages a ON c.grade = a.grade AND c.test_id = a.test_id
                        AND e.type_id = a.type_id
                    WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10) 
                        AND  c.percentage_standard_not_met > a.avg_percent_std_not_met 
                    GROUP BY e.type_id, c.grade, c.test_id
                    ORDER BY c.grade, c.test_id;
       
                 """
In [156]:
#pd.read_sql_query('DROP VIEW averages;', conn)
#pd.read_sql_query(view, conn)
stds_not_met = pd.read_sql_query(below_avg_query, conn)
In [ ]:
def convert_name(num_tup):
    #converts lables to tup of names
    return(num_tup[0], test_types.loc[num_tup[1]].values[0], school_types.loc[num_tup[2]].values[0])

labels = avg_diffs_from_pub[['grade', 'test_id', 'type_id']].values.tolist()
labels_convert = [convert_name(x) for x in labels]
label_names = [(x[0], x[2]) for x in labels_convert]
colors = ['#5c678a' if x[1] == 'ELA' else '#00b3b3' for x in labels_convert]
x_values = range(avg_diffs_from_pub.shape[0])

Percentage of schools with higher than average students not meeting standards

In [206]:
stds_not_met['total_schools'] = middle_test_avgs.total_schools
stds_not_met['percentage_greater_than_avg'] = (stds_not_met.num_percent_std_not_met_grter_avg/
                                               stds_not_met.total_schools)

plt.figure(figsize =(8,6))
plt.bar(range(avg_diffs_from_pub.shape[0]), stds_not_met.percentage_greater_than_avg, color = colors)
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')

#create legend 
ela_patch = mpatches.Patch(color='#5c678a', label='ELA')
math_patch = mpatches.Patch(color='#00b3b3', label='Mathematics')

plt.title('Percentage of schools with higher than average students not meeing standards')
plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False )
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)

plt.show()

From the above we see that when grouped by grade, test, and type, less than half of all schools have greater than the average percentage of students whose scores do not meet standards.

In [158]:
above_avg_query =""" SELECT c.grade
                    , c.test_id
                    , e.type_id
                    , a.avg_percent_std_not_met
                    , COUNT(*) num_percent_std_met_above_grter_avg
                    FROM caasp c 
                    INNER JOIN entities e ON c.school_code = e.school_code 
                        AND c.county_code = e.county_code 
                        AND c.district_code = e.district_code 
                    INNER JOIN averages a ON c.grade = a.grade AND c.test_id = a.test_id 
                        AND e.type_id = a.type_id
                    WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10) 
                        AND  c.percentage_standard_met_and_above > a.avg_percentage_standard_met_and_above 
                        AND  c.school_code > 0 
                    GROUP BY e.type_id, c.grade, c.test_id
                    ORDER BY c.grade, c.test_id;
                """
In [159]:
stds_met_above = pd.read_sql_query(above_avg_query, conn)

Percentage of schools with higher than average percentage of students performing above average

In [209]:
stds_met_above['total_schools'] = middle_test_avgs.total_schools
stds_met_above['percentage_greater_than_avg'] = (stds_met_above.num_percent_std_met_above_grter_avg/
                                                 stds_met_above.total_schools)
plt.figure(figsize =(8,6))
plt.bar(range(avg_diffs_from_pub.shape[0]), stds_met_above.percentage_greater_than_avg, color = colors)
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')

#create legend 
ela_patch = mpatches.Patch(color='#5c678a', label='ELA')
math_patch = mpatches.Patch(color='#00b3b3', label='Mathematics')

plt.title('Percentage of schools with higher than average students performing above average')
plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False )
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)

plt.show()

We also see that most charter schools have less than the average percentage of students whose score at or above standards.

In [161]:
avg_not_met_and_met = middle_test_avgs[['grade', 'type_id', 'test_id','AVG(c.percentage_standard_not_met)',
                                        'AVG(c.percentage_standard_met_and_above)' ]]
In [162]:
avg_not_met_and_met['diff_not_met'] = 0
avg_not_met_and_met['diff_met'] = 0

avg_diffs_from_pub = pd.DataFrame(columns = avg_not_met_and_met.columns)

for _, g in avg_not_met_and_met.groupby(['grade', 'test_id']):
    g.diff_not_met = g.iloc[0, 3] 
    g.diff_met = g.iloc[0, 4]
    avg_diffs_from_pub = avg_diffs_from_pub.append(g)
        
In [163]:
avg_diffs_from_pub.diff_not_met = (avg_diffs_from_pub['AVG(c.percentage_standard_not_met)']
                                   - avg_diffs_from_pub.diff_not_met)
In [164]:
avg_diffs_from_pub.diff_met = (avg_diffs_from_pub['AVG(c.percentage_standard_met_and_above)'] 
                               - avg_diffs_from_pub.diff_met)

Difference Between Average Charter Student Performance and Average Public School Student

In [165]:
#absolute differences charter avg percentage - public school avg percentage
avg_diffs_from_pub
Out[165]:
grade type_id test_id AVG(c.percentage_standard_not_met) AVG(c.percentage_standard_met_and_above) diff_not_met diff_met
0 6 7 1 27.633566 45.437815 0.000000 0.000000
1 6 9 1 28.001181 42.883966 0.367616 -2.553849
2 6 10 1 21.741918 50.754521 -5.891648 5.316705
3 6 7 2 36.514465 34.680878 0.000000 0.000000
4 6 9 2 38.974778 30.863446 2.460313 -3.817432
5 6 10 2 30.721370 38.828904 -5.793095 4.148026
6 7 7 1 28.988131 46.959704 0.000000 0.000000
7 7 9 1 26.261343 47.878699 -2.726787 0.918996
8 7 10 1 21.070408 53.724082 -7.917723 6.764378
9 7 7 2 37.776231 34.658059 0.000000 0.000000
10 7 9 2 36.976013 33.566397 -0.800218 -1.091663
11 7 10 2 31.702721 38.182925 -6.073510 3.524866
12 8 7 1 27.303510 46.026949 0.000000 0.000000
13 8 9 1 24.108711 47.555578 -3.194798 1.528629
14 8 10 1 20.896400 53.316333 -6.407110 7.289385
15 8 7 2 42.412807 33.879460 0.000000 0.000000
16 8 9 2 43.294487 32.038571 0.881680 -1.840888
17 8 10 2 40.025533 35.706600 -2.387273 1.827140

Here we see the absolute differences in average percentage of students scoring in the categories standards not met, and standards met and above.

This is calculated charter average percentage - public average percentage for each category.

A negative value in the diff_not_met column means that the charter school had less percentage of students not meeting standards when compared to public schools.

A positive value in the diff_met column means that the charter school had a larger percentage of students on average scoring at or above standard levels than public schools.

In [181]:
#plot standards not met
plt.figure(figsize = (12, 6))
plt.bar(x_values, avg_diffs_from_pub['AVG(c.percentage_standard_not_met)'], 
        label = label_names, color= colors)
plt.ylabel('mean percentage')
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')
plt.title('Standards not met')

ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)

#draw lengend and add legend
ela_patch = mpatches.Patch(color='#5c678a', label='ELA')
math_patch = mpatches.Patch(color='#00b3b3', label='Mathematics')

plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False )
plt.show()



#plot standards met and above
plt.figure(figsize = (12, 6))

plt.bar(x_values, avg_diffs_from_pub['AVG(c.percentage_standard_met_and_above)'], 
        label = label_names, color = colors)
plt.ylabel('mean percentage')
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')
plt.title('\n\nStandards met and above')

#clean plots
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)



#add legend 
plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False);
In [182]:
percent_stnds_query = """SELECT c.grade
                                , c.test_id
                                , e.type_id
                                , c.percentage_standard_not_met
                                , c.percentage_standard_met_and_above
                            FROM caasp c 
                            INNER JOIN entities e ON c.school_code = e.school_code 
                                AND c.county_code = e.county_code 
                                AND c.district_code = e.district_code 
                            WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10)
                                AND c.subgroup_id = 1
                            ORDER BY c.grade, c.test_id;
           
                            """
In [183]:
results = pd.read_sql_query(percent_stnds_query, conn)
results.head()
Out[183]:
grade test_id type_id percentage_standard_not_met percentage_standard_met_and_above
0 6 1 9 55.56 5.56
1 6 1 9 5.71 80.00
2 6 1 9 38.89 55.56
3 6 1 9 61.90 12.70
4 6 1 9 63.64 14.55
In [187]:
f, ax = plt.subplots(6, 3, figsize = (15, 25), sharey = True)

x = range(0, 100, 10)
i, j = 0,0
for name, group in results.groupby(['grade', 'test_id', 'type_id']):
    res_met = relfreq(group.percentage_standard_met_and_above, numbins = 10)
    res_not_met = relfreq(group.percentage_standard_not_met, numbins = 10)
    ax[i, j].bar(x, res_met.frequency, width = res_met.binsize, label = 'meeting and above standard',
                 alpha = .5, color = '#187eba')
    ax[i, j].bar(x, res_not_met.frequency, width = res_not_met.binsize, label = 'not meeting standard',
                 alpha = .7, color = '#f2a771')
    ax[i, j].set_xlabel(convert_name(name), fontsize = 13)
    ax[i, j].spines['top'].set_visible(False)
    ax[i, j].spines['right'].set_visible(False)
    
    if i == 0 and j == 2:
        ax[i, j].legend(frameon = False, bbox_to_anchor=(1.5, 1.55), fontsize = 12)
    j +=1
    if j ==3:
        i +=1
        j = 0 

f.suptitle("\nRelative frequencies of schools with percentage of students in each scoring category", fontsize = 16);

    

So far we have not separated perfomance into subgroups (gender, income level, ethnicity, etc). However, at this point on average, there seems to be evidence pointing to locally funded charter schools preforming the best out of these three categories. The relative frequenies combined with previous analysis on averages shows that locally funded charters school clearly outpreform in ELA tests, and tend to slightly outpreform in mathematics (though this is less clear). To see if the difference between means is statistically significant, we can preform independent t tests.

In [210]:
def calc_effect_size(s1, s2):
    """calculates effect size for ind sample t test
        input: samples 1 and 2 
        output: effect size 
    """
    std_est_s1_s2 = np.sqrt(len(s1))*np.sqrt(np.std(s1)**2/len(s1) + np.std(s2)**2/len(s2))
    effect_size_s1_s2 = (np.mean(s1) - np.mean(s2))/ std_est_s1_s2
    return effect_size_s1_s2


   

Reporting a significant t-test for independent groups (µ1 ≠ µ2)

Since we are performing two tests against public schools we will use a bonferroni correction

alpha = .05/2 = .001

In [216]:
"""conduct ttest and ANOVA to compare group means"""


for name, g in results.groupby(['grade', 'test_id']):
    not_met_pub = g[g.type_id == 7].percentage_standard_not_met
    not_met_pubchart = g[g.type_id == 9].percentage_standard_not_met
    not_met_loclchart = g[g.type_id == 10].percentage_standard_not_met
    
    met_pub = g[g.type_id == 7].percentage_standard_met_and_above
    met_pubchart = g[g.type_id == 9].percentage_standard_met_and_above
    met_loclchart = g[g.type_id == 10].percentage_standard_met_and_above
    
    
    print(name)
    print("\nPercentage Standard Not Met\n")
    print("\tPublic vs Local Funded Charter:  ", ttest_ind(not_met_pub, not_met_loclchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(not_met_loclchart, not_met_pub))
    
    print("\n\tPublic vs Direct Funded Charter:  ", ttest_ind(not_met_pub, not_met_pubchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(not_met_pubchart, not_met_pub))
    print("\n\tANOVA: ", f_oneway(not_met_loclchart, not_met_pub, not_met_pubchart))
    
    
    print("\n\nPercentage Standard Met and Above:\n")
    print("\tPublic vs Local Funded Charter:  ", ttest_ind(met_pub, met_loclchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(met_loclchart, met_pub))
    
    print("\n\tPublic vs Direct Funded Charter:  ", ttest_ind(met_pub, met_pubchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(met_pubchart, met_pub))
    print("\n\tANOVA: ", f_oneway(met_loclchart, met_pub, met_pubchart), '\n')
    
(6, 1)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=4.863140767762414, pvalue=2.7177216720127894e-06)
	Effect size: -0.40379063498473694

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-0.46724341845865824, pvalue=0.6404903551564978)
	Effect size: 0.02148139069019021

	ANOVA:  F_onewayResult(statistic=9.310587486899731, pvalue=9.238188714501302e-05)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-3.448881608300607, pvalue=0.0007188385907717147)
	Effect size: 0.286366383503492

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=2.719517618131738, pvalue=0.006717634561013535)
	Effect size: -0.1250280508310029

	ANOVA:  F_onewayResult(statistic=8.773646809678377, pvalue=0.00015767274808358686) 

(6, 2)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=3.826541906866097, pvalue=0.00018617512679834953)
	Effect size: -0.3177277749292982

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-2.6069266373049267, pvalue=0.00935965794541261)
	Effect size: 0.11998026191211447

	ANOVA:  F_onewayResult(statistic=10.368265031118456, pvalue=3.2243007093792584e-05)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-2.637560827839294, pvalue=0.00917025432933683)
	Effect size: 0.21900102288641626

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=4.131103026420736, pvalue=4.086018921484817e-05)
	Effect size: -0.19012399871666677

	ANOVA:  F_onewayResult(statistic=10.723126305925419, pvalue=2.2652059755816576e-05) 

(7, 1)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=6.176188987257654, pvalue=4.2973809057692475e-09)
	Effect size: -0.5109877160232291

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=3.1926381570932993, pvalue=0.0014683887177268237)
	Effect size: -0.14755349706210535

	ANOVA:  F_onewayResult(statistic=17.3574775499213, pvalue=3.252238949898397e-08)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-3.952745832416773, pvalue=0.0001129225835691313)
	Effect size: 0.32705293131502317

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-0.8876149299114162, pvalue=0.3750390724054987)
	Effect size: 0.041023102059530046

	ANOVA:  F_onewayResult(statistic=7.424404647543067, pvalue=0.0006093930968330569) 

(7, 2)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=3.7340762934518206, pvalue=0.00025628969370725796)
	Effect size: -0.30895940576764613

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=0.7953248671919885, pvalue=0.42668638722929164)
	Effect size: -0.03675792788829846

	ANOVA:  F_onewayResult(statistic=6.510682563206843, pvalue=0.0015121078603271049)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-1.9408738435777917, pvalue=0.05394521677420233)
	Effect size: 0.16059299981852093

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=1.0427309321594387, pvalue=0.29741763405780836)
	Effect size: -0.04819217908039646

	ANOVA:  F_onewayResult(statistic=2.7301262091640273, pvalue=0.06540011865816857) 

(8, 1)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=4.505622907527654, pvalue=1.2070756941162392e-05)
	Effect size: -0.36902659054735726

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=3.789355274843333, pvalue=0.00016358778070739595)
	Effect size: -0.1787968958655651

	ANOVA:  F_onewayResult(statistic=14.308425317760989, pvalue=6.608460275900196e-07)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-3.9644038811673656, pvalue=0.00010835250099670356)
	Effect size: 0.3247141074635852

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-1.4868222292420152, pvalue=0.13751896752976506)
	Effect size: 0.07015548609924911

	ANOVA:  F_onewayResult(statistic=9.415175426800358, pvalue=8.430895373646295e-05) 

(8, 2)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=1.2709213042496321, pvalue=0.20548456316997846)
	Effect size: -0.10409654072398493

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-0.789090671636335, pvalue=0.430338700611116)
	Effect size: 0.03731666015263003

	ANOVA:  F_onewayResult(statistic=1.2975321345190596, pvalue=0.27338369189580475)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-0.9902081723127987, pvalue=0.3234695081400931)
	Effect size: 0.0811040677886892

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=1.6973061073207827, pvalue=0.09009585414778282)
	Effect size: -0.08026627636293959

	ANOVA:  F_onewayResult(statistic=2.1105303170433203, pvalue=0.12138317090645377) 

No significant evidence that means are different at alpha = .001 for the following

  • direct charter, 6th, ela, standards not met
  • direct charter, 7th, math, standards not met
  • direct charter, 7th, math, standards met and above
  • direct charter, 7th, ela, standards met and above
  • direct charter, 8th, ela, standards met and above
  • direct charter, 8th, math, standards not met
  • direct charter, 8th, math, standards met and above
  • local charter, 8th, math, standards not met
  • local charter, 8th, math, standards met and above

However there is evidence we should reject our null hypothesis that the difference between the population means is zero, at alpha = .001, for the following

  • direct charter, 6th, ela, standards met and above
  • direct charter, 6th, math, standards not met
  • direct chater, 6th, math, standards not met and above
  • direct charter, 8th, math, standards not met
  • direct charter, 8th, math, standards met and above
  • local charter, 6th, ela, standards not met
  • local charter, 6th, ela, standards met and above
  • local charter, 6th, math, standards not met
  • local charter, 6th, math, standards met and above
  • local charter, 7th, math, standards not met
  • local charter, 7th, math, standards met and above
  • local charter, 7th, ela, standards met and above
  • local charter, 8th, ela, standards met and above
  • local charter, 8th, ela, standards not met

Conclusions

The effect size for the independent t-test is very small (<.2) when comparing all public school and direct funded charter school results by grade and test type.

The effect size for the independent t-test is small to medium (approx .2-.5) when comparing public school and locally funded charter school results by grade and test type, except in the case of 8th-grade math where the effect size is very small (<.1).

At this point, we have good evidence that locally funded charter schools consistently outperform public schools with all differences in means being found to be statistically significant except for 8th-grade math. However, there is not much evidence to support that direct funded charter schools perform better than public schools on average.

Locally funded charter schools appear to be a promising alternative for families looking to provide better education to their children without needing to pay for private schooling.

Avenues for Further Inquiry

Some potential issues with these findings:

Our findings are fairly tentative at this point, but they have allowed us to rule out investigating directly funded charter schools any further.

We have not separated the above data into subcategories. For instance, can this difference in performance be accounted for by a difference in gender or income level of attending students? It is possible that there are more locally funded charter schools in higher income areas, meaning that the difference in performance can be accounted for by family income. Is it possible that 2017 was just a good year for charter schools? How do other years compare? These are good areas for further analysis.

Curious parents should check out https://www.greatschools.org/ to learn about the great schools in their area and view their CAASPP results.