Comparing charter and public 6th, 7th, and 8th graders CAASPP Test Results¶

Problem Summary¶

Our hypothesis is that 6th, 7th and 8th-grade charter schools students perform better than public schools students in the same grades on the ELA and mathematics standardized tests. We will measure this by comparing the number of students failing to meet standards and more students meeting or above standards on California's standardized testing CAASPP.

We will also implicitly be making the assumption that higher performance on the CAASPP indicates greater student learning.

Findings Summary¶

By conducting paired t-tests on the percentage of students who did not meet standards and for those who were at or above standards we were able to determine that there was significant evidence to believe that there was a real difference between the means of the publicly funded and locally funded charter schools. However, there was not the same evidence for the directly funded charter schools.

Our findings are tentative at this point. We have not separated our data further into subcategories. For instance, can this difference in performance be accounted for by a difference in gender or income level of attending students? It is possible that there are more locally funded charter schools in higher income areas, meaning that the difference in performance can be accounted for by family income? Is it possible that 2017 was just a good year for charter schools? How do other years compare? These are all areas for further analysis.

What is a charter school?¶

There are three types of schools we will be looking at public schools, direct-funded charter schools, and locally funded charter schools.

A charter school is an independently run public school granted more operation flexibility, in return for higher performance accountability. Each school is established by charter. This charter is essentially a contract detailing the school's mission, program, and performance goals.

Charter schools are public schools in the sense that they are free to attend. However, unlike public schools, charter schools are schools of choice, meaning that families choose to send their kids to them. These schools operate with freedom from some of the regulations that are typically imposed upon district schools but are still accountable for their academic results and for upholding the promises made in their charters. Charter schools must also participate in state standardized testing. They must demonstrate performance in academic achievement, financial management, and organizational stability. If a charter school does not meet their performance goals, it can be closed.

How are charter schools funded?¶

There are two funding types for charter schools. Direct-funded charter schools elect to receive funding directly from the state. Whereas locally funded charter schools get their funding from their local education agency/school district.

Data Sources¶

More info about charter vs public schools:

https://www.ed-data.org/article/Charter-Schools-in-California

Data can be found at:

https://caaspp.cde.ca.gov/sb2017/ResearchFileList

https://caaspp.cde.ca.gov/sb2017/research_fixfileformat17

In [140]:

import pandas as pd
import sqlite3
import numpy as np 
import matplotlib.pyplot as plt 
import matplotlib.patches as mpatches
from scipy.stats import relfreq
from scipy.stats import ttest_ind
from scipy.stats import f_oneway


df_ca = pd.read_csv("ca2017.txt")
df_ca_entities = pd.read_csv("ca2017_entities.txt")

df_ca.columns = df_ca.columns.str.replace(' ','_' ).str.lower()
df_ca_entities.columns = df_ca_entities.columns.str.replace(' ','_' ).str.lower()


#remove entries with missing data 
df_ca = df_ca[df_ca.total_tested_at_entity_level != '*']
df_ca = df_ca[df_ca.percentage_standard_met != '*']

df_ca.head()

Out[140]:

	filler	test_year	subgroup_id	test_type	total_tested_at_entity_level	total_tested_with_scores	grade	...	area_1_percentage_below_standard	area_2_percentage_above_standard	area_2_percentage_near_standard	area_2_percentage_below_standard	area_3_percentage_above_standard	area_3_percentage_near_standard	area_3_percentage_below_standard	area_4_percentage_above_standard	area_4_percentage_near_standard	area_4_percentage_below_standard
0	NaN	2017	1	B	3209613	3206556	3	...	38.58	23.54	44.00	32.46	17.72	61.55	20.73	24.16	47.60	28.25
1	NaN	2017	1	B	3220894	3218106	3	...	34.14	24.96	45.81	29.23	25.92	48.74	25.34	0.00	0.00	0.00
2	NaN	2017	1	B	3209613	3206556	4	...	30.40	23.01	45.24	31.75	16.17	57.39	26.44	23.80	49.85	26.34
3	NaN	2017	1	B	3220894	3218106	4	...	42.15	20.13	44.42	35.45	22.43	43.29	34.28	0.00	0.00	0.00
4	NaN	2017	1	B	3209613	3206556	5	...	32.89	28.13	42.91	28.96	16.70	59.67	23.63	25.43	43.84	30.72

5 rows × 32 columns

In [142]:

#change dtype from object to float
df_ca = df_ca.astype({col: float for col in df_ca.columns[11:]})

In [143]:

df_ca.iloc[:,:-12].describe()

Out[143]:

	county_code	district_code	school_code	filler	test_year	subgroup_id	grade	test_id	caaspp_reported_enrollment	students_tested	mean_scale_score	percentage_standard_exceeded	percentage_standard_met	percentage_standard_met_and_above	percentage_standard_nearly_met	percentage_standard_not_met	students_with_scores
count	89222.000000	89222.000000	8.922200e+04	0.0	89222.0	89222.0	89222.000000	89222.000000	8.922200e+04	8.922200e+04	68304.000000	89222.000000	89222.000000	89222.000000	89222.000000	89222.000000	8.922200e+04
mean	29.077234	65602.389310	3.919898e+06	NaN	2017.0	1.0	7.278956	1.499809	5.816974e+02	5.664246e+02	2486.498086	17.403756	23.334445	40.738204	25.358098	33.903693	5.659247e+02
std	14.247435	11718.415172	2.772705e+06	NaN	0.0	0.0	3.709132	0.500003	1.753988e+04	1.706150e+04	67.568357	15.329506	10.494026	21.669364	8.691054	19.995055	1.704620e+04
min	0.000000	0.000000	0.000000e+00	NaN	2017.0	1.0	3.000000	1.000000	1.100000e+01	1.100000e+01	2248.000000	0.000000	0.000000	0.000000	0.000000	0.000000	1.100000e+01
25%	19.000000	64733.000000	1.216120e+05	NaN	2017.0	1.0	4.000000	1.000000	6.200000e+01	6.100000e+01	2437.200000	5.970000	16.130000	24.090000	20.000000	18.180000	6.100000e+01
50%	30.000000	67199.000000	6.015747e+06	NaN	2017.0	1.0	6.000000	1.000000	1.030000e+02	1.010000e+02	2481.500000	12.845000	23.170000	38.430000	25.600000	32.095000	1.010000e+02
75%	39.000000	69682.000000	6.049710e+06	NaN	2017.0	1.0	11.000000	2.000000	2.710000e+02	2.640000e+02	2531.600000	25.000000	30.300000	56.250000	30.720000	46.857500	2.640000e+02
max	58.000000	77032.000000	6.121081e+06	NaN	2017.0	1.0	13.000000	2.000000	3.305989e+06	3.220894e+06	2776.400000	96.880000	82.350000	100.000000	83.330000	100.000000	3.218106e+06

In [144]:

#Create dbs from caaspp data 
conn = sqlite3.connect('ca2017.db')

df_ca.to_sql('caasp', conn, if_exists = 'replace')
df_ca_entities.to_sql('entities', conn, if_exists = 'replace')

In [145]:

#query schools in the 6, 7, 8 grades where 100% of students either met or did not met standards

q = """SELECT c.school_code
        , e.county_name
        , e.district_name
        , e.school_name
        , c.percentage_standard_met_and_above
        , c.percentage_standard_not_met
        , c.subgroup_id
        FROM caasp c 
        LEFT JOIN entities e ON c.school_code = e.school_code AND c.district_code = e.district_code
        WHERE  (c.percentage_standard_met_and_above = 100 OR c.percentage_standard_not_met = 100) 
                AND c.grade IN (6, 7, 8) 
        ORDER BY c.percentage_standard_met_and_above DESC;
        """

The percentage of standards not met and percentage of standards met and above has a max value of 100. This seems supicious as it would mean that 100% of the school scored in the given category. We will look into these schools to see if these values seem reasonable. For example these could be prep or alternate schools.

In [190]:

schools_hund_percent = pd.read_sql_query(q, conn)
schools_hund_percent.head()

Out[190]:

	total_schools	type_id	grade	test_id	AVG(c.students_tested)	AVG(c.mean_scale_score)	AVG(c.percentage_standard_exceeded)	AVG(c.percentage_standard_met)	AVG(c.percentage_standard_met_and_above)	AVG(c.percentage_standard_nearly_met)	AVG(c.percentage_standard_not_met)	AVG(c.students_with_scores)
0	3497	7	6	1	118.480984	2515.154161	15.060835	30.376940	45.437815	26.928667	27.633566	118.388047
1	474	9	6	1	70.761603	2511.389451	12.904895	29.979198	42.883966	29.114937	28.001181	70.624473
2	146	10	6	1	88.890411	2527.628767	15.881918	34.872260	50.754521	27.503767	21.741918	88.801370
3	3496	7	6	2	119.075801	2503.938616	16.232228	18.448696	34.680878	28.804760	36.514465	119.014588
4	473	9	6	2	70.904863	2496.837421	13.249852	17.613150	30.863446	30.161734	38.974778	70.765328

Looking through the schools that have 100% of students meeting or above standards, we see that these schools are schools that are known for academic excellence.

On the other hand, schools that have 100% of students failing to meet criteria are often alternative/low income area schools.

We also see that rows with school code 0 are not schools but seem to be test centers for the county. Since these are not schools we shall not use these entries during further analysis. We will keep this in mind while writing our SQL.

In [147]:

df_ca_entities.describe()

Out[147]:

	county_code	district_code	school_code	filler	test_year	type_id
count	11327.000000	11327.000000	1.132700e+04	0.0	11327.0	11327.000000
mean	28.916836	65353.020482	3.871051e+06	NaN	2017.0	7.142491
std	14.283690	12049.756745	2.648044e+06	NaN	0.0	0.817326
min	0.000000	0.000000	0.000000e+00	NaN	2017.0	4.000000
25%	19.000000	64725.000000	1.305090e+05	NaN	2017.0	7.000000
50%	30.000000	67116.000000	6.008239e+06	NaN	2017.0	7.000000
75%	39.000000	69674.000000	6.046308e+06	NaN	2017.0	7.000000
max	58.000000	77032.000000	9.010745e+06	NaN	2017.0	10.000000

In [192]:

#create reference table for labels to id codes
test_types = pd.DataFrame([[1,'ELA'], [2, 'Mathematics']], columns =['test_id', 'test_type']).set_index('test_id')
school_types = pd.DataFrame([[7, 'Public School'], [9, 'Direct Funded Charter'], [10, 'Locally Funded Charter']], 
             columns = ['type_id', 'type']).set_index('type_id')

print('Reference Tables\n\n')
print(test_types, '\n')
print(school_types)

Reference Tables


           test_type
test_id             
1                ELA
2        Mathematics 

                           type
type_id                        
7                 Public School
9         Direct Funded Charter
10       Locally Funded Charter

In [149]:

"""query averages for each performance category for students in the 6th, 7th, and 8th grades. 
    Group results by school type, student grade, and test type

"""

perform_avgs_query = """SELECT COUNT(*) total_schools
                            , e.type_id
                            , c.grade
                            , c.test_id
                            , AVG(c.students_tested)
                            , AVG(c.mean_scale_score)
                            , AVG(c.percentage_standard_exceeded)
                            , AVG(c.percentage_standard_met)
                            , AVG(c.percentage_standard_met_and_above)
                            , AVG(c.percentage_standard_nearly_met)
                            , AVG(c.percentage_standard_not_met)
                            , AVG(c.students_with_scores)
                    FROM caasp c 
                    INNER JOIN entities e ON c.school_code = e.school_code 
                        AND c.county_code = e.county_code 
                        AND c.district_code = e.district_code 
                    WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10)
                        AND c.subgroup_id = 1 AND c.school_code >0
                    GROUP BY e.type_id, c.grade, c.test_id
                    ORDER BY c.grade, c.test_id;
                    """

In [150]:

middle_test_avgs = pd.read_sql_query(perform_avgs_query, conn)

In [152]:

middle_test_avgs.loc[middle_test_avgs.grade == 6]

Out[152]:

	total_schools	type_id	grade	test_id	AVG(c.students_tested)	AVG(c.mean_scale_score)	AVG(c.percentage_standard_exceeded)	AVG(c.percentage_standard_met)	AVG(c.percentage_standard_met_and_above)	AVG(c.percentage_standard_nearly_met)	AVG(c.percentage_standard_not_met)	AVG(c.students_with_scores)
0	3497	7	6	1	118.480984	2515.154161	15.060835	30.376940	45.437815	26.928667	27.633566	118.388047
1	474	9	6	1	70.761603	2511.389451	12.904895	29.979198	42.883966	29.114937	28.001181	70.624473
2	146	10	6	1	88.890411	2527.628767	15.881918	34.872260	50.754521	27.503767	21.741918	88.801370
3	3496	7	6	2	119.075801	2503.938616	16.232228	18.448696	34.680878	28.804760	36.514465	119.014588
4	473	9	6	2	70.904863	2496.837421	13.249852	17.613150	30.863446	30.161734	38.974778	70.765328
5	146	10	6	2	89.047945	2516.763699	17.082123	21.746370	38.828904	30.449863	30.721370	88.904110

When considereding average standardized test scores for 6th grade California students:

Direct funded charter schools (type_id 9) preform worse than locally funded charter schools (type_id 10) and public schools (type_id 8) in all categories.

When compared to public schools, locally funded charter schools have a smaller percentage of students failing to meet standards and comparable percentage of students exceeding.

In [153]:

middle_test_avgs.loc[middle_test_avgs.grade == 7]

Out[153]:

	total_schools	type_id	grade	test_id	AVG(c.students_tested)	AVG(c.mean_scale_score)	AVG(c.percentage_standard_exceeded)	AVG(c.percentage_standard_met)	AVG(c.percentage_standard_met_and_above)	AVG(c.percentage_standard_nearly_met)	AVG(c.percentage_standard_not_met)	AVG(c.students_with_scores)
6	1958	7	7	1	208.775281	2536.362921	14.081134	32.878514	46.959704	24.052211	28.988131	208.557201
7	469	9	7	1	73.332623	2541.061834	13.836439	34.042303	47.878699	25.859510	26.261343	73.098081
8	147	10	7	1	93.394558	2552.635374	15.171633	38.552313	53.724082	25.204150	21.070408	93.156463
9	1958	7	7	2	209.850868	2518.373544	16.019254	18.638882	34.658059	27.565756	37.776231	209.740552
10	469	9	7	2	73.362473	2518.918337	14.592836	18.973518	33.566397	29.457463	36.976013	73.249467
11	147	10	7	2	93.482993	2531.609524	17.028503	21.154150	38.182925	30.114422	31.702721	93.428571

When considereding average standardized test scores for 7th grade California students:

Direct funded charter schools under preform locally funded charter schools and public schools but outperform public schools is the number of students meeting, above, and exceeding in both test categories. On the other hand, direct funded charter schools have the highest percentage of students failing to meet standards. This might point to some direct funded charter schools significantly outpreforming others.

Locally funded charter schools outperform the direct funded charter schools and public schools, especially for the mathematics portion.

In [154]:

middle_test_avgs.loc[middle_test_avgs.grade == 8]

Out[154]:

	total_schools	type_id	grade	test_id	AVG(c.students_tested)	AVG(c.mean_scale_score)	AVG(c.percentage_standard_exceeded)	AVG(c.percentage_standard_met)	AVG(c.percentage_standard_met_and_above)	AVG(c.percentage_standard_nearly_met)	AVG(c.percentage_standard_not_met)	AVG(c.students_with_scores)
12	1986	7	8	1	205.294058	2551.972558	13.756012	32.270987	46.026949	26.669491	27.303510	205.115307
13	450	9	8	1	73.393333	2557.825333	13.462311	34.093000	47.555578	28.336089	24.108711	73.273333
14	150	10	8	1	91.513333	2569.438667	16.678200	36.637267	53.316333	25.787600	20.896400	91.433333
15	1981	7	8	2	206.566381	2532.360020	17.823433	16.056209	33.879460	23.707304	42.412807	206.317516
16	448	9	8	2	73.620536	2530.221652	15.793504	16.245022	32.038571	24.666629	43.294487	73.430804
17	150	10	8	2	91.506667	2540.207333	18.388400	17.317933	35.706600	24.267467	40.025533	91.426667

When considereding average standardized test scores for 8th grade California students:

Direct funded charter schools under preform compared to locally funded charter schools and public schools but outperform public schools is the number of students meeting, above, and exceeding in both test categories. Direct funded charter schools have the highest percentage of students failing to meet standards in mathematics.

Public schools have the lowest percentage of students failing to meet standards on both tests.

Locally funded charter schools have the highest percentage of students meetng and exceeding standards.

In [155]:

#Create view for avg test results above and below standards by school type, test type and grade


view = """ CREATE VIEW averages AS
        SELECT c.grade
        , c.test_id
        , e.type_id
        , AVG(c.percentage_standard_not_met) avg_percent_std_not_met
        , AVG(c.percentage_standard_met_and_above) avg_percentage_standard_met_and_above
        FROM caasp c 
        INNER JOIN entities e ON c.school_code = e.school_code 
            AND c.county_code = e.county_code 
            AND c.district_code = e.district_code 
        WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10) AND c.subgroup_id = 1 AND c.school_code >0
        GROUP BY e.type_id, c.grade, c.test_id
        ORDER BY c.grade, c.test_id;
        """
       
#query num of schools grouped by type, test type and grade with more than avg percentage of students below stnds

below_avg_query = """SELECT c.grade
                        , c.test_id
                        , e.type_id
                        , a.avg_percent_std_not_met
                        , COUNT(*) num_percent_std_not_met_grter_avg
                    FROM caasp c 
                    INNER JOIN entities e ON c.school_code = e.school_code 
                        AND c.county_code = e.county_code 
                        AND c.district_code = e.district_code 
                    INNER JOIN averages a ON c.grade = a.grade AND c.test_id = a.test_id
                        AND e.type_id = a.type_id
                    WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10) 
                        AND  c.percentage_standard_not_met > a.avg_percent_std_not_met 
                    GROUP BY e.type_id, c.grade, c.test_id
                    ORDER BY c.grade, c.test_id;
       
                 """

In [156]:

#pd.read_sql_query('DROP VIEW averages;', conn)
#pd.read_sql_query(view, conn)
stds_not_met = pd.read_sql_query(below_avg_query, conn)

In [ ]:

def convert_name(num_tup):
    #converts lables to tup of names
    return(num_tup[0], test_types.loc[num_tup[1]].values[0], school_types.loc[num_tup[2]].values[0])

labels = avg_diffs_from_pub[['grade', 'test_id', 'type_id']].values.tolist()
labels_convert = [convert_name(x) for x in labels]
label_names = [(x[0], x[2]) for x in labels_convert]
colors = ['#5c678a' if x[1] == 'ELA' else '#00b3b3' for x in labels_convert]
x_values = range(avg_diffs_from_pub.shape[0])

Percentage of schools with higher than average students not meeting standards¶

In [206]:

stds_not_met['total_schools'] = middle_test_avgs.total_schools
stds_not_met['percentage_greater_than_avg'] = (stds_not_met.num_percent_std_not_met_grter_avg/
                                               stds_not_met.total_schools)

plt.figure(figsize =(8,6))
plt.bar(range(avg_diffs_from_pub.shape[0]), stds_not_met.percentage_greater_than_avg, color = colors)
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')

#create legend 
ela_patch = mpatches.Patch(color='#5c678a', label='ELA')
math_patch = mpatches.Patch(color='#00b3b3', label='Mathematics')

plt.title('Percentage of schools with higher than average students not meeing standards')
plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False )
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)

plt.show()

From the above we see that when grouped by grade, test, and type, less than half of all schools have greater than the average percentage of students whose scores do not meet standards.

In [158]:

above_avg_query =""" SELECT c.grade
                    , c.test_id
                    , e.type_id
                    , a.avg_percent_std_not_met
                    , COUNT(*) num_percent_std_met_above_grter_avg
                    FROM caasp c 
                    INNER JOIN entities e ON c.school_code = e.school_code 
                        AND c.county_code = e.county_code 
                        AND c.district_code = e.district_code 
                    INNER JOIN averages a ON c.grade = a.grade AND c.test_id = a.test_id 
                        AND e.type_id = a.type_id
                    WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10) 
                        AND  c.percentage_standard_met_and_above > a.avg_percentage_standard_met_and_above 
                        AND  c.school_code > 0 
                    GROUP BY e.type_id, c.grade, c.test_id
                    ORDER BY c.grade, c.test_id;
                """

In [159]:

stds_met_above = pd.read_sql_query(above_avg_query, conn)

Percentage of schools with higher than average percentage of students performing above average¶

In [209]:

stds_met_above['total_schools'] = middle_test_avgs.total_schools
stds_met_above['percentage_greater_than_avg'] = (stds_met_above.num_percent_std_met_above_grter_avg/
                                                 stds_met_above.total_schools)
plt.figure(figsize =(8,6))
plt.bar(range(avg_diffs_from_pub.shape[0]), stds_met_above.percentage_greater_than_avg, color = colors)
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')

#create legend 
ela_patch = mpatches.Patch(color='#5c678a', label='ELA')
math_patch = mpatches.Patch(color='#00b3b3', label='Mathematics')

plt.title('Percentage of schools with higher than average students performing above average')
plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False )
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)

plt.show()

We also see that most charter schools have less than the average percentage of students whose score at or above standards.

In [161]:

avg_not_met_and_met = middle_test_avgs[['grade', 'type_id', 'test_id','AVG(c.percentage_standard_not_met)',
                                        'AVG(c.percentage_standard_met_and_above)' ]]

In [162]:

avg_not_met_and_met['diff_not_met'] = 0
avg_not_met_and_met['diff_met'] = 0

avg_diffs_from_pub = pd.DataFrame(columns = avg_not_met_and_met.columns)

for _, g in avg_not_met_and_met.groupby(['grade', 'test_id']):
    g.diff_not_met = g.iloc[0, 3] 
    g.diff_met = g.iloc[0, 4]
    avg_diffs_from_pub = avg_diffs_from_pub.append(g)

In [163]:

avg_diffs_from_pub.diff_not_met = (avg_diffs_from_pub['AVG(c.percentage_standard_not_met)']
                                   - avg_diffs_from_pub.diff_not_met)

In [164]:

avg_diffs_from_pub.diff_met = (avg_diffs_from_pub['AVG(c.percentage_standard_met_and_above)'] 
                               - avg_diffs_from_pub.diff_met)

Difference Between Average Charter Student Performance and Average Public School Student¶

In [165]:

#absolute differences charter avg percentage - public school avg percentage
avg_diffs_from_pub

Out[165]:

	grade	type_id	test_id	AVG(c.percentage_standard_not_met)	AVG(c.percentage_standard_met_and_above)	diff_not_met	diff_met
0	6	7	1	27.633566	45.437815	0.000000	0.000000
1	6	9	1	28.001181	42.883966	0.367616	-2.553849
2	6	10	1	21.741918	50.754521	-5.891648	5.316705
3	6	7	2	36.514465	34.680878	0.000000	0.000000
4	6	9	2	38.974778	30.863446	2.460313	-3.817432
5	6	10	2	30.721370	38.828904	-5.793095	4.148026
6	7	7	1	28.988131	46.959704	0.000000	0.000000
7	7	9	1	26.261343	47.878699	-2.726787	0.918996
8	7	10	1	21.070408	53.724082	-7.917723	6.764378
9	7	7	2	37.776231	34.658059	0.000000	0.000000
10	7	9	2	36.976013	33.566397	-0.800218	-1.091663
11	7	10	2	31.702721	38.182925	-6.073510	3.524866
12	8	7	1	27.303510	46.026949	0.000000	0.000000
13	8	9	1	24.108711	47.555578	-3.194798	1.528629
14	8	10	1	20.896400	53.316333	-6.407110	7.289385
15	8	7	2	42.412807	33.879460	0.000000	0.000000
16	8	9	2	43.294487	32.038571	0.881680	-1.840888
17	8	10	2	40.025533	35.706600	-2.387273	1.827140

Here we see the absolute differences in average percentage of students scoring in the categories standards not met, and standards met and above.

This is calculated charter average percentage - public average percentage for each category.

A negative value in the diff_not_met column means that the charter school had less percentage of students not meeting standards when compared to public schools.

A positive value in the diff_met column means that the charter school had a larger percentage of students on average scoring at or above standard levels than public schools.

In [181]:

#plot standards not met
plt.figure(figsize = (12, 6))
plt.bar(x_values, avg_diffs_from_pub['AVG(c.percentage_standard_not_met)'], 
        label = label_names, color= colors)
plt.ylabel('mean percentage')
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')
plt.title('Standards not met')

ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)

#draw lengend and add legend
ela_patch = mpatches.Patch(color='#5c678a', label='ELA')
math_patch = mpatches.Patch(color='#00b3b3', label='Mathematics')

plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False )
plt.show()



#plot standards met and above
plt.figure(figsize = (12, 6))

plt.bar(x_values, avg_diffs_from_pub['AVG(c.percentage_standard_met_and_above)'], 
        label = label_names, color = colors)
plt.ylabel('mean percentage')
plt.xticks(range(avg_diffs_from_pub.shape[0]),label_names, rotation = 'vertical')
plt.title('\n\nStandards met and above')

#clean plots
ax = plt.gca()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)



#add legend 
plt.legend(handles=[ela_patch, math_patch],bbox_to_anchor=(1.35, 1), frameon = False);

In [182]:

percent_stnds_query = """SELECT c.grade
                                , c.test_id
                                , e.type_id
                                , c.percentage_standard_not_met
                                , c.percentage_standard_met_and_above
                            FROM caasp c 
                            INNER JOIN entities e ON c.school_code = e.school_code 
                                AND c.county_code = e.county_code 
                                AND c.district_code = e.district_code 
                            WHERE c.grade IN (6, 7, 8) AND e.type_id IN (7, 9, 10)
                                AND c.subgroup_id = 1
                            ORDER BY c.grade, c.test_id;
           
                            """

In [183]:

results = pd.read_sql_query(percent_stnds_query, conn)
results.head()

Out[183]:

	grade	test_id	type_id	percentage_standard_not_met	percentage_standard_met_and_above
0	6	1	9	55.56	5.56
1	6	1	9	5.71	80.00
2	6	1	9	38.89	55.56
3	6	1	9	61.90	12.70
4	6	1	9	63.64	14.55

In [187]:

f, ax = plt.subplots(6, 3, figsize = (15, 25), sharey = True)

x = range(0, 100, 10)
i, j = 0,0
for name, group in results.groupby(['grade', 'test_id', 'type_id']):
    res_met = relfreq(group.percentage_standard_met_and_above, numbins = 10)
    res_not_met = relfreq(group.percentage_standard_not_met, numbins = 10)
    ax[i, j].bar(x, res_met.frequency, width = res_met.binsize, label = 'meeting and above standard',
                 alpha = .5, color = '#187eba')
    ax[i, j].bar(x, res_not_met.frequency, width = res_not_met.binsize, label = 'not meeting standard',
                 alpha = .7, color = '#f2a771')
    ax[i, j].set_xlabel(convert_name(name), fontsize = 13)
    ax[i, j].spines['top'].set_visible(False)
    ax[i, j].spines['right'].set_visible(False)
    
    if i == 0 and j == 2:
        ax[i, j].legend(frameon = False, bbox_to_anchor=(1.5, 1.55), fontsize = 12)
    j +=1
    if j ==3:
        i +=1
        j = 0 

f.suptitle("\nRelative frequencies of schools with percentage of students in each scoring category", fontsize = 16);

So far we have not separated perfomance into subgroups (gender, income level, ethnicity, etc). However, at this point on average, there seems to be evidence pointing to locally funded charter schools preforming the best out of these three categories. The relative frequenies combined with previous analysis on averages shows that locally funded charters school clearly outpreform in ELA tests, and tend to slightly outpreform in mathematics (though this is less clear). To see if the difference between means is statistically significant, we can preform independent t tests.

In [210]:

def calc_effect_size(s1, s2):
    """calculates effect size for ind sample t test
        input: samples 1 and 2 
        output: effect size 
    """
    std_est_s1_s2 = np.sqrt(len(s1))*np.sqrt(np.std(s1)**2/len(s1) + np.std(s2)**2/len(s2))
    effect_size_s1_s2 = (np.mean(s1) - np.mean(s2))/ std_est_s1_s2
    return effect_size_s1_s2

Reporting a significant t-test for independent groups (µ1 ≠ µ2)¶

Since we are performing two tests against public schools we will use a bonferroni correction

alpha = .05/2 = .001

In [216]:

"""conduct ttest and ANOVA to compare group means"""


for name, g in results.groupby(['grade', 'test_id']):
    not_met_pub = g[g.type_id == 7].percentage_standard_not_met
    not_met_pubchart = g[g.type_id == 9].percentage_standard_not_met
    not_met_loclchart = g[g.type_id == 10].percentage_standard_not_met
    
    met_pub = g[g.type_id == 7].percentage_standard_met_and_above
    met_pubchart = g[g.type_id == 9].percentage_standard_met_and_above
    met_loclchart = g[g.type_id == 10].percentage_standard_met_and_above
    
    
    print(name)
    print("\nPercentage Standard Not Met\n")
    print("\tPublic vs Local Funded Charter:  ", ttest_ind(not_met_pub, not_met_loclchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(not_met_loclchart, not_met_pub))
    
    print("\n\tPublic vs Direct Funded Charter:  ", ttest_ind(not_met_pub, not_met_pubchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(not_met_pubchart, not_met_pub))
    print("\n\tANOVA: ", f_oneway(not_met_loclchart, not_met_pub, not_met_pubchart))
    
    
    print("\n\nPercentage Standard Met and Above:\n")
    print("\tPublic vs Local Funded Charter:  ", ttest_ind(met_pub, met_loclchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(met_loclchart, met_pub))
    
    print("\n\tPublic vs Direct Funded Charter:  ", ttest_ind(met_pub, met_pubchart, equal_var = False))
    print("\tEffect size:", calc_effect_size(met_pubchart, met_pub))
    print("\n\tANOVA: ", f_oneway(met_loclchart, met_pub, met_pubchart), '\n')

(6, 1)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=4.863140767762414, pvalue=2.7177216720127894e-06)
	Effect size: -0.40379063498473694

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-0.46724341845865824, pvalue=0.6404903551564978)
	Effect size: 0.02148139069019021

	ANOVA:  F_onewayResult(statistic=9.310587486899731, pvalue=9.238188714501302e-05)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-3.448881608300607, pvalue=0.0007188385907717147)
	Effect size: 0.286366383503492

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=2.719517618131738, pvalue=0.006717634561013535)
	Effect size: -0.1250280508310029

	ANOVA:  F_onewayResult(statistic=8.773646809678377, pvalue=0.00015767274808358686) 

(6, 2)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=3.826541906866097, pvalue=0.00018617512679834953)
	Effect size: -0.3177277749292982

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-2.6069266373049267, pvalue=0.00935965794541261)
	Effect size: 0.11998026191211447

	ANOVA:  F_onewayResult(statistic=10.368265031118456, pvalue=3.2243007093792584e-05)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-2.637560827839294, pvalue=0.00917025432933683)
	Effect size: 0.21900102288641626

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=4.131103026420736, pvalue=4.086018921484817e-05)
	Effect size: -0.19012399871666677

	ANOVA:  F_onewayResult(statistic=10.723126305925419, pvalue=2.2652059755816576e-05) 

(7, 1)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=6.176188987257654, pvalue=4.2973809057692475e-09)
	Effect size: -0.5109877160232291

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=3.1926381570932993, pvalue=0.0014683887177268237)
	Effect size: -0.14755349706210535

	ANOVA:  F_onewayResult(statistic=17.3574775499213, pvalue=3.252238949898397e-08)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-3.952745832416773, pvalue=0.0001129225835691313)
	Effect size: 0.32705293131502317

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-0.8876149299114162, pvalue=0.3750390724054987)
	Effect size: 0.041023102059530046

	ANOVA:  F_onewayResult(statistic=7.424404647543067, pvalue=0.0006093930968330569) 

(7, 2)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=3.7340762934518206, pvalue=0.00025628969370725796)
	Effect size: -0.30895940576764613

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=0.7953248671919885, pvalue=0.42668638722929164)
	Effect size: -0.03675792788829846

	ANOVA:  F_onewayResult(statistic=6.510682563206843, pvalue=0.0015121078603271049)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-1.9408738435777917, pvalue=0.05394521677420233)
	Effect size: 0.16059299981852093

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=1.0427309321594387, pvalue=0.29741763405780836)
	Effect size: -0.04819217908039646

	ANOVA:  F_onewayResult(statistic=2.7301262091640273, pvalue=0.06540011865816857) 

(8, 1)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=4.505622907527654, pvalue=1.2070756941162392e-05)
	Effect size: -0.36902659054735726

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=3.789355274843333, pvalue=0.00016358778070739595)
	Effect size: -0.1787968958655651

	ANOVA:  F_onewayResult(statistic=14.308425317760989, pvalue=6.608460275900196e-07)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-3.9644038811673656, pvalue=0.00010835250099670356)
	Effect size: 0.3247141074635852

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-1.4868222292420152, pvalue=0.13751896752976506)
	Effect size: 0.07015548609924911

	ANOVA:  F_onewayResult(statistic=9.415175426800358, pvalue=8.430895373646295e-05) 

(8, 2)

Percentage Standard Not Met

	Public vs Local Funded Charter:   Ttest_indResult(statistic=1.2709213042496321, pvalue=0.20548456316997846)
	Effect size: -0.10409654072398493

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=-0.789090671636335, pvalue=0.430338700611116)
	Effect size: 0.03731666015263003

	ANOVA:  F_onewayResult(statistic=1.2975321345190596, pvalue=0.27338369189580475)


Percentage Standard Met and Above:

	Public vs Local Funded Charter:   Ttest_indResult(statistic=-0.9902081723127987, pvalue=0.3234695081400931)
	Effect size: 0.0811040677886892

	Public vs Direct Funded Charter:   Ttest_indResult(statistic=1.6973061073207827, pvalue=0.09009585414778282)
	Effect size: -0.08026627636293959

	ANOVA:  F_onewayResult(statistic=2.1105303170433203, pvalue=0.12138317090645377)

No significant evidence that means are different at alpha = .001 for the following

direct charter, 6th, ela, standards not met
direct charter, 7th, math, standards not met
direct charter, 7th, math, standards met and above
direct charter, 7th, ela, standards met and above
direct charter, 8th, ela, standards met and above
direct charter, 8th, math, standards not met
direct charter, 8th, math, standards met and above

local charter, 8th, math, standards not met
local charter, 8th, math, standards met and above

However there is evidence we should reject our null hypothesis that the difference between the population means is zero, at alpha = .001, for the following

direct charter, 6th, ela, standards met and above
direct charter, 6th, math, standards not met
direct chater, 6th, math, standards not met and above
direct charter, 8th, math, standards not met
direct charter, 8th, math, standards met and above

local charter, 6th, ela, standards not met
local charter, 6th, ela, standards met and above
local charter, 6th, math, standards not met
local charter, 6th, math, standards met and above
local charter, 7th, math, standards not met
local charter, 7th, math, standards met and above
local charter, 7th, ela, standards met and above
local charter, 8th, ela, standards met and above
local charter, 8th, ela, standards not met

Conclusions¶

The effect size for the independent t-test is very small (<.2) when comparing all public school and direct funded charter school results by grade and test type.

The effect size for the independent t-test is small to medium (approx .2-.5) when comparing public school and locally funded charter school results by grade and test type, except in the case of 8th-grade math where the effect size is very small (<.1).

At this point, we have good evidence that locally funded charter schools consistently outperform public schools with all differences in means being found to be statistically significant except for 8th-grade math. However, there is not much evidence to support that direct funded charter schools perform better than public schools on average.

Locally funded charter schools appear to be a promising alternative for families looking to provide better education to their children without needing to pay for private schooling.

Avenues for Further Inquiry¶

Some potential issues with these findings:

Our findings are fairly tentative at this point, but they have allowed us to rule out investigating directly funded charter schools any further.

We have not separated the above data into subcategories. For instance, can this difference in performance be accounted for by a difference in gender or income level of attending students? It is possible that there are more locally funded charter schools in higher income areas, meaning that the difference in performance can be accounted for by family income. Is it possible that 2017 was just a good year for charter schools? How do other years compare? These are good areas for further analysis.

Curious parents should check out https://www.greatschools.org/ to learn about the great schools in their area and view their CAASPP results.

Should Your Kids go to Charter School?