Search This Blog

Wednesday, December 27, 2017

School Recalibration Report: A First Glance

On December 15, the consulting firm Augenblick, Palaich and Associates delivered a draft of their school-funding recalibration report. It is a 500-plus page document, which no doubt has required quite a bit of work on behalf of the consultants.

The report itself is true to the job that the legislative Recalibration Committee asked for. As such, it is comprehensive and, like all consulting products in public policy, spends most of its time telling us "what is". The remaining part discusses three different ways to evaluate resource allocation models.

Here is where it gets interesting. Of the three models, the report falls desperately short on one. Before I get to that part, let me express my appreciation for their analysis of the two first models. They have done a good job analyzing them and delivering value for taxpayers' money. The first model, called "Professional Judgment", would distribute school funding, which the Augenblick report explains as follows:
The professional judgement (PJ) approach relies on the experience and expertise of educators in the state to identify the resources needed to ensure all districts, schools, and students can meet state standards and requirements. In Wyoming, this is specifically the required basket of goods and services, as well as any related requirements (such as assessments and the accountability system). Resources include school-level personnel, non-personnel costs, additional supports and services, technology, and district-level resources. These resources are first identified for students with no identified special needs (which allows for the calculation of a base set of resources) and then separately for special needs students, presented as adjustments, or “weights.” 
The method for assessing the need for funds in individual school districts bears a striking similarity to indicative central planning (not to be confused with Soviet-style teleological planning):

The PJ approach estimates the costs of adequacy by developing representative schools and one or more representative districts. Representative schools are designed using statewide average characteristics to resemble schools across the state. This includes identifying both averages for school sizes and grade configurations, as well as identifying average demographics for at-risk, English Language Learners (ELLs), and special education students. Note that in Wyoming, ELL is currently a part of the at-risk definition but was considered separately during the PJ panels. 
Again, the consultants did a commendable job working through this model. Without getting lost in the details, they have thoroughly explained it as they were asked to do. 

There is only one problem. It is incredibly bureaucratic. It reminds me of the appropriations formula training I got in college when I, as a naive young 19-year-old with wet ears and a head full of mush (kudos El Rushbo...), thought that I wanted to become a government bureaucrat. It really does not matter whether a funding model like this one is used widely around the country; it still requires its workload. The question for our legislators is, of course: does this funding formula deliver something that other, less onerous formulas don't?

The second funding model in the consulting report is called "successful schools" and is explained as follows:

The theory behind the successful districts/schools approach is that the resources used at the base level in the highest achieving districts in a state should be representative of the amount of resources all districts will need to successfully educate students with no special needs. Because these districts often have lower than average numbers of students with special needs, this method is not appropriate for determining adequate funding for special needs students, such as students who are from low-income families, English language learner (ELL) students, and students with disabilities. Using the typical successful districts/schools method, the study is conducted at the district level. Districts that are outperforming their peers, measured using both status performance (e.g. the percentage of student scoring at the proficient level or above on state assessments) and growth (the amount of academic gains students achieve over time) are identified. Expenditure data for these districts are then collected for broad categories of expenditures, such as central office administration and operations, school administration, school instruction, and school operations and maintenance. An overall per-pupil amount for these expenditures is calculated, weighted by district enrollment, and the result represents the estimated adequate per-pupil base spending amount. 
From an administrative-cost viewpoint, this model makes more sense than the "professional judgment" model. It is also more transparent for parents, taxpayers and legislators. One problem is that it still imposes an indicatively planned formula from the top down, with tentatively problematic consequences: if successful school districts are lean, budgetwise, but school districts with lots of problems require significant extra resources, this formula does not - at least not by default - capture that need. 

In fairness, the model is more complex than the quote above alludes to, but not in a way that immediately would seem to correct for the resource-need problem. In other words, there is a presumption behind the model that the supply of resources somehow linearly correlates with student performance. 

Which brings us to the third funding model, called the "statistical approach". The consulting report actually failed on this one, and that is not a criticism of the report or its authors. They explain why:

The accountability system in Wyoming focuses entirely on school-level outcomes. This rules out using district-level data for a cost analysis because there are no district-level outcomes that district officials would be trying to maximize (i.e. the theory underlying the model assumes that the producer is maximizing output, which clearly would not be the case at the district level). Estimating a cost function at the school level is theoretically possible but requires expenditure data tracked down to the school level and assumes that district or school officials are allocating all of those funds in a way that maximizes the measured school-level student performance. While Wyoming’s accounting system does provide a great deal of information on school-level expenditures, there are always some district-level expenditures for centralized services. Since those district-level funds presumably still support student achievement, they should be included in the cost function analysis, but it is not clear how best to allocate those funds down to the school level, nor is it clear that school and district officials are allocating those funds in ways that maximize school-level student performance. In the analysis below, the cost function models were estimated with and without these district-level expenditures. 
This is actually an interesting transparency problem; for all those interested in pursuing more transparency in government functions around the state, the above quote from the Augenblick report is a great example of what difference transparency could make. 

Beyond that, there is also the small-sample problem. Wyoming simply does not offer enough data for a traditional econometric study of school funding and school performance. In fact, the report explicitly says that the "small number of schools and districts makes it unlikely that any statistical model could be estimated with much precision." They deserve respect for this recognition.

The problem, though, is that our lawmakers still need to make decisions on how to fund our schools. The consultants tied their own hands by agreeing to do an econometric study of the statistical funding model. Econometrics (which I abandoned as a method a long time ago - it is useless, for the most part, in public policy) limits the kind of problems you can address. It is like a man who wants to build furniture with a hammer, but has no nails. Instead of using other tools and techniques, he puts the hammer down and lives on without furniture. 

Again, no critique of the econometrician who tried her best in this consulting report. The problem is, ostensibly, that our state government confined them to one, rather clumsy quantitative method. Now that they did not get the decision-making material they needed, the question is: what are they going to do instead? 

Let me offer a couple of examples on how we can reason about statistical facts without using econometrics. One of the points with econometrics is that you can isolate variables, one at a time, and examine their presumed causal effect (which is really nothing more than a correlation), and do so with large data volumes within a comfortably short period of time. However, as mentioned, the large data-sample requirement to put the study above the "rigorous" threshold (something they hammer into the heads of econometrics students from day one) limits its applicability in both time and space. 

This, in turn, leads to a problem where econometrics cannot help out with legislative decisions on issues that have never been addressed before, or where the reforms pursued are innovative or even entirely new. For example, suppose we were to introduce a statewide, all-inclusive school voucher system: what statistical material would we use as a basis for a rigorous, econometric analysis of the possible outcomes? 

The simple answer is: there is none. It would, at best, be a speculative study that would fall short of the standards econometricians (correctly) hold themselves to. 

Yet the legislative issue is still there. So how would we make a decision when econometrics cannot help? 

To see how we can work our way to some informed decision-making material without using econometrics, let us assume that we want to know whether or not teacher pay affects student performance. Using data from the Nation's Report Card website, hosted by the U.S. Department of Education, we pick two test scores to compare to teacher salaries across the 50 states and District of Columbia. First, 8th-grade math test scores:

Figure 1

Source: U.S. Department of Education

What can we say about this chart? Given that all other things are equal - in other words that we consider nothing else in this world than these two variables - it looks like states with higher teacher salaries also have somewhat better 8th-grade math test scores. However, this pattern only emerges if we extract a trend line from the score data; without the trend line, there seems to be no correlation at all. 

What do we do to find more information? We look at how teacher salaries compare to other test scores, such as science:

Figure 2
Source: U.S. Department of Education

Here, the presumed correlation is even weaker. 

We now have two sets of observations of how teacher salaries compare to test scores. Would anyone be willing to draw any conclusions about compensation of educational staff, and school performance? Reasonably, no. Does this mean that we know nothing about the question at hand? Yes, we do know something: teacher salaries do not appear to have any visible influence on student performance. 

Obviously, we would want more data, for example test scores in reading, and for other grades. If we still were not satisfied we could repeat the studies for several years and build one gigantic set of time series data for scores and teacher salaries.

When do we have enough data? That is always a judgment call. However, what matters is to remember that statistical analysis can yield a correlative answer as well as a non-correlative answer: either there is a correlative - and thus presumed causal - relationship between two variables, or there is no such relationship at all. In the latter case, it means that, for example, we cannot hope to raise test scores by paying teachers more.

Sometimes we have even less data available, yet we might still have to make decisions. For example, should we continue with the education policies of the Obama administration and give more control over K-12 education to the federal government? In this case, there is one small set of data that can help us at least form an opinion on the issue. Consider how U.S. PISA test scores declined under President Obama's attempts to centralize control over our K-12 education system:
  • In 2009, the U.S. average PISA reading literacy score was 500; in 2015, after six years with his education policy, the score was down to 497;
  • In 2009, the U.S. average PISA science literacy score was 502; in 2015, after six years with his education policy, the score was down to 496;
  • In 2009, the U.S. average PISA math literacy score was 487; in 2015, after six years with his education policy, the score was down to 470.
If we asked an econometrician to look at these numbers, he would jump out the window. Nevertheless, suppose this was all the data we had available. Could legislators in Wyoming make a decision based on it? 

Of course they could. The most prudent decision would be to say "we do not have enough information", but if it came down to an up-or-down vote - more centralization or more decentralization of K-12 education - it is reasonable to say that "whatever small set of information we have tells us that centralization is a bad idea". 

In other words: when econometrics does not help, we need to pursue other venues for gathering, processing and interpreting statistical information. 

The Augenblick consulting group had their hands tied when it came to the econometrics part. I actually criticized the specification that the state gave them, precisely because they included an econometric study requirement. Unfortunately, now taxpayers have paid for something they could not get.

There is more to say about this report. Stay tuned. 

No comments:

Post a Comment