Study: Multi-Year Gates Experiment to Improve Teacher Effectiveness Spent $575 Million, Didn鈥檛 Make an Impact
A major, long-term experiment in improving teacher performance funded by the Bill & Melinda Gates Foundation failed in its aims, according to a study released today by the RAND Corporation. The intermediate and long-term student outcomes in affected schools were not improved, and new measures of teacher effectiveness devised through the initiative rated almost all teachers highly, the authors found.
The , conducted by RAND in conjunction with the American Institutes for Research, renders a final verdict on the multi-year reform effort, known as the Intensive Partnerships for Effective Teaching. An , released two years ago, had given onlookers a preview of the initiative鈥檚 results 鈥 few of which were trending upward.
The Intensive Partnerships undertaking a 2006 paper on teacher effectiveness written by education expert Thomas Kane. Researchers have typically found that teacher quality is among the most important factors in student performance, and that proxies such as educational attainment or teacher experience are poor predictors of performance in the classroom.
Influenced by Kane鈥檚 research, the Gates team joined with seven partners in a long-term effort to devise measures of teacher effectiveness and create human resources practices to maximize it. Three large school districts were chosen: Pittsburgh Public Schools, Hillsborough County Public Schools in Florida, and Shelby County Public Schools, which merged with the Memphis school district in 2013. Additionally, the foundation selected four California charter management organizations: Alliance College-Ready Public Schools, Aspire Public Schools, Green Dot Public Schools, and Partnership to Uplift Communities Schools.
Between the 2009-10 and 2015-16 school years, the districts and CMOs were awarded roughly $575 million, including $212 million directly from the Gates Foundation.聽The remaining funds came from a combination of sources, including the districts and CMOs themselves, local philanthropies, and the federal government.
These funds amounted to expenditures of between $868 per pupil at Green Dot to $3,541 in Pittsburgh. Using the money, the schools were meant to develop a measure of teacher effectiveness that accounted for both metrics of student achievement (on standardized tests, for example) and in-classroom observations by administrators.
School leaders were then expected to make use of their teacher effectiveness rubric when making decisions about hiring and recruitment, compensation, placement, tenure (where applicable; the CMOs did not grant tenure then or now), and dismissal. Ultimately, the goal was to expose low-income and minority students to better teachers, improving their rates of high school graduation and college attendance by doing so.
The plan didn鈥檛 work, according to the research team.
鈥淏y 2014-2015, student achievement, [low-income minority] students鈥 access to effective teaching, and dropout rates were not dramatically better than they were for similar sites that did not participate in the [Intensive Partnerships] initiative,鈥 they write.
One problem, they find, is that the spiffy new teacher effectiveness ratings were difficult to put into practice. After the 2012-13 school year, no more than 2 percent of teachers in any of the seven school systems were rated in the lowest level of teacher effectiveness. Although the schools rated newly hired teachers more and more effective over the course of the study, RAND鈥檚 researchers found their performance to be no better based on their own calculations of value-added modeling (a statistical聽evaluation of teacher impact on student progress from year to year).

The rating inflation arose partially from the fact that teachers grew resistant to the new evaluations being used for high-stakes decisions like compensation and firing. Indeed, the Pittsburgh Teachers Union kicked up so much of a fuss over the new criteria that . Superintendent Linda Lane had to lower the minimum score for effectiveness 鈥 twice 鈥 before the issue was resolved.
Unexpected contingencies also arose during the six years the experiment was being conducted. Pennsylvania and California experienced school funding crunches, Hillsborough County jettisoned its superintendent (MaryEllen Elia, now the education commissioner in New York state), and virtually every state in the country decided to change their statewide standardized tests.
Whatever the cause, however, teacher performance was not bolstered by the costly study. The authors note that they will continue to track student outcomes for the next two years in case improvements take longer than expected to manifest.
The difficulty involved in revamping teacher evaluations, which can stoke hostility among teachers, at the same time the Intensive Partnerships initiative was underway. Last year, Bill Gates announced that his foundation would refocus its philanthropic efforts in education away from trying to build a better teacher evaluation and toward funding 鈥渘etworks鈥 of innovative public schools.
Disclosure: The Bill & Melinda Gates Foundation provides financial support to 蜜桃影视.
Did you use this article in your work?
We鈥檇 love to hear how 蜜桃影视鈥檚 reporting is helping educators, researchers, and policymakers.