Steve Ball, executive principal at the East Literature Magnet School in Nashville, arrived at an English class unannounced one day this month and spent 60 minutes taking copious notes as he watched the teacher introduce and explain the concept of irony. "It was a good lesson," Mr. Ball said.
But under Tennessee's new teacher-evaluation system, which is similar to systems being adopted around the country, Mr. Ball said he had to give the teacher a one -- the lowest rating on a five-point scale -- in one of 12 categories: breaking students into groups. Even though Mr. Ball had seen the same teacher, a successful veteran he declined to identify, group students effectively on other occasions, he felt that he had no choice but to follow the strict guidelines of the state's complicated rubric.
"It's not an accurate reflection of her as a teacher," Mr. Ball said.
Spurred by the requirements of the Obama administration's Race to the Top competition, Tennessee is one of more than a dozen states overhauling their evaluation systems to increase the number of classroom observations and to put more emphasis on standardized test scores. But even as New York State finally came to an agreement last week with its teachers' unions on how to design its new system, places like Tennessee that are already carrying out similar plans are struggling with philosophical and logistical problems.
Principals in rural Chester County, Tenn., are staying late and working weekends to complete reviews with more than 100 reference points. In Nashville, teachers are redesigning lessons to meet the myriad criteria -- regardless of whether they think that is the best way to teach. And at Bearden High School in Knoxville, Tenn., physical education teachers are scrambling to incorporate math and writing into activities, since 50 percent of their evaluations will be based on standardized tests, not basketball victories.
In Delaware, under pressure from the teachers' union, the state secretary of education announced last month that teachers would not be assessed on metrics based on how much growth students showed in their classrooms, as planned, because not enough of such data existed. In Maryland, districts were granted an additional year to develop and install evaluation models without the results being counted toward tenure, pay and promotions. And in New York, Thursday's agreement came after a stalemate lasting months in which more than 1,300 principals signed a petition protesting the new evaluations.
States "are racing ahead based on promises made to Washington or local political imperatives that prioritize an unwavering commitment to unproven approaches," said Grover J. Whitehurst, a senior fellow at the Brookings Institution. "There's a lot we don't know about how to evaluate teachers reliably and how to use that information to improve instruction and learning."
Backers of the new approaches say that change takes time. "You have to start the process somewhere," said Daniel Weisberg, executive vice president and general counsel at The New Teacher Project, a nonprofit agency founded in 1997. "If you don't solve the problem of teacher quality, you will continue to have an achievement gap."
Emily Barton, assistant commissioner for curriculum and instruction at the Tennessee Department of Education, acknowledged that the new system had kinks, but said that she heard "a consistent theme that the process is leading to rich conversations about instruction and that teacher performance is improving." In early 2010, the legislature required that half of a teacher's evaluation be based on annual observations and half on student achievement data. The following year, the state board of education added specifics: each year, principals or evaluators would observe new teachers six times, and tenured ones four times.
Each observation focuses on one or two of four areas: instruction, professionalism, classroom environment and planning. Afterward, the observer scores the teacher according to the state's detailed and computerized system. Instruction, for example, has 12 subcategories, including "motivating students" and "presenting instructional content." Motivating students, in turn, has subcategories like "regularly reinforces and rewards effort." In all, there are 116 subcategories.
"It's one thing to be observing -- I love that, it's my primary role," said Troy Kilzer, the 44-year-old principal of Chester County High School. "But you know when a good lesson is being taught without looking at a rubric." Mr. Kilzer said the new system had led to more precise discussions with teachers about their skills and better lesson planning. But he can hardly keep up with the work.
For principals, it is not just the observations, but also the pre-conference (where teachers explain and show the lesson), the post-conference (where observers explain what teachers might have done better) and four to six hours inputting data. "We are spending a lot of time evaluating people we know are very good teachers," Mr. Kilzer said.
For many principals, the observations mean less time for the kind of spot visits to classrooms that they relish -- and for everything else. "Parents were used to immediate feedback, or they'd stop back for a meeting," said Connie Gwinn, principal of H. G. Hill Middle School in Nashville who is supportive of the new system over all. "We don't have the opportunity to do that any more."
In November, state officials allowed some observations to be combined. Now, evaluators must measure the same number of data points, but they can do it in fewer visits.
Gera Summerford, president of the Tennessee Education Association, compared the new evaluations to taking your car to the mechanic and making him use all of his tools to fix it, regardless of the problem, and expecting him to do it in an hour.
"It has been counterproductive to the intent -- a noble intent -- of an evaluation system," said Stephen Henry, president of the Metropolitan Nashville Education Association.
Some teachers, though, praised the system.
"I'm definitely a lot more attuned to making my plans," said Morgan Shinlever, a physical education and health and wellness teacher at Bearden High.
Since Mr. Shinlever knows his fate now depends on math and reading scores, he is making his classes more academic. After watching the documentary "Food, Inc." recently, his sophomores wrote essays. Similarly, in Chester County, a gym teacher recently spread playing cards around and had students run to find three that added to 14.
Tennessee officials say the system will be tweaked but not changed significantly. The legislature is considering bills to exempt this year's evaluations from tenure decisions, and to lower the bar for tenure from scores of four or five to three. And the state recently announced teachers would not find out their ratings until the middle of next year -- at which point, they will be deep into next year's observations and testing.
"It's like building an airplane while it flies across the sky," said Mr. Ball, the magnet school principal in Nashville. "We're building it on the fly."
This article originally appeared in The New York Times .