Does the current push for “Value Added” measures mean that education has finally figured it out, or is this yet another silver bullet that will fail — and perhaps do more harm than good along the way?
While that is an interesting question, a number of prior questions need answers:
1. What exactly do we mean by ‘value’?
2. Who adds ‘value,’ and how do they do it?
3. How can we enable the women and men now teaching to add more ‘value’?
4. How can we attract people who add value to go into teaching?
5. At the end of the day, what do we value?
Recently I was introduced to Masha Tarasyuk, who spoke no English when she immigrated from the Ukraine at age 6. Masha told me that one teacher at her public school in the Bronx took her under her wing, supported her when she got down in the dumps and never stopped believing in her. Masha eventually graduated from Barnard and the Fordham School of Education and now is a Teach for America corps member at the High School for Medical Professionals in New York. She’s giving back, helping others just as that teacher helped her (and Masha is in her third year, by the way, even though the TFA term is just two.)
Surely that teacher ‘added value’ to Masha’s education, but, judging from the way Masha told the story, the value had less to do with her academic achievement and more to do with the emotional connection.
I’d like to believe that everyone reading this had at least one teacher like that, someone who made a huge difference in your life. We did a series on it, available at our YouTube channel.
Unless you have been living under a rock, you have to be aware of the recent value-added study by economists from Harvard and Columbia, positing that students who have truly effective teachers for a few years of their education end up making lots more money. Much of their findings are conjecture or at least extrapolation, and the authors were careful to warn against basing policy decisions on their study.
The economists measured ‘value’ with test scores, of course, because that’s what is available. Bubble tests results are how we keep score, at least for the moment. And if the kids in Teacher X’s classroom always seem to do well on those tests, while the students in Teacher Z’s classroom always seem to do poorly, why shouldn’t we draw some conclusions about the value each teacher is adding?
It is a stretch to connect better test scores to attending a better college, getting a better job and eventually making more money, but even if the connections are flimsy, we surely need more teachers who can motivate their students to do well.
Nick Kristof, the well-respected columnist for the New York Times, ignored the caveat about policy recommendations and made some: pay effective teachers lots more and fire ineffective ones. But it didn’t take Kristof’s words to energize politicians like New Jersey Governor Chris Christie, New York’s Andrew Cuomo and New York City’s Mayor Michael Bloomberg, all of whom are pushing value-added measurement as a way of doing what Kriistoff recommends.
That’s a Republican, a Democrat and an Independent, if you’re keeping score, which suggests that ‘value-added’ is either a non-partisan idea whose time has come — or a mad rush to judgment.
But let’s dig deeper? What do truly effective teachers do that adds value? Can those skills be taught?
The father of “value-added” measurement is Dr. William Sanders, now nearly retired in North Carolina but still very much engaged. He has not been fond of some of what I have been writing in this blog about bubble tests and, ever the gentleman, asked if I would be open to having a conversation.
Which we had recently.
A good deal of what follows is based on our 96-minute phone meeting, several days ago. What Dr. Sanders wanted to explore was the ‘how’ of value added. What is it that excellent teachers do that adds value to their students’ learning? Can a trained observer see what excellent teachers do that no-so-good teachers do not?
Here’s where it got interesting. Bill told me that teams of observers cataloguing the classroom behaviors of teachers from both groups could not find differences in their behavior. ‘Look again,’ he told them. Still no luck, they reported. ‘Look some more,’ he directed.
Eureka. The truly effective teachers, his observers finally figured out, were able to provide what’s called ‘differentiated instruction’ (treating kids individually according to their needs) and able to disguise what they were doing, so that the children were not aware of the different treatment.
These teachers, Bill said, don’t see a classroom of 25 students; instead, they see 25 different kids and figure out the best ways to reach them. And then they camouflage the different treatment lest some kids feel like Robins and others like Crows in those infamous reading groups.
They do not spend hours or days on test preparation. (Administrators, please read that sentence again!)
Do some teachers intuitively know how important it is to disguise what they are doing? If not, how did they learn to do that? He’s a fan of Teach for America because, he says, the data tells him that those teachers are more likely to be truly effective than teachers from traditional schools of education.
What’s the evidence, I wanted to know? The old Tennessean cited his research in Memphis, where, he said, for three years in a row the cadre of TFA teachers outperformed teachers who attended Vanderbilt, Middle Tennessee and Tennessee, using student achievement scores as the measure of performance.
Bill suggested that it was not the TFA summer training that makes the difference as much as the caliber of their recruits. When society opened up more opportunities for women, he reminded me, the entering ACT scores of those enrolling in education and home economics fell dramatically. Since the late 1960s, he said, talented young women are likely to enroll in other departments. Today, women make up half or more of those studying to be lawyers, doctors and veterinarians.
“TFA is bringing capable people back into the teaching pool,” he told me. If Bill is correct, then one sure-fire way to ‘add value’ in education is to recruit more people like the men and women who apply to Teach for America.
How do we entice them to become classroom teachers? With about one million teachers approaching retirement, TFA’s corps of 15,000 teachers is not the answer. We have to appeal to hundreds of thousands of talented young men and women and convince them that teaching is a respected and rewarding career.
Ask yourself if what’s going on in the public arena now is likely to attract people into teaching. Are the heavy-handed campaigns by politicians like Governors Christie and Cuomo (and the Governors of Wisconsin and Ohio) helpful? Is Mayor Bloomberg’s effort a step in the right direction? Is Michelle Rhee’s campaign to restrict collective bargaining and tenure likely to persuade talented young men and women that teaching is an appealing career? Are union leaders who oppose charter schools ‘on principle’ adding value to the teaching profession? When union leaders insist that teachers cannot be held accountable for student learning, are they elevating the teaching profession?
As the lawyers say, asked and answered.
Surely an important part of the value of an effective teacher is her ability to connect with individual children, her willingness to become emotionally attached to her students as individuals. (I write about this at some length in The Influence of Teachers.)
Those teachers need the time and space to make connections, but today teaching seems to be all about higher test scores. In an earlier piece, we explored the impact of test pressures on young readers:
Maybe it’s time to figure out the impact on young teachers, too?
Because evaluating teachers using student achievement scores is here to stay, it’s in teachers’ interests to argue for better measures of achievement. We need better ways of assessing the value that teachers add to the lives the children they teach, beyond test scores.
What do we value?
29 thoughts on “The Values Behind ‘Value-Added’”
I always enjoy reading your questions. They often get to the very heart of essential matters.
In addition to what “do” we value, I think you should also ask what “should” we value. There is a huge disconnect between these two. If we value the human spirit and the natural and creative development of the human mind, then clearly we should be taking all of kids out of schools altogether. Oddly enough, if we value test scores, we should still be taking them out. What society values is having children removed from society and placed in a babysitting factory so that their parents can work. The abusiveness of the environment has to be masked behind the pretense that schools offer something constructive. When that lie is exposed, great efforts to redefine success must be made. Were anyone to discover something that schools are good at, it would become the thing that is measured. So far no one has been able to do that.
If “evaluating teachers using student achievement scores is here to stay” as you suggest, may I suggest that these tests have to be demonstrated to be honest measures of effective student learning first! I have not seen such a study along with the study’s results presented.
Implicit in that thought is the connection with effective learning! I’m convinced personally that effective learning will lead to better achievement scores for sure. What I’m not clear on is whether better achievement scores require effective learning. To me this is absolutely critical. You commentary with regard to effective teaching shows that effective teaching leads (means) that this leads to effective learning. But only if good achievement scores correlate with effective learning should those scores be used to evaluate teachers!
No argument. What we need are better means of assessment (not the same as tests, of course). And a big first step to getting there is deciding what we value….
This blog covers two of my favorite issues.
If thiese are keys:
Teaching and learning are too complex to measure value added as a tool for grading teachers. It can only be used effectively in evaluating the progress of a student. Children learn to do things at different ages, when they are ready.
Certainly a key to success outcomes is connecting students to the educaitonal system, That is more likely measured in survey instruments than standardized tests.
I agree “The truly effective teachers, his observers finally figured out, were able to provide what’s called ‘differentiated instruction’ (treating kids individually according to their needs) and able to disguise what they were doing, so that the children were not aware of the different treatment” Again, you could measure the outcome but not know what to change in teaching process to remediate less than desireable advances.
If this is a skill, could it be taught? It took a second, third or fourth look for evaluators to finally see what was going on. Could something that subtle, that innate be taught?
Basing retention decisions on testing will only serve to encourage cheating and raise anxiety levels in children (teachers will stress the importance of the test).
Agreed. I wrote an earlier piece called “Trust but Verify” that argued for ‘grading’ schools, not individual teachers. I hope you saw that either in the Sacramento Bee or on our website. I wrote it with the Learning Matters Board Chair, Esther Wojcicki, a superb classroom teacher.
Much food for thought here. I will just address one element. “Do some teachers intuitively know how important it is to disguise what they are doing? If not, how did they learn to do that?” Often when we show AUGUST TO JUNE some people will comment that it is something about me as a person that made the learning that occurred possible. Of course personality and enthusiasm matter, but much of what I did as a teacher I learned by spending all of my career reflecting with my colleagues about children and teaching. I had the seeds for doing so implanted in my teacher training, and then had the good luck to work in situations where collaboration was basic. Teachers and students both benefit from “learning by doing.” Currently we are filming at Mission Hill Pilot School, and as I watch the purposeful interactions between staff members that is a hallmark of the school that Deb Meier created, it reinforces the importance to me of a supportive professional community.
I am betting that longer apprenticeships in the form of student teaching/interning coupled with quality time for reflection with mentors and colleagues would both improve a teacher’s ‘antennas’ for noticing and addressing different learning styles and needs, and also identify at an earlier point what qualities he or she needs to work on to grow into an effective, empathetic teacher.
What is your response to Bill Sanders’ endorsement of Teach for America and the quality of the men and women that program draws?
On a personal note, please give Debbie Meier, one of my heroes, a hug for me…
It’s not just longer apprenticeships, it’s permanent apprenticeships and collegiality that make the difference. We have all learned important things from bad teachers – if nothing else we learned what NOT to do – and even more from good ones. The “working team” is the model of TFA, and was, for many years, the kind of clinical “practice teaching” that ed schools relied on. Yet they only very, very rarely gave those host teachers adjunct or academic roles, and colleges typically retreated to purely academic standards, while schools demanded purely clinical skills. This isn’t/wasn’t that different in medicine or law, incidentally, but it signaled for all professions a failure to recognize what works best for the continuous improvement of the professions themselves.
While on that theme of failed opportunity – which is the underlying problem of measuring “value added” contributions – I recently realized how and why teachers have been so seriously trapped by bubble tests. It’s not the grade, nor even the value of the test. It is the presumption that there is only ONE right answer of four or five. In fact, there are usually gradients of right, and gradients invite collaborative decisions. Different perspectives improve choice, and mutual respect is the foundation for those differences.
That intellectual twist showed me why so many otherwise wise teachers slip into “if it’s not this one it has to be that one” kind of choices. More often than not they’re wrong, and multiple “right” answers are the case, just as more often than not many flowers bloom and many worlds unite. Yet schools are quite intolerant of paradox, and the kinds of tests they give may well be their greatest failure. The paradox of that paradox, incidentally, is that a bubble test is just fine if it asks questions that really do have one answer of several, and really do demand reasonable amounts of recall, and, probably most important, provide some critical information about the student, teacher, school, or subject matter. They’re fine, but they’re not enough.
Great column, John. It’s sad that the political focus is on hammering teachers and not on improving them. Battle lines don’t allow for much conversation, and as both a former union member and then school leader, there seems to be all too little real dialogue in public education, and all too often in independent education as well.
Good teaching requires significant investment in professional development, but the public has little interest in investing in teachers when truly bad teachers seem protected by their leadership and union contracts which allow teachers with tenure to remain in jobs when they are grossly underperforming.
Unfortunately much of the country still has the mindset it had in the 50s and 60s – teachers won’t trust administrators for fear they will lose their jobs, and the public still thinks one teacher in front of 30 students who teaches from the front of the room works. The world of the American student has fundamentally changed, and we have not caught on as voters and decision-makers. Differentiated instruction, project-based learning, and using brain research to inform instruction all make a difference, a positive difference, but implementing them is labor intensive and we seem not to have to collective will to make them happen for the children. And that is profoundly sad.
There you have “it.” The “it” is getting beyond test scores. Name me a profession that judges success and best practice on a test score? In most professions I can think of the metrics of success and determinations of best practice lie in the “practice” of the profession. And, while you cite “differentiated instruction” as one of the key practices that add value, there is a whole host of other value-adding behaviors that are just as critical.
We do external testing because we don’t trust teachers to measure their performance? Do the folks who call for testing as measure of performance really understand better than teachers what constitutes the kind of performance desired? Do we trust lawyers, accountants, doctors, and others to create their own metrics of success and good practice do because we trust them more? I don’t think so.
What most are proposing in measuring teacher performance using test results would like judging the performance of an individual football player based on winning the Super Bowl when in fact there are a myriad of circumstances and performances of many players, coaches, etc that contribute to the success of their enterprise/profession. Tom Brady may get the best value-added scores, but he would not get any if it were not for the other 21+ players and other personnel many of whom are unknown to most everyone.
Value-added measures have their place but any statistician “worth their salt” will tell you that a methodology that has an independent variable and an outcome which are dimensions of the same thing, has limitations. At best, value-added measure are only one piece but only one piece. Whether it is value-added measures or any other for that matter, anytime we ask one measure to be the sole determiner of success and good practice, we asking the measure to be a “god.”
I hope you saw my earlier piece about evaluating schools and the piece “Trust but Verify.” I think we are heading down a dangerous road with value-added if the measures remain as they are. But isn’t that up to us?
Unfortunately, it’s not up to “us.” Control of education has passed from the professionals who deliver it and increasingly out of local school boards’ hands into the hands of those who control the purse strings. In California, at least, that’s the state and federal government – in other words, politicians.
These are important questions to be asking. As a twenty-five year veteran teacher who is still trying to figure out how to do it better, a few observations:
1. The caliber of the pool of teaching candidates is important. You can’t fix stupid, and, conversely, a good brain seems to lead the owner in a search for excellence, whether it’s a better way to clean the floor, a better vacuum cleaner, or better teaching.
2. Beating up on teachers is NOT the way to encourage your best and brightest to join the profession. Teaching is difficult and demanding under the best circumstances, and the present ones are not the best. We are excoriated for results over which we have little control; have little power to influence working conditions, and none to shape expectations; at the mercy of politicians and state budgets; working in increasingly dangerous, chaotic conditions in a society which pays lip service to the importance of education, while devalualing it in the cultural mainstream.
3. In addition, strong-arm tactics aimed at lowering wages don’t help. Michelle Rhee as much as admitted the financial incentive for getting rid of experienced teachers when she bemoaned the fact that keeping expensive teachers led to larger class sizes. This kind of double speak goes on in all areas of education, and makes teachers completely crazy; it’s very demoralizing.
4. Lastly, it’s difficult to get validation as a teacher. You recount the story of the immigrant who was touched by a teacher’s advocacy. I hope the teacher knows how important she was. The results of our work, whether psychological or educational, often come to fruition over years, long after the student has left us. In the daily scrum of the classroom, it’s pretty hard to tell if you’re doing well. Some days are good, some not. Some kids do well, some don’t. Test scores might help – but I have to say, I think teachers are already cheating, and if I choose to not spend all my time coaching to the test in favor of longer term and larger goals (in my case, teaching college-readiness skills, not CST prep), my scores may well not compare favorably with my more focused colleagues.
As a result, most teachers no longer recommend teaching as a profession. We do not want our children to be teachers. We may still love it, but the conditions are really becoming untenable for even teaching fools like me. If we really believe in the importance of education, we need to commit to making it better for the people who deliver it.
Your last paragraph is heart-wrenching and speaks volumes about the situation we face as a nation. Teachers not wanting their children to become teachers is a bit like a chef not wanting his children to eat at his/her restaurant, isn’t it?
John, Clearly recruiting well prepared, well motivated, creative and
energetic teachers is critically important in producing more teachers who
can deliver read value, both core value and value added. Programs like
TFA are vitally important and a critical component. One other important way
to recruit excellent teachers is to expand the ability to recruit second
career teachers such as those participating in the Transition to Teaching
program initiated by IBM which hopefully can be expanded to other
companies. We offer those of our employees who have been long time school
volunteers, and are interested in a second career as a math and science
teacher the opportunity to opt into the IBM Transition to Teaching
Once they enter the program the company pays 100% of any education course
they need to become fully licensed and certified, and they also receive a
paid leave of absence to complete a practice teaching assignment. They
also receive a mentor, professional development and help in obtaining a
position in a school. Since the average retirement age is the mid 50’s
participants are looking for at least a 10 year second career. So far our experience
and the school’s experience with Transition to Teaching graduates has been
very positive. We have a zero attrition rate and the skills and knowledge of
the participants has been very much appreciated. The cost to IBM for each
participant averages about $15,000 and we have about 100 participants.
While the numbers are small for now, were the Fortune 500 to embrace and
replicate this model, and the numbers held up for each company, it
could be a meaningful way to add to the many other important initiatives such
Improving teacher quality is vitally important. But the reality is that
there is no “one size fits all” solution.
IBM has a long tradition of acting responsibility in education, under your leadership, Stan, and that of Lou Gerstner. Glad it’s continuing
Though many in education fear that the new approaches to teacher evaluation around the country might devolve into using just one test score to make conclusions about teacher effectiveness, those fears are misplaced. What is actually happening across the country in about 34 states is that NOT ONE state is just using a one year value added score; almost all states are using some combination of a measure of instructional practice (like the Danielson Framework or Bob Pianta’s CLASS each of which has several data elements), and then usually two sets of student test scores, one from the state accountability test and one from district short cycle assessments. Usually the instructional practice measure is 50+% of the combined measure and the value added or student growth scores are from 3 years or so. And when all are used together one gets a pretty reliable conclusion about a teacher’s effectiveness level. Further the recent report from the Gates Foundation MET program goes beyond this and also adds the Ron Ferguson student survey. When all are combined, again, one gets a pretty solid score for each individual teacher with the lowest score teachers producing the least impact on student learning growth and the highest score teachers producing the highest impact, sometimes a full 8-9 months when the test is an ambitious one that gets at higher order thinking. Not all the development work has been completed by states, but these are the directions in which they are moving — very ambitious and comprehensive approaches to teacher evaluation.
Compared to traditional teacher evaluation metrics that find 99+% of teachers satisfactory, advanced or accomplished and thus are useless for any management decisions, the new teacher evaluation metrics will be sufficiently valid and reliable for use by school systems to make critical human management decisions such as licensure, tenure, promotion, pay increases and dismissal.
This is good news, but are Messrs Christie, Cuomo, Bloomberg, Walker et alia listening?
The sound bite of “multiple measures” misses the point. Test score growth models can be accurate enough to complement or supplement human evaluations, but they must not drive the process.
And no, the MET shows that When all are combined, again, one CAN GET a pretty solid score for each individual teacher. It says nothing about how it would work in real life. As Charlotte Danielson says, when a district with a “gotcha” culture uses her rubrics, then her rubrics become a gotcha system. In that case, multiple measures just means multiple hoops to jump through.
Even when district leaders claim they trust value-added, some school leaders will have the exceptional moral integrity to still honestly evaluate teachers regardless of the value-added. But when a teacher is fairly or unfairly indicted by a value-added model, in many (or most?) cases that is tantamount to a conviction. In most systems, the game is played the way the game is played. And when principals are evaluated by vams, that just increases their need to cover their own rear ends, and play the game the way the bosses want.
As Bruce Baker explains, the percentage of weight given to test scores versus human evaluations is irrelevant when both are seen as hoops that educators have to jump through. When districts send the message that test score growth is paramount, principals understand the unspoken message of how those systems want evaluations to work. Baker astutely concludes, “It [value-added] may be 50% of the protocol, but will drive 100% of the decision.
You ask how is it that some teachers just know. The two best teachers I ever had were a veteran and new teacher just in the field three years. What they both shared was an urgent vision of success that they always shared with us. I remember in 8th grade Ms. Howard saying, “no you may not like diagramming sentences (using the books she had harvested from the dumpster) but my role is to teach you to write and do it well.” To this day I can hear her in my head and appreciate the day in, day out focus she had on our future. Same with Ms. Whitty and her daily exoneration on why European history mattered. The weekends she spent prepping us for the AP test were appreciated by many of us that knew college depended on scholarships and other aid and that these tests helped us in our goals. But I also have no doubt that if you looked at the general scores these teachers excelled. What I remember both from both of these teachers was both their teaching and their vision. Does value added capture, I think it would part of it, but that is also why evaluations also have a personal element.
Be interesting to know what they think of the current climate regarding teachers. Would they become teachers now, do you think? That’s my concern, frankly. Two of my three grown children were public school teachers in New York City, and the way they were treated played a huge part in their decisions to leave the profession. It’s a shame, because both seemed to be the kind of teacher you describe
The good news about the Chetty, Rockoff, and Friedman study, from my persective, is that I have read three long pieces by economists defending their results from criticism by education experts. Those three responses, though welcome, also show how little they know about the concrete realities of schools. (and also, Friedman said he was misquoted about the policy implications of his work.)
As usual, their skill is above reproach. But they seem far more concerned about the sweetness of their methodologies than asking the questions necessary for policy decisions. For instance, their sample population was 76% low income and they excluded classrooms where 25% of students are on IEPS.
My low income district doesn’t have a neighborhood secondary school that is only 76% low income, and the difference between the challenges faced by 95%+ schools grow more geometrically than arithematically. I don’t believe I’ve ever had a regular class with fewer than 1/4th of the students on IEPs. But that isn’t the real problem. I suspect that value-added can do a decent job of controlling for large numbers of kids with learning disabilities, but they make no effort to control for the far greater issue of large concentrations of kids with serious emotional disturbances.
I have no doubt that the types of databases that Sanders and Chetty et al use can give accurate results on a macro level. What I find shocking is that they don’t seem to understand the objections that there is no reason to believe that in the age of the Big Sort that their aggregates say anything about specific teachers, classes, or even schools.
This exchange, however, seems to have clarified where their misunderstandings come from. The issue isn’t poverty, trauma, bad parenting, or bad schooling. Plenty of kids overcome each of those factors. The issue is the way that they all come together in an age of self-segregation and choice.
Similarly, economists often claim, correctly, that in-school variations in value-added are equal or greater than across districts. Such a fact, however, says absolutely nothing about the causes of those variations. The Big Sort is just as huge of a factor inside individual school as outside. In the inner city, for instance, there is no comparison as to the challenges faced in a Algebra I class as opposed to the junior Algebra II class next door, after 1/3rd to 1/2 of the students from the freshman class have dropped out.
The question the study should have asked was whether teachers who transferred from highly effective schools to ineffective schools saw a drop in their value-added. But the authors apparently did not realize why that is the much more important policy decision. Rather than limit their analysis of transfers to the top and bottom 5%, (who also are pretty easy to identify without value added) they should have focused on the 90%. Even then, despite their huge data base, the sample would be shockingly small. During my entire career, I never saw an effective teacher transfer from an effective school to my school, much less stick it out for three years, which is what they required for their study.
Here’s the problem. Before a fighter plane is put into service, they test the various components under lab environments. But then they test the entire plane under a variety of circumstances. Value-added is the riskiest educational experiment that I’ve ever heard of. But it is being rushed into service, all across the country, without even having done lab tests on most of its components, and without a single test of how it operates in the real world.
Mike Winerip in the NY Times has a useful analysis. One point: all the data comes from pre-NCLB years, before all the test-prep frenzy.
Exactly. And their study should have tried to give us a handle on what would happen in the next twenty years. NCLB, like the vams, did not mandate the testing frenzy, but top education experts and teachers tended to predict, chapter and verse, how the testing frenzy would unfold. Value-added for evaluations is NCLB on steroids and targeted only at teachers and principals. Can we take another generation of the testing frenzy?
Also, the economic projections were based on old jobs, many of which were destroyed by the same types of number-driven economic policies.
I was a consultant to the Colorado Department of education back in 1971 and developed a Value Added Measurement system for them at that time. The monograph describing that work is at http://www.edlyell.com/uploads/Easy.pdf
In my model we would be accounting for more than cognitive knowledge gained, including keeping track of changes in methods, socio-economic status, learning styles, etc.
I was part of a panel of Governor, his cabinet, and education leaders on teacher quality that Dr. Sanders presented to in Denver back in 2000 just after he had been elected. Sanders model is good, but should have just been the beginning of work to move toward my more complete theoretical model. However, instead of moving on to improve his model most of public education spent all of their time and resources shutting down any attempt to create better value add models.
When I first presented my model in 1971-2 most did not even understand what I meant. As superintendents, board members, and teacher union leaders understood it, they used their powerful political connections to shut down any movement in that direction for decades.
Now Colorado has a ‘growth model’ that is doing a good job in moving forward on a value add approach with concern for more than just cognitive test scores. This effort is necessary, but not yet sufficient to me at this time.
I have been a teacher in Hillsborough County for 23 years. During those 23 years I have written county curriculums, written county exams, been a trainer at a number of workshops (both for instructional delivery and content knowledge), sponsored numerous extra-curricular activities, been a tutor for a variety of programs, and coached football. I have spent most of my life dedicated to the teaching profession. My grandfather was a teacher and principal for over 40 years and my grandmother was a teacher for over 40 years. I guess teaching is just “in my blood”.
Two years ago when we received the Gates Foundation Grant, I had a meeting with and personally expressed my concerns to the president of HCTA. I explained to her that after researching the Gates Foundation, I realized they have put a lot of money into education, yet have seen little to no results. They have left some school systems in much worse shape than when they started. Since that day, I have spoken with numerous people including the director of the EET program. I have continued to spend hundreds of hours researching information about this grant and its effects. However, all of my efforts have fallen on deaf ears. Thus, I was wondering if you could take a few minutes to listen.
I would first like to address the evaluation process. Even if a teacher is evaluated 5 times a year (most teachers are evaluated less often than that), that is still less than .5% of their instruction time for the year. In addition, many teachers instruct quite differently on days they are observed than the other 99.5% of the time. That is why it is possible to evaluate a teacher’s competence but not his/her effectiveness. Also, we will never get the best people in the mentoring positions because the majority of the top teachers don’t want to leave their classrooms. When I sat on the Prescreening Committee for the peer/mentor positions two years ago, one of the people who was also on the committee was a principal. Near the end of the day he said “I wouldn’t let half these people even be teachers in my school, let alone peers or mentors”. Also, the evaluation system is not equal for all teachers. I would like to give you the data that supports this hypothesis; however as of now, all of my requests to obtain this information have been denied. I have asked for the average score of each evaluator to show that the scores are biased. The mere fact that they are unwilling to provide me with this data (that I am sure is easy to obtain) is a strong indication my theory is correct. In addition, when I spoke with individuals concerning this I was told “Individual summaries for peers, mentors, and principals are not available and would have no meaning. If, for example, one principal had a mean score of 24.3 and another 17.1 it would not be indicative of anything because they are rating different faculties based upon different observations. The same would be true for peers and mentors.” Although this statement is true for principals, it is not true for peers and mentors. Because the peers and mentors go to different schools their population is more of a random sample. Therefore, there should be much more correlation between their scores. The fact remains that regardless of the amount of training you provide for evaluators, each evaluator is going to rate with some amount of personal bias. The bias can be reduced by normalizing the evaluator’s scores. You would just normalize scores based on their trait baseline bias and then convert the normalized scores back to a scaled score to reduce the amount of personal bias. We have a 200 million dollar grant with all these so called experts, yet this is not being done.
There have been many studies done that question the validity of EET’s evaluation system. For example:
“In the 1997-98 school year, the district adopted a knowledge- and skills- based teacher evaluation and compensation system (Odden, Archibald, Milanowski, and Conti, 2001). The evaluation standards and rubrics in this new system are based on those described in Charlotte Danielson’s Framework for Teaching (Danielson, 1996; Danielson and McGreal, 2000).The results of these analyses indicate that the Coventry teacher evaluation system has some criterion-related validity in reading, though not in math.”
This is from: The Relationship Between Teacher Evaluation Scores And Student Achievement: Evidence from Coventry, RI CONSORTIUM F O R P O L I C Y R E S E A R C H I N E D U C A T I O N University of Pennsylvania • Harvard University • Stanford University •University of Michigan • University of Wisconsin-Madison Brad White Consortium for Policy Research In Education University of Wisconsin-Madison April, 2004
In terms of the value added method (VAM), there are numerous problems. First of all, there is an obvious problem trying to use the VAM for seniors. If a senior comes to my class in August, takes the first semester exam in January, then exempts their final exam (as most do), it is impossible to calculate what I have taught that student from August to May. Secondly, a lot of my students who are not seniors are still missing pre and post-tests for a variety of reasons. Last year only 42% of my students had an accurate pre and post test. Next, most articles credit the teacher with 15-20% of the learning gains that a student has in a year. Is it really possible to control the other 80% or more so that we can calculate what direct effect a teacher has had? I could continue for pages about what is wrong with the VAM, but instead I want to cite a few excerpts from articles and letters that have been written:
“However, the promise that value-added systems can provide such a precise, meaningful, and comprehensive picture is not supported by the data.”
This is from: Can Teachers be Evaluated by their Students’ Test Scores? Should They Be? The Use of Value-Added Measures of Teacher Effectiveness in Policy and Practice. Sean P. Corcoran. in collaboration with Annenberg Institute research staff © 2010 Brown University
VAM also raises important technical issues about test scores that are not raised by other uses of those scores. In particular, the statistical procedures assume that a one-unit difference in a test score means the same amount of learning—and the same amount of teaching—for low performing, average, and high-performing students. If this is not the case, then the value-added scores for teachers who work with different types of students will not be comparable. One common version of this problem occurs for students whose achievement levels are too high or too low to be measured by the available tests. For such students, the tests show “ceiling” or “floor” effects and cannot be used to provide a valid measure of growth. It is not possible to calculate valid value-added measures for teachers with students who have achievement levels that are too high or too low to be measured by the available tests. In addition to these unresolved issues, there are a number of important practical difficulties in using value-added measures in an operational, high-stakes program to evaluate teachers and principals in a way that is fair, reliable, and valid. Those difficulties include the following:
1. Estimates of value added by a teacher can vary greatly from year to year, with many teachers moving between high and low performance categories in successive years (McCaffrey, Sass, and Lockwood, 2008).
2. Estimates of value added by a teacher may vary depending on the method used to calculate the value added, which may make it difficult to defend the choice of a particular method (e.g., Briggs, Weeks, and Wiley, 2008).
3. VAM cannot be used to evaluate educators for untested grades and subjects.
4. Most data bases used to support value-added analyses still face fundamental challenges related to their ability to correctly link students with teachers by subject.
5. Students often receive instruction from multiple teachers, making it difficult to attribute learning gains to a specific teacher, even if the data bases were to correctly record the contributions of all teachers.
6. There are considerable limitations to the transparency of VAM approaches for educators, parents and policy makers, among others, given the sophisticated statistical methods they employ.
This is written to the Secretary of Education (The Honorable Arne Duncan) in a Letter Report to the U.S. Department of Education on the Race to the Top Fund from the National Academy of Sciences on October 5, 2009
There are also a lot of things missing from the way we calculate VAM. What value is given to coaches and sponsors that spend hundreds of extra hours with students at little or no pay? What value is given to the teacher who helps a student through their parents’ divorce or death in the family? What value is given to the teacher who helps other colleagues with their struggles? What value is given to the teacher who helps write county exams and curricula? What value is given to the teacher who has such a good rapport with his students that students advise the teacher of a gun, and the gun is recovered in a student’s book bag? The answer to all these questions is: none. I should know because they are all examples from my personal experience. Sadly, when we devise a system that does not value these types of actions, we are dehumanizing the teaching profession.
If we really wanted to improve the evaluation system we would need to be sure that the evaluation system is reliable, valid, and unbiased before implementing it. There are hundreds of different teaching styles. Although the current system may evaluate how close we are teaching to Charlotte Danielson’s expectations, there are no studies done to measure how effective the teacher is if they have a different teaching style than the one that the system is designed from. The basic notion that we can assign “a number” to a teacher’s effectiveness with any system is ridiculous. Is it realistic to think we can evaluate a first grade teacher and a high school teacher or a physical education teacher and a math teacher with the same system? That is ludicrous (at best). If one wants a true indication of a teacher’s effectiveness, we would need input from: parents, other teachers in the teacher’s content area, department heads, assistant principals, principals, current students, students from the previous year, students from 5 -10 years prior, the teachers that have those students the following year, and many others. In my 46 years on this earth I have seen many attempts at educational reform. Though intentions are often good, there are never results that match those intentions. Unfortunately we continually direct our efforts to either unattainable or unsupported theories. Most people don’t want to acknowledge it, but there are just four simple things that one needs to become an effective teacher. They need to have proper knowledge of their curriculum, the ability to explain the material in a clear and concise manner, provide a proper learning environment in the classroom and most importantly, the teacher must truly care about his/her students. Unfortunately, these things can’t be put into an equation and assigned a number…although we have spent a lot of time and money trying. Good teachers are teachers who have a passion for education and their students, not ones who can teach to tests and perform “dog and pony” shows 3 times a year.
This system is designed to hold teachers accountable for a student’s desire to learn and positively affect students by forcing them to do their work. However, it does something completely different that is very detrimental to students. Because when a student goes to college, there is no one there to make them go to class. Let alone, stand over them and force them to do work. So in high school we are now asked to stand over students forcing them to do work. Then 3 months later, we send them off to college where for the most part they don’t even have to go to class and certainly don’t have anyone forcing them to do their assignments. That is not preparing students to be successful.
The EET program needs to be quickly removed, before any more damage is done to the Hillsborough County School System. Charlotte Danielson wrote “In addition, the evaluation process must be done in a spirit of respect and professionalism. It can’t be, or appear to teachers to be, punitive or contributing to a “gotcha” effect” (pg. 57, The handbook for enhancing professional practice: using the framework for …By Charlotte Danielson). Yet this is exactly what is happening. In the past 2 years most teachers I have spoken with have one of two thoughts. The first thought is that the principal knows he/she is a better teacher than their evaluation indicates, but is unwilling to speak-up on behalf of the teacher. The other thought is that the system is unfair, but there is nothing anyone can do about it. Neither of these feelings are ones that would help promote effective teaching.
I would also like to clarify one thing. Several people including students, teachers, and administrators have asked “why not just do the dog and pony show 2-3 times a year?” and to be honest, that would be much easier. I have been an effective teacher and will continue to do so until my last day in a classroom. This letter is not intended to “blame” Hillsborough County because I understand we are being mandated at the state and national level to do these types of things. I’m certainly not intending to offend anybody, I’m trying to do what I have been taught my entire life, and that is to do the right thing. We as teachers need to ‘stand-up’ to the people that are making policies that contain no realistic plan for implementation. We need to do what is right for the teaching profession or our children are going to suffer. The sad truth is that politicians and philanthropists are preying on teachers. Teachers are so dedicated they will do whatever it takes to remain teachers, even if they know it is fundamentally flawed.
The entire EET program is aimed at making teachers more effective. Yet, my 23 years of teaching have taught me that one of the most important characteristics a student looks for in a teacher is a positive environment in which to learn. However, because this system has such a negative impact on teachers, it is doing just the opposite. So the entire design of the program is counterproductive to its objective.
Wharton High School
poignant and thoughtful. Thank you for contributing to the conversation. I have written elsewhere about the need for a ‘Trust but Verify” system. I also feel strongly that we ought to evaluate schools, not individual teachers, and trust those in the school to take care of problems, whether it’s unwilling or incompetent teachers or students who are far behind, unmotivated, et cetera. That way we don’t have to test in every subject,for one thing
You may be interested in this short cautionary piece on the research about value-added. Ed Haertel, who is a co-author, is the chair of the National Research Council Board on Testing and Assessment, the most prestigious body in the country on these issues. Jesse Rothstein was a former member of the President’s Council on Economic Advisors. The dysfunctional outcomes in Houston that are discussed are based on the Sanders tool.
One thing I’d really like to comment on is that fat reduction plan fast can be carried out by the perfect diet and exercise. A person’s size not only affects appearance, but also the entire quality of life. Self-esteem, depressive disorder, health risks, and also physical abilities are influenced in fat gain. It is possible to just make everything right whilst still having a gain. If this happens, a problem may be the culprit. While an excessive amount of food and not enough exercising are usually responsible, common medical conditions and trusted prescriptions can certainly greatly help to increase size. Thx for your post here.