A Paradox? Or a Genuine Contradiction?


As always, remember that John’s book The Influence of Teachers is for sale at Amazon.

Can something really be good and bad at the same time? How about that delicious but fattening dinner you had last week? It was great, until you added up the calories, right?   Now what about a school? Can it be both good and bad at the same time?  Is educational quality — like beauty — in the eye of the beholder or do test scores say it all?
Good/ Bad blog

More precisely, can a school with only 18% of its 4th graders at grade level in reading be considered a good school? Before you say, “Of course not,” please read on.  Because we discovered that the FIRST graders at that school were reading confidently and competently. That’s right: the first graders were readers, but the fourth graders weren’t according to the results of the state test.

Is this a paradox, or a full-blown contradiction?

I’m asking these questions of you because we are asking them of ourselves, in our reporting for the NewsHour. It actually began with a different question: “Are the Reading Wars (phonics versus whole language) over, or do they rage on, but under the radar?”

As a starting point, producer Cat McGrath and I decided to see if we could get into some schools with terrible reading scores.  While a couple of principals turned us down, the principal of PS 1 in the South Bronx in New York City, said, “Come on up. We are a great school.”

“Yeah, right,” we thought. After all,  we had the scores in front of us: not even 18% of the school’s 4th graders were competent readers.

We went up to that high poverty neighborhood, where crime scene tapes proliferate and unemployed men linger on street corners. PS 1 fits right in. It is grim looking from the outside, a fortress-like building with few windows.  Inside is different, however. The classrooms and corridors of PS 1 are bright and full of energy, with student work displayed everywhere.   Jorge Perdomo, who’s led the school for  five years, took us to his first grade classes.  “Our first graders are reading,” he claimed, “and writing too,” pointing to their papers on classroom walls.

Our skepticism did not seem to bother him or diminish his enthusiasm.  “Come on back anytime — with your cameras — and see for yourself.”

We did.  We saw veteran and rookie teachers giving their first graders a strong (and essential) foundation in phonics.  First graders were learning that letters make sounds, that combinations of letters make different sounds, and that, when letters are strung together, they can make words.  They were decoding.

That’s only part of the battle, of course.  Comprehension, actually understanding what the words mean, is a tougher challenge.  To test that skill, I asked the first graders to close their eyes while I wrote a nonsense story on the board: “The blue pancake went swimming in the lake and ate a frog.”

They read it eagerly and confidently.  When I asked what they thought of the story, they said without much enthusiasm, “It’s OK,” but that was because they were just being polite to the white-haired stranger.  When I asked, “Is there anything wrong with that story?” (a question that gave them permission to be critical), they were impossible to contain.  Pancakes aren’t blue, pancakes can’t swim, pancakes don’t have a mouth, and pancakes can’t eat a frog.  The words tumbled out of their mouths.

The principal was right about his first graders, but what about the fourth graders and their 18% competency?

Adults offered several possible explanations.  By the time they’re fourth graders, one teacher said, they are no longer naive. They know that their Dad is in prison, or their Mom has a drinking problem, or maybe they now have to be responsible for their younger siblings. Life has caught up with them, and reading no longer matters.

The test is much harder, several offered.  Now they have to reach conclusions and draw inferences, and that’s much tougher.

We looked over past tests, and, sure enough, the passages were about subjects that poor kids in the south Bronx may not be familiar with (cicadas or dragonflies were two of the subjects, for example). Answering the questions did require inferential leaps, just as we had been told.

So we asked to talk with a couple of fourth graders who were reading below grade level, and here’s where it got complicated.  As you will see in the NewsHour piece (embedded below), both children, one age 9 and the other 11, handled the passages and answered all the questions. Maybe the personal attention helped, but they read easily and drew inferences correctly. We only ‘tested’ a couple of kids, but both were below grade-level, their teacher assured us.

Where does that leave us? Maybe the kids are terrible test takers? Maybe there’s too much stress (there’s a couple of weeks of test-prep build into the schedule)?  Perhaps there’s a fundamental contradiction between testing reading and reading itself?

I have a theory, but I would love to know what others make of this.

You can view the completed piece from PBS NewsHour (it aired on June 6, 2011) here:


28 thoughts on “A Paradox? Or a Genuine Contradiction?

  1. John, You’ve just described a very poignant and important example of the struggle over standardized testing. In the comments on your op-ed piece about Joel Klein, someone raged about his being the testing monster, promoting standardized scores as the end-all of school quality. That person would like this story; it humanizes the struggle, by showing some kids whose scores are low but whose competence (measured casually by you) is as expected. Your reporting is not a definitive piece of evidence in the disagreement, but it is a really good example of why the struggle over evidence goes on.


      • I am a 4th grade teacher. What I see when students are unsuccessful on reading tests is this–immaturity. Most of my students who have low scores on standardized reading tests can do well on one article and accompanying questions, especially when I sit with them as they complete it, just as you did in the story. However, they don’t have the stamina (is it lack of maturity?) to complete a test with 3 or 4 articles and a total of 35 to 40 questions. Some have attention issues–one of my student’s mother abandoned her and has another family in another state. She does remarkably well considering and she is amazingly cheerful. Further, even the very capable students are children, 10 years old, they would rather be playing soccer, reading for fun, playing math games. A high stakes reading test? For whom? Not for them. And what’s the answer on how to grade schools. Perhaps we should look at students’ progress. Are they making a year’s progress each year? After all there is 11+ months difference in age range of a grade level of students. And what measure should we use to gauge their progress? That is the question.


  2. I find this passage amusing because it conveniently ignores any attempt at being self-critical: “Adults offered several possible explanations. By the time they’re fourth graders, one teacher said, they are no longer naive. They know that their Dad is in prison, or their Mom has a drinking problem, or maybe they now have to be responsible for their younger siblings. Life has caught up with them, and reading no longer matters.”

    The problem is described as being “life” as opposed to the oppressive environment that these adults have willfully imposed on children. Denial is amazing thing to witness in action.

    While clearly there are many potential factors for the decline in reading scores and not enough information has been provided, by fourth grade students fully internalize learned helplessness. The fact that they will never be allowed to pursue their interests within a school environment for them, taken as a whole, is no longer something worth fighting over.

    I doubt indicting the obvious issue of the represive school environment will play a role in the analysis that will be presented, but it is intellectually dishonest not to consider its impact.


    • Readers might want to google Cevin Soling to learn about the film he has made on this subject–and go see it…


  3. It’s not a matter of repression, but rather one of paradox. Most schools abhor paradox; good schools do their best to accommodate it; and the best revel in it. Kids who read for ideas may or may not reflect those ideas on a multiple choice test: how could one honestly say only one answer is right on a bubble test when the paradox of multiple right answers overwhelms “common core standards?” It’s not just poor kids, but rather all kids, particularly in middle grades, who respond to tests as they respond to other “intrusions” on their culture – from parents to bullies. They do their best to ignore them. In rich kid schools the “culture” of the school celebrates test winning (since, for them it is a game). In poor kid schools that culture is tuned more precisely to questions like that with which you began: questions that give permission to critique, to apply knowledge to solve problems, and to … think. The paradox you found in the South Bronx is the same paradox that gave us Al Smith instead of Franklin Roosevelt: problem solvers are not always as nice as their more polished peers.

    That is also what’s wrong with Joel Klein, incidentally: real educational change comes first from looking at what’s going on, finding what works, and building on success. Klein solved problems rather than rolled with paradox. He didn’t listen, he solved. Like yuppie kids, he wasn’t embarrassed at his solutions, and enjoyed being the smart guy in the room. Real change agents are a lot more cautious, and try to find smart guys to hang with before making waves. And that means that Klein’s solutions will themselves require solutions, again and again. The contrast is really with Al Smith: he found problem solvers already active – like Robert Moses – and gave them authority to build parkways. His parkways made jobs while they built parks while they moved the yuppies to work. What a remarkable paradox for the 1920’s??!!

    We need more paradox, not less; and it’s not just tolerance for paradox that counts, but real, conflicting, contradictory truths. That’s what city kids understand and tests don’t.


    • I had a wonderful professor who taught us about the importance of ‘a high tolerance for ambiguity.’ I have tried to keep that lesson close. My own view in this particular case, however, is that the situation is not a paradox but an actual contradiction. We don’t say that on air, of course, because we respect the viewers (who probably don’t even care what I think)


      • How interesting. In high school, I had a teacher who read aloud to our class the objectives she crafted for the English Department. One of the objectives, I’ll never forget, was to become more comfortable accepting the ambiguities of life. I credit this objective, and her success in incorporating this objective into daily coursework and instruction, to being very influential in shaping my mind.

        As with this situation, most of life’s questions don’t resolve easily into multiple choice questions. All I know is that I’d like my child to benefit from an eduction that similarly embraces ambiguity as an important lesson to learn when transitioning to adulthood. I fear that the message we are sending by putting such an emphasis on multiple choice tests.


  4. In any sort of assessment, multiple measures need to be used, and used both honestly and intelligently. In attempting to measure the effectiveness of teachers at universities, some administrators look primarily at student evaluations, but ignore the student’s responses to the questions about how many classes they missed, and how many hours per week they studied. Is it any wonder that the most common grades “given” students in such situations has become “A” ?

    There is no “litmus test” of good teaching or of student achievement. To effectively pretend otherwise is asking for big trouble.

    Leadership matters whether the assessment is of student achievement, teacher performance, or rating the value of financial instruments.


  5. I think you may have hit the nail on the head when you talked about the topics on the test, which are often not ones that poor kids are familiar with. The tests are really problematic, not only because they limit creativity, but because they are biased to the benefit of certain kinds of life experiences. They don’t measure reading. They measure answering multiple choice questions that assume knowledge you (as a poor kid) probably don’t have. It’s not the same skill, although the Powers that Be like to pretend it is.


  6. Where that leaves us is most likely yet another piece of data with follow-up priminary investigation – BOTH OF WHICH WILL LIKELY NOT RECEIVE THE ADDITIONAL INVESTIGATION THEY SHOULD.

    Clearly, these data suggest that maybe the standardized testing is not aligned with at least one vision of what being a good reader means. Or they suggest that the passage of time from first to fourth grade has included situations detrimental to testing well on standardized tests. Or …, or …, or … Bottom line: the contradiction (I agree) should have raised a red flag that lead to an investigation of each “or” to inform those involved of understanding leading to changes.

    One huge and fundamental problem with teaching currently is the dominance of standardized tests solely to point fingers at the “obviously guilty” parties. This article points out that even standardized test results can inform us if we just look beyond “THE NUMBER”. We must make the effort if we want improvement in effective learning for all. Linking a number to any prescriptive solution is doomed to failure.

    Consider the medical profession. If they performed as the politicians, foundations, experts, and venture billionaire philanthropists (with their favorite prescriptive and thus doomed solutions) would have the education profession perform, they’d have standardized tests leading to automatic treatment based on the scores. There would be no office visits to consider history and patient input; there would be no referral to specialists for further investigation if suggested; there would be no discussion among the medical team involved; AND, for sure there would be none of the “second opinions” sought at the insistence of the primary doctor. (ASIDE: On a personal note with political overtones – alert, alert … – could this be what Ryan and the GOP are suggesting / proposing?)

    We need local broadly engagement (many parties) in dialogue dedicated to finding the appropriate solutions for local issues – engagement that indeed would look at data and for sure investigate the contradiction (or what ever you want to label it) for reading data at the Bronx school.


  7. This was an excellent piece. It sets out to puzzle your viewers and creates some skepticism. It shows what the test scores can miss.

    Perhaps next you will do a program that deconstructs and demystifies these standardized tests by asking the public to consider some questions that are implicit in test-based accountability: Is it possible for every child to pass the test (i.e., score on grade level)? How do test makers decide which test questions get onto the test and which don’t? How are cut scores determined? Why are there 4 performance levels (rather than 3 or 5, or 6 etc.)? How many correct answers are needed to score a level 2? A level 3? A level 4? What is the most consistent correlate with standardized test scores? The answers to these and other questions about these tests can help your viewers consider various perspectives on why PS 1’s 4th graders’ scores are disappointing. Knowing more about these tests may also help inform the public about the validity of these tests for judging schools, teachers, and students.


  8. I noticed the irony at the end of your report that the test results will come back after the school year is over. Of course that does undermine any potential use of test results as a diagnostic tool, since (a) time changes a lot of things when your 10 years old and (b) in the average city school there is about a 20% mobility rate between June and September, making the tracking of test scores with individual kids a hassle if not impossible. Another one of those ironies is that very few, if any testing systems correlate AGE with scores, leaving kids who are held back with redundant scores.

    I once evaluated one system that routinely held back 25% of its 9th grade in order to maximize the “gain score” from grade 7 to grade 10, and drilled that 25% mercilessly. Produced remarkable scores, which the Gates Foundation routinely cited as the best in the state. Of course they were based on a fraudulent policy, ignored that age-related variable, and were “justified” by a former principal because 9th graders were “unprepared.” I did help unseat him, and his Guidance Director, Superintendent, and School Committee support. But that’s a while ago. And a much smaller system than PS 1 must endure.

    Great report. Look more and longer at great “failing schools.” There is much to see.


    • Looking at great FAILING schools? What a lovely idea! I think there is a lot to learn from the knowledge that teachers from those “failing” schools have and know how to use in order to move children to higher levels of learning. I have never understood why a first grade teacher who is given a student at a pre-k level and is able to get him/her to beginning or middle 1st grade level is never looked at as a successful year for that student. The only thing that goes out to the public is that the test score shows this child is still behind and therefore must not have a qualified teacher working with him.
      And I won’t even begin to talk about the ELL expectations that are now out there on test scores as well as our special education students. When will this nonsense end? I believe in accountability but one size does not fit all no matter what you want to believe.
      There is no reason why these kids should not be learning about cicadas or dragonflies. I personally would like to see all this money schools spend on testing spent on technology and good old field trips. Our children need knowledge and we are at a day and age when getting knowledge to our children is easier than it has ever been. Unfortunately we are not keeping our schools and the teachers up to par with technology due to the cost and the closed minded people who are still not ready to recognize the potential technology provides. Our children are twiddling their thumbs and nodding off while we sit in classrooms trying to convince them to pick A, B, C or D. Oh, but wait…….I forgot …..tests are slowly becoming computerized that should solve the technology side of it.


  9. Brilliant and insightful post, John. One of the most poignant points:

    “We looked over past tests, and, sure enough, the passages were about subjects that poor kids in the south Bronx may not be familiar with (cicadas or dragonflies were two of the subjects, for example). Answering the questions did require inferential leaps, just as we had been told.”

    Many tests are imprecise — not just because kids may have a bad day but because the items, like the ones to which you refer, do not make sense given context and culture.

    When is analyzing mounds of data some of this error washes out – but when looking at school and teacher level data – well that is another story….like the one you have surfaced here.

    So when it comes to assessing teaching, isn’t it time to give teachers a chance to explain and use student test score data – and not just be judged by it? http://www.acarseries.org/papers/Barnett_Berry-New_Student_Assessments.pdf

    Thanks John.


  10. I appreciate all these thoughtful comments and hope that all of you were able to see last night’s piece on the NewsHour (or, if you didn’t, that you will go online and watch it now).

    I’m just a reporter and so cannot carry the water on this, but what we are doing to many children’s natural curiosity by constant testing borders on criminal, in my personal view. We must have accountability, but, as that fourth grade teacher said last night, we have to get beyond just a score on a test.


  11. John, have you ever looked at the portfolio assessment that Ann Cook and others in NY Public Schools developed, for which they received approval from NY State Dept of Ed? I agree that “we must have accountability” and wonder whether you have looked at what Ann and others did to develop and implement more applied assessments.


  12. John, thank you for your piece last night. I lived the story of PS1 and worked with a school that struggled with only having 30% to 40% of their students—depending on the tested grade—pass their state test.

    I was employed at the site to increase student achievement and I was successful. Or more specifically, the students and teachers were successful. In short, we discovered that children were struggling transferring what they were learning in the classroom to the state test. We made some small instructional changes and our scores increased quite significantly. A school, that had failed to make AYP for five consecutive years, had close to 65% of the students pass their state CRT last year.

    How did we do it? We asked questions that looked like the state test. For example, our teaches were asking children, “What is the main idea?” but the state was asking “What would be the best new title for the passage?” Although they are both main idea questions, ELLs and struggling readers did not know they were main idea questions. We introduced this question stem early in the year and kids were more successful on the state test.

    I realize we could all debate the merits of this effort but a school, that lost its principal because of the School Improvement Grant in March, was celebrated when test scores were released in the summer. All of this by simply changing how we were asking questions.


  13. Congratulations on improving the scores. What I wonder about, however, is Ms. Cartegna’s observation, that we are taking the joy out of reading. Any evidence that the kids at your feel better about reading, want to read, et cetera, because they passed the test? I recall how my son taught his HS students in Brooklyn how to psych out the test, beat the system, et cetera. He made it a challenge, and it kind of worked, although he knew that he was also teaching them to be cynical. In the end, I think he decided that this was a zero-sum game.


    • It is a fair question and one I don’t have the answer to. That said, we did survey our students about how they felt about the examination and most students noted that they felt much better about this CRT compared to past experiences. Likewise, they noted that they felt less anxious. We attributed this to the fact that by the time they got to the test, they had been asked hundreds of questions framed like the state examination.


  14. @Aaron – It is funny you should mention the School Improvement Grants. I noticed the principal at PS 1 has been there five years. If his school were to get SIG money, he would be out, despite the evidence that John points out that the school is a success.

    We have to move away from relying so heavily on test scores in education policy. We all acknowledge that they paint imperfect pictures of schools and teachers, yet they are often used in high-stakes decisions as if they are 100% accurate..


  15. This is a very interesting piece, John. And many interesting resposnes. Thanks to all. A few ideas on why there is a contradiction here:

    1. Speed. Passing a state test involves answering a certain number of questions in a certain amount of time. Perhaps the “below grade level” kids are slower.
    2. Content knowledge. John, you referenced Robert Pondicscio. I think he, E.D. Hirsch and others make powerful arguments that accumulated knowledge matters a lot; these kids may have less of it.
    3. Amount that the kids have read. Related to the above point, I wonder how much these kids have read compared to more affluent peers. That will make a difference re content knowledge and speed.
    4. Home and community environment. The tests ask students to do things like make inferences and synthesize information. Students with college-educated parents may get much more practice and may be better at it.

    One more thought, perhaps for the teachers more than anyone. In my experience as a teacher and parents, the best way to prepare these kids for these tests would be to have them read A LOT. There’s pleasure in that, if the kids can get momentum on it. I’d trade another 50 books read by my kids for 2 weeks of test prep. (Truth be told, I still want 1 week of test prep because as several writers observe, there are tricks that are very helpful to know.)

    And I agree that some states/schools spend too much time testing their kdis.


  16. I would need to know more about the tests and the schools to comment with much detail, but what I can offer is this. On a political level there is a paradox and on the school level there is contradiction. Politically, poverty has been the driving force behind funding improvement plans for low performing schools. When there is a need, government must find a way to help. For example, since No Child Left Behind began nearly 2 decades ago, funding, testing, and accountability practices have grown. Standards have increased and today it all continues to boom. This booming in an “age of information” as Obama has coined it has brought us to this great uncertainty about just about everything in public education. Herein exists the paradox. Since the first publication of Webster’s dictionary 190 years ago, the English language vocabulary has multiplied by ten at least. The mixed terms and interchanging of methodologies behind teaching Math skills for example has reached an all time high. 4th graders have to be very strong readers to do well on a state or federal test, even after extra hours are provided for them to prepare. Confusion over terms in a test study guide can be even more confusing on the actual test. At one time a typical study guide matched pretty well with what would be expected on a test. Today, this is not always the case and is an alarming cause for concern (confusion.)

    The point about contradiction is this. Why did the 4th graders perform so much poorly than the 1st graders contradicting prior years’ results? Maybe they rebelled or maybe they were simply confused. Children often tune into things long before administrators or teachers do. They muddle along incapable of really articulating their confusion or understanding of the importance of a state or federal test. They are little sponges just absorbing everything unwittingly. Their silence however, does not negate their ability to sense the confusion and controversies over the high stakes testing, over the accountability issues, and over the school grade demands. They are children and they respond to it the best way they can, positively and negatively. They muddle, the parents huddle, and the government hammers.
    The pressing questions today regarding testing, accountability (teacher evaluations) should be handled very carefully. As Diane Ravitch points out, the top performing nations do not earn their titles from test scores. Perhaps they are just too largely populated to keep track of test scores or they are in fact more innovative in Math and Sciences. What ever the case, we must hold on to what is core in being human and at least teach that as well as we can. Until American children as one show more common core civility, empathy, and pride, I worry about putting too much more emphasis on TESTING. (maybe the common core civics tests are on the horizon)


  17. Again, a lot of thoughtful comments. Please read Robert Pondiscio’s blog, cited in my comment above. As I said, producer Cat McGrath and I wanted to spark debate and discussion, and it has, but not enough. Please circulate the link to this blog and to the piece itself. We need people in power to think about what we are doing….


  18. It would be interesting to see how kids across all socio-economic strata would do if we gave state assessments that were written at grade-level, but contained content that was as obscure as possible — content similar to Robert Pondiscio’s example on the Core Knowledge blog “ My Havanese is snoozing on the divan.” Who would understand that?

    I appreciate your piece because you showed an orderly urban public school where children are learning and teachers are teaching. If one were to merely look at the school’s assessment scores, it would be easy to assume that the school is a failure. But as someone who has taught most elementary grades, I know how challenging it has been for the teachers to get the kids where they need to be.

    I don’t doubt that if we had a curricula that emphasized discrete content that every fourth-grader should know, the kids in PS 1 would do well. Those teachers and students would work hard to know and understand what they need to know and understand.

    But as our curriculum and state standards are currently designed, state assessments are a “crap shoot” for some urban kids. This is unfair, undemocratic and damaging to children’s psyches.


  19. I wonder about some of the parameters of the test – not just the vocabulary and unfamiliar content, but the setting, the physical requirements, the motivation. Do different tests, say computerized leveled tests, give the same results? Does lack of student feedback play a part? If the results of the tests were to be provided in the middle of the year, what difference might that make? What if shorter versions of the tests could be given, scored, and returned in a more timely manner? I think that, until we know some more answers to the effects of changing test parameters, we are just guessing.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s