Measuring soft skills

(This post was co-authored with Arnold Packer.)

Reliability and Validity are the Alpha and Omega of testing. A test that is reliable can be counted on each time it’s given, while a valid test measures what it is supposed to. Tests that meet these two criteria are the gold standard of assessment..

Soft SkillsFor example, making someone swim 100 yards to test whether or not he can swim would be a valid and reliable test. If you sink, you flunk, and that’s true each time the test is given and is independent of who is doing the testing.

However, when teachers are trying to assess ‘soft’ skills, the waters get murky. How can we measure the ability to work with others, process information from disparate sources, communicate persuasively, or work reliably?

Consider the concept of reliability. Is an employee who is late to work one day most weeks reliable? What grade would you assign her, on a scale of 1-5? Suppose you found her explanations plausible (child care problems, for example) and you cut her slack—and gave her a 5– because you’ve been there yourself? However, another employer might give that same employee a grade of 2 or 3, because, after all, late is late.

There’s no set scale for measuring ‘working with others,’ meaning that the rating may vary depending upon who’s doing the rating. And what to one teacher is ‘persuasive communication’ may fall flat with another. There’s just no easy way to measure those all-important ‘soft’ skills.

And they are important. Put yourself in the position of Human Resource officer, trying both to be fair and to have some confidence in an applicant’s likelihood of success on the job. You want to know as much as you can about a potential hire, but now all you have to go on are a resume, impressions from an interview, and maybe some recommendations.

Or consider higher education. College admission officers don’t want a freshman class made up entirely of students with perfect GPA’s. They know that students with ‘soft skills’ and academic proficiency contribute greatly to campus life.
And so they consult SAT (and maybe AP) scores, scores on high school exit exams, references, and high school GPAs, but how reliable and valid are these? High school grades and GPAs are clearly unreliable–as every student who has chosen courses and teachers to enhance their record knows. References are even less reliable, and the lack of predictive validity of SAT scores has led many selective colleges to abandon them.

Clearly, both employers and admissions officers could use more information—if only that information were reliable and valid.

While we recognize the complexity of measuring soft skills, we believe we are close to meeting this challenge. This summer, with help from a Kellogg Foundation grant, we are asking mentors at 28 community-based organizations to assess teenagers’ performance and provide them with a Verified Resume. We ask the mentors to assess high and middle school students on such traits as responsibility, work ethic, collaboration, communication, problem-solving, critical thinking, and creativity.

We ask the mentors to write a two -sentence description of the context in which each of the traits was demonstrated. Was the teenager responsible about picking up trash in the park or helping out on the surgical ward? Communicating to a friend about the homework assignment requires a different skill level than communicating about obesity to a large community audience. There is no reasonable rubric that will cover this amount of variation.

Finally, mentors also grade the students’ performance on a scale of one (“cannot do it”) to five (“does it well enough teach others”).

Having a Verified Resume of performance will potentially improve the package of information available to college admissions and HR staff, if those busy folks take the time to look at the VR in the few minutes that they devote to considering each applicant. The challenge is to convince them that the VR will improve their decisions.

Five communities are involved in the current Kellogg-financed project: Baltimore, Boston, Grand Rapids, Montana and Salt Lake. Baltimore provides a good illustration of the process. All of the eight participating not-for-profit organizations there are involved in youth development. Two have students creating videos; another sends students to City Hall; students in a third organization engage in debates; those in a fourth help younger students with algebra.

Here are two grades and comments pertaining to a student in a media project.

Skill Rating Observations
Responsibility 4.50 G. created and took on personal projects above and beyond the requirements of the programs in which he participated. He often came in early to ensure that these tasks were completed professionally and on time.
Team Player 3.50 G. has consistently led his team members to complete projects on time. In the Festival Committee, he helped to organize his peers to accomplish the production of promotional videos and prepare for public speaking events.

But the process doesn’t stop when the mentors give their grades. Instead, we will survey employers to see if they agree with, for example, that ‘5’ the mentor gave their new employee. If not, revisions are in order, and perhaps some retraining of the mentor. That is, we envision the VR as a living document, one that is always subject to on line revision.

We got into this because we believe performance traits like responsibility, tolerance for diversity, ability to communicate and work ethic matter.. Because they matter, we must also figure out how to measure them reliably.

It’s not going to be easy, but nothing of importance ever is.

Can the VR come close to the assessment community’s gold standard? The quest for the perfect is frequently the enemy of the good. John Maynard Keynes is credited with saying, “It is better to be roughly right than precisely wrong.” Teaching, measuring, and certifying soft performance skills are important, according to both employers and colleges. We cannot afford slavish adherence to shibboleths regarding statistical purity without asking if new measures can provide more real reliability and better predictive validity.

Advertisements

15 thoughts on “Measuring soft skills

  1. I think the concept of VRs is marvelous and long overdue. More important, I believe they should be an essential component of how we routinely evaluate students within and coming out of our schools. Some of the most important things we teach, or need to, aren’t on the tests.

    Like

    • Good advice, Kadeejia! Your renedmir is timely because I also still make those same mistakes sometimes. One thing that helps me overcome the one man show syndrome is being part of a tribe, specifically the Tribe Syndication Association (TSA) where I found you. Glad to meet you

      Like

  2. While I like the idea of making these soft skills an explicit part of the curriculum, I bristle at the idea of trying to hard to quantify and measure them as if we could with any certainty. Even with the “hard” skills, researchers talk about the difference between observed scores and true scores, and a margin of error. The steps described here that are supposed to mitigate subjectivity are not all that impressive, and the idea that you could then track a student later, using the same descriptors in a different context with different scorers/raters and produce something quantifiable and meaningful… come on. We’re taking this all a bit too far.

    Like

  3. I agree with David. Yes, part of the curriculum, no to quantifying it, and especially no to making it high-stakes. “Because they matter, we must also figure out how to measure them reliably.” WHY? Because they (soft skills) matter, we must model the skills, teach the skills, mentor the skills, and develop relationships with our students so that they will gain our trust, and look to us for guidance when making life choices. Let’s put more effort into nurturing the heart of teaching and learning, and less into quantifying and measuring. We can measure results and outcomes to see if we are improving as a system. (drop-out rate, unemployment, prison incarceration, teenage pregnancy, etc.)

    Like

  4. The National Council on the Accreditation of Teachers requires the schools of education it accredits to choose, define, promote, and measure “dispositions” that are important for teachers. Perhaps some of the better efforts of teacher education programs could be consulted in this work as well.

    Like

  5. As a middle school Family and Consumer Science teacher(which you might recognize as Home Ec), I am surprised by the number of education stakeholders who are unaware that Career and Technical Educators have been assessing the “soft” Workplace Readiness Skills of our student as well as specific performance based skill competencies for years. These quantitative external measures are used by students as workplace credentials, by schools for CTE program evaluation and improvement, and by business and industry to inform human resource decision making. Education policymakers and reporters might find it worthwhile to review this 2008 ACTE brief.

    http://www.acteonline.org/uploadedFiles/Publications_and_Online_Media/files/WorkReadinessCredentials.pdf

    Like

  6. I too agree that trying to quantify any score is never going to be perfect. Indeed in a lab course I have developed and taught, we talk a great deal about uncertainty in everything – with one of my comments always being “The only thing certain is uncertainty.” BUT that doesn’t make trying to quantify the soft skills totally useless. It just means that they must be used with the understanding that there will always be some sort of level of arbitrariness in them. The more ratings the better; the more information included with respect to the ratings the better. AND the more discussion of the totatality of the ratings and comments, the better one can understand the appropriateness of the soft skill level for the individual.

    Like

  7. Very interesting, Should I blog back to answer some of the comments? I would say things like:

    In response to John Bennett: Yes, quantifying any score is never going to be perfectg and yes the measures must be used with the understanding that there will always be some sort of level of arbitrariness in them. The more ratings the better; the more information included with respect to the ratings the better. AND the more discussion of the totatality of the ratings and comments, the better one can understand the appropriateness of the soft skill level for the individual.

    In response to Diana and David: If it is not measured people will not pay attention. Also, the youngsters needsomething to show to employers and colleges and, for may reasons, the academic transcript is not enough

    Like

  8. All metrics are, by definition, relative. “Soft” is no more soft than an 86 on a test, if – and when – it represents a real negotiation, discussion, and comparison of past, future and others in the class or group. The joy of the VR is that it scaffolds that discussion, and reduces the indecision of students, teachers, parents, and, ideally, employers in assessing and documenting skills young people show in realistic, project-based, individual and collaborative ventures.

    Framing such “soft” skills as more “subjective” ignores the levels of subjectivity involved in choosing and framing multiple choice test items, as well as their interpretation against “standards,” themselves subject to seemingly endless negotiations. When young person, teachers, and employers agree that projects are “a lot” better, or “much” worse, it all depends on what they intend to do with that agreement. In bad times, it can mean survival or departure; in good it could mean more money or less responsibility. Are those “times” more subjective than the dates of a war or the terms of a President?

    I once worked with a school system that routinely retained, or held back, the bottom 25% of their 9th grade. Claiming “they weren’t prepared,” in fact the Principal later admitted that an extra 12 months of drill and practice for the lowest 25% guaranteed the highest “gain scores” in the state from 7th to 10th grade. So much for objective measures.

    Like

  9. Hello, i think that i saw you visited my website thus i got here to return the choose?.I’m attempting to in finding issues to improve my web site!I guess its adequate to make use of a few of your ideas!!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s