Weak Academic Measures
We need to use numbers to evaluate our schools and our students. However, we need to understand what numbers can tell us, and more important, what the numbers can not tell us. Even though many academic measures exist, the errors we make when reading academic measures typically result from just a few incorrect notions. The many misleading conclusions we hear typically result from the same few errors.
Below, we look at how these errors cause most of the commonly reported numbers use to describe schools and students to be misleading to most people. We will also suggest alternative measures that would be less misleading for the desired results, or provide more information.
DRAFT: Last Updated: January 2011
|Individual Class Grades &
Individual Test Grades
Unreasonable Expectations: We want grades to accurately inform us how the students have performed and precisely sort the students by rank. But performance is made of many vague variables, thus grading always depends on arbitrary decisions. No single number can reasonably account for all those variables, and each classroom makes the arbitrary decisions differently.
Oversimplification: There is no mathematically correct means to reduce performance to a single meaningful number. Cognition, success, and achievement are too complex to reduce to a single number. Each test and each grading system arbitrarily defines 100%. The definition of 100% is totally arbitrary, and is not founded in what can be achieved. It is intrinsically misleading to compare success to an arbitrary 100%. Most grading systems fail to distinguish between high-level achievements and accelerated learning - distinctly different concepts.
Accuracy Problems: Grading systems involve many arbitrary choices. Many of those arbitrary choices result from the demands of school administrators, politicians, and parents. As such these choices reflect social pressures, not measures of academic success.
Alternate Measures: Separate out distinct aspects of performance: behavior, low level skills, and high level skills
IQ & GPA
Class Average Test Scores and Pass Rates
We have looked at just a few examples of how we let numbers mislead us. Typically, the numbers are not wrong; we simply expect the numbers to tell us more than they reasonably can. This usually results from accounting methods that oversimplify information into a single number, while washing out the information we really want to know.
Above, we just gave a few common examples. You should learn to ask the same for all reported numbers. How have I expected too much information from this number? How does this number oversimplify the concepts? What information about range and distribution was washed out in the averaging process? If you regularly ask these questions, you will not be deceived by numbers.
Testing Bias: All standardized tests are biased towards questions
that can be answered quickly, and against large complex problems. But,
life's successes are comprised of large complex problems, not quick answers.
Tests are also biased towards individual work, even though most life successes
are derived from cooperation and communication. Most IQ tests, which include
the biases already mentioned are heavily biased towards verbal, spatial
and logical realms and against social, physical, and musical problems.
Accelerated learning vs. high level learning: A quick distinction is that high level accomplishments typically take more than one day. NCTM suggests that high school students should regularly solve problems that take a week. Tests made of questions that students can answer in two minutes or less address low level skills.
Good guessers vs. poor test takers (test anxiety): Some students are very skilled at test taking. They can guess the right answers even when they didn't really learn the material. Some are poor test takers. Even though they learn the material better than others, they do not get higher scores. Many factors contribute to poor test taking: anxiety, fatigue, not understanding the structure of the test, knowing the subject, but having trouble with the test language, etc.
Knowledge without understanding vs. understanding with minimal knowledge:
We have all met people who are proud of their knowledge, or grades, even
though they do not seem to really understand the material. We've also
met people who can tell stories that demonstrate deep understanding even
though their subject knowledge is extremely limited. Methods to test knowledge
are different from the methods to test understanding.
Arbitrary grading decisions
Elementary learning skills: When a student starts elementary school he has to learn how to function cooperatively and respectfully in group settings where the goals and needs of the group differ from his own desires. As he progresses through elementary school he has to learn how to manage his own learning, taking notes, guiding his own studying, learning how to use resources, etc. By the end of sixth grade if he has not learned how to master his own learning, he is limited to dependent learning styles. Regardless of his test scores, he is not a strong learner. Yet, none of these things are scored on standardized tests. Few even can be. The tests do not measure the most important elements of learning.
Teacher pleasers vs. learners: Some students are very adept at
the game of school. They are good a figuring out what the teacher will
reward, and what gets the grade. They do not necessary learn more than
others, but they do get good grades. Some students are skillful learners,
or problem solvers. They can quickly grasp ideas and figure things out
on their own. Many of these students are not adept at the game of school.
They do not focus on pleasing the teacher or getting the grade. The first
group gets good grades without much learning. The second group gets mediocre
grades even when they learn more than others.
Author's note: In college I was a member of a
student club that require potential initiates to go through a probationary
period where they had to prove themselves. During that time, we created
a point system to evaluate the performance of the potential new members.
Before the point system was implemented we discussed what the initiate
was doing, how he was interacting with current members, and what specific
issues we might have to deal with. After the point system was implemented,
we discussed who did or didn't have enough points. We actually found ourselves
having to reject an initiate with a high point score, because he caused
too many problems. I argued in club meeting that with the point system
we knew less about our initiates, not more. But I was the only one willing
to abandon the point system.