TIMSS - An Analysis of the International High School Physics Test
Sam Bowen
What is the TIMSS Study?
TIMSS is the Third International Mathematics and Science Study. It is an attempt to compare the science learning and competence of students from up to 40 countries around the world. It has recently been announced in the news that US students begin elementary school with enthusiasm and high levels of accomplishment, but that by junior high and high school have lower scores on standardized tests than most other countries. Recently results of a physics test given to students in their last year of high school has shown that U.S. students rank very close to the bottom of the 18 countries.
What was the design of the study and what did it seek to determine?
There were three major aspects of the TIMSS study of the educational systems of the participating countries:
- A study of the teaching materials, the official curriculum, and teacher training in each country.
- Several written science and math examinations administered to a comparably sized samples of students in each country at the same age (9 or 13 years) or placement in school (last year of high school).
- An observational comparative study of junior high classrooms of Japan, Germany, and the US.
For each country there were three populations of students:
- Population 1: A sample of 9 year olds, representing grades 3 and 4.
- Population 2: A sample of 13 year old students in grades 7 and 8.
- Population 3: A sample of students in 12th or last high school year taking mathematics and science.
Not all of the original 40 countries were involved in all parts of the study. In particular, the physics test was taken by 18 countries and did not include any Asian country. Each of the populations of students were subjected to a written examination with the same questions on half of the exam and one of three different sets of questions on the other half. The results of the studies for the first two populations (1 and 2) are now available as published books by Kluwer Academic Publishers. Visit the main site for newsletters and press releases about the study. The major report on the TIMSS study is also on the U.S. Department of Education website. It appears that much of the detailed data from the studies is available only in the books which are being published. The Boston College site seems to have the best data on the physical examination and has some of the physics test items and score distributions available for download.
Visit the Website
What were the differences between the curricular materials and the educational systems of the different countries compared to the US?
The major difference mentioned in the reports is that the U.S. curricular materials are "a mile wide and an inch deep". By this is meant that the curricula in the U.S. presents many topics, but no subject is covered in any depth and there is very little difference in the emphasis between subjects. Because the study was able to look at students in both 7th and 8th grades, it was possible to see if there are any gains in subject matter competency between these grades. In several countries, students showed gains in some subjects, mostly those that had been covered in some depth in their schools. In the U.S., by contrast, no subjects showed a gain in knowledge and skill.
One other insight is possible by looking at the different subject matter areas where U.S. 13 year olds knowledge was compared with others. (The subject matter titles are from the Pop. 2 TIMSS study. ) The US students were near the top in only one subject: Life Cycles and Genetics. They were in the high middle range for Earth Processes, Earth in the Universe, and Structure of Matter. Our students were near the bottom in Physical Changes, Forces and Motion, and Properties and Classification of Matter.
What was the nature of the High School Physics examination?
The written examination on which the U.S. students were ranked so low was a combination of 66 problems (items) in three forms: (a) multiple choice, (b) essay, drawing, or graphing, and (c ) algebraic or numerical manipulation. The non-multiple-choice items had a very detailed grading procedure covering all possible answer responses. The distribution of responses for all questions can be downloaded from the Boston College site. A set of 37 questions from the physics examination has also been made public and can also be obtained from the web site at Boston College ( http://www.csteep.bc.edu/TIMSS) I would encourage the reader to examine the problems and the response distributions as well.
Let us begin with a characterization of the exam problems that have been released. I classified the 37 examples by types: Conceptual (21), Formula (4), Graphical or Drawing (8) and Numerical (4). Conceptual problems could present either a multiple choice or an essay about a point of physics which required no numerical evaluation, but which tested the ability of the students to apply old results to new situations. Formula problems required a straightforward, one step, manipulation of a formula (a large number of equations were given at the beginning of the exam). Graphical problems required the reading or construction of a graph, and the Numerical problems required a simple calculation. (All necessary physical constants were given at the beginning of the exam.) Each released problem was listed with the fraction of all students who solved it successfully. The formats of the released problems were: multiple choice (20), drawings (3), essay (10), and numerical evaluation (4).
The distribution of answers to all 66 questions in the physics exam can be downloaded from the Boston College site along with a one line description of the questions. The lowest average percentage (for all students) was 9% on an essay problem asking for the pattern of water flowing out from holes in the side of a vertical cylinder of fluid as a function of the height of the holes. The highest percentage (65%) on a released item was a multiple choice question asking why steam has more volume than water. Another not-yet-released question labeled, "Circuit in box" had an average percent correct of 86% for all students.
In the following is tabulated the average percentage of participants who answered correctly various types of questions for the TIMSS items.
Type of Problem |
US (TIMMS) |
All Countries (TIMSS) |
Overall Average |
24% |
35% |
Average for E&M Problems (16) |
19% |
32% |
Average for Mechanics (16) |
22% |
33% |
Average for Modern Physics (14) |
24% |
34% |
Average for Thermodynamics |
29% |
39% |
Average for Waves and Light (11) |
34% |
44% |
The numbers in parentheses in the first column are the number of problems of each subject on the TIMSS exam.
In order to learn more about source of the deficiencies of the US students, I examined the 33 lowest scoring items on the test. I tested whether some one of the five subject areas appeared more in the bottom half than the upper half. For both the U.S. and the all country categories, each of the subjects appeared about half of the time in each group. So, it would appear that the low scores of the U.S. students is not because of a knowledge deficit in one of the five categories listed above.
What kinds of questions are primarily missed by the U.S. students?
Twenty-four of the 33 lowest scoring questions were open-ended, graphical or essay questions. Only 9 of these questions were multiple choice. When I further examined those multiple choice questions (which have been released), it was clear that almost all 9 required a number of steps and relationships, and often the use of a symbolic representation (equations), to generate a correct answer. By contrast the questions which were answered by the largest fractions of U.S. students were predominantly multiple choice and single step or single concept questions.
In the following table are items on which less than 10% of the U.S. students gave fully correct answers.
Item |
Subject |
%Correct All |
%Correct US |
Item description |
f16 |
E&M |
11.8 |
0.1 |
period of charged particle in B field |
f15 |
modern |
12.3 |
0.4 |
photoelectric effect and different metals |
f14 |
E&M |
16.8 |
0.4 |
magnitude of E field strength |
g19 |
E&M |
13.2 |
0.6 |
Lenz's law and falling aluminum ring |
h18 |
modern |
13.7 |
0.8 |
television as particle accelerator |
f17b |
mech |
8.7 |
1.44 |
acceleration expt/measurement error |
h16 |
E&M |
21.9 |
1.5 |
speed of el in crossed E and B fields |
g18 |
modern |
10 |
1.8 |
alpha particles passing through gold |
h14 |
heat |
13 |
2 |
effect of density on freezing of water |
16g |
mech |
8.5 |
3.5 |
effect of pressure on water leaking from a bottle |
f12 |
heat |
12.3 |
4.6 |
Temperature of a system |
g14 |
modern |
25 |
4.6 |
paths of alpha, beta and gamma in electric field |
g11 |
heat |
13.4 |
5.2 |
effect of ice melting on water level in aquarium |
h15 |
modern |
23.3 |
6.6 |
de Broglie wavelength of moving electron |
15g |
mech |
15.7 |
6.7 |
direction of acceleration on bouncing ball |
h13 |
mech |
35 |
6.9 |
interpretation of a force vs distance graph |
f17a |
mech |
32.3 |
8.3 |
acceleration expt/value of gravity |
h19a |
waves |
18.3 |
8.9 |
speed of sound expt/outline |
f02 |
mech |
18.9 |
9 |
force on connected springs |
All of these items except the last one were open-ended, multiple-concept, multiple-step problems. The last problem was a multiple-choice problem. The fourth column is the percent of U.S. students who were able to answer that question correctly. For these open-ended problems students could have been given a partially correct answer, but such answers are not being counted in this comparison.
What are the characteristics of these most poorly answered questions? They all attempt to measure fairly clear concepts that are fundamental to physics. They almost all require a symbolic representation of the problem and a manipulation of variables. None of these questions are memorization or single step problems.
If one examines the questions which were answered by the largest fractions of the U.S. students, it is easy to see that these are dominated by single fact, one-step items. It is worth noting that the U.S. students have a higher percentage of correct answers than the international average on only 5 items. An Excel file of all the items can be obtained from my home page.
In my opinion, the questions for this examination appear to be of high quality which do probe whether students understand the principles of physics. Since all of the needed physical constants and a number of equations are presented with the examination, it appears that the examination did not require a significant level of memorization instead of understanding.
Are there other characteristics of the study that indicate the weakness of the US system?
In conversations with Dr. Senta Raizen of NCISE, who is one of the authors of the data analysis team for the TIMSS project, several important points came up that are not fully emphasized in the study reports. The major characteristic of the U.S. curricula is that they cover a very large number of topics and are primarily focused on vocabulary. Current U.S. students have been exposed to a very large number of topics, but do not have experience in depth on many. The various measures of student interest seem to continually drop with grade level in the U.S. Many other countries exhibit an increase in interest in science around the eighth grade where students go into some depth with various subjects. In the U.S. there is a more or less steady decrease in interest as the number of topics covered continues to increase.
A second surprising difficulty with the study was that the TIMSS researchers from the U.S. had very serious difficulties finding enough students who were taking physics and advanced mathematics in 12th grade so that the sample size of the US students would be comparable to that of other countries. Most other countries were able to find adequate numbers of large enough classes to make up the required sample sizes in physics and advanced mathematics because the fraction of their students taking these subjects were larger than in U.S. schools.
The comparison between the eighth grade classrooms in Japan, Germany, and the U.S. also showed that the types of questions and concern that is present in US classrooms are much more fact related and only involve very simple conceptual processes. The classrooms in Japan and Germany seem to provide students with greater understanding of processes and applications of mathematics than is present in this country.
There is one revealing anomaly among the U.S. schools involved in the TIMSS examinations. It has been reported that the "First in The World Consortium" of schools did perform better than the world average. This group of quality schools have very strong preparation in Junior High, requires all teachers in high school to have a major or minor in the subjects which they teach, and provides significant training for all teachers. Seventy percent of the students in these schools are in advanced mathematics programs. Also large numbers of students take physics. The web page for this consortium is (http://www.ncrel.org/fitw), but at the time this newsletter went to press the confirming data was not on these web pages.
What can interested physicists do?
A first step would be to read some of the comparative books that are being produced from the TIMSS study. The book, "Characterizing Pedagogical Flow", reports data about classrooms, teachers, materials, and curriculum for 9 and 13 year olds in the following countries: France, Japan, Norway, Spain, Switzerland, and the United States. The book, "A Splintered Vision," is a thorough study of the unfocussed nature of the United States curriculum. This book may provide one of the best pictures of the weakness of the current system. The international study of science curriculum is presented in the book, "Many Visions, Many Aims, Vol. 2," Another study of interest is a volume, "Examining the Examinations", which compares what examinations students heading for college must survive in seven different countries. Other volumes which provide detail about the high school population of the study will be available in the near future.
One possible response would be to create yet another curriculum. It is not clear that the most immediate need is for a new curriculum. The problem is that science teaching has been taken over by a new profession, (the science education specialist in curriculum and instruction), most of whom have never carried out scientific research, at least in the physical sciences. A far more likely root of the problem may lie with the level of understanding of physics by the teachers of high school physics. Most of the standards for teachers to teach physics have their foundation in the initial non-calculus physics courses which students take in college. Typically teachers are supposed to take other courses beyond the first year, but many teachers end up teaching physics with only the single year of (non-calculus) physics.
The major hurdles for making a change are largely local and political. Each state legislature, state board of education and state department of education has determined, along with the help of many interest groups, the educational and skill requirements for physics and other science teachers. These requirements do not usually require much depth in physics. Changes in these requirements will be slow and time consuming. It will require a process of meeting and working with a number of governmental and local groups. The education colleges will not regard physicists as very serious unless they see that we are taking some effort to provide the course work and support for teachers in our subject. There are some state resources available (Eisenhower funds) for improving physics and science education, but it will take a considerable amount of time and effort to convince those who manage these that physicists are a good investment.
My Opinion of the Timss Message for the Physics Community
My opinion of the TIMSS message for the physics community is that we need to take responsibility for pre-college physics and science teachers. We need to give them a better training in physics. I think the TIMSS results reflect the same effects as measured by the Force Concept Inventory in introductory mechanics classes. We are not generally giving students an understanding of physics which supports generalization and manipulation of concepts in new contexts. The understanding of physics by the bulk of pre-college physics teachers in this country is primarily at the level assessed by the Force Concept Inventory at the end of their the first-year introductory physics course. Many other countries require much more physics exposure and mastery, including undergraduate research. A large fraction of existing high school teachers will soon be retiring; the physics community has a great opportunity.