Tag: research

Why should I use peer instruction in my class?

Image: "Lecture Hall," uniinnsbruck, Flickr (CC)

[Update (June 16): Lead author Zdeslav Hrepic pointed me to a follow-up book chapter [PDF] where he and the study co-authors describe using tablet-PCs to counter the problems uncovered in their study. Thanks, Z.]

I’m sure we’ve all heard it from skeptical instructors: Why should I use peer instruction in my class? In response, we often cite Hake’s 6000-student study or the new UBC study by my colleagues Louis, Ellen and Carl. These are still pretty abstract, though: If you use interactive, learner-centered instruction, you can expect your students to better grasp of the concepts.

“Sure, but why?” the instructors ask. “Why does it work?”

I just read a paper that can help answer that question. I ran across it while following a discussion about the Khan Academy videos and whether or not they are good tools for learning. This paper by Hrepic, Zollman and Rebello (2007) asks students in an introductory physics course and physics experts (with M.Sc’s and Ph.D’s) to watch a 15 minute video of a renowned physics educator presenting a topic in physics.

The researchers do a series of pre- and post-tests and interviews with the students and experts to compare their understanding of the concepts covered (or not) in the video. There were some significant differences. A couple that stick in my head. (1) students recalled learning about concepts that were not presented in the video. (2) Only students who knew the correct answers on the pre-test were able to infer the concepts from the video (that is, the questions were not explicitly answered in the video.) The students who did not know the concept before were unable to make the inferences. Like I said, there are significant differences between what the instructor thinks a lecture covers and what the students think is covered.

The paper nicely gives us some suggestions to counter this problem.

And my thoughts about how to use peer instruction to do that.

Making inferences: Experts make more inferences than students. And only students who already know the concepts can infer them from the lecture. Therefore, instructors need to be cautious about relying on students to fill in the blanks.

Some of the best peer instruction questions are the conceptual questions where the answer is not simple recall. No traxoline here, please. Questions that rely on students making inferences are excellent for promoting discussion because it’s likely students will interpret the question differently, make different assumptions and come to different conclusions. <soapbox> All the more reason that students need to first answer clicker questions on their own so they’re prepared to share their inferences. </soapbox>

Prior knowledge: Students’ prior knowledge influences what they perceive and can “distort” their recollection of what the lecturer says. Therefore, it’s essential that the instructor has some idea of what the students already know (particularly their misconceptions) before presenting new material.

A few, introductory clicker questions will reveal the students’ prior knowledge. Sure, maybe these are simple recall questions that won’t generate a lot of discussion. But the students’ responses will inform the agile instructor who can tailor the instruction.

Continuous feedback about students’ understanding: The trail the instructor blazes through the concepts and the path the students follow often diverge during a lecture. The instructor should be continuously gathering and reacting to feedback from the students about their understanding so the instructor can shepherd the students back on track.

Observant instructors can gather critical feedback from the discussions that occur during peer instruction or the students answers on in-class worksheets like the Lecture-Tutorials popular in introductory “Astro 101” classes and other hybrids of the Washington Tutorials. Rather than waiting weeks until after the midterm or final exam to find out students totally missed Concept X, the instructor can discover it within minutes of introducing the topic. Minutes, not weeks! The agile instructor can immediately revisit the difficult concepts. Immediately, not weeks later or never!

I’m much more confident I can answer the skeptical instructor now. “Why should I use clickers in my classroom?” Because they give the students and you to ability to assess the current level of understanding of the concepts. Current, right now, before it’s too late and the house of cards you’re so carefully building come crashing down.

An astronomy education retreat

Last year, Tim and Stephanie Slater phoned me up and invited me to be part of an astronomy education research group they were putting together. I was flattered to be part of the Conceptual Astronomy and Physics Education Research (CAPER) team! Especially when I learned who else I’d be working with. I mean, check out the bio’s of these remarkable astronomy educators. I’ve got to admit, I was a bit overwhelmed by their experience (and publication records.)

We got together at a conference we all attended and meet via telecon regularly but this week was special. A group of us — Tim, Stephanie, Julia, Sharon, Kendra, Inge, Eric and I — got together in Colorado for an intensive, 3-day astronomy education research retreat.

Wow.

We talked about this. We argued about that. We thought about this and that. And it was all about teaching and learning astronomy. Not marking or Little League or home renovations or all those other things that eat up our time. Just astronomy education. What a treat!

By the end of the 3 days, we’d developed a research project, from concept tests and interview protocols to IRB letters and pre/post testing schedules. And what’s it all about?

Understanding certain concepts in introductory astronomy, like the causes of the seasons and the phases of the Moon, requires students to visualize the Earth, Moon and Sun, from both Earth-centered and Sun-centered points-of-view. It seems likely, then, that students with better spatial reasoning abilities will be more successful. There are already  standard tests of spatial reasoning. And there are a number of assessments of astronomy knowledge, augmented by the one’s we created this week. Add some pre-/post-testing and a dash of correlation coefficient and see what comes out.

One of the concepts we want to explore is the motion of the sky, so we made up an assessment using this diagram.  (I’m using this example because *I* created this diagram with Powerpoint and a little help from Star Walk.)

Looking south at sunset. So many questions we can ask...

Like I said earlier, I was pretty overwhelmed by the calibre of the other people in the group. So it was very gratifying, good for my ego, to be able to contribute and realize that we all have strengths. Maybe that’s the humble Canadian coming through.  I’m excited about what we’ve done and what we’ll be doing. And proud I have knowledge and experience to share.

I can’t wait to see what we find. Stay tuned!

Going over the exam

How often have you heard your fellow instructors lament,

I don’t know why I bother with comments on the exams or even handing them back – students don’t go over their exams to see where they what they got right and wrong, they just look at the mark and move on.

If you often say or think this, you might want to ask yourself, What’s their motivation for going over the exam, besides “It will help me learn…”? But that’s the topic for another post.

In the introductory gen-ed astronomy class I’m working on, we gave a midterm exam last week. We dutifully marked it which was simple because the midterm exam was multiple-choice answered on Scantron cards. And calculated the average. And fixed the scoring on a couple of questions where the question stem was ambiguous (when you say, “summer in the southern hemisphere, do you mean June or do you mean when it gets hot?”). And we moved on.

Hey, wait a minute! Isn’t that just what the students do — check the mark and move on?

Since I have the data, every student’s answer to every question, via the Scantron and already in Excel, I decided to “go over the exam” to try to learn from it.

(Psst: I just finished wringing some graphs out of Excel and I wanted to start writing this post before I got distracted by, er, life so I haven’t done the analysis yet. I can’t wait to see what I write below!)

Besides the average (23.1/35 questions or 66%) and standard deviation (5.3/35 or 15%), I created a histogram of the students’ choices for each question. Here is a selection of questions which, as you’ll see further below, are widespread on the good-to-bad scale.

Question 9: You photograph a region of the night sky in March, in September, and again the following March. The two March photographs look the same but the September photo shows 3 stars in different locations. Of these three stars, the one whose position shifts the most must be

A) farthest away
B) closest
C) receding from Earth most rapidly
D) approaching Earth most rapidly
E) the brightest one

Students' choices for Question 9. The correct answer is B.

Question 16: What is the shape of the shadow of the Earth, as seen projected onto the Moon, during a lunar eclipse?

A) always a full circle
B) part of a circle
C) a straight line
D) an ellipse
E) a lunar eclipse does not involve the shadow of the Earth

Students' choices for Question 16. The correct answer is B.

Question 25: On the vernal equinox, compare the number of daytime hours in 3 cities, one at the north pole, one at 45 degrees north latitude and one at the equator.

A) 0, 12, 24
B) 12, 18, 24
C) 12, 12, 12
D) 0, 12, 18
E) 18, 18, 18

Students' answers to Question 25. The correct answer is C.

How much can you learn from these histograms? Quite a bit. Question 9 is too easy and we should use our precious time to better evaluate the students’ knowledge. The “straight line” choice on Question 16 should be replaced with a better distractor – no one “fell for” that one.  I’m a bit alarmed that 5% of the students think that the Earth’s shadow has nothing to do with eclipses but then again, that’s only 1 in 20 (actually, 11 in 204 students – aren’t data great!)  We’re used to seeing these histograms because in class, we have frequent think-pair-share episodes using i>clickers and use the students’ vote to decide how to proceed. If these were first-vote distributions in a clicker question, we wouldn’t do Question 9 again but we’d definitely get them to pair and share for Question 16 and maybe even Question 25. As I’ve written elsewhere, a 70% “success rate” can mean only about 60% of the students chose the correct answer for the right reasons.

I decided to turn it up a notch by following some advice I got from Ed Prather at the Center for Astronomy Education. He and his colleagues analyze multiple-choice questions using the point-biserial correlation coefficient. I’ll admit it – I’m not a statistics guru, so I had to look that one up. Wikipedia helped a bit, so did  this article and Bardar et al. (2006). Normally, a correlation coefficient tells you how two variables are related. A favourite around Vancouver is the correlation between property crime and distance to the nearest Skytrain station (with all the correlation-causation arguments that go with it.) With point-biserial correlation, you can look for a relationship between students’ test scores and their success on a particular question (this is the “dichotomous variable” with only two values, 0 (wrong) and 1 (right).) It allows you to speculate on things like,

  • (for high correlation) “If they got this question, they probably did well on the entire exam.” In other words, that one question could be a litmus test for the entire test.
  • (for low correlation) “Anyone could have got this question right, regardless of whether they did well or poorly on the rest of the exam.” Maybe we should drop that question since it does nothing to discriminate or resolve the student’s level of understanding.

I cranked up my Excel worksheet to compute the coefficient, usually called ρpb or ρpbis:

where μ+ is the average test score for all students who got this particular questions correct, μx is the average test score for all students, σx is the standard deviation of all test scores, p is the fraction of students who got this question right and q=(1-p) is the fraction who got it wrong. You compute this coefficient for every question on the test. The key step in my Excel worksheet, after giving each student a 0 or 1 for each question they answered, was the AVERAGEIF function: for each question I computed

=AVERAGEIF(B$3:B$206,”=1″,$AL3:$AL206)

where, for example, Column B holds the 0 and 1 scores for Question 1 and Column AL holds the exam marks. This function takes the average of the exam scores only for those students (rows) who have got a “1” on Question 1. At last then, the point-biserial correlation coefficients for each of the 35 questions on the midterm, sorted from lowest to highest:

Point-biserial correlation coefficient for the 35 multiple-choice question in our astronomy midterm, sorted from lowest to highest. (Red) limits of very weak to strong (according to the APEX disserations article) and also the (green) "desirable" range of Bardar et al. are shown.

First of all, ooo shiney! I can’t stand the default graphics settings of Excel (and PowerPoint) but with some adjustments, you can produce a reasonable plot. Not that this in is perfect, but it’s not bad. Gotta work on the labels and a better way to represent the bands of “desirable”, “weak”, etc.

Back to going over the exam, how did the questions I included above fare? Question 9 has a weak, not desirable coefficient, just 0.21. That suggests anyone could get this question right (or equivalently, no could get this question right). It does nothing to discriminate or distinguish high-performing students from low-performing students. Question 16, with ρpb = 0.37 is in the desirable range – just hard enough to begin to separate the high- and low-performing students. Question 25 is one of the best on the exam, I think.

In case you’re wondering, Question 6 (with the second highest ρpb ) is a rather ugly calculation. It discriminated between high- and low-performing students but personally, I wouldn’t include it – doesn’t match the more conceptual learning goals IMHO.

I was pretty happy with this analysis (and my not-such-a-novice-anymore skills in Excel and statistics.) I should stopped there. But like a good scientist making sure every observation is consistent with the theory, I looked at Question 26, the one with the highest point-biserial correlation coefficient. I was shocked, alarmed even. The most discriminating question on the test was this?

Question 26: What is the phase of the Moon shown in this image?

A) waning crescent
B) waxing crescent
C) waning gibbous
D) waxing gibbous
E) third quarter

It’s waning gibbous, by the way, and 73% of the students knew it. That’s a lame, Bloom’s taxonomy Level 1, memorization question. Damn. To which my wise and mentoring colleague asked, “Well, what was the exam really testing, anyway?”

Alright, perhaps I didn’t get the result I wanted. But that’s not the point of science. Of this exercise.  I definitely learned a lot by “going over the exam”, about validating questions, Excel, statistics and WordPress. And perhaps made it easier for the next person, shoulders of giants and all that…

Navigation