*By Kevin Cheung, Associate Professor, School of Mathematics and Statistics*

When instructors make changes in their teaching, they often want to know the impact of such changes. Even though countless studies in higher education exist in the literature, conducting a proper study that meets statistical rigour is hardly a walk in the park.

As an example, say we want to assess the impact of the choice between two textbooks, Book A and Book B, for a course. How should we set up a study?

Three immediate options come to mind:

- Teach the course in a single term in a way that is agnostic to the choice of the textbook and randomly assign students to each of these textbooks as their main reference.
- Teach two sections of the course in the same term using Book A in one section and Book B in the other section with the same set of assessments.
- Teach with Book A in one term and Book B in another term while keeping everything else as similar as possible.

Option 1 is known as A/B testing. It is attractive in theory but is unimplementable in the current context. For instance, are students willing to be randomly assigned? Can we ensure that students assigned Book A won’t consult Book B or vice versa? (Note that A/B testing might work in a MOOC setting as suggested by a patent application by Khan Academy though the conclusions drawn could be weak.)

Option 2 gets around the difficulty of randomly assigning the two textbooks to a certain extent though it still suffers from the problem that students will know about the two textbooks. A potentially bigger problem is that the sections of students might be quite dissimilar. It might happen that one section fits the schedules of students in a particular program while the other does not, possibly leading to confounding; any observed difference in outcomes could very well be correlated with the makeup of students in the different sections instead of the textbook choice.

Option 3 seems viable if the course is taught in the same term of different academic years with identical course schedules. However, creating assessments consistent across terms is nontrivial. Students taking the course in a later term often can obtain materials from prior terms through friends and “study assistance” platforms. Of course, they can also find out about an alternative textbook from these materials. In addition, ensuring that the teaching is consistent across terms is challenging.

Given the difficulties mentioned above, should we simply give up? Definitely not. We just need to be mindful of what factors could be at play and what the potential limitations are. In fact, identifying such factors and limitations can sometimes help us learn more about students’ learning environment.