New Study Highlights the Importance of Careful Multiple-Choice Question Construction to Ensure Fairness and Accurately Reflect Student Abilities
Photo by Ellie Ellien on Unsplash.
Research
New Study Highlights the Importance of Careful Multiple-Choice Question Construction to Ensure Fairness and Accurately Reflect Student Abilities
“Question flaws do not affect all students equally—they create a barrier for weaker students, unintentionally widening performance gaps.”
Medical, dental and master’s students in biomedical sciences frequently take standardized, multiple-choice question tests to assess their foundational knowledge. Reasons for its widespread use include reliability, efficiency, low cost and when the questions are well-constructed, high accuracy. However, multiple choice questions may present challenges for test-takers when they contain item writing flaws (flawed, ambiguous, or poorly constructed test questions) that can potentially compromise the fairness and validity of the assessment.
New research, led by scientists at Boston University Chobanian & Avedisian School of Medicine, suggests that an increased number of total flaws in questions could lead to disadvantages among lower-performing students, not because the content is more difficult, but rather that these students may be less able to navigate item writing flaws.
Marisol Lopez
“Our study highlights the importance of careful multiple-choice question construction to ensure assessments are fair and accurately reflect student abilities,” explains corresponding author Marisol E. Lopez, PhD, assistant professor of pharmacology, physiology & biophysics at the school. “Based on our findings, we recommend the identification and elimination of item writing flaws to promote greater equity.”
To study if poorly written questions affect student performance, the researchers created a physiology test and demographic survey that was given to 31 BU dental as well as graduate medical science master’s students and 54 University of Central Florida medical students.
After analyzing the results, they found test performance did not differ among demographic groups (sex, race/ethnicity, English language proficiency, birth country, and primary home language) and question difficulty was not affected by the number of flaws present in the question. However, they did find that questions with more flaws have a greater impact on low performing students. Specifically, certain flaws such as when questions have unclear stems (prompts) affect weaker students disproportionately. These findings suggest that high performing students can better navigate question flaws.
According to the researchers, these findings highlight the need for improved faculty training and rigorous review of test questions before exams are administered. “Fixing unclear wording, simplifying stems, and eliminating common structure mistakes could help ensure that all students, no matter their background, are evaluated fairly,” adds Lopez.
These findings appear online in the journal Medical Science Educator.