Edinburgh Research Archive

Algorithms for assessing the quality and difficulty of multiple choice exam questions

dc.contributor.author
Luger, Sarah Kaitlin Kelly
en
dc.date.accessioned
2017-03-15T11:09:35Z
dc.date.available
2017-03-15T11:09:35Z
dc.date.issued
2016-06-27
dc.description.abstract
Multiple Choice Questions (MCQs) have long been the backbone of standardized testing in academia and industry. Correspondingly, there is a constant need for the authors of MCQs to write and refine new questions for new versions of standardized tests as well as to support measuring performance in the emerging massive open online courses, (MOOCs). Research that explores what makes a question difficult, or what questions distinguish higher-performing students from lower-performing students can aid in the creation of the next generation of teaching and evaluation tools. In the automated MCQ answering component of this thesis, algorithms query for definitions of scientific terms, process the returned web results, and compare the returned definitions to the original definition in the MCQ. This automated method for answering questions is then augmented with a model, based on human performance data from crowdsourced question sets, for analysis of question difficulty as well as the discrimination power of the non-answer alternatives. The crowdsourced question sets come from PeerWise, an open source online college-level question authoring and answering environment. The goal of this research is to create an automated method to both answer and assesses the difficulty of multiple choice inverse definition questions in the domain of introductory biology. The results of this work suggest that human-authored question banks provide useful data for building gold standard human performance models. The methodology for building these performance models has value in other domains that test the difficulty of questions and the quality of the exam takers.
en
dc.identifier.uri
http://hdl.handle.net/1842/20986
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Sarah K. K. Luger and Jeff Bowles. Two methods for measuring question difficulty and discrimination in incomplete crowdsourced data. In Proceedings of The First AAAI Conference on Human Computation and Crowdsourcing (HCOMP-13), Palm Springs, CA, USA., 2013.
en
dc.relation.hasversion
Sarah K. K. Luger and Jeff Bowles. An analysis of question quality and user performance in crowdsourced exams. In Proceedings of the 2013 Workshop on Data-Driven User Behavioral Modelling and Mining from Social Media, DUBMOD at CIKM 2013 San Francisco, CA, USA, October 28, 2013. Pages 29-32., 2013.
en
dc.relation.hasversion
Sarah Luger. A graph theory approach for generating multiple choice exams. In 2011 AAAI Fall Symposium on Question Generation, 2011.
en
dc.subject
Item Response Theory
en
dc.subject
IRT
en
dc.subject
Multiple Choice Questions
en
dc.subject
MCQs
en
dc.subject
question answering
en
dc.subject
question difficulty
en
dc.subject
question discrimination
en
dc.title
Algorithms for assessing the quality and difficulty of multiple choice exam questions
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
Luger2016.pdf
Size:
3.55 MB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)