Algorithms for assessing the quality and difficulty of multiple choice exam questions
dc.contributor.author
Luger, Sarah Kaitlin Kelly
en
dc.date.accessioned
2017-03-15T11:09:35Z
dc.date.available
2017-03-15T11:09:35Z
dc.date.issued
2016-06-27
dc.description.abstract
Multiple Choice Questions (MCQs) have long been the backbone of standardized
testing in academia and industry. Correspondingly, there is a constant need for the
authors of MCQs to write and refine new questions for new versions of standardized
tests as well as to support measuring performance in the emerging massive open online
courses, (MOOCs). Research that explores what makes a question difficult, or what
questions distinguish higher-performing students from lower-performing students can
aid in the creation of the next generation of teaching and evaluation tools.
In the automated MCQ answering component of this thesis, algorithms query for
definitions of scientific terms, process the returned web results, and compare the returned
definitions to the original definition in the MCQ. This automated method for
answering questions is then augmented with a model, based on human performance
data from crowdsourced question sets, for analysis of question difficulty as well as
the discrimination power of the non-answer alternatives. The crowdsourced question
sets come from PeerWise, an open source online college-level question authoring and
answering environment.
The goal of this research is to create an automated method to both answer and
assesses the difficulty of multiple choice inverse definition questions in the domain of
introductory biology. The results of this work suggest that human-authored question
banks provide useful data for building gold standard human performance models. The
methodology for building these performance models has value in other domains that
test the difficulty of questions and the quality of the exam takers.
en
dc.identifier.uri
http://hdl.handle.net/1842/20986
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Sarah K. K. Luger and Jeff Bowles. Two methods for measuring question difficulty and discrimination in incomplete crowdsourced data. In Proceedings of The First AAAI Conference on Human Computation and Crowdsourcing (HCOMP-13), Palm Springs, CA, USA., 2013.
en
dc.relation.hasversion
Sarah K. K. Luger and Jeff Bowles. An analysis of question quality and user performance in crowdsourced exams. In Proceedings of the 2013 Workshop on Data-Driven User Behavioral Modelling and Mining from Social Media, DUBMOD at CIKM 2013 San Francisco, CA, USA, October 28, 2013. Pages 29-32., 2013.
en
dc.relation.hasversion
Sarah Luger. A graph theory approach for generating multiple choice exams. In 2011 AAAI Fall Symposium on Question Generation, 2011.
en
dc.subject
Item Response Theory
en
dc.subject
IRT
en
dc.subject
Multiple Choice Questions
en
dc.subject
MCQs
en
dc.subject
question answering
en
dc.subject
question difficulty
en
dc.subject
question discrimination
en
dc.title
Algorithms for assessing the quality and difficulty of multiple choice exam questions
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Luger2016.pdf
- Size:
- 3.55 MB
- Format:
- Adobe Portable Document Format
This item appears in the following Collection(s)

