Algorithms for assessing the quality and difficulty of multiple choice exam questions

Luger, Sarah Kaitlin Kelly

Algorithms for assessing the quality and difficulty of multiple choice exam questions

Simple item page

dc.contributor.author

Luger, Sarah Kaitlin Kelly

en

dc.date.accessioned

2017-03-15T11:09:35Z

dc.date.available

2017-03-15T11:09:35Z

dc.date.issued

2016-06-27

dc.description.abstract

Multiple Choice Questions (MCQs) have long been the backbone of standardized testing in academia and industry. Correspondingly, there is a constant need for the authors of MCQs to write and refine new questions for new versions of standardized tests as well as to support measuring performance in the emerging massive open online courses, (MOOCs). Research that explores what makes a question difficult, or what questions distinguish higher-performing students from lower-performing students can aid in the creation of the next generation of teaching and evaluation tools. In the automated MCQ answering component of this thesis, algorithms query for definitions of scientific terms, process the returned web results, and compare the returned definitions to the original definition in the MCQ. This automated method for answering questions is then augmented with a model, based on human performance data from crowdsourced question sets, for analysis of question difficulty as well as the discrimination power of the non-answer alternatives. The crowdsourced question sets come from PeerWise, an open source online college-level question authoring and answering environment. The goal of this research is to create an automated method to both answer and assesses the difficulty of multiple choice inverse definition questions in the domain of introductory biology. The results of this work suggest that human-authored question banks provide useful data for building gold standard human performance models. The methodology for building these performance models has value in other domains that test the difficulty of questions and the quality of the exam takers.

en

dc.identifier.uri

http://hdl.handle.net/1842/20986

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Sarah K. K. Luger and Jeff Bowles. Two methods for measuring question difficulty and discrimination in incomplete crowdsourced data. In Proceedings of The First AAAI Conference on Human Computation and Crowdsourcing (HCOMP-13), Palm Springs, CA, USA., 2013.

en

dc.relation.hasversion

Sarah K. K. Luger and Jeff Bowles. An analysis of question quality and user performance in crowdsourced exams. In Proceedings of the 2013 Workshop on Data-Driven User Behavioral Modelling and Mining from Social Media, DUBMOD at CIKM 2013 San Francisco, CA, USA, October 28, 2013. Pages 29-32., 2013.

en

dc.relation.hasversion

Sarah Luger. A graph theory approach for generating multiple choice exams. In 2011 AAAI Fall Symposium on Question Generation, 2011.

en

dc.subject

Item Response Theory

en

dc.subject

IRT

en

dc.subject

Multiple Choice Questions

en

dc.subject

MCQs

en

dc.subject

question answering

en

dc.subject

question difficulty

en

dc.subject

question discrimination

en

dc.title

Algorithms for assessing the quality and difficulty of multiple choice exam questions

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Luger2016.pdf
Size:: 3.55 MB
Format:: Adobe Portable Document Format

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection