7 Guidelines for Building Exam Items

Dec 5, 2022

min read

Building an exam for your organization has been assigned to you. You've spent the last two weeks worrying about how you'll handle this and putting off the necessary work, but now you're prepared to start the test development process. But how do you even get started? Why was this exam necessary to create? Which item types are the most appropriate for your test, given that you are aware that you must create test items? Who is your target market? How do you determine that?

Determining Your Purpose for Testing: Why and Who

First thing’s first. Before creating your test, you need to: 1) determine why you are testing your candidates, and 2) figure out who exactly will be taking your exam. Assessing the purpose of your exam is the first vital step of the development process. You do not want to test just to test; you want to scope out the “why” of your exam: why this exam is important to your organization, and what you are trying to achieve with having your test takers sit for it. You can narrow down your purpose for testing by asking yourself a few questions:

Is your organization interested in testing to see what was learned at the end of a course presented to students?
Are you looking to assess if an applicant for a job has the necessary knowledge to perform the role?
Are candidates trying to obtain certification within a certain field?

The Benefits of Identifying Your Exam's Purpose

Learning the purpose of your exam will help you come up with a plan on how best to set up your exam—which exam type to use, which type of exam items will best measure the skills of your candidates, etc. Determining this purpose will also help you to be better able to figure out your test audience. Whether they are students still in school, individuals looking to qualify for a position, or experts looking to get certification in a certain product or field—it’s important to make sure your exam is actually testing at the appropriate level. Your exam will not be valid if your items are too easy or too hard, so keeping the minimally qualified candidate (MQC) in mind during all of the steps of the exam development process will ensure you are capturing valid test results overall.

What Is the MQC?

MQC is the acronym for “minimally qualified candidate.” The MQC is a conceptualization of the assessment candidate who possesses the minimum knowledge, skills, experience, and competence to just meet the expectations of a credentialed individual. If the credential is entry level, the expectations of the MQC will be less than if the credential is designated at an intermediate or expert level. Think of an ability continuum that goes from low ability to high ability. Somewhere along that ability continuum, a cut point will be set. Those candidates who score below that cut point are not qualified and will fail the test. Those candidates who score above that cut point are qualified and will pass. The minimally qualified candidate, though, should just barely make the cut. It’s important to focus on the word “qualified,” because even though this candidate will likely gain more expertise over time, they are still deemed to have the requisite knowledge and abilities to perform the job.

Factors to Consider when Constructing Your Test

You’ve determined the purpose of your exam and identified the audience. Now it’s time to decide on the exam type and which item types to use that will be most appropriate to measure the skills of your test takers. The type of exam you choose depends on what you are trying to test and the kind of tool you are using to deliver your exam (note that you should always make sure the software you use to develop and deliver your exam is thoroughly vetted). The type of items you choose depends on your measurement goals and what you are trying to assess. It is essential to take all of this into consideration before moving forward with development. Let’s take a look at some common exam types for you to consider.

Common Exam Types

Fixed-Form Exam

Fixed-form delivery is a method of testing where every test taker receives the same items. An organization can have more than one fixed-item form in rotation, using the same items that are randomized on each live form. Additionally, forms can be made using a larger item bank and published with a fixed set of items equated to a comparable difficulty and content area match.

Computer Adaptive Testing (CAT)

A CAT exam is a test that adapts to the candidate's ability in real time by selecting different questions from the bank in order to provide a more accurate measurement of their ability level on a common scale. Every time a test taker answers an item, the computer re-estimates the tester’s ability based on all the previous answers and the difficulty of those items. The computer then selects the next item that the test taker should have a 50% chance of answering correctly.

Linear on the Fly Testing (LOFT)

A LOFT exam is a test where the items are drawn from an item bank pool and presented on the exam in a way that each person sees a different set of items. The difficulty of the overall test is controlled to be equal for all examinees. LOFT exams utilize automated item generation (AIG) to create large item banks.

The above three exam types can be used with any standard item type. Before moving on, however, there is another more innovative exam type to consider if your delivery method allows for it:

Performance-Based Testing

A performance-based assessment measures the test taker's ability to apply the skills and knowledge learned beyond typical methods of study and/or learned through research and experience. For example, a test taker in a medical field may be asked to draw blood from a patient to show they can competently perform the task. Or a test taker wanting to become a chef may be asked to prepare a specific dish to ensure they can execute it properly.

Common Item Types

There are many different item types to choose from. While utilizing more item types on your exam won’t ensure you have more valid test results, it’s important to know what’s available in order to decide on the best item format for your program. Here are a few of the most common items to consider when constructing your test:

Multiple-Choice

A multiple-choice item is a question where a candidate is asked to select the correct response from a choice of four (or more) options.

Multiple Response

A multiple response item is an item where a candidate is asked to select more than one response from a select pool of options (i.e., “choose two,” “choose 3,” etc.)

Short Answer

Short answer items ask a test taker to synthesize, analyze, and evaluate information, and then to present it coherently in written form.

Matching

A matching item requires test takers to connect a definition/description/scenario to its associated correct keyword or response.

Build List

A build list item challenges a candidate’s ability to identify and order the steps/tasks needed to perform a process or procedure.

Discrete Option Multiple Choice™ (DOMC)

DOMC is known as the “multiple-choice item makeover.” Instead of showing all the answer options, DOMC options are randomly presented one at a time. For each option, the test taker chooses “yes” or “no.” When the question is answered correctly or incorrectly, the next question is presented. DOMC has been used by award-winning testing programs to prevent cheating and test theft. You can learn more about the DOMC item type in this white paper.

SmartItem

A self-protecting item, otherwise known as a SmartItem, employs a proprietary technology resistant to cheating and theft. A SmartItem contains multiple variations, all of which work together to cover an entire learning objective completely. Each time the item is administered, the computer generates a random variation. SmartItem technology has numerous benefits, including curbing item development costs and mitigating the effects of testwiseness. You can learn more about the SmartItem in this infographic and this white paper.

What Are the General Guidelines for Constructing Test Items?

Regardless of the exam type and items types you choose, focusing on some best practice guidelines can set up your exam for success in the long run. There are many guidelines for creating tests, but this list sticks to the most important points. Little things can really make a difference when developing a valid and reliable exam, so be sure to follow along!

Institute Fairness

Although you want to ensure that your items are difficult enough that not everyone gets them correct, you never want to trick your test takers! Keeping your wording clear and making sure your questions are direct and not ambiguous is very important. For example, asking a question such as “What is the most important ingredient to include when baking chocolate chip cookies?” does not set your test taker up for success. One person may argue that sugar is the most important, while another test taker may say that the chocolate chips are the most necessary ingredient. A better way to ask this question would be “What is an ingredient found in chocolate chip cookies?” or “Place the following steps in the proper order when baking chocolate chip cookies.”

Stick to the Topic at Hand

When creating your items, ensuring that each item aligns with the objective being tested is very important. If the objective asks the test taker to identify genres of music from the 1990s, and your item is asking the test taker to identify different wind instruments, your item is not aligning with the objective.

Ensure Item Relevancy

Your items should be relevant to the task that you are trying to test. Coming up with ideas to write on can be difficult, but avoid asking your test takers to identify trivial facts about your objective just to find something to write about. If your objective asks the test taker to know the main female characters in the popular TV show Friends, asking the test taker what color Rachel’s skirt was in episode 3 is not an essential fact that anyone would need to recall to fully understand the objective.

Gauge Item Difficulty

As discussed above, remembering your audience when writing your test items can make or break your exam. To put it into perspective, if you are writing a math exam for a fourth-grade class, but you write all of your items on advanced trigonometry, you have clearly not met the difficulty level for the test taker.

Inspect Your Options

When writing your options, keep these points in mind:

Always make sure your correct option is 100% correct, and your incorrect options are 100% incorrect. By using partially correct or partially incorrect options, you will confuse your candidate. Doing this could keep a truly qualified candidate from answering the item correctly.
Make sure your distractors are plausible. If your correct response logically answers the question being asked, but your distractors are made up or even silly, it will be very easy for any test taker to figure out which option is correct. Thus, your exam will not properly discriminate between qualified and unqualified candidates.
Try to make your options parallel to one another. Ensuring that your options are all worded similarly and are approximately the same length will keep one from standing out from another, helping to remove that testwiseness effect.

Conclusion

Constructing test items—and creating entire examinations—is no easy undertaking. This article helps you identify your specific purpose for testing and helps you determine the most common exam and item types you can use to measure the skills of your test takers. We’ve gone over general best practices to consider when constructing items, and we’ve sprinkled helpful resources throughout to help you on your exam development journey.

‍

Posted

Dec 5, 2022

in

Exam Science