In the world of high stakes assessment, testing organizations, test takers, and various stakeholders count on assessments to be supported by a rigorous process to ensure that it is valid, reliable and fair. When an assessment is perceived as unfair and unsupported, legal challenges may arise.
Perhaps most importantly, we make every effort to protect test takers from unfairness. We want to ensure every test taker is treated equitably and in an unbiased manner, and all have the same opportunities. Secondly, we work to support organizations in the event of a legal challenge. Some groups are legally protected, so equity isn’t just the right thing to do, it’s the law. And it’s not enough to take steps towards validity and fairness, these must be demonstrated in appropriate documentation in the event of a challenge.
Historically, there have been a variety of grounds for challenges to testing, often sparked by subgroup differences in test scores, complaints regarding access to testing and accommodations, and privacy concerns. For the purposes of this blog, we’re going to pay special attention to the issue of subgroup score differences and ensuring fairness across a diverse demographic including age, gender, race, and ethnicity.
In our second blog on legal defensibility, we will discuss test fairness and defensibility considerations related to test administration, and multi-modal testing in particular. In this blog we will look earlier in the process, at test development and how to produce science based, legally defensible content that is fair to all test takers.
A rigorous process
High stakes assessments, such as certification, licensing, qualifications and admissions, it is important to take steps to ensure validity, legitimacy, and fairness. And you must be able to provide evidence you have done so. Firstly, this involves a rigorous and objectively controlled method, free from bias, to define the important characteristics that are important for an occupation or educational curriculum, and which should be tested. A thorough process that builds a valid measurement of the knowledge, skills, abilities, and other relevant characteristics (KSAOs) needed to perform a role or achieve a certain standard.
Test development fundamentals
Rigorous test development requires that industry best practices be followed and we work with our clients at every stage to ensure all test items and forms will stand up to scrutiny in the event of a legal challenge of unfairness. Here are some of the fundamental steps you should take, to ensure validity and fairness and put yourself in the best possible position to defend against legal challenge.
- Job analysis – work with a diverse mix of Subject Matter Experts (SMEs) from your industry or discipline to objectively define exactly what is important to perform the role or achieve the required standard. Ideally, SMEs will represent various demographic groups.
- Measurement – use the job analysis as a foundation for building test content specifications. This should define the domains to be measured and the level of complexity required.
- Item development – work with a demographically diverse team of SMEs to generate test items that comport with the test blueprint.
- Psychometric evaluation – trial your items with a broad group of test takers. Use psychometric modelling and statistics to calibrate the difficulty and reliability of questions and construct a test form (or multiple equivalent test forms) in accordance with the test blueprint. When feasible, this may include a differential item functioning (DIF) analysis to identify and remove potentially biased items.
- Fairness analysis – conduct a rigorous cultural fairness review that analyses and scrutinizes test item content for any potential biased language or stereotypes. This will help to ensure items are perceived as fair across diverse populations.
- Set a standard – determine a passing score, based on the level of competency required to meet legal / industry standards and ensure public safety.
It’s important to involve a demographically diverse range of individuals at every stage of the test development process. One that is representative of your test takers and the wider population when it comes to age, gender, race, and ethnicity. Diversity should be reflected in the SMEs that undertake your job analysis and standard setting, as well as the experts who write and review your test items and the test takers involved in trialing and reviewing your tests.
By going through these steps, bringing in a diverse range of people, and openly sharing the rigorous process involved, you will also reassure test takers that the test they are taking is fair. This makes it far less likely they will challenge a result on the grounds of unfairness.
The validity of an assessment is demonstrated by evidence that it appropriately measures what it was designed to measure. Following the above test development steps will help assure the validity and fairness of the assessment. But of course, every industry and sector changes with time. New technology and innovations are introduced, occupations and curricula change, requiring that job analyses be updated to determine the need for updates to the tests and passing standards. Your tests need to keep pace with this change. Regularly scrutinize your test content to make sure you stay relevant to your industry and all your test takers.
As with test development, ongoing evaluation should also be viewed through the lens of fairness. Equally, have there been changes to legislation or standards that need to be reflected in your test content? Review, revise and / or reject poorly performing questions and if necessary, write new items. The performance of your test items and test forms should be analysed on a regular basis to ensure they are functioning as intended – and legally defensible.
To be clear, a legal challenge is a low frequency event. And no test developed by PSI when used as recommended by PSI, has ever been successfully challenged in court. A legal challenge can be costly, especially if it doesn’t go your way. Not just financially – the credibility of your program is at stake.
The team here at PSI brings expertise in the many competencies required to effectively develop valid and fair assessment assessments to assure the defensibility of your testing programs and your organization. We bring together science, technology, and deep expertise to develop and deliver ironclad defensibility in the most demanding and high-stakes testing applications.