There are many of hot topics in current conversations about the future of test development. But what do people in the testing industry really think about topics such as Artificial Intelligence (AI) and Diversity, Equity & Inclusion (DE&I) in testing? Are we fully on board the AI train, forging ahead at full steam? Or are some adopting a more cautious approach? Does every testing organization know the measures needed to support DE&I in testing? Or are some still unclear about best practice?
We conducted a survey of assessment industry professionals to find out where the testing industry stands on these important issues – and more.i In our last blog, we shared the survey results on trends in test delivery. This blog focuses on the topic of test development. Our research offers insights into where we are now, as well as what we can do to foster innovation in test development while ensuring tests remain valid, reliable and fair.
AI in test item creation
Most respondents to our survey are supportive of the use of AI generative tools for test item creation. This support is particularly pronounced in organizations with testing volumes over 50,000 per year, where 86% support the use of AI for item creation, compared to 55% in organizations with testing volumes under 1,000.
It’s no surprise that higher-volume testing organizations, whose programs may need larger item banks to maintain test security, are interested in technologies for more efficient item generation. However, organizations of all sizes are still unsure about the quality of test items created using AI generative tools. While more than a quarter of respondents (26%) anticipate that AI will improve the quality of tests, more than half (52%) are not sure, and 14% believe it will decrease quality.
Pros and cons of AI in test development
Respondents see efficiency and speed of test item creation as the top benefit of AI-generated content, followed by the quantity of items produced. AI’s ability to generate content that may provide a good starting point for Subject Matter Experts (SMEs) was also recognized. On the negative side of the AI debate, the most common concerns are around quality or accuracy of content, security of content, and copyright / ownership of content.
It seems that while the testing industry is keen to explore new technologies for test development, it’s not a case of jumping in headfirst. There is understandable caution around the use of AI for test content development, and more needs to be done to provide assurances about quality and security where AI is involved. The oversight of human SMEs is still essential during test item creation to maintain high standards.
Test development technologies
Our survey explored additional testing technologies and the picture was similarly varied across different sized testing organizations. Nearly a third (29%) of those with testing volumes over 50,000 have already successfully explored innovations such as automated item generation, compared to 12% of those with testing volumes under 1,000.
Overall, 14% of respondents from organizations of all sizes report a successful experience with innovations in testing technology and 17% are still evaluating the results. This shows that new technologies are changing test development processes and the way we work, but also that initial trials and adoption are still in the early stages.
High-volume testing organizations are leading the way and testing the waters, addressing their concerns prior to wholesale adoption. And this reflects our approach at PSI. We are keen to embrace new technologies for the benefit of our clients, while balancing innovation with the rigor required to maintain test integrity and security.
Innovative item types and test formats
Another evolving area we explored in our survey was new item types and test formats that testing organizations might be considering in the next 12 months. Simulations (26%) and situational judgement tests (23%) were the most popular options, with open-ended response types coming in third (20%).
The variety of item types available to us is rapidly growing beyond the more traditional multiple choice type questions. Different item types increase our ability to evaluate the knowledge and skills of our test takers, leading to more valid, reliable, and fair testing. However, as with the use of AI in item creation, it’s important that a high degree of rigor is still applied when creating new item types. For example, updating your SME training on authoring new item types, and conducting sufficient pre-testing and post-test analysis to ensure items are of high-quality. Additionally, it’s important to note that the knowledge or skill to be assessed should determine the appropriate item types to use.
DE&I and test development
DE&I is important across the whole assessment lifecycle, and DE&I considerations are essential during test development to ensure all content is fair and unbiased. Our survey showed that regardless of organization size, there is a consistent investment in DE&I training for item writers and developers. This includes bias awareness training (35%) as well as diversity & inclusion training (33%).
Testing organizations are also mindful of changes in the test taker population that might affect the need for different item and test formats. This is reflected in the provision of more inclusive and diverse test formats, as well as adapting test formats for those with disabilities and impairments.
Simplifying and tailoring test formats for non-native language speakers and specific populations are also common measures taken to meet the needs of different test takers. Astute organizations are seeing that supporting DE&I is the right thing to do, and it allows them to reach a larger pool of test takers and grow their programs.
Read our guide: Diversity, equity & inclusion across the assessment lifecycle.
The Future of Test Development
Our Future of Testing research shows a keen interest within the testing industry to explore the use of AI and other technologies in the test development process. Equally, DE&I continues to be a focus in test development, with testing organizations using a variety of methods to ensure they are serving the needs of different populations.
What the research also shows us is that there is some caution in the testing industry, around AI in particular. We are all working in a period of intense and rapid change and while it’s important that we leverage the new technologies available to us, we need to continue working together to navigate the way forward – to ask the right questions, explore the potential, and proceed appropriately.