Please ensure Javascript is enabled for purposes of website accessibility
/
/
Flexible test development: Supporting your AI journey at your pace
Blog

Flexible test development: Supporting your AI journey at your pace

Frank Williams, Director of Psychometrics

December 16, 2025
Share:

Testing programs are under a lot of pressure. You’re expected to maintain exam integrity, keep content current, respond to security threats, and operate efficiently. Often with limited budgets and limited subject matter expert (SME) availability. At the same time, AI is everywhere. Boards are asking about it. Conference agendas are full of it. And naturally, many programs are wondering: Should we be using AI in our test development process?

My view, and PSI’s view, is straightforward. When AI is the right fit, it can be a powerful tool in test development. But no testing organization should feel pushed into using it before they’re ready. Flexible test development means meeting your program where it is today and supporting you with the tools that make sense for your needs, culture, and resources.

There’s no such thing as a ‘standard credentialing exam’

When I think about flexibility, I start with the exam itself. Every program is different. Not just by industry, but by the purpose of the test, the characteristics of the candidate population, and the history of the program. Even within a single client, multiple programs may require variations in process or structure.

At PSI, we do have clear quality standards. We care about fairness, accuracy, currency, and solid psychometric principles. But those standards are intentionally broad. They guide the test development process without dictating the final shape of your exam.

I often say in client meetings, “We aren’t the SMEs”. My job isn’t to tell you what your exam must look like. My job is to help your SMEs apply strong item development practices so the exam we build together is defensible, valid, and aligned with your goals.

Even style guides, which often look similar across a client’s programs, don’t have to be identical. If a specific subgroup needs something different, that’s perfectly fine. Exam design should always reflect the needs of the program, not a rigid template.
This foundation of flexibility is important when we start talking about AI. It’s not a replacement for your current process, it’s an option that can support it.

Learn how PSI supported ACRP to complete four job analyses and update exam forms on schedule.

Where AI fits: Enhancing test development, not replacing expertise

Anyone who has worked through a major item-writing cycle knows how challenging it is. Good item writing requires deep content expertise, familiarity with the existing item bank, and an understanding of typical candidate errors. It’s time-consuming and it demands nuance.

AI can save you time. It is very good at the formulaic part of item writing. It can generate effective stems quickly and consistently, and for many programs, that alone is a game-changer. When you struggle with SME participation, as many programs do, AI-generated draft items can dramatically reduce the burden. Instead of asking SMEs to produce items from scratch, we can ask them to review and refine draft items that already meet baseline quality criteria.

But what AI can’t do is provide the nuance required in the test development process. It doesn’t replace human insight, far from it. That’s where your SMEs are still essential.

Discover how to maximize your SME’s time and impact in test development.

Same test development processes, different SME focus

We follow the same rigorous development process whether we use AI or not – item review and validation, then pretesting to ensure quality. The difference is that AI can take on the heavy lift at the beginning, leaving the humans to focus on the stages where expertise matters most.

One area where human judgment is still essential is distractor quality. AI can usually produce a solid key, but the plausibility of distractors is where context, nuance and real-world experience play a major role. Understanding what typical candidates get wrong, and why, is something humans still do best.

AI gives us speed and scale. Humans ensure relevance and validity. Used together, you get the best of both worlds.

Is AI right for your assessment program right now?

AI isn’t a universal solution. There are practical and cultural factors that determine whether it’s a good fit. Here are some questions that can help guide your decision:

How many items do you need?
Some clients have extremely healthy item banks. They’ve recently completed bank clean-ups, approved items are in good shape, and there’s no immediate pressure to generate more. In those cases, adopting AI may not offer meaningful value right now.

    On the other hand, many programs do need significantly more items, especially given today’s security landscape. Security breaches are no longer an ‘if’, they’re a ‘when’. Programs need item banks with enough depth to rotate forms quickly and respond to security events without compromising exam quality. AI can make it far easier to build that capacity at pace.

    Are SME resources tight?
    If you consistently struggle to get enough SMEs for item-writing workshops, AI can help you rebalance the workload. It lets SMEs spend more time reviewing and less time generating content from scratch.

      Read tips to recruit and retain a representative group of SMEs in test development.

      Do you have the resources to adopt a new tool?
      Like any new capability, AI comes with what I think of as ‘start-up costs’. These include project management, data integration, training, and change management. Some organizations operate with lean resources and cannot take on new infrastructure right away.

      Is your organization culturally ready?
      This factor doesn’t get talked about enough but may be the deciding one. You might be a textbook candidate for AI-driven item writing with high volume needs, limited SME time, and strong psychometric infrastructure. But if your board or stakeholders aren’t comfortable yet, their hesitation matters. Cultural readiness involves more than policies, it’s about familiarity and trust.

        One way to build that familiarity is to encourage low risk, every day use of AI tools for simple tasks like summarizing documents or drafting communications. If people get comfortable with AI in small ways, the leap into exam development feels far less daunting.

        What are stakeholders concerned about, and can you reassure them?
        A common concern is data security. Many boards worry that feeding information into an AI system means it could leak onto the wider internet. That fear doesn’t come from a lack of AI literacy, it comes from a real and reasonable awareness of risk.

          The important point is that purpose-built AI tools can operate in secure, closed environments that keep your content entirely within controlled systems. Part of my role is helping organizations understand how that works and why it’s safe.

          Learn about building secure AI test development workflows.

          Flexible partnership: Adopting AI when you’re ready

          I work with clients who are eager to embrace new tools, and others who need time. Some want to see a pilot first. Some want to bring SMEs along gradually. Some want to wait entirely, and that’s okay. We never prescribe one way forward. Instead, we explore:

          • What you’re ready for today.
          • What your stakeholders are comfortable with.
          • Where your program needs to go in the long term.
          • What pace of change is realistic for your culture.

          When new clients join PSI, we never impose the ‘PSI way’ on them. We look at what they’ve been doing, what they like about their current process, and how much change they’re prepared to make.

          The same approach applies to AI. Some clients say, “Take us full speed ahead”. Others prefer to move gradually. Some want to keep everything human-led for now. And all those choices are valid. To me, true partnership looks like ongoing planning conversations, listening closely to each other’s needs and constraints, and finding the middle ground that lets us move forward without unnecessary disruption.

          As I often tell my team, partnership isn’t about perfection. It’s about mutual respect, shared goals, and responsiveness.

          Talk to us about what’s right for your program

          AI can transform parts of the test development process, but it should never be adopted under pressure or out of fear. You can still build a strong, resilient, and defensible exam without using AI. And if you do want to explore AI, you should be able to do it at a pace that makes sense for you.

          If you’re considering your next steps – whether that’s exploring AI, strengthening your item bank, or simply refining your test development workflow – talk to our team about what’s right for your program. We’ll meet you where you are today and help you plan where you want to go next.

          Share:

          We're here to help

          Whatever your testing needs, our friendly, experienced team is here to provide guidance and answer your questions.

          Stay informed

          Join our newsletter and stay tuned with the newest insights

          Search

          An ETS company