Data forensics case studies from our team of experts 


Data forensics case studies from our team of experts 

Isabelle Gonthier, PhD, ICE-CCP


When people interact with test items and test forms in abnormal and fraudulent ways, this is often revealed in the test data. Data forensics enhances test security by investigating the data, giving us the ability to detect a wide variety of potentially suspicious patterns – from abnormally fast item response times to high item response similarities and unusually high test scores.  

These patterns in the data are then used to support investigations into suspected issues. And they can sometimes even reveal complex patterns of responses that aren’t observable by test proctors. On occasions where malpractice is confirmed, the outcomes of data forensics are invaluable in supporting disciplinary or further action against individuals or groups of test takers. 

This blog shares some stories from our team of data forensics experts where they used different statistical indices and analytics, often supported by web crawling, to detect anomalies which might indicate test misconduct. Read on to find out what our data forensics detectives uncovered and what happened next… 

Data forensics tool: High response similarity 

Discovery: As part of a client’s ongoing data forensics program, our routine scans unveiled a growing number of test takers with high response similarity. This means that over time, more and more test takers were responding to specific test items in a suspiciously similar way 

Any change in testing data over time demands further investigation. But in this instance our team knew they needed to delve into the data, as a high response similarity can indicate proxy testingwhere someone takes a test in the place of the actual test taker. A high response similarity can also show us where multiple test takers have had access to the same stolen test content. 

Outcome: A more detailed exploration of the data revealed that the response similarities were centered at a single test center outside the United States. On further investigation, the PSI global security team also found anomalous test scheduling patterns at that site – people from the United States were testing at that site on another continent. As a result the site was suspended pending further investigations.

Data forensics tools: High response similarity and response time irregularity 

Discovery: Another PSI data forensics client also experienced rapidly emerging patterns of high response similarity. In addition, their data showed some response time irregularities, where test takers either took a suspiciously short or sometimes long time to respond to individual items. When these patterns emerge in the data it raises a red flag, as we know it can indicate stolen test content. 

Using the changes in average item difficulties and item response time, we identified a list of items that were likely stolen and leaked. This list contained the items where an unexpected number of test takers were getting the item correct, as well as items where test takers responded in an unusually fast time.  

In data forensics, we often bring together information from different sources to support our investigations. This might be a tip off that test content has been leaked, for example. In this case, the list of items detected by our routine scans also had a high degree of overlap with additional evidence that those items were leaked. 

Outcome: In this case, for many of the leaked items the difficulty decreased but for many others difficulty increased, because the items were leaked with a wrong answer. When estimating individual test taker’s approximate score gain from using stolen content, we found that many of them had net-zero effect. Presumably due to memorizing a mix of correct and incorrect answers. 

Data forensics tools: Leaked item analysis and item matching algorithm 

Discovery: In some cases, an investigation is triggered but our findings show there is no issue and no further action is required. In this instance, a PSI client received copies of handwritten notes containing test content. These notes were transcribed and analyzed by the PSI team and an industry Subject Matter Expert, working together to identify potentially compromised bank items 

The PSI item matching algorithm, which uses various matching techniques to find identical or similar items across different platforms, was used to identify additional items from the bank that had been compromised. Web crawling was then used to look for this potentially compromised test content online, but no items were found during the search.  

Outcome: While these investigations were underway, our client withheld test scores to ensure they hadn’t been compromised in any way. As our additional forensics analysis did not reveal evidence of test takers using compromised materials, the client released the withheld scores to their test takers.  

Data forensics tool: Item matching algorithm and differential performance analysis 

Discovery: A client was mailed a document appearing to contain some of their bank items. Again, the PSI item matching algorithm was used to confirm that these were indeed items from the bank that might have been compromised. 

At the same time, several test takers had been flagged by data forensics. These flagged test takers exhibited differential performance on some test items, where their performance was different to the norm. Significantly, those items with differential performance were the same items included in the mailed document. 

Outcome: Our client retired the compromised items from their active item bank. These items were then used to create a new practice test, which the testing organization was able to sell to test takers and generate income from this valuable test content. 

Data forensics tool: Scoring and completion time analysis 

Discovery: Data forensics identified a group of high scoring test takers who passed what is usually a long exam in less than thirty minutes. Forensic analysis also revealed a group of test takers who had answered most items on a form in less than three seconds. However, this group took an excessive amount of time on the last few items to avoid the appearance of an unusually short test time overall. 

Outcome: These data forensics findings initiated weekly test center analyses to monitor overall test and item level times. Our investigations are ongoing. 

Flexible data forensics 

Depending on your test taker volumes and the security needs of your testing program, we conduct data forensic analysis on a daily, monthly or quarterly basis. And as some of the above examples demonstrate, where needed we will increase the regularity and type of analysis to investigate specific concerns.  

These case studies show that by acquiring, collating, and analyzing vast amounts of data we can spot patterns in the testing data that might indicate malpractice – with a speed and level of accuracy that has been previously unavailable. Used in this way, data forensics is rapidly emerging as one of the most powerful tools we have to enhance test security.