We will be administering curriculum-based assessments via Qualtrics to measure learning gains in high school students pre- instruction vs. post-instruction. In the past, we have tried a number of strategies to get matching pre- post-data with a minimum of human-hours required to find missing or incorrectly entered matches. We have not found a satisfactory way to force validated matching options. We typically have to match anywhere from a few hundred, to about a thousand pre- and post- assessments. As you can imagine, doing this manually is extremely time intensive. Here are the things we have tried so far:we have assigned ID numbers to students that they needed to remember for their post test. Results = not great - many forgotten numbers, numbers entered incorrectly, students entering made up numbers, etc. We have assigned ID numbers based on patterns (first two letters of the teacher’s last name, class period as a two-digit number, and school-based student ID number). Results = also not great - many issues with things as small as capitalization vs no capitalization, spacing, mismatching, etc. We have tried creating ID numbers that are only numeric and 6 digits long so that we can at least force that type of response - they consist of a two digit teacher ID assigned by the researchers and given to the teacher, a two digit class period (i.e., 1st period = 01), and a two digit student number assigned by the teacher. Results = still a lot of mismatched data We tried a forced validation version of #3 to try to prevent students from making things up and not entering information correctly. In the forced validation we require students to enter their two digit teacher ID, and then enter it again for validation against their first entry. We then request their class period (two digits), and have them enter it again for validation against their first entry. We do the same thing for their student id - they enter it, and then they must enter it again and it has to match what they put in. Results = slightly better than attempts 1, 2, and 3 because it makes students have to put in something that they will remember - since they have to enter it twice, and because it asks for the pieces of the ID number one section at a time. However, there are still students making up things (like entering all 01,01,01). We have rules that require numbers only, and for teacher ID numbers they have to be two digits between 01 and 49 (we won’t assign a number higher than 49 for any single study), for class periods they have to be between 00 and 09 (no school that we’ve come across yet has had more than 9 class periods), and for student ID numbers, they need to be between 01 and 60 (we haven’t had a class period with more than 60 students in it yet). What we would love is a way to be able to have some sort of forced validation of a pre-test and post-test that still allows the data to be anonymous to us, the researchers, but requires a match with a previously entered value on a pre-test before a student can take a post-test. We cannot have access to names, email addresses, or other PII of high school students. In the standardized assessment world, test tickets are generated based on PII and assigned to students so that their test scores can be matched to their ID information. In the research world, we cannot do that. I’m interested in ideas and suggestions for how we can more easily and accurately match pre- and post- assessment data among hundreds of test takers with some sort of forced validation. Any ideas?

Pre-and post- assessment matching - anonymous

We will be administering curriculum-based assessments via Qualtrics to measure learning gains in high school students pre- instruction vs. post-instruction. In the past, we have tried a number of strategies to get matching pre- post-data with a minimum of human-hours required to find missing or incorrectly entered matches. We have not found a satisfactory way to force validated matching options. We typically have to match anywhere from a few hundred, to about a thousand pre- and post- assessments. As you can imagine, doing this manually is extremely time intensive. Here are the things we have tried so far:

we have assigned ID numbers to students that they needed to remember for their post test. Results = not great - many forgotten numbers, numbers entered incorrectly, students entering made up numbers, etc.
We have assigned ID numbers based on patterns (first two letters of the teacher’s last name, class period as a two-digit number, and school-based student ID number). Results = also not great - many issues with things as small as capitalization vs no capitalization, spacing, mismatching, etc.
We have tried creating ID numbers that are only numeric and 6 digits long so that we can at least force that type of response - they consist of a two digit teacher ID assigned by the researchers and given to the teacher, a two digit class period (i.e., 1st period = 01), and a two digit student number assigned by the teacher. Results = still a lot of mismatched data
We tried a forced validation version of #3 to try to prevent students from making things up and not entering information correctly. In the forced validation we require students to enter their two digit teacher ID, and then enter it again for validation against their first entry. We then request their class period (two digits), and have them enter it again for validation against their first entry. We do the same thing for their student id - they enter it, and then they must enter it again and it has to match what they put in. Results = slightly better than attempts 1, 2, and 3 because it makes students have to put in something that they will remember - since they have to enter it twice, and because it asks for the pieces of the ID number one section at a time. However, there are still students making up things (like entering all 01,01,01). We have rules that require numbers only, and for teacher ID numbers they have to be two digits between 01 and 49 (we won’t assign a number higher than 49 for any single study), for class periods they have to be between 00 and 09 (no school that we’ve come across yet has had more than 9 class periods), and for student ID numbers, they need to be between 01 and 60 (we haven’t had a class period with more than 60 students in it yet).

What we would love is a way to be able to have some sort of forced validation of a pre-test and post-test that still allows the data to be anonymous to us, the researchers, but requires a match with a previously entered value on a pre-test before a student can take a post-test. We cannot have access to names, email addresses, or other PII of high school students. In the standardized assessment world, test tickets are generated based on PII and assigned to students so that their test scores can be matched to their ID information. In the research world, we cannot do that.

I’m interested in ideas and suggestions for how we can more easily and accurately match pre- and post- assessment data among hundreds of test takers with some sort of forced validation. Any ideas?

Page 1 / 1

That seems like a conundrum. How much time passes between the pre- and post-tests? Who administers the tests: the researchers or the teachers? Is there any way to get the teachers involved in assigning the IDs for the pre-test, and ensuring the same IDs are used for the post-test?

I don’t come from an assessment background but ideally, if you could assign unique IDs in the pre-test, then load them into the post-test and use the Authenticator feature when they access it, that would help you make the matches.

Hi @MatthewM - thank you for your reply! There is typically between 2 and 8 weeks (depending on the materials we are testing) between pre- and post- tests. We (the researchers) provide the assessment link to the teachers, and they share the link with their students. We have directed teachers to make sure that the students are entering the correct information with their ID numbers, and they do their best. But, having been a former classroom teacher myself, I can appreciate the difficulty of the task that they have just getting everyone logged in - let alone logged in with an exactly correct password. I would like to find a way to take the burden off of the teachers, as much as possible, while making it as easy as possible for us to match data. A solution that would make it easier to do the correct thing than the wrong thing . . . a path of least resistance related to getting the number entered correctly.

I’m interested in thinking more about how we might be able to assign unique IDs in the pre-test, and load them into the post-test. We typically work with schools across the U.S. and have anywhere between 6 different teachers (each in a different school), and 30 different teachers (each in a different school) participating. Those teachers typically have at least 2 class periods that they implement the materials with. The number of students per teacher ranges from about 30 students to over 200 students.

If you were to assign unique ID’s for pre-test that were preloaded to the post-test, how would you go about distributing them across participants - knowing that they would have to go through teachers to get to the student participants? I’m interested in any and all ideas - what we have tried so far has not worked well, so I’m open to suggestions! . Thank you so much for your time and help!

I posted a reply here but it seems to have vanished somehow. But I believe you could improve the validation of IDs between pre- and post-tests by using query strings for the part of the student ID that represents the school/teacher/class. That would reduce the burden on teachers and students; you wouldn’t have to have unique ID numbers across the project (just within each school/teacher) since you would use the query string data to concatenate with the student IDs to make them unique.

This is obviously glossing over all the details to set it up in Qualtrics, but have a look at the support page linked above and let me know if I can explain anything further.

@MatthewM - thank you for the information! I read through the query strings link (thank you for sharing that). I’m not sure I understood it all, though. To make sure I’m picking up on the general idea - we would create unique query strings for the school/teacher/and class, and then share the URL that has the query string that is unique to that teacher and their class for the survey? The students would then enter a student ID that would be matched from pre- to post- test. The responses to surveys would be grouped by school/teacher/class because of the query string associated with the survey link. Then, the only risk for mismatch between pre- and post- test would be whether or not the students enter their student ID correctly? Is that correct? If I’m understanding that correctly - is there a way to also pre-assign random student ID numbers that could then be validated for post-test completion to ensure that we can match the data?

Thanks again for your help and suggestions!

Sorry, my original reply was more detailed but somehow the system did not save it or let me post it. But you are on the right track with your overview of query strings, it would definitely narrow down the range of ID mismatches to the teacher/class level.

I think you could pre-assign ID numbers and share them with the teachers to provide to their students, and use an Authenticator to validate the IDs.

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded