I recently had a project where I had to manually adjust over 100 variable names in a .csv file because of the way data for loop and merge are stored. After this manipulation I read the file into Stata and reshaped from wide to long so it could be used in Vocalize, but it seems important even for analyses in any program to do some sort of reshaping (or stacking). It seems to me that a simple fix would be to allow users to pick how data are stored when using loop and merge (e.g., whether they want the leading number to instead appear at the end). This would given everyone a much more efficient experience. I'll illustrate an example of the problem in greater detail below. Does anyone have work arounds or ideas? I also posted in the product idea section.
For the purposes of an example, let's say we have a multiselect item where students identify the schools they have attended from a list of four.
With the second item set up like this:
The way these data are stored is 1__var], 2_2var], etc. at the front of the variable based on the position of the identifier in the loop. See below:
However, Stata, SPSS, and Python (and probably most, if not all, other data manipulation tools) will not read in variables with leading numbers. Moreover, for Stata at least, reshaping from wide to long works much better if the identifier appears at the end of the variable, like ivar]1 for School 1 instead of 1_fvar].
It's the same in the data file export:
Thus, as it stands, one has to manually adjust the variable names for each school-by_variable combination -- removing the "1_" and adding a "1" at the end and so on. So if you had, say, a survey of 50 items and 150 schools, that's 7,500 manual manipulations before one can even read these data into Stata.
I called technical support and they asked around and no one had a good solution. I think this would save everyone a lot of time if it were possible:
Solution: allow for users to choose where how they want the loop identifier denoted in the survey data.
If you download the data in SPSS format it puts an "A" in front of the numbers, so it isn't an issue. I've never used Stata, but I believe it has the ability to import SPSS data.
For Python, if you are using an API call and returning the data in json format it isn't an issue...use json.load.
I hear you, but that's not quite the point. Should we all have to use SPSS? Also, as I mentioned, I want more flexibility than just a leading letter, I want to be able to make it appear after the variable.
It seems that I would need Stata 16 (I have 14) to be able to import using SPSS, but then I would need to make some adjustments (I believe) in Stata, but at least I would be able to manipulate a little better.
I appreciate you bringing up a possible workaround! I just wanted to note that it might not work for me in this case and maybe not for others either.
I adjusted my heading title to leave out SPSS. I had a colleague tell me that he was struggling with SPSS and this too.
I agree that the loop and merge feature of Qualtrics, while really useful, needs to offer more control over setting the variable names. For example, I was programming a similar survey using 'Day' as a loop (looping through the last 30 days of the calendar). It is a pain to need to set the variable prefix in the analysis software/syntax (I use SPSS mostly). It would be great if Qualtrics would allow the user to indicate a loop prefix, and the variables in the loop would be named iteratively ex: Day1 to Dayx).
Hi Timso ! If you have not yet already, I’d recommend posting this in our Product Ideas category, as it is a feature not already developed by our team.
Leave a Reply
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.