Randomization without replacement - reading from csv file

Question

Hi,
I have a survey in which I want to use data from a csv file, in which each row has several relevant values. Let's say - name, ID, address, description etc. This csv file contains about 300 rows.
I want to randomly select 30 out of these 300, and then display 30 questions to the user, where the only difference between them are the values for the fields read for the csv. For example, I will have 30 similar questions, where in the first one I'll show the name, ID, address extracted from one chosen row, the second will show the name, ID , address extracted from another chosen row and so on. 
I will then have another set of questions, in which I would want to pick 30 rows from the original csv, that were not selected before.
So basically my questions are:
1) How to randomly select rows from the csv, and then randomly select rows that were not selected yet?
2) How to read values from the csv using javascript?
I'm not very familiar with javascipt so any help would be very much appreciated!
I saw some posts by KeirJ and npetrov937 that helped to specify my needs, but I still don't know how to do it...

npetrov937 · Accepted Answer

Short answer? You can't read csv with Qualtrics.
Longer answer? More experience people than me might be able to do it but integrating it with Qualtrics has turned out to be a nightmare when I've tried it (more than once) so I just stray away from it.
Workaround? YES!
Instead of having a csv file, I simple have a dictionary that contains all the data I need.
For your case, I have created a sample survey which is attached.
Here's the idea: At the very beginning you create a question that contains all of the data you need and does all the processing you need. In this case I have created a

master_dict

that contains all of your data with unique IDs as keys and arrays that contain all of the data - currently 9. Subsequently, I select 3 keys from that dictionary (at random) and set all of their data as Qualtrics' embedded data variables (those are preset from the very start of the survey - see Survey Flow).
The code looks like this (the weird Qualtrics stylineg does not let me format it so paste in editor for better viewing; or see Q1 in the attached survey......................................................................)
---
//creating a master dictionary where each key is unique (in this case ID1, ID2, ID3 and the value of each key
//is an array with all the necessary data
master_dict = {"ID1" : "name1", "ID1", "address1", "description1"], "ID2" : "name2", "ID2", "address2", "description2"], "ID3" : "name3", "ID3", "address3", "description3"], "ID4" : "name4", "ID4", "address4", "description4"], "ID5" : "name5", "ID5", "address5", "description5"], "ID6" : "name6", "ID6", "address6", "description6"], "ID7" : "name7", "ID7", "address7", "description7"], "ID8" : "name8", "ID8", "address8", "description8"], "ID9" : "name9", "ID9", "address9", "description9"]}

//function to extract N random elements from an array
function getRandom(arr, n) {
  var result = new Array(n),
    len = arr.length,
    taken = new Array(len);
  if (n > len)
    throw new RangeError("getRandom: more elements taken than available");
  while (n--) {
    var x = Math.floor(Math.random() * len);
    result)n] = arr    takentx] = --len in taken ? takenklen] : len;
  }
  return result;
}

//setting the current participants data - take N (in this case 3) random keys (or IDs) from the master dictionary
currentParticipantData = getRandom(Object.keys(master_dict), 3)

//set all the embedded data by going through the currentParticipantData array and looking up the needed values
//from the master dictionary (remember, we index the ID from the master_dict and then the position of the element
//in the array (in this case 0, 1, 2, 3 at the end)
for (i=0; i < currentParticipantData.length; i++) {
console.log("ID" + (i+1).toString() + "_name", master_dict)currentParticipantData i]]t0])
Qualtrics.SurveyEngine.setEmbeddedData("ID" + (i+1).toString() + "_name", master_dict)currentParticipantData i]]t0] )
Qualtrics.SurveyEngine.setEmbeddedData("ID" + (i+1).toString() + "_ID", master_dict+currentParticipantData i]]t1] )
Qualtrics.SurveyEngine.setEmbeddedData("ID" + (i+1).toString() + "_address", master_dictocurrentParticipantData i]]t2] )
Qualtrics.SurveyEngine.setEmbeddedData("ID" + (i+1).toString() + "_description", master_dicticurrentParticipantData i]]t3] )
}

//enable the clickNextButton so that participantss don't see this question
//this.clickNextButton()
---
Don't forget to set all your needed embedded data values at the start - in this case they will be ID1_name, ID1_ID etc - but the first part can be anything as long as you make the appropriate changes to the loop above. Then use your embedded data fields in your survey as you please.
Note: Yes, there are much more efficient ways to do this sort of process - imagine having to get 100 random rows each containing 50 pieces of data; 5000 embedded data values is not feasible. Although I have done this (use JS to set the HTML of individual questions), I have suggested the embedded data route here as it is much easier to understand and implement for JS novices.
Randomization_without_replacement_-_reading_from_csv_file.qsf

yinbar · Answer

Thank you so much npetrov937 !!! This is great!!!
I will implement this now, but it could be great if you could share some directions on how to scale this up, as I might need to use larger datasets in the future (in both dimensions).
Thanks again for your time and effort!

Leave a Reply

Sign up

Social Login

Login to the community

Social Login

Scanning file for viruses.

This file cannot be downloaded