A Guide to ‘Good’ Survey Results
In a previous BluePaper we addressed the question of how many of your students you need to survey to get a “good” survey result-viz., the question of sample size-and we ended our answer to that question with another rhetorical question: Will a “good” sample ensure a “good” survey result? Leaving aside for the moment what exactly is meant by a “good” survey result, it will be obvious to anyone that, while a “bad” sample will ensure a “bad result,” it may not be as obvious that a “good” sample will not ensure a “good” survey result. It is the old “necessary but not sufficient” condition. Taking up the question of what exactly makes for a “good” survey result leads to a line of inquiry that, like the question of sample size, yields some surprises and provides some not-so-obvious answers.
What is meant by a “good survey result”?
Well, in the most general of terms, a good survey result is one that tells you what you need to know, that is, what you set out to learn in the first place. But peel this onion back a few layers and you will see that the threats to a “good” survey result, so defined, are numerous. Ironically, the most common threat to a “good” survey result is the one that is most controllable-viz., lack of clarity about what you seek to know. Poorly formulated information objectives account for as many poor survey results as do technical methodological defects. There are other threats to be sure, but all these are manageable, and there are also specific methodological practices that can help you to avoid them. In one way or another, most threats to a “good” survey result, other than those attributable to a lack of clarity of information objectives, boil down to threats to the validity or reliability of the results. Of these two, validity, for reasons that will be described shortly, is by far the more critical.
But first, allow a short digression here to avoid the risk of some later terminology-induced confusion. In question here is the common misuse of the term “validity,” The term is often incorrectly used in reference to samples, but it really applies only to survey results. If we clarify the use of the term first, our discussion of “good” survey results will be made easier. We are frequently asked “how large must the sample be to be valid?” This question cannot really be answered, because validity is not a property of a sample. Validity is a property of the responses recorded from the survey respondents, i.e. the survey results. If you ask a student how much time s/he spends studying, and s/he exaggerates by reporting double what they actually spend, then the response is not valid. That is, it does reflect the facts as they are. If students are going to be untruthful about the time spent studying, there is nothing to gain by asking the question to more and more of them. In fact, asking more and more respondents will only make matters worse; it will lead to an invalid result in which you have high but unwarranted confidence.
The concern of those who incorrectly ask about the validity of a sample originates with an intuitive, if vague, notion that you cannot know the attitudes or habits of a large group of people by asking only a few. This is a very reasonable concern and an understandable, and generally correct intuition, but it has nothing to do with the construct of validity as it is applied in survey research. As applied in survey research, validity refers to a property of the responses, i.e. the survey results, but you cannot control the validity of your results by increasing your sample size. Two examples will make this obvious. First example: You want to know what percentage of your students are liars, so you survey them with the question “do you tell lies?” What can you know from a sample of one? What can you know from a sample of one million? In both cases the answer is little or nothing. Second example: why does a chef stir the soup before tasting it? Because if she stirs it, the soup becomes homogeneous. Hence, it does not matter from where she takes her sample spoonful. She needs only one spoonful to know how the entire pot tastes, to know the fact she seeks, viz. “is the soup done?” If everyone in a population is the same, then a sample of one tells all. Your intuition is probably telling you that the more heterogeneous the population, the larger your sample size needs to be, and, if it is, our compliments to your intuition, because it is correct.
So, the question we should ask about the sample is “is it representative?” not “is it valid?” Representativeness is a property of the sample; whereas validity a property of the survey responses, i.e. of the survey results. (Please see our BluePaper entitled “How many of My Students do I Need to Survey?” for a discussion of sample size.)
Now that we are not at risk of misusing or misunderstanding the term, validity, let’s return the focus of our discussion to the question of what makes a “good” survey result. As we have already said, a good survey result is one that delivers what you need to know. To reach this objective you first need clarity about your objective, and then you need a survey process that can produce results that have both the basic attributes of validity and reliability. The importance of clarifying, in advance, not only what you need to know, but also why you need to know it cannot be understated. Ask yourself “what managerial objective or action am I trying to inform with the survey results, and what managerial action will I take if I discover A, and what action will I take if I discover B? Another important point to make here is that the greater the clarity you achieve concerning your information objectives beforehand, the easier it is to achieve validity and reliability in the results, and the more likely it is that you will. I will make this connection clearer later, but first, let’s discuss what is meant by the attributes, validity, and reliability.
Validity and reliability: what are they?
Validity and reliability are very simple, but very specific, concepts. Validity simply refers to the degree to which a response reflects (or measures) what we intend for it to reflect. For example, it will be understandably difficult for anyone in current times to believe that the circumference of one’s skull can tell you anything about their intelligence, but there was a time in the history of psychology in which this was considered one way to measure a person’s intelligence. Now, of course, we know that there is no relationship between the two, and that skull circumference is not a valid measure of intelligence. Similarly, in our earlier example of wanting to know what percentage of our students are liars, their answer to the question “do you tell lies?” will not reveal if they are or are not liars. Someone who says no and is lying cannot be distinguished from an honest “no.” So, unless a survey question, evokes an accurate and truthful response, it will lack validity. Invalid responses are useless. There are no statistical analyses nor any other methodological tool that can restore validity to invalid responses. Validity has to be built into the survey from the outset.
On the other hand, reliability refers simply to the stability, or reproducibility, of a response. Skull circumference as a measure of intelligence is not valid, but it is extremely reliable. If you take ten, a hundred, or a thousand measures of someone’s skull circumference, you will get pretty much the same result on all measurements-i.e., the measure is highly reproducible. The measurements however, will not tell you a single thing about intelligence. So it is a reliable but not a valid measure of intelligence (though it is a valid measure of how large the person’s head is, if that is what you sought). Survey responses can have some, but incomplete, reliability. For example, if you want to know how satisfied students are with the variety of foods served in the dining hall, they may be quite satisfied overall, but, depending on what was on the menu the day you ask them, they may report being a little more or a little less satisfied. Even attitudes that are more or less stable will have some small degree of day-to-day variability. The point here is that reliability is not an all or nothing attribute, rather there are degrees of reliability.
We have used some extreme examples here to illuminate the concepts of validity and reliability, but, in most practical cases of institutions surveying their students, the issues of validity and reliability are not as clear cut. For example, suppose you want to measure the how effective your process for matching roommates is, and, as a measure of your effectiveness, you decide to ask the student residents “do you like your roommate?” Will this question produce valid responses for answering your question about the effectiveness of your roommate matching process? It may tell you something about the matching effectiveness, but it assumes roommates who like each other are good matches. Maybe the roommates like each other because they are good drinking buddies, which leads both to never study; or, maybe a student likes his or her roommate as a person, but simply does not prefer to room with them. Perhaps a student who does not like his or her roommate fears word may get back to them, so they simply report liking the roommate more than is the case. So, a response may have some, but not complete validity. The point here is that validity, like reliability, is not an all or nothing attribute, rather there are degrees of validity. Perhaps a different question would have more validity for the information objective-e.g., “If given the opportunity to choose your roommate, would you choose your current roommate?”
Now let’s say something about why validity is more critical than reliability. Think about it, if a response is valid, it is a true measure of what it was intended to measure. And, if it is a true measure of what it was intended to measure and you make repeated measurements of the characteristic, you will get the same result every time. Hence the measure is reliable. Therefore, if a measure is valid, it will also necessary be reliable. The inverse, however, is not true: a reliable response does not guarantee its validity. The moral of this story? Concentrate on validity, and reliability will take care of itself.
Clarity of information objectives.
This is a difficult issue. An exchange between Alice and the Cheshire Cat in Lewis Carroll’s Alice in Wonderland comes to mind:
Alice: Would you tell me, please, which way I ought to go from here?
The Cat: That depends a good deal on where you want to get to
Alice: I don’t much care where.
The Cat: Then it doesn’t much matter which way you go.
Alice: …so long as I get somewhere.
The Cat: Oh, you’re sure to do that, if only you walk long enough.
This is not exactly a good strategy for designing surveys. As simple as it sounds, it is essential in surveying that you achieve great clarity about what you seek, before you start designing a survey. This includes great clarity in your terminology. For example, suppose you are interested in knowing if your off-campus students feel integrated into campus life. Asking the question “do you feel integrated into campus life” may get you some information, but it will not be clear exactly what information. First, different students will have different levels of need for being integrated, so they will have different thresholds for what qualifies as integrated. Second, students will have different interpretations of what the term “integrated” means. Maybe to some it is a matter of how you feel about the campus experience, while to others maybe it is a matter of the participation in on-campus activities. You, the surveyor, must decide what information you seek, how you want “integrated” to be defined, and then ask the question in a way that makes your definition clear to the student. One way to help achieve clarity of the constructs you are asking about is to “operationalize” them. For example, maybe you would operationalize the construct of “integration into campus life” as the number of non academic on-campus activities in which the student participates. The more operationalized a concept is, the easier it is to achieve validity and reliability in its measurement. The risk, of course, is that one can, if not careful, operationalize the meaningfulness out of a response by over operationalizing it. A careful balance must be achieved between clarity of a response to a question and its meaningfulness.
A technical word here about information is in order. Above, we have used the term information, but what exactly does it mean? In information theory, as well as in scientific surveying, the term “information” has a very specific meaning. Information refers to uncertainty, and a response that reduces uncertainty conveys, or “transmits,” a lot of information. A response that reduces little uncertainty transmits little information. So, in our earlier example where we asked a student if he or she is a liar, the response, whether it be yes or no, will reduce the uncertainty by no practically useful amount. We will be just as “in the dark” about whether they lie after their response as we will have been before it. In the example of asking a student if they feel integrated into campus life, an answer of yes or no reduces uncertainty somewhat, but maybe not as much as you need it reduced-e.g., maybe they answer yes because they have no need or desire to be integrated. In the example of asking a student if they are a member of one or more student organizations, an answer of yes or no reduces practically all uncertainty of the matter (assuming, of course, that they are not lying). But notice that, in each of these sample questions, there is an incrementally greater level of specificity about what we are asking: “do you feel integrated into campus life?” can mean different things to different people. But, asking “are you a member of one or more student organizations?” will be interpreted more or less the same by everyone, especially if the question also clarifies what qualifies as a student organization. The qualification criteria could be easily handled by simply listing all student organizations and asking the student to indicate those of which they are a member. In designing a questionnaire, it is a useful exercise to ask yourself, for each question, what uncertainty will the response to each resolve? and to what degree will it be resolved? If the answer to these two questions is either “it is not clear” or “not much,” the question needs revision. Achieving clarity with your information objectives will go a long way toward ensuring you compose well structured questions; ones that will produce valid and reliable responses, at the same time as reducing the amount of uncertainty. In the final analysis, that is why we conduct surveys.
So, we have talked about response validity, response reliability, and clarity of information objective(s) We looked at how each of these properties drive whether we get a “good” survey result. We have also cited some examples of the threats to getting a “good” survey result.
To really appreciate the importance, as well as the difficulty, of achieving a “good” survey result, we suggest the following exercise: try to write a survey question (or set of questions) that will resolve practically all uncertainty concerning the following question: How many of your off-campus students would elect to live on campus if the housing fees were lower? Then, subject your formulation of the question to an evaluation of how you would control for the following threats to validity and reliability:
Common threats to response validity and some effective measures to overcome them:
1. Question not clear to respondent (e.g., “Do you want more campus activities?” Meaning what, more to choose from? more opportunity to participate in existing ones? more time made available for participation?)-Be clear and precise as to what information you seek, operationalize the construct.
2. Too many dimensions of a single construct included or implied in wording of question (e.g., “Are you satisfied with your residential life?”)-Decompose large construct into smaller operationalized constructs.
3. Response scale too vague or not appropriate to question being asked (e.g., “How often do you take your meals off campus?” [ ] never, [ ] when I don’t like on-campus menu, [ ] when my family comes to visit, [ ] often, [ ] as often as I can )-Operationalize the construct being scaled, apply proper measurement scale (nominal, ordinal, interval, ratio), and properly and clearly label the response options. Seek competent technical survey development expertise.
4. Student feels a risk to confidentiality and that his or her response might become known by others (e.g., “Do you get along well with your resident assistant?”)-Communicate that responses are confidential, use an external vendor with high security to conduct survey.
5. Student has tendency to give popular or politically correct responses instead of expressing their honest feelings (e.g., “Would you be willing to room with student of a different race than yourself?”)-Include other “concordance-checking” questions, e.g., “Would you support a race-blind policy of roommate assignments?” elsewhere in the survey, and check for consistency during data analysis.
6. Student believes no follow up action will be taken on survey results, so does not invest in giving thoughtful answers (e.g., because the survey of students’ satisfaction with room assignment is being conducted in the face of years of continuing complaints from students; complaints that have, to date, resulted in no change to the assignment policy or process)-At the time of the survey, announce that survey results will be posted, for student viewing, on the department’s web site or published in the campus newspaper, and that the posting will include the response by student services management.
7. Student does not feel s/he is a stakeholder in the object of the question, therefore no thought is given to the response (e.g., “Do you think the compensation of student service directors is sufficient to attract talented managers?”)-Avoid such questions, since the issue is not likely to be important to students and their responses are not relevant to managers of the issue. Or, make the connection to their interests explicit in the question.
There are other threats, but these are the ones you are most likely to encounter in surveying students on student service issues. If you are able to avoid the threats to the validity and reliability, you have properly clarified the information you seek and why you seek it, and you have found the proper balance between the meaningfulness of the questions asked and how they have been operationalized, then you will most probably obtain a “good” survey result. At least you will have “good” survey data. Raw survey data, however, are not information, rather they are potential information that get processed into actual information, by means of proper analysis. The engine for performing that processing is statistical analysis.
In a later BluePaper we will take up the topic of statistical analysis of survey data. It is simpler than you think. Meanwhile, we invite you to take one or more of the surveys we have developed to see how we have achieved validity and reliability in our surveys. Data-driven decision making requires more than just data. It requires valid and reliable data that have been properly processed into managerially relevant information with the proper statistical analysis. More on these ideas in a future BluePaper.