|
Associate Professor of Marketing FGCU DR. L's MARKETING RESEARCH LECTURE NOTES Ch. 7 - 10
CHAPTER 7 I. Reasons for the Popularity of Surveys: A. The need to know WHY people do or don't do something. B. The need to know HOW people make decisions. C. The need to know WHO the person is in terms of demographics, lifestyles, attitudes, etc. II. Types of Errors in Survey Research - Random Sampling Error (Random Error) & Systematic Error (Bias). Remember this: CENSUS RESULTS less SAMPLE RESULTS = TOTAL [SAMPLING] ERROR = RANDOM SAMPLING ERROR plus SYSTEMATIC ERROR (BIAS). A. Random Sampling Error - Error that results from chance variation. It cannot be totally avoided, but it can be minimized by increasing the sample size. You can estimate the range of random sampling error at a particular level of confidence (if you use a "probability" sample). B. Systematic Error [BIAS] - Error that is built into the research results from the research design or in the execution of the sample process (e.g., in choosing or using the sample). The presence of systematic error causes the results of a sample to consistently vary in one direction - either higher or lower than the true value of the population parameter (variable) being estimated from the sample. Systematic Error is of two types - Sample Design Error & Measurement Error. 1. SAMPLE DESIGN ERROR is of three types: a) Frame Error - systematic error that results from using an inaccurate or incomplete list of population elements (called a SAMPLING FRAME). For example, using a telephone directory as a sample frame creates a bias in the responses because unlisted and no-phone people are, in fact, systematically different than people who have phones and list their numbers, and this systematically biases the sample data. b) Population Specification Error - systematic error that results from an incorrect definition of the universe (or population) from which the sample is chosen. If those people that were excluded incorrectly from the definition of the population of interest (and therefore excluded from possible selection to your sample of that population) are significantly different on the variables of interest, the sample data would be biased. c) Selection Error - Error that results from following incomplete or improper sampling procedures. If the people not interviewed are, for any reason, systematically different from those interviewed, the sample data would be biased. This is a serious problem with nonprobability samples. 2. MEASUREMENT ERROR - Error that results from a variation between the information being sought, and that actually obtained by the measurement process. This is a much greater threat to survey accuracy than is random error. [ Note: The opinion polls that you see quoted by the media and in marketing research reports note that the results have a "+ or - 5% degree of error." This does not refer to the chance of "total" error; It only refers to the "chance" of random (sampling) error, which is often just called CHANCE. Thus, the "+ or -" statement makes no assertions about errors that may be present due to biases and measurement errors. This means that it does not include the remainder of total error, which also includes the possibilities of both the set of sample design errors discussed above, and the set of measurement errors that we now begin to discuss.] Types of Measurement Errors (which tend to obtain the wrong information): a) Surrogate Information Error - Error that results from a discrepancy between the information needed to solve a problem, and that sought by the researcher. This error comes from improper research design, which usually stems from incorrectly defining the basic problem that needs solving. [this is the error committed by Coca Cola on researching New Coke] b) Interviewer Error (or interviewer bias) - Error that results from conscious or unconscious bias in the interviewer's interaction with the respondent. This bias is created when respondents' give untrue or inaccurate answers due to the interviewer's dress, age, sex, facial expressions, body language, or tone of voice. It is also created when interviewer's cheat by inventing answers to the survey questions. c) Measurement Instrument [Questionnaire] Bias - Error that results from the design of the questionnaire or measurement instrument. It can occur due to poor survey design that encourages errors in responses (such as the inclusion of leading questions and/or confusing instructions or survey layout). A researcher should pretest questionnaires to reduce this bias. d) Processing Error - Error that results from incorrect transfer of information from the survey document to the computer. A researcher must check data input several times to prevent this error from occurring even once. e) Non-Response Bias - Error that results from a systematic difference between those who do and those who do not respond to the measurement instrument. This error can result when some individuals in the sample either cannot be reached or refuse to answer the questions in the survey. The researcher must establish procedures to make several attempts on those who do not respond, and then statistically compare the responses of those who responded to the first request versus those who responded to the second or third attempts to get their responses to the survey. If the responses are not significantly different, one can conclude that non-response bias is absent from the survey. f) Response Bias - Error that results from the tendency of people to answer a question falsely, through deliberate misrepresentation, or through unconscious falsification. For instance, some people deliberately give false answers to appear intelligent (such as guessing rather than responding "don't know"), or to avoid providing embarrassing information, or to conceal information that they feel is confidential. Also, some people give inaccurate responses while trying to be truthful. This can happen due to poor question format or content. Non-Response Bias and Response Bias often are referred to collectively as "Response Error." III. Types of Surveys: A. Door-to-door Interview - An interviewer interviews consumers in their homes (this method usually collects the most reliable and the most volume of data). Why? B. Mall-Intercept - Interview consumers in shopping malls or other high-traffic locations (usually done in public areas or interviewees may be taken to some nearby private area). Also used to screen people for inclusion in focus groups. (this method is relatively simple, yet effective & efficient) C. Executive Interview - Interviews conducted with business people about products or services. (Very expensive and time consuming.) D. From-Home Telephone Interview - Interviewers use their home phones to interview respondents, who are usually consumers & industrial users of products. E. Central Location Telephone Interview - Interviewers make calls from a company facility to reach and interview respondents. This allows the supervisor to unobtrusively monitor the interviewing while it is taking place, and can also facilitate the use of computer assisted interviewing capabilities [see next]. F. Computer Assisted Telephone Interview (CATI) - Central location telephone interviewing in which the interviewer enters answers directly into the computer. This allows the interviewer to input responses directly into the data set which avoids the cost and processing errors associated with manual coding and entering of data into the computer sometime after the interviews generated the raw input data on paper. G. Direct Computer Interview - Consumers are intercepted in a mall and interviewed by a computer that asks questions and accepts responses from the consumer's (participant's/ subject's) own hand. H. Self-Administered Questionnaire - A questionnaire filled out by the respondent with no interviewer present. Used in mall-intercepts, classrooms and mail surveys. Mail Surveys - Questionnaires are mailed to a sample of consumers or industrial users, along with instructions, postage paid return envelopes, and cover letters. Respondents complete and return the questionnaires by mail. The most serious problem with mail surveys is that the response rates are often very low (e.g., often less than 10%). a) Ad Hoc (one shot) Mail Surveys - Questionnaires are sent to selected names and addresses with no prior (no pretest) or posttest contact. b) Mail Panel - Participants are precontacted and screened, then periodically sent questionnaires for completion to produce data for a series of studies. IV. Factors Determining The Choice of a Particular Survey Method: A. Sampling precision required - cost v. accuracy; mail v. telephone. B. Budget available. C. The need to expose the respondent to various stimuli. D. Quality of data required - multivariate statistical techniques require higher quality (or level of) data. E. Length of questionnaire - can be longer for in-person and mail, but must be short for telephone. F. Special tasks that the respondent must perform. G. Incidence rate - is the % of the population thought to participate in the behavior of interest? For instance, if the incident rate is low, a mall intercept may not be efficient (economical). H. Questionnaire complexity - more complex questionnaires must use in-person and mail surveys; less complex can use telephone. ------------------------------------------------------------------------ CHAPTER 8 I. Experiment -- refers to a research project constructed such that the researcher (experimenter) changes one element (an explanatory or independent variable) to observe the effect of that change on another element (the dependent variable). An experiment measures the change in the dependent variable created by a specific, controlled change in another variable(s) which is called the independent variable(s). This is done by controlling or holding constant the other independent variables while manipulating the independent variable(s) of interest, and measuring the change created in the dependent variable. Thus, the researcher is an active participant in the research process instead of a passive collector of data as with the survey or observation methods of research. Causation - The demonstrable and predictable effect of one variable on another. (This is better described as "Concomitant Variation" when talking about causation studies.) Demonstrating Causation Requires 3 Things: 1. Correlation - refers to variables that vary together in a predictable manner. 2. Appropriate Time-Order of Occurrence (i.e., appropriate temporal sequence) - to be considered a "cause" of change in a dependent variable, a change in an independent variable must occur "before" an observed change in the dependent variable. 3. Elimination of Extraneous Causal Factors - totally controlling extraneous causal factors in an experiment suggests that the change recorded in the dependent variable is, in fact, due to the change created in the independent variable of interest (assuming that you also satisfy requirements 1 & 2). II. Experimental Settings - are of two types: 1. Laboratory Experiments - Tests done in a sterile environment in which the researcher can control almost all possible causal factors. However, while the laboratory allows the researcher to control the variables involved, the lab may not accurately represent the real marketplace. Thus, the research results my not hold up when transferred to (generalized to) the actual marketplace. Thus, lab results are said to have good internal validity, but often lack external validity. This suggests that lab results are more likely to be statistically correct than results from field experiments, but less likely to be generalizable to the population of interest which is always located outside of the laboratory. 2. Field Experiments - Tests conducted outside the laboratory in an actual market environment. A test market is a good example. This solves the problem of realism of the test environment, but factors other than the independent variable(s) of interest may influence the observed changes in the dependent variable of interest because the researcher cannot control all other independent variables that may affect the dependent variable. For instance, the researcher cannot control nor even precisely measure the effects of competitive actions, the weather, the economy, societal trends, the political climate, nor other elements of the uncontrollable environment. Thus, field experiments often lack internal validity, while having better external validity. This suggests that the results have a better chance of being statistically wrong, but they are more likely generalizable to other similar market situations, if they are statistically correct. A. Validity - refers to research actually measuring what it attempts to measure. Validity suggests that the measurement device is substantially free from error, including both systematic (measurement and sampling biases) and random error. There are two basic types of validity of concern in experiments: 1. Internal Validity - the extent to which competing explanations for the observed experimental results can be avoided. Good internal validity suggests that the experiment (or treatment) actually produced the differences observed in the dependent variable, not by other causal factors, so that the results "of this single test" can be trusted. 2. External Validity - The extent to which causal relationships measured in an experiment can be generalized to outside (other) persons, settings, and times. Good external validity suggests that the subjects and the setting of the experiment are similar to that of the population of interest so that the results can be projected to (generalized to) the population of interest. B. Threats to "Internal" Validity - Experiments conducted outside the laboratory are often low in internal validity for the following reasons: 1. History - variables or events, other than those manipulated by the researcher, that occur between the beginning and the end of the experiment can affect the value of the dependent variable measured at the end of the experiment. 2. Maturation - Changes in the subjects that are a function of time (such as aging, hunger, or fatigue) may produce different results over time. 3. Instrumentation - Changes in the calibration of measurement instruments, observers, or settings that take place over time may produce different scores on responses over time. 4. Selection Bias - Important systematic differences may exist between the experimental group and the population of interest. 5. Mortality - Changes in the representativeness of the sample group due to respondents dropping out of the experiment. 6. Testing (or Premeasurement) Effect - Changes in the representativeness of the sample group due to the laboratory setting itself. This can occur because people often respond differently when they know that their behavior is being monitored (called the Hawthorne Effect). A Testing Effect can also result from respondents learning how to respond in repeated testing in an "appropriate" manner, rather than in a "truthful" manner. C. Threats to "External" Validity - laboratory experiments often lack external validity due to: 1. Surrogate Situations - the lab setting, treatment conditions, or test units differ from those that would be found in the real marketplace of interest. 2. Interaction Effects of "Sample Selection Bias" - suggests that we cannot tell whether the observed differences in our observations of different groups in the experiment are due to our treatments [manipulations of the dependent variables] or differences in the compositions of the 2 or more groups involved in the tests. This happens when the separate groups that receive different treatments are systematically different from each other, just because we allowed some bias in our selection of those groups. In other words, if the groups in an experiment are not homogeneous on important characteristics, we cannot say confidently that the test results are due to our experimental treatments of the independent variables. Rather, the results may be due to some combination (or interaction) of the biases present in our selection of the various groups and the treatments we administer to the independent variables. 3. Reactive (or Interactive) Effects of "Testing" - refers to the fact that respondents tend to change their reactions due to the learning process they naturally proceed through because of being measured before exposure to the treatment (called the Preexposure Measurement or Pretest), and then measured again on the same questions after the treatment (the Postexposure Measurement or Posttest). In other words, asking questions before the treatment causes the subjects to increase their sensitivity to the specific points they were just tested on, because they believe that they will be tested on those same points after the treatment and they want to give proper answers. And, they are correct; they are tested on the same points. In fact, the posttest usually consists of the exact same questions, although the order may be rearranged. Respondents then often try to answer the questions "correctly" rather than the way they really feel or think. III. Test Markets A. Test Market - Testing of a new product or some element of the marketing mix using experimental or quasi-experimental designs. Also often refers to the locality in which the testing is done. B. Test Market Usage and Objectives - Test markets are used to test new products, changes in existing products, or new marketing programs to estimate what would happen if they were introduced nationally. Test markets also allow the firm to gain experience in the physical distribution, shelf life, and storage problems of a new or reformulated product. IV. Experimentation: Basic Issues Experimental Design - A test in which the researcher has control over one or more independent variables and manipulates them. [Nonexperimental designs - involve NO manipulation. They are called EX POST FACTO (after the fact) RESEARCH, where an effect is observed and then some attempt is made to attribute that effect to some causal factor.] 1. Treatment - the independent variable that is manipulated in an experiment. Let's use price as an example of the variable to be treated (or manipulated). An example would be changing prices to see how much the subjects would increase or decrease their number of items purchased based upon, say, 3 levels of prices (called 3 treatment conditions). The variable (sometimes called the single treatment factor) is price, and the treatments on that variable are changing the price to three new levels to see how many items respondents would purchase at each price. 2. Subjects - are the people that participate in the experiment. Subjects are grouped for the experiment as follows: a) A TEST GROUP will receive the treatment. These individuals will be exposed to the treatments (three treatment conditions) of the independent variable of interest -- price; and b) The CONTROL GROUP will not receive the treatment. This is the group for whom the independent variable (price) will not be changed during the course of the experiment. 3. Dependent Variable - The element of an experiment that is theorized to be affected by changes in the independent variable(s) (such as price). 4. The Plan (or the Procedure) Deals With - controlling extraneous (or other) causal factors which are independent variables, other than the treatment variable, that can effect the dependent variable. The goal is to control the non-treated variables so that we can clearly see and measure the effect on the dependent variable of our treatments (treatment conditions) to the treatment variable. 5. Experimental Effect - is the effect of the manipulations (treatment conditions) of the treatment variable (the independent variable that is manipulated) on the dependent variable. It is also called the TREATMENT EFFECT - THE SPECIFIC IMPACT OF THE TREATMENT VARIABLE (X) ON THE DEPENDENT VARIABLE (Y). The goal is to determine the specific effect of each treatment condition on the dependent variable. 6. Randomization (R) - is the random assignment of subjects to treatment conditions so that we can ensure an even representation of subjects' characteristics in all groups (test and control groups). This only suggests that, given sufficient size (which takes advantage of the law of large numbers and the central limit theorem), each group will tend to be similar on important characteristics. Thus, the groups are not likely to contain systematic biases just because of their possessing an uneven or dissimilar makeup. 7. Physical Control (of extraneous causal factors) - refers to holding the value or level of extraneous variables constant throughout the experiment. This allows us to say that the changes we made to the treatment variable caused the changes we observed in the dependent variable. MATCHING - is another method of physical control. Under this approach, respondents are matched in regard to important personal characteristics (e.g., age, income, life-style, etc.) before being assigned to different treatment conditions. This makes each group homogeneous with all other groups, but only with regard to the matched variables. Thus, there are no important differences between test and control groups on the matched variable(s). Randomization is somewhat different because it controls for all variables, not just matched ones. However, randomization ensures only that the groups will "tend" to be similar, not exactly the same as is attempted by matching. 8. Design Control (of extraneous causal factors) - The use of the appropriate experimental "design" to control extraneous causal factors. 9. Statistical Control (of extraneous causal factors) - refers to using ANCOVA (Analysis of Covariance) to adjust for the effects of confounded (confused) variables by statistically manipulating (adjusting) the value of the dependent variable for each treatment condition. V. The Use of Experiments A. Reasons for Not Using Experiments More Often: 1. High Cost 2. Security Issues 3. Implementation 4. Contamination VI. Types of Experimental Designs - refers to designing the experiment to control extraneous factors (i.e., Design Control Factors) A. Experimental Design - a test in which the researcher has control over one or more independent variables and manipulates them. B. Notations Used In Experimental Designs are: 1. X - TREATMENT -- The independent variable (the treatment variable) that is manipulated in an experiment. 2. O - OBSERVATION -- The observed values of each subject's score (e.g., response or test score) for each variable. 3. R - indicates the groups were randomly selected. C. Preexperimental Designs (see Table 8.1 on p.218) 1. One Shot Case Study -- research in which all test units (people or test markets) (O) are exposed to the treatment (X) for some period, and then the dependent variable is measured. There is no control group and only one test is administered after the change (treatment). There is no measurement before the treatment. 2. Static-Group Comparison -- research in which no pretest is used to establish a baseline (like #1 above), and nothing is done to ensure an even spread of characteristics between the experimental and control groups (called Non-Equivalent groups). So, this is the same as the One Shot Case Study, but with the addition of a control group who does not receive the treatment (X). This addition allows us to compare the two groups' results concerning changes in the variable of interest. 3. One Group, Pretest-Posttest Design -- research (usually done on changes in an established product or marketing strategy) in which the prevailing condition is considered the pretest measurement, and (like #1 above) the observations measured after the change (treatment) are the posttest, with no control group. D. True Experimental Designs - Research where the researcher has almost complete control over the administration of the treatment(s), and uses an experimental group and a control group, plus assignment to both groups is randomized (see Table 8.2 on p.222). 1. Before and After with Control Group -- research using both an experimental and a control group, assignment to which is randomized, but only the experimental group receives the treatment. Observations are measured both before and after the treatment(s). 2. After-Only with Control Group - A research design using randomized assignment to the experimental and control groups (like # 1 immediately above), that measures observations only after the treatment, rather than before and after. E. Quasi-Experiments - Studies where the researcher lacks "complete" control over the scheduling of treatment(s), or where s/he must assign respondents to the treatment(s) in a nonrandom manner. Two Types: Interrupted Time-Series & Multiple Time-Series Designs. ________________________________________________________________________ CHAPTER 9 I. The Concept of Measurement A. Measurement - the process of assigning numbers to objects in accordance with specific rules to represent quantities of attributes. Thus, measurement is a procedure used to assign numbers that reflect the amount of an attribute possessed by an event, person, or object. B. Rule - a guide, method or command that tells a researcher what to do. Such as, "assign #'s 1 through 5 to people according to how strongly they feel about an attribute." C. The Measurement Process (steps): 1. Identify the Construct (Concept) of Interest: A CONSTRUCT is the invented name of a property or concept. It is the name of the thing being studied. Such as: Marital Role, Social Class, etc. CONSTRUCTS -- are abstractions created for research, that simplify and integrate the complex phenomena found in the marketing environment. 2. Define the Concept (Construct): a) First Conceptually - by creating a CONSTITUTIVE DEFINITION of the concept using other concepts and constructs to establish precise boundaries for the concept under study. This is similar to a dictionary definition that simply states the central idea (or concept) under study, which distinguishes it from all other similar concepts. b) Then Operationally - by creating an OPERATIONAL DEFINITION of the concept based upon the observable characteristics that will actually be measured. This definition includes the process for assigning a value to the concept, where the process is stated as a set of measurement rules to be used in the investigation. This definition is a bridge between a theoretical (Constitutive) concept and real-world events or factors that can be observed, and thus measured. Therefore, the operational definition of the concept gives meaning to that concept by spelling out what the researcher must do to measure it. 3. Develop a Measurement Scale: A SCALE -- is a set of symbols or numbers so constructed that the symbols or numbers can be assigned by a rule to the individuals (or to their behaviors or attitudes) to whom the scale is applied. Four Major Levels of Measurement Scales: (see Table 9.1 on p.242) a) Nominal Scales - scales that partition data into mutually exclusive and collectively exhaustive categories that may be either equal or not equal. They produce non-metric data, such as male (1) and female (2), that simply IDENTIFIES objects, events, or groups. The Mode is the measure of central tendency. b) Ordinal Scales - are nominal scales that also order the data so that they determine which objects are greater or less than the other. So, they SHOW ORDER. They produce non-metric data that provide information about the relative RANKING of some characteristic possessed by an event, object, or group. For instance, ordinal scales show preferences for each item in a list of attributes about an object, and those preferences are ranked in order of importance to the respondent. However, the amount of difference between responses is unknown. We know only that one item is more important than another, not how much more important. The Mode and the Median are both used as measures of central tendency. c) Interval Scales - are ordinal scales with "equal intervals" between points to show relative AMOUNTS. In this way, they SHOW ORDER AND DIFFERENCES between responses. They may include an "arbitrary zero" point assigned by man - such a with temperature. Since each point on the scale is equidistant from the ones above and below it, we can measure how much of a trait a consumer does/doesn't have, and then discuss the "Amount Of Difference" between consumers, since the difference between each point on the scale is equal. However, we cannot say that the distance (or size) of the interval from 0 to 4 is = to twice the distance of 0 to 2. [This limitation is mathematically due to the arbitrary, man-made zero we assigned to the scale.] d) Ratio Scales - (never in marketing) -- are interval scales with a meaningful (absolute) zero point, so that the "magnitudes" between any points on the scale can be compared arithmetically. Thus, ratio scales (such as age, weight, height, distance, area, counts, or time) show order, differences, and arithmetically comparable amounts where, the distance from 0 to 4 is twice the distance of 0 to 2. Nominal & Ordinal Scales Produce Nonmetric Data Which Must Be Analyzed With Nonparametric Statistical Procedures; However, Interval & Ratio Scales Produce Metric Data That Can Be Analyzed With the Parametric Procedures That You Studied The Most In Statistics, Such As Regression Analysis. II. Reliability & Validity Measurement A. Types of Errors: MEASUREMENT = COMPLETE ACCURACY + ERROR M = A + E ERROR = SYSTEMATIC Error + RANDOM Error E = S + R 1. Systematic Error - Error that results in a constant bias in the measurements. It results from faults in the measurement instrument or in the measurement process. 2. Random Error - error that affects measurement in a transient, inconsistent manner. It is always present, the only question is "how much." B. Sources of Measurement "Differences:" 1. True Difference (does not involve error) 2. Respondents' Stable Characteristics 3. Temporary Personal Factors 4. Situational Factors 5. Interviewer Bias 6. Questionnaire Wording Bias 7. Measurement Instrument Bias 8. Questionnaire Construction Bias C. Reliability - measures which are consistent from one administration to the next. Reliability suggests that the measurement instrument is stable, and is considered an "indirect" measure of the validity of a measuring instrument. It is the degree to which the measures are free from random error, and therefore provide "consistent" data. 1. Test-Retest Reliability - the ability of the same instrument to produce almost the same results when used a second time under conditions as nearly the same as possible in each test. STABILITY - refers to substantial lack of change in results (i.e., few differences in the scores found) from test to retest; then we say the measuring instrument is stable). Such measurement stability is called Reliability. 2. Equivalent Form Reliability - the ability to produce similar (consistent) results using two measuring instruments, that are as similar as possible, to measure the same object in the same time period. 3. Internal Consistency Reliability - ability to produce similar (consistent) results using different samples to measure a phenomenon during the same time period. a) Split-Half Analysis Technique - a method of assessing the reliability of a scale by dividing into two the total set of measurement items, and then correlating the results for each item (each response) of the two groups of respondents. You want high correlations, which indicate high reliability of the measurement device (or measurement scale) because its elements are homogeneous. b) Cronbach-Alpha - computes the mean reliability coefficient estimate for all possible ways of splitting in half a set of individual responses (within the sample) to each item (measure) in a measurement scale (a group of measurements of a single construct). Lack of correlation of any item (a single measure) with the other items (measures) in the measurement scale suggests that the item should be omitted from the scale. [Note that here, a Measurement Scale refers to a "set" of individual Scaled Measures that, as a group, are intended to measure a single concept (or construct)such as lifestyle or social class.] When one measure does not correlate with the others, the scale (set of measures) is strengthened (increased reliability) by eliminating the uncorrelated measure from the measurement scale. You want a alpha coefficient > 0.70 for the measurement scale to suggest that it has internal consistency (internal reliability) and that the items (individual measures) in the measurement scale should all remain there because they are homogeneous. [Note: A "measurement scale" is usually just called a "SCALE"] D. Validity - the degree to which what was supposed to be measured actually was measured. Validity addresses the extent to which the measurement instrument is free from "both" systematic and random error. 1. Face Validity - the degree to which the instrument "seems" to measure what it is supposed to. This is a judgement call by the researcher which was actually made before measurement as the questions were designed. However, we usually want to create measures that go beyond our "feeling" that our measures of some concept are valid. Thus, there are three "direct" assessments of the validity of a measurement instrument -- Content Validity, Criterion-Related Validity, and Construct Validity. 2. Content Validity - the degree to which the measures in the measuring instrument represent the universe of the concept under study. It suggests the measurement scale provides adequate coverage of the topic under study. If you have content validity, the measurement instrument adequately covers the most important aspects of the construct (concept) that is being measured. This is also "primarily" a judgement matter, based upon specific definitions of items to be measured, literature search, focus groups, expert opinions, and pretests of the measurement scales with open ended questions added to produce scale-expansion ideas. 3. Criterion-Related Validity (also called Pragmatic Validity) - The degree to which a measurement instrument can predict a variable that is designated a criterion (or dependent) variable. It refers to the usefulness of the measuring instrument as a predictor of some other characteristic or behavior of an individual; and it is shown by high correlations between criterion (dependent) and predictor (independent) variables. a) Predictive Validity - the degree to which the "future" level of a criterion (dependent) variable can be forecast by a current measurement scale (which is the set of predictor [independent] variables). b) Concurrent Validity - the degree to which a criterion (dependent) variable, measured at the "same point in time" as the predictor (independent) variable(s) of interest, can be predicted by the measurement instrument (scale). 4. Construct Validity - refers to understanding the factors that underlie the obtained measurement. It is concerned with the theory behind the mathematical prediction that we make with statistical procedures, and it is the most difficult type of validity to establish. We can observe the behavior that we believe relates to the construct of interest, but we cannot directly observe the construct because the construct is simply a label that we have given to the phenomenon of interest (such as an attitude or a lifestyle). For instance, when we want to measure people's attitude toward the service a firm provides, we create several questions (measures or items) that we think will measure that attitude. However, it is possible that the designed group of measures (the scale) actually measures some other attitude such as attitude toward the total value received from the firm, which likely includes individuals' attitudes toward both the product purchased and the service provided by the retailer. In this instance, the group of measures do not show construct validity because they actually measure a different construct such as attitude toward the total product offering, rather than attitude toward the "service" provided by the firm. To have construct validity suggests that the measure actually measures what it was designed to measure. That is, each item in the instrument must reflect the construct of interest and also show a high correlation (show internal consistency reliability) with the other items (measures) in the scale (measurement scale) that are intended to measure the same construct. a) Convergent Validity - the degree of association (correlation) among "different" measurement scales (i.e., among different sets of measures of a construct), that purport to measure the same concept (construct). Confirmation of the existence of convergent validity confirms the existence of the construct itself, and such confirmation is determined by high correlations between two or more independent (different) sets of measures (measurement scales) of the same construct that researchers believe "should" measure the same construct. Think of this as analogous to getting a second, independent opinion, except that the opinions are statistical in nature rather than independent subjective opinions of two individuals. b) Discriminant Validity - refers to the lack of association (correlation) among the sets of measures (scales) of separate constructs that are supposed to be different. If two, independent sets of measures (scales) that are believed to measure different constructs do "not" correlate highly, we say we have discriminant validity for the measures, and this finding is considered further evidence (in addition to convergent validity) of the existence of construct validity of our set of measures. ATTITUDE MEASUREMENT I. The Nature of Attitudes A. Attitude - An enduring organization (arrangement) of motivational, emotional, perceptual, and cognitive processes with respect to some aspect of our environment. An ATTITUDE is a learned predisposition to respond in a consistently favorable or unfavorable manner toward an object. Attitudes tend to be long lasting and consist of clusters (sets) of interrelated beliefs. Attitudes encompass or reflect our VALUE SYSTEMS, which represent our "standards" (or norms) of good and bad, right and wrong, etc. B. Some Research Results On Attitudes With Respect To Behavior (the Attitude-Behavior link is very complex): 1. The more favorable the attitudes of consumers toward a product, the higher is the incidence of product usage. 2. Conversely, the less favorable the attitude toward the product, the lower the incidence of usage of the product. 3. The more unfavorable people's attitudes are toward the product, the more likely they will stop using it. 4. The attitudes of people who have never tried a product tend to be distributed about the mean in the shape of a normal distribution. 5. When attitudes are based upon actually trying and experiencing a product, attitudes predict behavior quite well. Conversely, when attitudes are based upon advertising, attitude-behavior consistency is significantly reduced, and therefore difficult to predict. C. Attitudes have three components: 1. Cognitive (cognitions) - knowledge and beliefs about an object or behavior; 2. Affective (affections) - emotional reactions or feelings toward an object or behavior; and 3. Conative (conations) - behavioral intentions (& actual behavior) based on the person's attitudes about the object. D. Ways To Effectively Change Beliefs: 1. Changing their belief(s) about specific attributes of the product or brand from negative or neutral to positive. 2. Changing the relative importance of their beliefs toward an attribute of the product or brand; or 3. Adding new beliefs about attributes they may not have considered before. II. Attitude Scales A. Scale - a measurement tool. B. Scaling - procedures for determining quantitative measures of subjective and sometimes abstract concepts (constructs). Scaling refers to the assignment of numbers (or other symbols) to properties (attributes) of objects in order to impart (apply) some of the characteristics of numbers to the properties in question. In other words, we assign numbers to indicants (or indicators) of the properties (attributes) of objects in order to quantitatively measure those properties. This assignment allows us to apply statistical procedures to abstractions (concepts) that would otherwise remain qualitative and therefore unstudied through scientific methods. [Note: This is also the definition of MEASUREMENT in most advanced research texts.] 1. Unidimensional Scaling (used in this textbook) - procedures designed to measure only one attribute of a respondent or object. For instance, we may measure consumers' price sensitivity by using several measurement items (measures), but they will be combined into a single measure with a single name -- price sensitivity. (Price sensitivity is an example of a construct.) [All measurements we have discussed so far refer to unidimensional scaling.] 2. Multidimensional Scaling (MDS) (not covered in this textbook) - procedures designed to measure "several dimensions" of a concept or object. These dimensions are not combined into a single measure of, say, target customers. Thus, they are measured and described along more than one dimension or characteristic. Multidimensional scaling is usually used in the development of perceptual maps which position consumers' preferences for various products along the dimensions (attributes) measured. Thus, MDS is used often in product positioning studies. C. Graphic Rating Scales (generates interval level data) -- graphic continuums anchored by 2 extremes presented to respondents for evaluation of a concept (construct) or object. D. Itemized-Ratings Scales (generates interval level data) -- scales in which the respondent selects an answer from a limited number of categories. Graphic and Itemized scales are NONCOMPARATIVE because the respondent makes a judgement about an item, concept, or person without reference to another, separate object, concept, or person. E. Rank-Order Scales (generates ordinal level data) -- scales in which the respondent compares one item with another (or a group of items against each other) and ranks them in order of strength, importance, etc. So, it is COMPARATIVE because the respondent is asked to judge one item (object, concept, or person) against another. F. Paired Comparison Scales (generates ordinal level data) -- scales that ask the respondent to pick 1 of 2 objects in a set based on some stated criteria. G. Semantic Differential (a popular image measurement tool that generates interval level data) -- a method of examining the strengths and weaknesses of a product or company versus the competition. This is accomplished by having respondents rank the product or company between dichotomous pairs of words or phrases that could be used to describe it. Then, the MEAN OF THE RESPONSES for each pair of words or phrases is PLOTTED IN A PROFILE (or IMAGE). H. Likert Scale (generates interval level data for each statement (or attitude)) - A scale in which the respondent specifies a "level" of agreement or disagreement with statements (attitudes) that express a favorable or unfavorable attitude toward the concept under study. Total Attitude Score -- the overall attitude score of a respondent (positive or negative) based upon several measures of attitude(s). I. Purchase Intent Scales - scales used to measure a respondent's intention to buy or not buy a product. Consumers are simply asked to make a subjective judgment on their likelihood of buying a new product, and the marketing manager uses past experience to translate their responses on the scale into estimates of purchase probability. This is the most used scale in commercial market research for new products and services, product modifications, new services or service modifications by a retailer, and even by nonprofit organizations. Purchase Intent Scales employ multichotomous questions (which provide >2 possible answers) which are analogous to the multiple choice questions on exams that students are quite familiar with. Purchase Intent Scales have been found to be good predictors of consumer choice of FREQUENTLY PURCHASED AND DURABLE CONSUMER PRODUCTS.
------------------------------------------------------------------------ CHAPTER 10 DEFINITIONS: Questionnaire - a set of questions designed to generate the data necessary for accomplishing the objectives of the research project. Editing - going through the questionnaire to ensure that the Skip Patterns (which refers to THE SEQUENCE IN WHICH THE QUESTIONS ARE ASKED) were followed and the required questions were filled out. Coding - conversions of respondents' answers to numeric values. Open-Ended Questions - questions that ask the respondent to reply in his or her own words. Closed-Ended Questions - questions that ask the respondent to choose from a list of answers. Closed-Ended Questions are of three types: Dichotomous, Multiple Choice (Multichotomous), & Scaled-Response Questions. Dichotomous Questions - questions that ask the respondent to choose between 2 answers. Multiple Choice (Multichotomous) Questions - questions that ask a respondent to choose among a list of > 2 answers. Scaled-Response Questions - multiple choice questions where the choices are designed to capture the intensity of the respondent's answer.
CLICK HERE to go to CHAPTERS 11 THRU 12 OF MARKETING RESEARCH NOTES CLICK HERE to go to DR. L's HOMEPAGE
|