Chat with us, powered by LiveChat EDUC 2201 University of California Irvine Statistics Questions - STUDENT SOLUTION USA

Please use the following scenario to answer questions 47–50.

47.Brian is designing a research study to examine the effects of a new curriculum for graduate-level introductory statistic courses. He hopes that this new curriculum will improve graduate students’ statistical understanding when compared with traditional curriculums.

Brian comes to you for help with his sampling method; he is unsure whether to use probability or non-probability sampling. Explain the pros/cons of each type of sampling to Brian and why his choice is important in the context of his research project (make sure to identify his population of interest). In your answer, provide a specific sampling technique that Brian could use.

48. Brian also needs help with operationalizing his dependent variable. Describe what it means to operationalize a variable and identify the dependent variable in his study and offer one example of how Brian could operationalize it.

49. Brian was hoping to collect data using a testing instrument. Define reliability and validity with respect to measuring instruments AND explain why Brian should be concerned with these technical attributes when choosing the instrument.




50. Brian is not sure if high reliability can exist with low validity. He also is not sure if high validity can exist with low reliability. Explain your answer to both of these in the context of his study.

EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (1)
Lecture Note by: Shangmou Xu
Week 1 – Lecture Note
Overview
Welcome to Week 1! I hope you got chance to go over the Syllabus and figured out the flow of
this course. If not, please spend ten mins to read through the document and let me know if you
have any question. This course begins with a two-week lecture sequence featuring the nature of
research and related ethics issues. There will be two Discussions and one Assignment associated
with this topic, Discussion 1 (due this week), and Discussion 2 and Assignment 2 (due next
week). Week 1’s lecture focuses on the nature of research.
The first part of the lecture for Week 1provides a general orientation to the nature of research,
the value of research, and the relevance of research to you as a graduate student. Unlike other
ways of finding the answers to questions or gaining knowledge, research relies on the
scientific/disciplined inquiry approach. The second part of this module focuses on the
distinguishing characteristics of the scientific/disciplined inquiry approach and the strengths and
limitations of this approach. In the third part of the module, you’ll learn why it is useful to
classify research, and you’ll be introduced to two main “families” of research methods,
quantitative methods and qualitative methods. Finally, you’ll become acquainted with four types
of quantitative methods: descriptive, correlational, causal-comparative, and experimental.
Textbook Reading (optional if you haven’t got your book)
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Pearson.
Chapter 1, pp. 4–19
Edition 11: Chapter 1, pp. 4-18
Lecture slides
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (1)
Lecture Note by: Shangmou Xu
Please read through Week 1 Slides
Key Point

Note about research process (Week 1 slides, Page 11-12)
o Let’s talk more about step 2 and step 3. In data collection procedure, it usually
contains two main sub-steps, selecting participates and selecting method to collect
data. The first sub-step, selecting participants, is also referred as Sampling, which
we will discuss on Week 5. In the second sub-step, selecting method to collect
data, we need to consider the data collection instrument and the way that we use
the instrument. An instrument could be a standardized test (like GRE, SAT), a
researcher-developed survey (like…any survey you’ve experienced so far), a set
of interview questions, or even a tape measure to measure students’ height.
Basically, everything that can “give” us data during research is an instrument. We
will address the “how to use instrument” question later this course.
o Data can be in very different forms. It is not limited to numeric data, or
quantitative data. Data can also be in the form of qualitative data, such as
interview transcript (therefore interview questions are data collection instrument).
The way that we deal with existing data is called data analysis. Data analysis can
be as simple as calculating the average of students’ test score, or in very
complicated ways (such as model building). Summarizing interview transcript is
also a good example of data analysis.

Among different ways of classifying research, please pay close attention to “classification
by method” (slides page 21-25)
o There is no good or bad research methods. It’s always about “picking the
appropriate one”. Both quantitative and qualitative research can be extremely
useful when your research questions/goals match research methods. In the slides
you might have already figured out the basic difference between quant and qual
method and what can these methods do. Please don’t worry if you can’t
understand all these differences. Let’s start with a quick discussion about these
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (1)
Lecture Note by: Shangmou Xu
two methods (see discussion assignment 1). Remember, there is no right or wrong
answer in discussion. Next week, we will talk more about quant and qual methods
with detailed examples.

Few word about “subtypes of quantitative research” (slides page 26-34)
o I introduced four subtypes of quant research in slides, 1) descriptive research, 2)
correlational research, 3) causal comparative research, and 4) experimental
research. Although from 1) to 4), we consider more design factors and design in a
more rigorous way, all four subtypes can be really useful. Almost every
quantitative research starts with a descriptive analysis. I would say descriptive
analysis is a really powerful tool to communicate your research with people from
many different fields. So please keep that in mind, even you will learn many
complicated research methods, don’t forget to describe your participant
characteristics, trends, context to your audiences.
o
About 4) experimental research. Only in experimental research, researchers can
actively manipulate independent variables. In other three research designs,
researcher can only use pre-existing conditions to conduct research. In Week 9,
we will learn different types of experimental research.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (2)
Lecture Note by: Shangmou Xu
Week 2 – Lecture Note
Overview
Welcome to Week 2! I hope you got chance to go through Week 1’s lecture and have a basic
understanding of the nature of research. This week, we will discuss ethics in research. Any
research involving humans as participants must comply with fundamental ethical principles that
are enforced by the federal government. In this lecture, you will learn about these principles in
conjunction with learning about three experiments that were carried out prior to the time when
the federal government began to regulate research ethics: the Tuskegee Syphilis Study,
Milgram’s study on conformity, and the Stanford Prison Study. This is an important lecture
because it really goes over the different ethical issues and some problems in the past. In the
lecture slides, you will be able to view three videos regarding these studies that have
questionable ethics. When viewing these videos, it could be disturbing, so perhaps you could
take a break and after these videos and then go back for reflection. When reflecting these videos,
I would like you to pay attend to these questions: 1) did the participants consent to the study, 2)
were the participants fully aware of the study, 3) were the researchers truthful to participants?
Ant most importantly, no research should do any harm to any participants.
There will be a Discussion assignment about the ethics in research along with Assignment 2 that
will cover both topics we’ve discussed so far.
Textbook Reading (optional if you haven’t got your book)
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Pearson.
Chapter 1, pp. 19-27
Edition 11: Chapter 1, pp. 18-27
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (2)
Lecture Note by: Shangmou Xu
Lecture slides
Please read through Week 2 Slides
Exercise
This exercise will give you a quick review and practice of research classification. This will be
helpful for Assignment 2 and midterm, so please be sure to check the item and answers.
Read each of the abstracts below. First, decide whether the study described is qualitative or
quantitative. Second, if the study is quantitative, decide whether it can best be classified as
descriptive, correlational, causal-comparative, or experimental
1. The objective of the study was to characterize the personality types of pre-service
teachers. The Myers-Briggs Type Indicator (a personality inventory) was administered to
288 students enrolled in teacher-preparation programs. Results of analysis of students’
scores indicated that the Sensing-Feeling-Judging (SFJ) profile described a large
proportion of the students.
Answer: quantitative, descriptive
Quantitative because of large sample size, and reference to analysis of students’ scores
Descriptive because purpose is to describe personality types
2. In this study, the authors examined the personality correlates of anorexic
symptomatology in a sample of 197 female undergraduates. Statistical analysis
demonstrated that among the personality traits considered in the study, only
obsessiveness and emotional dependence were related to anorexic symptoms.
Answer: quantitative, correlational
Quantitative because of large sample size and reference to statistical analysis
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (2)
Lecture Note by: Shangmou Xu
Correlational because purpose is to find personality traits related to anorexic symptoms
3. The study investigated the effect of two theory-based intervention methods — active
processing and metacognitive — on kindergarten student’s ability to generate questions.
Ninety-three kindergarten students were randomly assigned to one of three groups: the
active processing group, the metacognitive group, and a conventional control group.
Results found that the average number of questions generated was higher in both
intervention groups than in the control group.
Answer: quantitative, experimental
Quantitative because of large sample size and reference to “average number of questions
generated”
Experimental because independent variable is manipulated (researcher implements two
intervention methods and a control condition) and students are randomly assigned to treatments.
4. This study explored the role of emotion in parent-adolescent conversations on careerrelated issues. The transcripts of two parent-adolescent conversations about career were
analyzed in detail. It was found that emotion served to energize action and lend context to
the adolescents’ process of building an understanding of career.
Answer: qualitative
Qualitative because of small sample size (two conversations) and reference to detailed analysis
of transcripts.
5. This study examined the effect of maternal substance abuse on the developmental level of
toddlers. Developmental level was quantified on the basis of characteristics of the
toddlers’ play activities. The researcher observed the play activities of two groups of
toddlers: toddlers whose mothers were substance abusers and toddlers whose mothers
were not substance abusers. There were 40 toddlers in each group. It was found that the
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The nature of research and Ethics in Research (2)
Lecture Note by: Shangmou Xu
average developmental level of toddlers whose mothers were substance abusers was
significantly less advanced than the level of toddlers whose mothers were not substance
abusers.
Answer: quantitative, causal-comparative
Quantitative because of relatively large sample size and reference to quantifying the
developmental level of toddlers.
Causal-comparative because of nature of independent variable (researcher has no influence over
whether a mother is a substance abuser; pre-existing groups are compared).
6. The purpose of this study was to gain insight into elementary teachers’ perceptions about
the consequences of high-stakes standardized achievement testing. In-depth interviews
were conducted with seven teachers in an inner-city elementary school in southern
California that has a large population of Hispanic students. The themes that emerged
from the researcher’s analysis of transcripts of the interviews included teachers’
experience of stress due to the rapidly growing emphasis on standardized testing, and
their fears that students’ learning would suffer because of a narrowing of the curriculum.
Answer: qualitative
Qualitative because of small sample size, reference to in-depth interviews, and reference to
researcher’s analysis of transcripts in order to identify themes.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The Research Process and Literature Review
Lecture Note by: Shangmou Xu
Week 3 – Lecture Note
Overview
Welcome to Week 3! I hope you have already figured out what’s going on with this course. In
Week 1 and Week 2, we went over some basic idea of social science research as well as ethics
issues. Please note, it’s always required (even if you are going to conduct a secondary data
analysis) to get the ethics issues reviewed before conducting any research. In Assignment 2, I
asked you to read two empirical studies and summarize the research problem, data collection
instruments, data analysis methods, and major conclusion. These four steps are part of a standard
social science research process. In Week 3, I will introduce the whole research process step by
step. After learning this lecture, you will be able to read an empirical research article by research
step, and understand the flow of a research. That’s why I will briefly introduce literature review
right after this: by breaking empirical research apart based on research steps, you will be able to
sort multiple research articles by their similarities and organize them, In Discussion 3 and
Assignment 3, you will have chances to tryout your ability of reviewing literature.
Textbook Reading (optional if you haven’t got your book)
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Pearson.
Chapter 2, pp. 61–74
Chapter 3, pp. 79–81, 93–100 (skim pages 82–92)
Edition 11: Chapter 2, pp. 71-84
Edition 11: Chapter 3, pp. 89-91, 101-108 (skim pages 92-100)
Lecture slides
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The Research Process and Literature Review
Lecture Note by: Shangmou Xu
Please read through Week 3 Slides, part 1 and part 2
Exercise
This exercise will give you a quick review and practice of identifying independent and dependent
variables. This will be helpful and midterm, so please be sure to check the item and answers.
Part 1. Name the independent and dependent variables in each of the studies described below.
1. A social psychology experiment was conducted to investigate the effect of size of group
on the length of time required for the group to reach agreement on a strategy for solving a
problem. There were three treatment conditions: groups of 5, 10, and 15 subjects
respectively.
Independent variable: size of group
Dependent variable: time required to reach agreement
2. A physical therapy experiment was conducted to investigate the effect of endurancepromoting vs. strength-promoting exercises on stroke patients’ ability to perform
activities of daily living (such as dressing, brushing teeth, etc.). Stroke patients were
randomly assigned to one of two groups. One group was led by a physical therapist in
endurance-promoting exercises; the other group was led by a physical therapist in
strength-promoting exercises.
Independent variable: type of exercise (endurance-promoting or strength-promoting)
Dependent variable: ability to perform activities of daily living
3. An educational technology experiment was conducted to investigate the effect of design
of computer screen on the length of time required for subjects to locate a target image.
Subjects were randomly assigned to one of three screen design conditions: one image per
screen, four images per screen, and 16 images per screen.
Independent variable: design of computer screen
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The Research Process and Literature Review
Lecture Note by: Shangmou Xu
Dependent variable: time required to locate a target image
4. A counseling psychology experiment was conducted to investigate the effect of a career
development workshop on college students’ degree of indecision about choosing a career.
A pretest on career indecision was given to all subjects. Subjects then were randomly
assigned to one of two groups. One group attended a one-day career development
workshop led by career counselors. The other group did not attend the workshop.
Following the workshop, a posttest on career indecision was given to all subjects.
Independent variable: attendance at a career development workshop
Dependent variable: career indecision
5. An experiment in foreign language education was carried out to investigate the effect of
organizational devices on students’ comprehension of a text in the foreign language.
Students were randomly assigned to one of two conditions. In the first condition, students
read a text enhanced by organizational devices; in the second condition they read the
same text without the organizational devices. All students were then given a test to
measure their comprehension of the text.
Independent variable: use of organizational devices
Dependent variable: comprehension of text
Part 2. Please identify the hypothesis type for the following research hypothesis (selecting from
non-directional research hypothesis, directional research hypothesis, and null hypothesis)
1. The level of background noise will have an effect on older adults’ performance on a word
recognition task.
Answer: non-directional research hypothesis
2. Male students will perform better than female students on an advanced calculus test.
Answer: directional research hypothesis
EDUC 2201/EFOP 2001 Introduction to Research Methodology
The Research Process and Literature Review
Lecture Note by: Shangmou Xu
3. The average achievement of graduate students in a Web-based introduction to research course
will be NO different than the average achievement of students enrolled a traditional introduction
to research course.
Answer: null hypothesis
4. Adult males who participate in warm-up exercises before lifting weights will experience less
muscle soreness than adult males who do not participate in warm-up exercises.
Answer: directional research hypothesis
5. Amount of previous computer experience will have an effect on nurses’ attitudes toward the
use of electronic patient charts.
Answer: non-directional research hypothesis
6. Attendance at a career information workshop will have NO effect on the career knowledge of
undergraduate students.
Answer: null hypothesis
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Sampling
Lecture Note by: Shangmou Xu
Week 4 – Lecture Note
Overview
Welcome to Week 4! In Week 3, I introduced the basic research process and you may
understand that many, if not all, research projects start with a clear stated research question(s).
While in this course, it’s not required to conduct any research or empirical analysis, I would like
you to understand that your methodology selection is based on the research goal/questions; that
is, any subsequent research procedures will “work for” the research goal. Therefore, there is no
“right or wrong” or “good or bad” methods. Instead, it’s all about matching methods with
research goals/questions. Please keep that in mind because we are about to learn some “real”
research methods and it’s super helpful to think about “research goal” all the time.
Think about any research project, what’s the next step after deciding research questions?
My answer would be: deciding population and sample. For example, my research goal is to
understand the academic motivation among US high school students and provide policy
suggestions. My research question is what factors motivate US high school students to
participate academic activities. In this case, our population would be all US high school students.
And what about sample? Well, typically we are not able to obtain information from every
individual from the population and we have to “select” some individuals from the population and
generalize any results we get from these individuals to the whole population. These selected
individuals form my sample for this specific study and the selection process is called sampling.
Depending on the extent to which we would like to generalize the research results to the
whole population, the representativeness and the size of the sample might be different. Using the
academic motivation project as an example, our goal is to understand the high school students
across the whole country, so we want our result to be super generalizable. Therefore, our sample
should be in a high level of representativeness to ensure we can generalize the result. Of course
we may expect the sample size would be large. Another example would be a research project
about student computer lab usage among PITT-SoE students and the goal is to improve the
experience of using SoE computer lab. In this case, the population is all Pitt SoE students. We
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Sampling
Lecture Note by: Shangmou Xu
still want some characteristics to be representative in our sample, like having both Mater’s and
Doctoral students, or having students from every department, but the level of representativeness
would be lower than the national representativeness study.
If our sample is not able to represent the intended population and some members of the
intended population have a lower or higher sampling probability than others, our sample is
biased, and it’s called sampling bias. Sampling bias occurs when that characteristic of a sample
differs in a systematic way from the population. Every sample is biased because no sample can
100% represent the population. However, in most cases, we are not chasing for 100%
representativeness, so we can allow sampling bias. In the exercise and assignments, we will
discuss the potential sources of sampling bias.
Because we may accept sampling bias in our research, based on the research goal, people
select different sampling method. It’s important to note that when thinking about probability
sampling methods and non-probability methods, there is no better or worse methods. Again, you
may want to match your research goal with sampling methods. Please proceed to textbook
reading and slides reading to know more details about sampling methods.
Textbook Reading
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Pearson.
Chapter 5, pp. 129–141
Edition 11: Chapter 5, pp. 137-149
Lecture slides
Please read through Week 4 Slides
Exercise
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Sampling
Lecture Note by: Shangmou Xu
This exercise will give you a quick review and practice of identifying sampling methods and
bias. This will be helpful and midterm, so please be sure to check the item and answers.
Part 1. Please identify the specific sampling method used in the following situations:
1. A study was done to determine the age, number of times per week, and the duration
(amount of time) of residents using a local park in city A. The first house in the
neighborhood around the park was selected randomly, and then the resident of every
eighth house in the neighborhood around the park was interviewed.
Answer: Systemic sampling
2. A woman in the airport is handing out questionnaires to travelers asking them to evaluate
the airport’s service. She does not ask travelers who are hurrying through the airport with
their hands full of luggage, but instead asks all travelers who are sitting near gates and
not taking naps while they wait.
Answer: Convenience sampling
3. A teacher wants to know if her students are doing homework, so she randomly selects
rows two and five and then calls on all students in row two and all students in row five to
present the solutions to homework problems to the class.
Answer: Cluster
4. The marketing manager for an electronics store wants information on what are the most
popular brands and she thinks if varies by age of the customer. She randomly selects 100
customers from 5 different age groups and gives them questionnaires to complete.
Answer: Stratified
5. The librarian at a public library wants to determine what proportion of the library users
are children. The librarian has a tally sheet on which she marks whether books are
checked out by an adult or a child. She records this data for every fourth patron who
checks out books.
Answer: Systemic
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Sampling
Lecture Note by: Shangmou Xu
6. A political party wants to know the reaction of voters to a debate between the candidates.
The day after the debate, the party’s polling staff calls 1,200 randomly selected phone
numbers. If a registered voter answers the phone or is available to come to the phone, that
registered voter is asked whom he or she intends to vote for and whether the debate
changed his or her opinion of the candidates.
Answer: Simple Random
Part 2 Please read the following sampling design and discuss: (1) whether the sample is
representative of the population and (2) the potential sources of bias of the study.
Several online textbook retailers advertise that they have lower prices than on-campus bookstores.
However, an important factor is whether the Internet retailers actually have the textbooks that students need in stock.
Students need to be able to get textbooks promptly at the beginning of the college term. If the book is not available,
then a student would not be able to get the textbook at all, or might get a delayed delivery if the book is back
ordered. A college newspaper reporter is investigating textbook availability at online retailers. He decides to
investigate one textbook for each of the following seven subjects: calculus, biology, chemistry, physics, statistics,
geology, and general engineering. He consults textbook industry sales data and selects the most popular nationally
used textbook in each of these subjects. He visits websites for a random sample of major online textbook sellers and
looks up each of these seven textbooks to see if they are available in stock for quick delivery through these retailers.
Based on his investigation, he writes an article in which he draws conclusions about the overall availability of all
college textbooks through online textbook retailers.
Is this sample representative of all college textbooks? What’s the possible courses bias in this
study?
Sample answer: The sample is not representative of the population of all college textbooks. Two
reasons why it is not representative are that he only sampled seven subjects and he only
investigated one textbook in each subject. There are several possible sources of bias in the study.
The seven subjects that he investigated are all in mathematics and the sciences; there are many
subjects in the humanities, social sciences, and other subject areas, (for example: literature, art,
history, psychology, sociology, business) that he did not investigate at all. It may be that
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Sampling
Lecture Note by: Shangmou Xu
different subject areas exhibit different patterns of textbook availability, but his sample would
not detect such results.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
Week 5 – Lecture Note
Overview
Welcome to Week 5! In Week 4, I introduced sampling methods and sampling bias.
Following the research procedure, the next step would be data collection. But before discussing
selecting data collection (measurement) instrument, I want to spend a whole week introducing
some basic statistical methods, descriptive statistics. The reason behind this decision is that you
might need this skill at any step of your research, even it’s a qualitative-based research project.
For example, after deciding your sample (participants), you may need to conduct a basic
descriptive analysis about the characteristics of your participants. So, let’s get into this. If you are
not confident about your mathematic skills (like me), don’t worry because we won’t focus on
mathematical calculation. Instead, I would like you to focus on the interpretation and application
of certain statistical methods (i.e., asking yourself why we use this method and what can we do
from this method).
In this lecture, we will introduce three ways to summarize the data, distribution, central
tendency, and variability. To summarize the data in terms of distributions, we can have a
frequency distribution. For example, in a test, I’m going to identify for each score, the frequency
or the number of individuals who obtained the score. Frequency distributions and graphs help
portray what the distribution looks like. Another way is to look at the shape of the distribution.
We can have normal distribution with a symmetrical bell shape. We can also have positively
skewed distribution and negatively skewed distribution. Please see slides for more detail. For
central tendency, we have three types of measures of central tendency, the mode, the median,
and the mean. The mode is the most frequently occurring score. The median is the point on the
score scale in which 50 percent of the scores fall above and 50 percent fall at or below. The last
measure of central tendency is the mean, and it’s considered to be the arithmetic average.
Measures of variability provide you with an indication of the extent to which the scores
vary from another. Another way of saying this is the degree of dispersion in the scores, or how
spread out or how close the scores are. If the scores are not spread out very close, then they’re
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
going to lack variability. If the scores are spread out, then you’re going to have more variability.
Standard deviation is the most common type of measure of variability. The standard deviation
represents the average distance between all scores and the mean. So it’s basically, you’re looking
at the deviation of each and every score from the mean. So the more the scores deviate from the
mean, the higher the standard deviation, the more the scores are closer to the mean, the lower the
standard deviation. The standard deviation is influenced by every score.
We can do a lot of things with a normal curve. For example, in this week, we will
introduce the relative scores. Relative score is a common method to locate a score from an
individual participant (for example, test score from one student in a class) in a normal
distribution. The ultimate goal is to compare the individual’ score (or any variable that can be
quantified) to the population’s overall performance. For example, I took an online programming
course and I got 85 for the final exam. Am I doing good? Well, yes, because I’m satisfied with
my whole learning process and I’m just comparing to myself. I didn’t know programming before
and now I can do some stuff. Am I doing good in related to other students in the same class? I
don’t know, unless I have some ways to know the distribution of final exam scores of all the
students in the class. The basic way to do the comparison is to know the percentile rank. For
example, if I ranked at 70th percentile for the programming class (with final exam 85 points), I
did better than 70% of the student and only 30% of the students did better than me. We can also
do this comparison in many different ways. For example, if I know the mean score and standard
deviation of the final exam, I can estimate my relative position by calculating how many
standard deviations I’m located above/below the mean (also called z-score). If the mean is 80
and the standard deviation is 5, then 85 is one standard deviation above the mean: (85-80)/5=1
(I’m doing better than many students). If the mean is 90 and the sd is 5, I’m one standard
deviation below the mean (I’m doing worse than many students). How many students exactly?
Well, you need to check the normal table for reference (see the big chart in slides or book). In a
normal distribution, approximately, 68% of the people are within plus 1 and minus 1 standard
deviation. In that chart, you can find a lot of different ways to locate my score.
Please proceed to textbook reading and slides. I also prepared a lot of exercises for you to
practice. Please be sure to check every item in this lecture note.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
Textbook Reading (Required)
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Pearson.
Chapter 12, pp. 319-332
Edition 11: Chapter 17, pp. 482-496
Lecture slides
Please read through Week 5 Slides
Exercise
This exercise will give you a quick review and practice of applying descriptive statistics to
different research areas. This will be helpful and midterm, so please be sure to check the item
and answers.
Part 1. The tables below show frequencies and statistics for a multiple-choice exam with a
maximum score of 43 points. Refer to these tables to answer Question 1-3
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
1. How would you explain the fact that the three measures of central tendency are not equal to
each other? (Hint: look at the frequency table.)
Answer: The few low scores would affect the mean more than they would the median or the
mode.
2. Did the class on the average do well on the exam?
Answer: Yes
3. Would the range alone provide an adequate descriptive of the variability in students’ scores?
Answer: No. Because the lowest score is 29, the range might suggest that scores were more
variable than they really were. The range only takes the highest and lowest score into account.
Part 2. Would you expect each of the distributions described below to be positively skewed,
negatively skewed or approximately normal? Explain your reasoning.
1. The distribution of household income in the United States
Answer: Positively skewed because only a few extremely wealthy people.
2. The heights of female undergraduate students at the University of Pittsburgh
Answer: Approximately normal. Women close to average height are more common than either
extremely short or extremely tall women.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
3. The distribution of undergraduate GPA’s of students enrolled in master’s degree programs in
the School of Education at Pitt.
Answer: Negatively skewed. Most students in graduate school have fairly high undergraduate
GPA’s and only a few have low GPA’s.
4. The actual weight of 1,000 bags of potato chips labeled as 2.5 ounce bags that are produced by
a manufacturer.
Answer: Approximately normal. Most bags would be close to the target weight; only a few
would be far from the target weight in either direction.
Part 3. Please answer the following questions about normal distribution
1. Suppose that scores on the verbal section of the SAT (Scholastic Aptitude Test) are normally
distributed with a mean of 500 and a standard deviation of 100. Would you expect a large
proportion of students to have scores less than 300? Why or why not?
Answer: We would expect only a small proportion to have scores less than 300. A score of 300 is
two standard deviations below the mean. In a normal distribution only 5% of the scores are more
than 2 standard deviations from the mean.
2. Arrange the following statements in order of least to most precise description of a
student’s relative position on a standardized achievement test.
_____the student’s Z score is positive
_____the student’s percentile rank is 91
_____the student placed in stanine 8
Answer: Positive Z score (any score above the mean has a positive Z score); stanine 8; percentile
rank of 91.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
3. Which of the following students has the highest relative position on a standardized
achievement test?
a. a student whose Z-score is -2.0
b. a student whose percentile rank is 30
c. a student placed in stanine 2
Answer: A student whose percentile rank is 30 has the highest relative position (this percentile
rank would fall in stanine 4). A Z-score of -2 would fall in stanine 1.
Part 4. Measures of the Location of the Data
1. Two students were applying to the same graduate school. Which student had the better college
GPA when compared to other students at his/her college? Explain how you determined your
answer and indicate what the student had a better relative GPA
Student
Student GPA
College Mean GPA
College sd
A
2.7
3.2
0.8
B
2.5
3.0
1.0
Answer:
For student A, ? − ????? =
2.7−3.2
0.8
= −0.625, meaning this student’s GPA is .625 standard
deviation below the mean.
For student B, ? − ????? =
2.5−3.0
1.0
= −0.50, meaning this student’s GPA is .5 standard
deviation above the mean.
Student B’s GPA is relatively better because although the student’s GPA was lower than Student
A’s GPA, Student B’s college mean was lower and the sd was higher
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
2. A music school has budgeted to purchase three musical instruments. They plan to purchase a
piano costing $3,000, a guitar costing $550, and a drum set costing $600. The mean cost for a
piano is $4,000 with a standard deviation of $2,500. The mean cost for a guitar is $500 with a
standard deviation of $200. The mean cost for drums is $700 with a standard deviation of $100.
Which cost is the lowest, when compared to other instruments of the same type? Which cost is
the highest when compared to other instruments of the same type? Justify your answer.
Answer: For pianos, the cost of the piano is 0.4 standard deviations BELOW the mean. For
guitars, the cost of the guitar is 0.25 standard deviations ABOVE the mean. For drums, the cost
of the drum set is 1.0 standard deviations BELOW the mean. Of the three, the drums cost the
lowest in comparison to the cost of other instruments of the same type. The guitar costs the most
in comparison to the cost of other instruments of the same type.
Part 5. Measures of central tendence and variability
The descriptive statistics and histograms displayed on the attached sheet are based on fictitious
data on the midterm score of World History in Class A and Class B.
Class A:
Mean=92.5
Standard deviation=9.2
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
Median=96
Class B:
Mean=84.38
Standard deviation=9.2
Median=85
Please answer question 1-4 based on the information provided.
1. Comparing the two classes, which class did better overall in the World History midterm and
how did you determine this?
Answer: Overall, Class A did better because Class A had the higher mean score or it is apparent
from the visual display of the data
2. In which class is the difference between the mean and the median smaller? Relate your
answer to the shape of the distribution.
Answer: The difference between the mean and the median is smaller in Class B. The shape of
the distribution is approximately symmetric as compared to Class A.
3. What is the most common score interval in each of the two classes? Does the mean score fall
into the most common interval in any of the classes? What dose the most common score interval
in Class A tell you about the overall performance in Class A?
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Descriptive Statistics
Lecture Note by: Shangmou Xu
Answer: For Class A, the most common score internal is 95-100. For Class B, the most common
score interval is 75-80 and 90-95. The mean does not fall into the most common interval in any
of the classes.
From the most common score interval, we know that most students’ World History midterm
grade fall between 95 and 100. However, we can’t tell the same conclusion from the most
common score interval of Class B.
4. In which class do the scores vary the greatest amount?
Answer: comparing the standard deviations of two classes (both are 9.2), it appears that the
midterm scores vary approximately to the same extent. However, by looking at the histograms it
appears that there is more variability in Class B. Obtaining the mean and sd for a very skewed
distribution can be led to inaccurate interpretations.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
Week 6 – Lecture Note
Overview
Welcome to Week 6! This week, we will go back to our research procedure and discuss
the step after sampling. After we decide our participants and briefly describe the characteristics
of participants, we may start to select a measurement instrument to collect the data we want.
Recall that an instrument could be a direct tool to collect data like height, weight, blood pressure,
etc; it could be a test like SAT, GRE; it could be a survey like a self-esteem survey, study
motivation survey; it could also be an interview outline. Depending on the property of the data
we collect, there are two types of data we can collect, observable data (direct data like height or
weight) and constructs (non-observable data). Let me give you some examples of constructs.
[Example 1] Let’s say we want to know the math skill of an 8th grade class, then “math skill”
becomes a non-observable variable (i.e., a construct). It’s non-observable because we can’t use
our eyes or a handy instrument to directly measure math skills (like we usually do for height or
weight). Instead, we want to use a math test (let’s say we have 20 MC questions) to indirectly
measure students’ math skill and assume that the test we use MAY accurately capture students’
math skills. It’s an indirect measure because we will never know the actual math skills and
depending on the math test we used, students may have very different “observed” math skill.
That is being said, “math skill” is not a single construct. The construct in this example is “math
skill measured by certain 20 MC questions”, and you may have different math skill construct
using a different math test (e.g., a test with 10 open-ended questions). [Example 2] Let’s say we
want to know the math self-confidence of an 8th grade class, and self-confidence becomes our
construct. You and I may have different concepts of self-confidence, but based on my own
understanding of related research, I created my own survey to measure self-confidence. This
survey may contain several open-ended questions about the math learning experience and some
Likert scales items about self-reported confidence. In this case, math self-confidence measured
by my survey is the construct and I assume that this construct can actually capture 8th grade
students’ math self-confidence.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
Week 6, Slides part 1 is all about the characteristics of instruments, and construct might
be the most difficult part to understand. I hope these two examples can help you understand the
most basic idea. Please read through Part 1 and get a basic idea of different types of
measurement. If you can’t memorize that knowledge from these slides, don’t worry because you
can always go back to these slides and in fact, you don’t even need to memorize these. Let’s
move on to Part 2 and Part 3, validity and reliability. I will give you a quick review of the basic
idea of validity evidence and reliability evidence of a test and you can go ahead to read through
the detail of validity and reliability evidence.
From two examples I just gave to you, you may’ve already noticed that the instrument we
created (math skill test and self-confidence survey) are not always “accurate” because the
constructs depend on our understanding of math skills and we won’t know the true math skills.
However, “accurate” is a vague description of an instrument and we need to unpack the box of
“accurate”. We, therefore, introduce two important concepts, validity and reliability, to evaluate
the “accuracy” of an instrument. Please note, validity and reliability are not the property of the
measurement instrument, so there is no “yes or no” question about validity or reliability; that is,
we don’t say a test is valid or not valid. Instead, we usually say that the researchers provide
appropriate, adequate validity evidence. In general, you can provide as much evidence as you
can, but what types/amount of evidence you will need to provide? Well, that really depends on
the goal of your own research. For now, you can just have the idea that providing validity and
reliability evidence is an on-going effort when developing/adopting an instrument and this
evidence will support your usage of an instrument.
Let’s forget about the measurement instrument for a while and think about the basic
concepts of validity and reliability. I want to show you this nice picture (Credit to Prof William
M.K. Trochim). Assume that I want to hit the center of the target and here are the evaluations of
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
my skills of hitting the target. In the first case, my skill of hitting the target is reliable (I hit
constantly) but not valid (I can’t hit the center). In the second case, my skill is not reliable,
meaning each of these hits is kind of random. But I would say this skill is valid because if think
about a test result like that and the mean score is around the center, it’s kind of valid, just not
reliable. In the third case, my skill is not reliable (hitting randomly). Also, because I only hit the
upper side of the target, my skill is not valid. In this end, in the fourth case, I can consistently hit
the center of the target.
Back to measurement, Validity refers to the degree to which a test measures what it is
supposed to measure and, consequently, permits appropriate interpretation of scores. There are
four types of validity evidence, content validity, criterion-related validity, construct validity, and
consequential validity. You can find the detailed explanation on slides part 2 and textbook from
page 161. I just want to provide some thoughts about criterion-related validity specifically.
Criterion-related validity is determined by relating performance on a test to performance on a
second test or another measure. We call the second measure as the criterion(ia). Within the
category of Criterion-related validity, there are two forms, concurrent validity and predictive
validity. Concurrent validity is often used when we want to develop a new instrument to replace
or add to an existing instrument. For example, in addition to the midterm exam, I want you to
write a paper as well (just an example) to evaluate your performance during the first half of the
semester. I want to make sure that these two instruments are highly related, meaning that they are
measuring the similar thing and yielding similar testing results. Predictive validity is used when
we want to use one testing result to predict some future results. SAT score and college GPA
would be a good example. In Assignment 6 question 4, I provide three sets of criteria for you to
evaluate. Note, these three criteria are independent. Think about the pros and cons of these items
as criteria. In what ways these can provide a good reference and in what ways these criteria may
be not as good as we expected.
Reliability is the degree to which a test consistently measures whatever it is measuring.
Take a pretest-posttest research setting as an example. In such research setting, we want to see
the effect of the intervention, so we conduct a pretest at the beginning and provide the
intervention. After the intervention, we conduct the posttest. The effect of the intervention can be
estimated by subtracting pretest from posttest score. To ensure the difference is really
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
representing the effect of the intervention, we want our test to reliably measure student
achievement. If the test is less reliable, we won’t know whether the change in the test result is
due to less consistency of the test, or due to the effect of the intervention.
I think that’s all for a brief overview. Please read the assigned reading and slides.
Textbook Reading (Required)
Gay, L. R., Mills, G. E., & Airasian, P. W. (2011). Educational research: Competencies for
analysis and applications (10th ed.). Upper Saddle River, NJ: Pearson.
Chapter 6, pp. 149-193
Edition 11: Chapter 6, pp. 157-201
Lecture slides
Please read through Week 6 Slides (Part 1-Part 4)
Exercise
This exercise will give you a quick review and practice of evaluating validity evidence and
reliability. This will be helpful and midterm, so please be sure to check the item and answers.
Part 1. Types of validity evidence. Identify which main type of validity evidence – content or
criterion-related — is reported in each of the following excerpts. What is the basis for your
answer?
1. “Validity of interpretations about student information skills is supported by [test] developers’
review of curriculum documents and the current literature on information skills. Also supporting
validity is the incorporation of teachers in the review of items and to provide feedback …”
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
Answer: Content, because validity is assessed by developers’ review of curriculum documents,
and the literature on information skills. There is no reference to another method of measurement
or to statistical information.
2. “… validity is demonstrated by measuring the relation between the GRADE and several group
and individually administered reading tests. Correlation coefficients range from .69 to .90,
suggesting that the GRADE is measuring similar skills to those of other tests of reading ability.”
Answer: Criterion, because validity is assessed through the correlations between the GRADE
and other reading tests.
3. “… validity evidence of the GRADE is also demonstrated in two studies with dyslexic and
learning disabled (reading) students. In both of these studies the dyslexic and learning disabled
students scored significantly lower on the GRADE than students in a matched control group.”
Note: Two studies were carried out. In the first study dyslexic students were compared to a
control group; in the second study learning disabled students were compared to a control group.
Answer: Criterion, because validity is assessed by the relationship between the GRADE and the
criterion of group membership. Scores of dyslexic and learning disabled students, respectively,
were compared to the scores of a control group.
4. “A clinical validity study showed significant differences between the mean ADHD-SRS
ratings given by both teachers and parents for children diagnosed with ADHD when compared
with undiagnosed children.”
Note: The study was carried out in two parts. In the first part of the study teachers’ average
ratings of children diagnosed with ADHD were compared to their average ratings of
undiagnosed children. In the second part parents’ average ratings of diagnosed and undiagnosed
children were compared.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
Answer: Like item 3 above, this is an example of criterion related validity, where the criterion is
group membership. Parents’ and teachers’ ratings, respectively, of children diagnosed with
ADHD were compared to parents’ and teachers’ ratings of undiagnosed children.
5. “In Study 1, 144 firefighter applicants took the Firefighter Selection Test, completed fire
college [training], and became firefighter probationers. [Correlations were obtained with a
variety of measures including] “job knowledge tests and supervisors’ ratings.”
Answer: Criterion related. Scores on the Firefighter Selection Test were compared with tests of
job knowledge and supervisors’ ratings after they became firefighter probationers.
Part 2. Reliability. Identify the approach to reliability (test-retest, equivalent forms, internal
consistency, or interrater) that is referred to in each of the excerpts below. What is the basis for
your answer?
1. __________ reliabilities between a trainer and 14 observers were reported as a mean of 93%
agreement with a range of 83% to 97% agreement.
Answer: Inter-rater. Percent agreement between a trainer and observers is reported.
2. The reliability coefficients for the five scales range from .79 to .98 for the Kuder-Richardson
20 and from .81 to .98 for the split-half.
Answer: Internal consistency. Both KR 20 and split-half are examples of the internal consistency
approach.
3. __________ reliabilities were computed by correlated scores on the two new forms, Form C
and Form D. These values ranged from .73 to .90 with a median of .83.
Answer: Equivalent/alternate forms. Correlations were computed between scores on two new
forms of a test.
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
4. The percentage of participants reporting the identical four preferences [on the Myers-Briggs
type indicator] after a 4-week interval range from 55% to 80%, with an average of 65%.
Answer: Test-retest. The same test is administered to a sample twice with a time delay of 4
weeks between administrations.
Part 3. Evaluation of validity and reliability evidence.
Please answer the following questions in regard the review of the Early Reading Diagnostic
Assessment (ERDA) that is attached. (The ERDA was one of the tests used in the study on
vocabulary for which you read the literature review). I suggest that you read the entire review
before beginning to answer the questions. Line numbers have been added to the review for your
convenience.
1. In lines 39-40 the reviewer states “The standardization sample is remarkably close to the U.S.
census data from 2000 on the stratified variables.” Do you think the reviewer considers this to be
a strength or a weakness of the ERDA? Explain your answer.
Answer: It is a strength that the sample is representative of the U.S. population.
2. How was the content validity of the ERDA assessed? (See lines 65-70)
Answer: Curriculum experts in reading examined the content of the items in relation to the
standards specified by the U.S. Department of Education to ensure that all curriculum areas were
balanced. The experts also identified and then removed items that seemed to have either ethnic or
gender bias.
3. How was the criterion validity of the ERDA assessed? (See lines 80-91)
Answer: The performance on the ERDA of a sample of children diagnosed with learning
disabilities in reading was compared to the performance of a control group. The sample of
EDUC 2201/EFOP 2001 Introduction to Research Methodology
Selecting Measurement Instruments
Lecture Note by: Shangmou Xu
children with learning disabilities was made up of 47 second and third grade students who were
found to be eligible for special education services.
4. Do you agree with the reviewer’s comment (see lines 93-94) that “Additional data correlating
the ERDA with established diagnostic reading assessment tests such as the Woodcock Reading
Mastery Test would have been helpful information? ,” or do you think that the procedure used by
the test developers was adequate? Explain your position.
Answer: Additional evidence for criterion-related validity of the ERDA for children in
kindergarten and first grade would be helpful.
5. According to the reviewer, which types of reliability (test-retest (stability), equivalent forms,
internal consistency, inter-rater) were evaluated for the ERDA? (See lines 42-60)
Answer: Internal consistency (split-half) reliability and test-retest reliability were evaluated.
Inter-rater reliability was evaluated for subtests requiring subjective judgment.
6. Were reliability coefficients for all subtests greater than or equal to .8? Explain. (See lines 4260)
Answer: Reliability coefficients for some subtests were in the .60’s and .70’s. The subtests with
relatively low reliability all contained 12 or fewer items.
EFOP 2001/EDUC 2201: Introduction to Research Methodology
Week 6 – Additional Exercises
1. Which of these research scenarios would be best addressed by a qualitative research
study?
a. A psychiatrist wants to understand why different treatments for depression are
viewed as less socially acceptable.
b. A medical researcher wants to determine potential side effects of a new diet pill
on people with high blood pressure.
c. A school psychologist wants to investigate the relationship between anxiety and
math test scores in first grade children.
d. A city counsel member wants to evaluate the effect of after school programs on
students’ academic performance.
Rationale: Response A is correct because the researcher is interested in understanding
reasons why people feel a certain way. This would be hard to address using quantitative
methods.
2. What is the primary difference between causal-comparative and experimental
research?
a. Causal-comparative research is concerned with cause-effect relationships.
b. Causal-comparative research makes use of pre-existing groups.
c. Experimental research involves observation of the dependent variable.
d. Experimental research involves testing a hypothesis.
Rationale: Response B is correct. Causal-comparative research relies on pre-existing
groups, while experimental research requires that the researcher randomly assigns
participants into the control and experimental groups. Note that the other response
options are true statements, but they do not describe the difference between these two
categories of research.
3. A drug company is testing the effect of a new medication on reducing cholesterol levels
in women.
In this research study, what is the dependent variable?
a. a drug company
b. the medication
c. cholesterol levels
d. women
Rationale: Response C is correct. Cholesterol levels depend on the medication.
Use the information in the paragraph below to answer questions 4 through 6.
A high school principal wants to investigate whether there is a relationship between
achievement and attendance at school. To study this, he analyzes data from attendance
records and students’ scores on an achievement test. His hypothesis is that as
achievement scores increase, the number of days missed will tend to decrease.
4. Which type of research could best be used to test this hypothesis?
a.
b.
c.
d.
Descriptive
Correlational
Causal-Comparative
Experimental
Rationale: This is a correlational study because the goal is to establish whether there is a
relationship (correlation) between two variables. This research would not be considered
causal-comparative or experimental because it does not examine the cause-effect
relationship between a dependent and an independent variable
5. What scale of measurement would we use to measure achievement?
a. Nominal
b. Ordinal
c. Interval
d. Ratio
Rationale: Achievement test scores are measured on an interval scale. It is not ordinal
because the intervals between scores are equal with respect to achievement (for
example, the difference between an 85% and a 90% is the same as the difference
between a 70% and a 75%). It is not ratio because there is no absolute zero. If a student
gets a zero on an achievement test, it does not mean that student has achieved nothing.
6. If the principal found that his hypothesis was correct, what would be the most valid
conclusion?
a. Students with higher achievement test scores are more motivated, so they miss
fewer days of school.
b. Coming to school regularly means that students learn more and therefore
achieve more.
c. If a student has perfect attendance, it is probably because that student is a very
high achiever.
d. The better a student’s attendance, the more likely that student has higher
achievement test scores.
Rationale: Response D is correct because it is the only statement that does not imply a
causal relationship between the two variables. Correlational research does not
investigate the cause of the relationship between two variables, just whether a
relationship is present.
7. A researcher is investigating the effects of various treatment methods for obsessivecompulsive disorder. Participants are recruited through various local psychotherapists.
The participants are required to give the researcher their names, but the researcher
assures them that he will not reveal their identities under any circumstance.
Which of the following would best describe the nature of the subjects’ participation in
this study.
a.
b.
c.
d.
Anonymous
Confidential*
Both anonymous and confidential
Neither anonymous nor confidential
Rationale: The research is confidential, not anonymous. In order for it to be anonymous,
the researcher would not know any of the names of the participants.
8. Which of the following situations involves an ethical violation?
a. When recruiting subjects for a research study on the effect of viewing violent
images on concentration, a psychologist asks potential participants if they would
be willing to watch a video that involves violence and gore.
b. A researcher wants to conduct an experiment on the effects of smoking on sleep
patterns using men only. No women are recruited to participate in the study.
c. A researcher is conducting a study that involves having subjects participate in
vigorous cardiovascular exercise. Because he does not want to be liable for
anyone being injured, he only recruits participants that are in good health.
d. Participants in a study researching the effect of a new drug are told that once
they agree to participate in the research study, they must participate for the full
six-week study because the drug is very expensive.
Rationale: Response D is correct because the participants are not free to end their
participation in the study at any time. The situation in Response A is not unethical
because the researcher is informing the participants of the nature of the study, including
foreseeable risks. Responses B and C do not represent ethical violations because there is
nothing unethical about a researcher targeting a specific subgroup as the focus of their
research.
9. A researcher wants to conduct experimental research to investigate the effect of playing
classical music on subjects’ ability to complete a stressful task. What would be the best
first step to begin the research?
a. Randomly select participants that represent the population on key
characteristics.
b. Submit a research proposal to the IRB.
c. Conduct a literature review to see whether similar research has been done in the
past.
d. Conduct an experiment in which the experimental group completes a stressful
task while classical music is playing.
Rationale: The literature review should be conducted first. The literature review would
help the researcher decide whether the research study should be conducted at all and/or
inform the direction of the research study. It does not make sense to recruit participants,
submit a proposal to the IRB, or design the experiment until this has been done.
10. Which of the following is an example of a null hypothesis?
a. Children with reading disabilities will have the same level of test anxiety as
children without reading disabilities.
b. A chemical compound will cause plants to grow 30% larger than plants that are
not exposed to the compound.
c. There is a relationship between alcohol consumption and academic performance
in high school students.
d. People who live less than twenty miles away from the ocean are more likely to
get skin cancer than the average American.
Rationale: A null hypothesis is a statement describing no relationship or no difference.
Responses B, C, and D talk about the presence of a relationship between two variables.
11. Which of the following is an example of a directional research hypothesis?
a. Children with autism approach complex reasoning tasks differently than other
children.
b. There is a correlation between sudden changes in temperature and instances of
heart attacks in the elderly.
c. Acid rain causes changes in plant growth.
d. Aromatherapy reduces stress levels.
Rationale: Response D is the only option that talks about the direction of the relationship
between the variable— aromatherapy causes stress levels to go down. The other
hypotheses do not describe the direction of the relationship.
12. To obtain a sample of undergraduate students at the University of Pittsburgh, a
researcher obtains a list from the registrar of all registered undergraduates. She then
uses a random number table to select 400 students to receive a survey. What sampling
technique did the researcher employ?
a.
b.
c.
d.
Simple random sampling
Stratified random sampling
Cluster sampling
Systematic sampling
Rationale: This is an example of simple random sampling. Everyone in the population
has an equal chance of being selected.
13. To obtain a sample of high school students in Pennsylvania, a researcher obtains a list of
all of the high schools in the state. He then randomly selects eight of the high schools
using a random number table. He administers an assessment to all of the students in
those eight schools. What sampling technique did the researcher employ?
a. Simple random sampling
b. Stratified random sampling
c. Cluster sampling
d. Systematic sampling
Rationale: This is an example of cluster sampling, where the high schools are clusters.
14. A sociologist wants to study elderly Americans’ attitudes about voting. He decides to
administer a survey to the residents of the local nursing home at which he volunteers.
From a list of all of the residents, he randomly selects 70 of the 350 residents to
complete a 20-item survey about voting.
Which statement below best describes the flaw in this research design?
a. Because he only sampled 20% of the residents at the facility, the results may not
be generalizable to the population. He should select a larger proportion of
residents at the nursing home.
b. Because he only selected residents at one nursing home, he is using a form of
convenience sampling. He should consider a sampling plan that involves a wider
range of elderly people.*
c. Because he wants to measure attitudes, it is inappropriate to conduct a
quantitative research study. He should conduct a qualitative study.
d. Because he is administering his survey at a nursing home, his study is in violation
of the Health Insurance Portability and Accountability Act .
Rationale: Response B is correct because the researcher would like to generalize his
results to the entire population of elderly Americans. It is unlikely that residents in one
nursing home would represent all elderly people in America (for example, think about
how people in a nursing home might be different from elderly people who live
independently). Response A is somewhat correct in the sense that the results of this
study will not be generalizable to the population, but sampling more residents at this
nursing home will not effectively correct that.
15. Three students took an exam. Relative to other classmates, Kara’s raw score was
equivalent to a z-score of 2. Susan scored at the 50th percentile. Jason scored in the
second stanine.
Based on the above information, which statement below is correct?
a. Susan’s score was the highest.
b. Kara’s score was the highest.
c. Jason’s score was the highest.
d. Jason’s and Kara’s scores were the same.
Rationale: A z-score of two means that Kara’s score was two standard deviations higher
than the mean. Since Susan scored at the fiftieth percentile, her score was at the mean
(assuming a normal distribution). With a score in the second stanine, Jason scored well
below the mean.
16. A professor wants to know, in general, how well students performed on the first
homework assignment in her class. She asks her teaching assistant to summarize the
scores for the assignment. The scores for the eight students were as follows:
24%
84%
88%
90%
92%
92%
96%
98%
The teaching assistant responded only by telling the professor that the mean score for
the assignment was 83%. What was the major flaw in the teaching assistant’s response?
a. The mean of the eight scores is not 83%.
b. A different measure of central tendency would have described the dataset
better.
c. The teaching assistant didn’t convert the raw scores to z-scores.
d. It is improper to use measures of central tendency when scores are not normally
distributed.
Rationale: Means are affected by extreme scores. The one student who received at 24%
brought the mean down significantly. If the assistant was going to describe the dataset
just in terms of a measure of central tendency, using the median (91%) would give the
professor a better sense of how well most students performed. The other responses are
not correct: the mean is 83%, it is not necessary to convert the raw scores to z-scores,
and you can use measures of central tendency to describe datasets that are not normally
distributed.
17. An industrial psychologist has developed a new inventory designed to measure job
satisfaction. He asks you to help him devise a plan to provide criterion-related validity
evidence for the inventory. Which of the following would be a good step to include in
your plan?
a. Administer the Job Satisfaction Inventory and another instrument that measures
job satisfaction to a group of people; compute a correlation between the scores
of the two instruments.
b. Administer the Job Satisfaction Inventory to the same group of people twice;
compute a correlation between the scores on the first and second
administration.
c. Share the Job Satisfaction Inventory with other industrial psychologists who have
expertise in this area; ask them to rate the inventory on depth and breadth of
coverage of the domain of job satisfaction.
d. Assemble a focus group to talk about the potential unintended consequences of
administering the Job Satisfaction Inventory.
Rationale: Criterion-related validity refers to the relationship between scores on a test
and some other method of measuring the same trait. Response B relates to test-retest
reliability, Response C relates to content validity, and Response D relates to
consequential validity.
18. A test developer wants to assess the test-retest reliability of scores on an eighth grade
math achievement test. She administers the test to a group of 60 students on Monday
and then again to the same group of students three days later. What mistake did the
test developer make in her investigation?
a. She should not have administered the test twice to the same group of students.
b. She should have administered the test to more than 60 students.
c. She should have divided the students into a control group and an experimental
group.
d. She should have allowed more time between the two test administration dates.
Rationale: Three days is not a good window of time between administration dates.
Consider how you might perform differently on a test if you had already taken in three
days prior. You would probably remember some of your previous answers. The ideal
window of time for assessing test-retest reliability is two to six weeks.
19. A high school administrator wants to administer a standardized test to see how each
particular student has improved his or her writing ability from freshman to senior year.
There are 800 students in the high school, and it is the only high school in the school
district.
The administrator asks you whether a norm-referenced, criterion-referenced, or selfreferenced interpretation of the data would be most appropriate. What information
from the above paragraph would best help you make this decision?
a. The administrator wants to administer a standardized test.
b. The administrator wants to track progress over time.
c. There are 800 students in the school.
d. It is the only high school in the district.
Rationale: The fact that the administrator is interested in students’ progress over time
means that a self-referenced interpretation is most appropriate. The other response
options are not as relevant to making this decision.
20. Why is it important that a researcher operationally defines the constructs that serve as
variables in a research study?
a. So constructs can be directly observable.
b. So the researcher can avoid measuring the same construct twice.
c. So the researcher defines the constructs the same way that other researchers
do.
d. So it is clear what the constructs mean and how they will be measured.
Rationale: Since constructs are abstractions that are not directly observable, they need
to be operationally defined so that it is clear what the researcher is measuring. The
same construct can be measured in different ways and can have more than one
operational definition. Operationally defining a construct doesn’t mean that it becomes
directly observable.
21. A company has received 25 applications for an open position. The applicants are asked
to complete a 30-item assessment comprised of essay questions about how they might
handle various work-related situations. Due to the large number of applicants, the
assessments were scored by two different people in HR.
Which type of reliability would be of most relevance when interpreting the applicants’
scores?
a. Alternate forms reliability
b. Internal consistency reliability
c. Rater reliability
d. Test-retest reliability
Rationale: Since more than one rater is scoring the essay exams, they need to be doing
so reliably in order for scores from applicants rated by Rater 1 be meaningfully
compared to scores from applicants rated by Rater 2

Purchase answer to see full
attachment

error: Content is protected !!