# Psyc 3130 - Final Review

version from 2016-05-02 22:49

## Section

Lecture 10 - Independent Measures ANOVA

Independent Measures ANOVA
Independent Measures ANOVA is typically used to analyze the relationship between two variables under the following conditions:
If the dependent variable is quantitative in nature and is measured on a level that at least approximates interval characteristics
The independent variable is between-subjects in nature
The independent variable has three or more levels

3 relevant questions
Is there a relationship between the variables?
If so, what is the strength of the relationship?
If so, what is the nature of the relationship?

Example research question
An investigator examines the relationship between individuals’ religion and what they consider to be the ideal family size.
We have seven participants from each of three religious groups

Question 1: Null & Alternative Hypotheses
Ho: µC = µP = µJ (the means for all religious groups are equal)

µC = µP ≠ µJ
µC = µJ ≠ µP
µP = µJ ≠ µC
µC ≠ µP ≠ µJ

H1: The three means are not all equal

Mean XC = 3.00
Mean XP = 2.00
Mean XJ = 1.00

We know by looking at the means they are all different, but are they different enough that it doesn’t just reflect sampling error? Are they statistically significant?

Variability
Between-Group Variability –the difference between the mean scores for the groups
Two factors account for this
Sampling error
The effect of the Independent Variable on the Dependent Variable
Within-Group Variability – the difference within the scores of the three groups
What accounts for the difference?
Disturbance Variables (i.e. the degree in which someone practices their religion)

The variance ratio
Variance Ratio – the ratio of the between-groups and within-groups variability
Between-Group Variability / Within-Group Variability
Sampling Error + Effect of the IV / Sampling Error

Sum of squares
SStotal = SSBETWEEN + SSWITHIN
SStotal = ∑X2 – (∑X)2 / N
SSwithin = ∑X2 – ∑Tr2 / N
SSbetween = ∑Tr2 / N – (∑X)2 / N

Mean Squares
MSbetween = SSbetween / dfbetween
DFbetween = k - 1
MSwithin = SSwithin / dfwithin
DFwithin = N – k
DFtotal = dfbetween + dfwithin
DFtotal= (k -1) + (N - k) = N - 1

F ratio
The variance ratio is formally known as the F Ratio
F = MSbetween / MSwithin
Criteria for critical value = DFbetween / DFwithin

Question 2: Strength of the Relationship
eta2 = SSbetween / SStotal
eta2 = (dfBETWEEN)F / (dfBETWEEN)F + dfWITHIN
Also, F = t2

Question 3: The Nature of the Relationship
µC = µP ≠ µJ
µC = µJ ≠ µP
µP = µJ ≠ µC
µC ≠ µP ≠ µJ
In order to determine the nature of the relationship, we must conduct additional analysis:
Scheffe test
Newman-Keuls test
Tukey HSD Test

Lecture 11 - Repeated Measures ANOVA

Cumulative Portion

Samples & Populations
Population – the entire group of individuals a researcher wishes to study
Sample – a subset of the population that is intended to represent, or stand in for the population
Representative sample – accurately reflects the individuals, behaviors, and scores found in the population.
Random sample – each member of the population has an equal chance of being included

Relationship between samples and population
Population --> select sample --> sample --> generalize sample results to population

Sampling error
Samples are used to represent or to generalize to a population, however, they are not likely to give a perfectly accurate assessment – this results in sampling error
Sampling Error – the naturally occurring discrepancy, or error, that exists between a sample statistic and the corresponding population parameter
Ways to reduce sampling error: Increase sample size; Keep variability in each population relatively small
M - µ = the amount of sampling error in a set of sample data

Statistics vs. Parameters
Statistics- numerical values that are based on data from the sample
Symbols are letters from the English alphabet
Parameters – numerical values that are based on the data from an entire population
Symbols for parameters are letters from the Greek alphabet

Variables
Independent Variable (X) – the variable that is changed or manipulated by the researcher, or the variable we believe causes a change in the other variable.
Usually consists of two (or more) treatment conditions to which the subjects may be exposed
May also be referred to as levels or treatments
Dependent Variable (Y)– the variable being observed or measured to assess the effect of the treatment
Criterion Variable – the variable that is being predicted (Y) in Regression analyses
Predictor Variable – the variable from which predictions are made (X) in Regression analyses

Scales of Measurements
Four types of measurement:
Nominal – using numbers and labels to categorize observations, but do not make any quantitative distinctions between observations
Ordinal – categories that are organized in an ordered sequence (size or magnitude)
Interval – ordered categories that have equal differences between numbers – but with an arbitrary zero point
Ratio – an interval scale with an absolute zero point

Measurement Hierarchy (from top to bottom of pyramid)
ratio ---> interval --> Ordinal --> Nominal

Types of Research Designs
Between-Subjects Design – values of the independent variable are split up between participants.
Within-Subjects Design – participants receive all levels of the independent variables.

Skewness
+ skewed (right skewed): Mode, Median, Mean (from left to right)
skewed (left skewed): Mean, median, mode (from left to right)

Deviation Scores
How do we know that the Mean is the mathematical center or balance point of a distribution of scores?
We know that the Mean is just as far from the scores above as it is from the scores below it.
The distance separating the score from the Mean is called the score’s deviation, or the amount a score deviates from the Mean.
The formula for computing a deviation scores is: X – M

Measures of Central Tendency
A measure of central tendency is a statistical measure to determine a single score that defines the center of the distribution. The goal of central tendency is to find the single score that is most typical or most representative of the entire group.
3 Types of Measures of Central Tendencies
Mode
Median
Mean
How do these relate to the normal distribution?

Variance
The Variance – equals the mean squared deviation or the average squared distance from the Mean
Variance = the mean squared deviation = Sum of squared deviation / number of scores
∑(X - µ)2 = sum of the squared deviation, also known as the SS or Sum of Squares

Standard Deviation
The Standard Deviation is the positive square root of the variance and provides a measure of the standard or average distance from the mean.
When it equals zero, there is no variability in the scores
The higher the value of the standard deviation, the more variability there is, everything else being equal

Computing Standard Deviation
σ = √σX2
σ = √ ∑(X - µ)2 / N
So if we have a Variance of 4 from the previous example, what would the Standard Deviation be?

Standard Deviation
What does the standard deviation mean?
So in the example, the Standard Deviation was 2
This is can be interpreted in 3 ways:
The scores differ from the Mean by an average of 2
The Standard Deviation allows us to determine how consistent the scores are from one another, and thus how accurate the Mean describes the distribution
The Standard Deviation indicates how much scores deviate above and below the Mean.
For example in the previous example, a score of 7 deviates +1 Standard deviation from the mean of 5, while a score of 1 would deviate -2 Standard Deviations from the Mean.

Z-scores
A z–Score is the distance a raw score is from the mean when measured in units of standard deviation
z -Scores allow us to determine the raw scores location in a distribution, it’s relative and simple frequency, and its percentile
In other words, the standard score is equal to the original score, minus the mean of the distribution over the standard deviation.
z = X - M / Sx

Important Characteristics of z-scores
A z-Score of 0 is equal to the Mean of the raw score
A positive z-score indicates the raw score is greater than the Mean; a negative z-score indicates the raw score is less than the Mean
The larger the absolute value of a z-score, the less frequently that raw score occurs
The standard deviation and variance of a set of standard scores always equals 1.00
Even though the measures of central tendency and variability will change when you convert to standard scores, the shape of a distribution does not change.
68% of the scores in a normal distribution will be between ±1 z-score

Power
The goal of behavioral research is to reject the Null Hypothesis when it is false – or to find support for our alternative hypothesis
In the IQ pill example, if we conclude from our experiment that the pill works, and it does actually work, we’ve uncovered a relationship in nature.
Power of a statistical test is the probability that the test will correctly reject a false Null Hypothesis. That is the power is the probability that the test will identify a treatment effect if one really exists.
This is also the probability of NOT making a Type II error – therefore power equals 1 - β
Things to consider during the design of your study:
sample size, or the number of units (e.g., people) accessible to the study
effect size, provides a measurement of the absolute magnitude of a treatment effect, independent of the size of the sample being used
alpha level (a, or significance level), or the odds that the observed result is due to chance
One tailed vs. two-tailed tests – a one tailed test would result in significantly more power because it increases the critical region in one direction

Null Hypothesis
A Null Hypothesis states that in the general population there is no change, no difference, or no relationship.
Symbol for the Null Hypothesis = Ho
In the context of an experiment, Ho predicts that the IV (treatment) has no effect on the DV (scores) for the population
So, why do we need a null hypothesis?
In behavioral research, you can never actually prove that something is true, but we can prove that it’s false.
Provides a starting point for any statistical test.

Alternative Hypothesis
Of course, in hypothesis testing you need to compare something to the null hypothesis.
The Alternative Hypothesis states that there is a change, a difference, or a relationship for the general population.
Alternative Hypothesis = H1
In the context of an experiment, H1 predicts that the IV (treatment) does have an effect on the DV (score) in the population

Pearson Correlation Coefficient
The Pearson Correlation Coefficient – measures the degree and the direction of the linear relationship between two variables.
Correlations can range from -1 to +1 and is represented by the symbol r
Correlations of -1 or +1 are considered perfect correlations
The strength of the correlation coefficient is determined by how far r is from 0.
In a positive correlation the two variables tend to change in the same direction
As the value of X increases, the value of Y also tends to increase
As the value of X decreases, the value of Y also tends to decrease,
In a negative correlation the two variables tend to go in the opposite directions
As the value of X increases, the value of Y tends to decrease and vice versa – this is an inverse relationship
Correlation DOES NOT imply causation!
There can be four possible explanations for correlating variables:
X is the cause of Y
Y is the cause of X
The correlation between X and Y is coincidental
A third variable is the cause of the correlation between X and Y

Regression
Regression, the statistical technique for finding the best-fitting straight line for a set of data
Y = a + bX – known as the regression line
Y = the predicted value of X
b = Slope of the regression line
a = Intercept (the predicted value of Y when X = 0)

Nonlinear Relationships
Two relationships may be related, but not linear; in this case they will be curvilinear
Ex. The relationship between anxiety and test performance (for low to moderate levels of anxiety, performance increases. However, as anxiety goes from moderate to high, test performance decreases)

3 relevant questions
Is there a relationship between the variables?
If so, what is the strength of the relationship?
If so, what is the nature of the relationship?
How are they determined for the Correlation Coefficient, Independent Measures t Test, Repeated Measures t Test, Independent Measures ANOVA and the Repeated Measures ANOVA?