 # STAT1 Basics rename version from 2017-05-28 15:29 edit

## Section

Statisticsthe art and science of collecting, analyzing, presenting and interpreting data
Datathe facts and figures collected, analyzed and summarized for presentation and interpretation
Data Setall the data collected in a particular study
Elementsthe entities on which data are collected
Variablecharacteristic of interest for the elements
Observationthe set of measurements obtained for a particular element
Nominal Scalethe scale of measurement for a variable when the data are labels or names used to identify an attribute of an element. It can be numeric or non-numeric.
Ordinal ScaleThe scale of measurement for a variable if the data exhibit the properties of nominal data and order or rank of the data is meaningful.
Interval ScaleThe scale of measurement for a variable if the data demonstrate the properties or ordinal and interval between values expressed in terms of a fixed unit of measure. Always numeric.
Ratio ScaleThe scale of measurement for a variable if the data demonstrate all the properties of interval and the ratio of two values is meaningful. Always numeric.
Categorical Datalabels or names used to identify an attribute. Either nominal or ordinal scale of measurement.
Quantitative Datanumeric values that indicate how much or how many of something. Either interval or ratio scale of measurement.
Cross Sectional Dataare data collected at the same (or approximately) point in time.
Time Series Dataare data collected over several time periods.
Descriptive Statisticsuses tabular, graphical, numerical summaries of data
Statistical Inferencethe process of using data obtained from a sample to make estimates or test hypotheses about the characteristics of a population
Populationis the set of all elements of interest in a particular study
Sampleis a subset of the population
Censusis a survey conducted on the entire population to collect data
Sample Surveyis a survey conducted from the sample to collect data
Primary Datadata collected by the investigator conducting the research. Original information for field research.
Secondary Data data collected by another person or different source for re-use in the purpose of research.
Simple Random Samplingbasic method of sampling from a population randomly
Systematic Random Samplingmethod in which we randomly select one of the first k elements and the select every kth element thereafter.
Stratified Samplingmethod in which the population is first divided into strata and a simple random sample is then taken from each stratum.
Cluster Samplingmethod in which the population is first divided into clusters and then a simple random sample of the clusters is taken.
Multi Stage Samplingcomplex form of cluster sampling
Convenience Sampling (Accidental)members are chosen based on relative ease of access. Like friends, classmates, family, etc.
Snowball Samplingfirst respondent refers a friend then refers another and so on.
Judgmental Samplingthe researcher choose the sample appropriate for study.
Deviant Casediffer from dominant pattern
Case Studylimited to one group
Ad Hoc quotasa quota is established edit

## SUMMARIZING QUALITATIVE DATA

Frequency distributiontabular summary of data showing the frequency (or number) of items in each of several nonoverlapping classes
Relative frequencyfraction or proportion of the total number of data items belonging to the class
Percent frequencyrelative frequency multiplied by 100
Bar Graphgraphical device for depicting qualitative data that have been summarized in a frequency, relative frequency, or percent frequency distribution edit

## SUMMARIZING QUANTITATIVE DATA

Histograma bar graph with no natural separation between rectangles of adjacent classes
Cumulative Frequency Distributionnumber of items with values less than or equal to the upper limit of each class
cumulative relative frequency distributionshows the proportion of items with values less than or equal to the upper limit of each class
cumulative percent frequency distributionshows the percentage of items with values less than or equal to the upper limit of each class
Ogivegraph of a cumulative distribution
exploratory data analysisconsist of simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly
stem-and-leaf displayshows both the rank order and shape of the distribution of the data
Crosstabulationtabular method for summarizing the data for two variables simultaneously
scatter diagramgraphical presentation of the relationship between two quantitative variables edit

## MEASURES OF LOCATION

Meandata set is the average of all the data values
Medianvalue in the middle when the data items are arranged in ascending order
Modevalue that occurs with greatest frequency
Percentileprovides information about how the data are spread over the interval from the smallest value to the largest value edit

## MEASURES OF VARIABILITY

Rangedata set is the difference between the largest and smallest data values
RangeSimplest measure of variability
Interquartile Rangedifference between the third quartile and the first quartile
Interquartile RangeRange for the middle 50%
Variancemeasure of variability that utilizes all the data
VarianceAverage of squared differences between data value and mean
standard deviationpositive square root of the variance
Coefficient of Variationindicates how large the standard deviation is in relation to the mean
Chebyshev’s TheoremAt least (1 - 1/k 2 ) of the items in any data set will be within k standard deviations of the mean, where k is any value greater than 1.
Empirical RuleApproximately 68% of the data values will be within one standard deviation of the mean
Outlierunusually small or unusually large value in a data set
Smallest, First Quartile, Median, Third Quartile, LargestFive-Number Summary
Box Plotbox is drawn with its ends located at the first and third quartiles
Covariancemeasure of the linear association between two variables
Weighted meanmean is computed by giving each data value a weight that reflects its importance edit

## PROBABILITY

Probabilitynumerical measure of the likelihood that an event will occur.
Experimentis a process in statistics that generates well defined outcomes.
Sample Spaceis the set of all experimental outcomes.
Sample Point or Experimental Outcomeis an element of the sample space.
Eventis a collection of sample points or a subset of the sample space.
Tree Diagramis a graphical representation that helps in visualizing a multiple step experiment
Counting Rule for CombinationsA second useful counting rule allows one to count the number of experimental outcomes when the experiment involves selecting n objects from a (usually larger) set of N objects.
Classical Methodassigning probabilities based on the assumptions of equally likely outcomes.
Relative Frequency Methodassigning probabilities based on experimentation or historical data.
Subjective Methodassigning probabilities based on judgment.
Eventa collection of sample points
Mutually Exclusive Eventsthe events have no sample points in common
Conditional Probabilityprobability of an event given that another event has occurred
Multiplication lawprovides a way to compute the probability of the intersection of two events