# QM 2 ( Reading 7 )

rename
lightenning's
version from
2017-02-21 23:24

## Section

Question | Answer |
---|---|

Statistics | Data and methods of analyzing Data |

Descriptive Statistics | How large volumes of data are converted into useful, readily understood information by summarizing their important characteristics |

Inferential statistics | Methods used to make forecasts, estimates, or draw conclusions about a larger set of data based on a smaller, representative set. |

Name the Measurement Scales | Nominal, Ordinal, Interval, and Ratio |

Nominal scale | they categorize the data but do not rank |

Ordinal Scale | Categorize data in categories that are ranked ( worst performing to best performing ) and hence letting you infer that a stock belonging to a group 2 did well in the market. |

Interval Scale | not only scaled but scaled in a way that values can be added or subtracted to them ( temperature ) but, zero does not mean absence of data and multiplication does not work. ( 6 times 10 is not 60 degrees ) |

Ratio Scale | Strongest scale, which has all the characteristics of the interval scale, as well as an origin point ( zero ) and therefore, zero means zero and absence of data. |

Frequency distribution | tabular illustration of data categorized into a relatively small number of intervals or classes which include all the data and are mutually exclusive. All scales apply to these frequency distributions |

Modal interval | The interval with the highest frequency. |

Relative Frequency | proportion or fraction of total observations that lies in that particular interval. it is calculated by dividing the absolute frequency by the total number of observations. |

Cumulative absolute frequency | Observations that are less than the upper bound of the interval. ( sum of all the intervals under the upper bound not just this interval ) |

Cumulative relative frequency | total observations lower than the upper bound divided by total absolute observations OR cumulation of all the relative frequencies for intervals lower than the upper bound. |

Frequency Histogram | used to graphically represent the data contained in frequency distribution |

Frequency Polygon | Graphically illustrates the data in a frequency distribution. ( the midpoint of the interval represents the interval on X axis while y is the frequency ) |

Arithmetic mean properties | All observations are used, all intervals have a mean, sum of deviations from mean is zero, and an interval only has one mean |

Median property | Although helpful with skewed data sets, it is solely based on the position in the data set and does not reflect any other data |

Calculating Median | Even: average of ( n/2 + n+2/2 ) Odd: n+1/2 |

Mode types | Unimodal: one mode, bimodal: 2 modes or no modes at all ( all numbers happen just once ) |

Modal interval | the interval with the highest frequency |

Weighted mean | Assigns different weight to each observation. ( Sigma X i W i ) |

Geometric mean | Frequently used to average rates of change over time, or to calculate the growth rate of a variable over a period. |

Geometric mean Formula | G = Root n of ( X1 X2 X3 ... Xn ) or Root n of [ (X1 + 1) (X2+1)... ( Xn+1 ) ] -1 for numbers between 0 and 1 *** remember that you do not get absolute values and hence you add the negatives to the one just like the rest |

Relationship between Geometric and Arithmetic means | G is always less than or equal to Arithmetic mean. G equals Arithmetic only when all entries are identical, and the difference between G and A increases as the dispersion in observed values increases. |

Harmonic mean | Special kind of weighted mean where the weight of an observation is inversely proportional to its magnitude and is mostly used to determine the average cost of shares purchased over time. |

Harmonic mean formula | Xh = N / Sigma (1/xi ) |

relationship between Harmonic Mean and Geometric mean | Unless all the data are equal, H is always less than G which is always less than A |

Quantile | A value at, or below which a stated proportion of the observations in a data set lie. ( Quartile, Quintile, Decile, Percentile ) |

Quantile Formula | Ly = ( n+1 ) y / 100 :: y = Percentage at which we are dividing the distribution, Ly = Location of Percentile ( Py ). if say you get 2.25 for your answer, it means you take the 2nd number from left, and add 0.25 of the difference between 2nd and 3rd number. |

Dispersion | Variability or spread of random variable around its central tendency ( risk around mean which is the expected return ) |

Range | Maximum value - Minimum value |

Mean Absolute deviation ( MAD) | The average of absolute values of deviations of observations in a data set from its mean. |

MAD formula | Sigma [ absolute value ( Xi - X bar)] / n |

Population Variance formula | Sigma ( Xi - Miu )^2 / N |

Sample Variance formula | Sigma (Xi - Xbar)^2 / n-1 |

Semivariance | Avg of squared deviations below the mean |

Semideviation | Positive square root of semivariance |

Chebyshev's inequality | A method of calculating an approximate value for the proportion of observations in a data set that lie within k standard deviations from the mean |

Chebyshev's formula | Proportion of observations within k standard deviations from the mean = 1 - 1/k^2 where k is desired distance from std |

When is it not benefitial to use std to compare different populations | When data sets being compared have significantly different means, and when data sets have different units of measurement. |

Coefficient of Variation | Ratio of std of the data set to its mean ( risk per unit of return ) |

Coefficient of Variation Formula | CV = std / Xbar |

Sharpe Ratio | Ratio of excess return over the risk free rate to its std |

Sharpe Ratio Formula | Sharpe ratio = rp - rf / std |

Issues with Sharpe ratio | 1. It does not quite work with negative sharpe ratios ( since it decreases when you increase the risk). 2. std mostly applies to normal curves and not asymmetrical distributions which most investments are |

Positively Skewed | Stretches on the right side of the mean, with Mean > Median > Mode. It has many outliers on the right side of the mean that make it skew more. |

Negatively skewed | Long tail on the left of the mean, meaning Mean < Median< Mode |

Sample Skewness Formula | Sk = [ n / (n-1) ( n-2) ] Sigma ( Xi - Xbar )^3 / s^3 |

Sample skewness for large samples | Sk = 1/n Sigma ( Xi - Xbar )^3 / s^3 |

Properties of Sample skewness | Posirively skewed sample has positive Sk, negatively has negative Sk, normal distributions have 0 sk, and any | sk| greater than 0.5 is significantly skewed |

Kurtosis | Measures the extent to which a distribution is more or less peaked than a normal distribution. A normal distribution has a kurtosis of 3. It is usually reported as Excess Kurtosis (Ke) which is Kurtosis - 3 ( normal kurtosis ) |

Types of kurtosis | LeptoKurtic: more peaked and has fatter tails, and Ke >0 , Platykurtic: less peaked and thinner tails than normal distribution and Ke <0. Mesokurtic: identical to a normal distribution and has a Ke |

Sample Kurtosis Formula | Ke = [n(n+1) / (n-1)(n-2)(n-3)](Sigma Xi - Xbar )^4/ S^4 - [3(n-1)^2 / (n-2)(n-3)] |

Sample Kurtosis Formula for large n | Ke= [1/n Sigma (Xi - Xbar)^4 / s^4 ]- 3 |

## Pages linking here (main versions and versions by same user)

No other pages link to this page. See Linking Quickstart for more info.