The data consists of the salary amount (in £‘000’s) of 140 male and 140 female full time workers. A comparison of salaries between the two groups is given hence.
Table 1 gives the summary statistics for salary of women. Table 2 gives the summary statistics for salary of men. It is observed that measures of central tendency is found to be greater for men than women.
Women are seen to be earning an average of 28.101.The dispersion in salary values as explained by the standard deviation is found to be 15.25. 50% of women earn less than equal to 23.3. Men are found to be earning an average of 33.898. The dispersion in salary values as explained by the standard deviation is found to be 21.83 units. 50% of men earn less than equal to 24.65 (Rohatgi & Saleh, 2015).
The distribution of salary for men and women is found to be right skewed, however the distribution for men depict a longer right tail. The skewness measures are greater than +1 making both highly skewed (Rohatgi & Saleh, 2015).
Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
The excess kurtosis measure as provided by EXCEL is found to be greater than 1 for both men and women indicating that the distribution has sharper and higher peaks than normal distribution as well as fatter and longer tails (Salkind, 2015). The distribution for men however are seen to be sharper than those of women.
Inter-quartile range (IQR) = Q3 – Q1
The median is calculated as 23.9, Q1 is calculated as 16.9 and Q3 is calculated as 34.4. The IQR is thus found to be 17.5.
The corresponding EXCEL computed values are:
The computed statistics thus differ from the EXCEL computed ones. The Median and IQR are found to be higher when determined using ogive. This is owing to the fact that the graphically determined values are based on grouped data.
The percentage of women who earn less than 19.025, the first quartile of salary distribution of men is found to be 32.1429% out of all women. 15.714% out of all women are seen to earn more than 43.8, the 3rd quartile of salary distribution for men.
This suggests that the chance that a woman earns less than the lower quartile salary margin for men is more than double the chance that a woman earns more than the upper quartile salary margin for men, further enforcing the notion that women get paid less than men.
The 95% confidence intervals for salaries of men and women are given by:
m ± ()
The confidence interval was computed to be (26.842, 29.361) for women and (32.096, 35.701) for men in thousand pound units.
The 95% confidence interval for the difference of salaries of women from men is given by:
± ()
Here n and m are both equal to 140. The computed interval is (5.611, 5.984) in thousand pound units.
The mean is the average of all values of the dataset. It minimizes the mean squared error. It is the most popular and preferred measure of central tendency. It is however heavily influenced by outliers which could distort the value of the mean. It is in such cases where outliers are present that a median or a mode would be a more preferred measure. A median is the middle value of the dataset, that is, the value which equally divides the data by frequency. A median is preferred over mean when distribution is skewed. The mode is the value with highest frequency in the dataset. It is the only viable measure of central tendency out of the three measures when the data is nominal (Rohatgi & Saleh, 2015).
The given datasets are found to be rightly skewed which makes median the most appropriate measure of central tendency.
The range is the difference between the highest and lowest value of the dataset. It is the most crude and simple measure of dispersion. The main disadvantage of the range is that it considers only the two extreme data points and if either of the two are found to be outliers, range becomes high, when in fact the data might not be highly dispersed at all.
The standard deviation is the most robust measure of dispersion. It is the mean of the squared difference of all data points from the mean. Therefore it considers all the data points. It gives a measure of how tightly the data points are dispersed around the mean (Rohatgi & Saleh, 2015).
References
Rohatgi, V. K., & Saleh, A. M. E. (2015). An introduction to probability and statistics. John Wiley & Sons.
Salkind, N. J. (2015). Excel statistics: A quick guide. SAGE Publications.