Chapter 2: Business Analytics_ Data Analysis _ Decision Making 5th Edition Albright - €8,45   In winkelwagen

Tentamen (uitwerkingen)

Chapter 2: Business Analytics_ Data Analysis _ Decision Making 5th Edition Albright

1. A sample of a population taken at one particular point in time is categorized as: a. categorical b. discrete c. cross-sectional d. time-series c 2. Excel stores dates as a. numbers b. variables c. records d. text a 3. Researchers may gain insight into the characteristics of a population by examining a a. mathematical model describing the population b. sample of the population c. description of the population d. replica b 4. In order for the characteristics of a sample to be generalized to the entire population, it should be: a. symbolic of the population b. atypical of the population c. representative of the population d. illustrative of the population c 5. Coding males as 1 and females as 0 in a data set illustrates the use of a. nominal variables b. dummy variables c. numerical variables d. ordinal variables b 6. Gender and State are examples of which type of data? a. Discrete data b. Continuous data c. Categorical data d. Ordinal data c 7. The daily closing values of the Dow Jones Industrial Average are examples of a. cross-sectional data b. discrete data c. time-series data d. continuous data c Copyright Cengage Learning. Powered by Cognero. Page 1 Name: Class: Date: Chapter 02 8. Data that arise from counts are called a. continuous data b. nominal data c. counted data d. discrete data d 9. A variable is classified as ordinal if a. there is a natural ordering of categories b. there is no natural ordering of categories c. the data arise from continuous measurements d. we track the variable through a period of time a 10. Categorizing age variables as "young," "middle-aged," and "elderly" is an example of a. counting b. ordering c. value adding d. binning e. categorizing d 11. A histogram that is positively skewed is also called a. skewed to the right b. skewed to the left c. balanced d. symmetric a 12. What measure of distribution relates to extreme events, such as a stock market crash? a. asymmetric b. kurtosis c. negatively skewed d. skewness b 13. What is the most common type of chart for showing the distribution of a numerical variable? a. time series graph b. histogram c. bin d. box plot b Copyright Cengage Learning. Powered by Cognero. Page 2 Name: Class: Date: Chapter 02 14. As a measure of variability, what is defined as the maximum value minus the minimum value? a. variance b. standard deviation c. mean d. range e. median d 15. The median can also be described as the a. middle observation when the data values are arranged in ascending order b. population mean c. second percentile d. the average of all values a 16. The difference between the first and third quartile is called the a. interquartile range b. interdependent range c. unimodal range d. bimodal range e. mid range a 17. If a value represents the 95th percentile, this means that a. 95% of all values are below this value b. 95% of all values are above this value c. 95% of the time you will observe this value d. there is a 5% chance that this value is incorrect e. there is a 95% chance that this value is correct a 18. In a generic box plot, the x inside the box indicates the location of the a. mean b. median c. minimum value d. maximum value a Copyright Cengage Learning. Powered by Cognero. Page 3 Name: Class: Date: Chapter 02 19. In a generic box plot, the vertical line inside the box indicates the location of the a. mean b. median c. mode d. minimum value e. maximum value b 20. Which of the following are the three most common measures of central tendency? a. Mean, median, and mode b. Mean, variance, and standard deviation c. Mean, median, and variance d. Mean, median, and standard deviation e. First quartile, second quartile, and third quartile a 21. The length of the box in the box plot portrays the a. mean b. median c. range d. interquartile range e. third quartile d 22. With symmetric, "bell-shaped" distributions, approximately what percent of the observations are within two standard deviations of the mean? a. 50% b. 68% c. 95% d. 99.7% e. 100% c Copyright Cengage Learning. Powered by Cognero. Page 4 Name: Class: Date: Chapter 02 23. The mode is best described as the a. middle observation b. same as the average c. 50th percentile d. most frequently occurring value e. third quartile d 24. The interquartile range (IQR) represents what percent of the observations? a. lower 25% b. middle 50% c. upper 75% d. upper 90% e. 100% b 25. Which of the following statements is for the following data values: 7, 5, 6, 4, 7, 8, and 12? a. The mean, median and mode are all equal b. Only the mean and median are equal c. Only the mean and mode are equal d. Only the median and mode are equal a 26. How is the median defined if the number of observations is even? a. the average of the two middle observations b. the difference between the two middle observations c. the most frequent observation d. the difference between the highest and smallest observation a 27. The average score for a class of 30 students was 75. The 20 male students in the class averaged 70. The 10 female students in the class averaged a. 75 b. 85 c. 60 d. 70 e. 80 b Copyright Cengage Learning. Powered by Cognero. Page 5 Name: Class: Date: Chapter 02 28. If the mean is 75 and two observations have values of 65 and 85, what is the squared deviation of each? a. 100 b. 20 c. 400 d. 10 a 29. Expressed in percentiles, the interquartile range is the difference between the a. 10th and 60th percentiles b. 15th and 65th percentiles c. 20th and 70th percentiles d. 25th and 75th percentiles e. 35th and 85th percentiles d 30. A sample of 20 observations has a standard deviation of 4. The sum of the squared deviations from the sample mean is a. 400 b. 320 c. 304 d. 288 e. 180 c 31. Where will you find "time" on a time series graph? a. horizontal axis b. first column c. vertical axis d. last column a 32. Age, height, and weight are examples of numerical data. a. b. Copyright Cengage Learning. Powered by Cognero. Page 6 Name: Class: Date: Chapter 02 33. Data can be categorized as cross-sectional or time series. a. b. 34. All nominal data may be treated as ordinal data. a. b. 35. Categorical variables can be classified as either discrete or continuous. a. b. 36. A population includes all elements or objects of interest in a study, whereas a sample is a subset of the population used to gain insights into the characteristics of the population. a. b. 37. The number of car insurance policy holders is an example of a discrete numerical variable. a. b. 38. A variable (or field or attribute) is a characteristic of members of a population, whereas an observation (or case or record) is a list of all variable values for a single member of a population. a. b. 39. Phone numbers, Social Security numbers, and zip codes are examples of numerical variables. a. b. Copyright Cengage Learning. Powered by Cognero. Page 7 Name: Class: Date: Chapter 02 40. Cross-sectional data are data on a population at a distinct point in time, whereas time series data are data collected over time. a. b. 41. A data set is typically a rectangular array of data, with observations in columns and variables in rows. a. b. 42. Both ordinal and nominal variables are categorical. a. b. 43. The median of a data set with 30 values would be the average of the 15th and the 16th values when the data values are arranged in ascending order. a. b. 44. The count of categories is the only meaningful way to summarize categorical data. a. b. 45. Using dummy variables is an efficient way of determining counts of categorical variables. a. b. 46. As a graphical tool, the histogram is ideal for showing whether the distribution of a numerical variable is symmetric or skewed. a. b. Copyright Cengage Learning. Powered by Cognero. Page 8 Name: Class: Date: Chapter 02 47. A distribution with a high kurtosis has almost all of its observations within three standard deviations of the mean. a. b. 48. A frequency table indicates how many observations fall within each category, and a histogram is its graphical analog. a. b. 49. In the term “frequency table,” frequency refers to the counts of observations in specified categories. a. b. 50. A distribution of a numerical variable with no skewness is said to be symmetric. a. b. 51. Suppose that a sample of 10 observations has a standard deviation of 3, then the sum of the squared deviations from the sample mean is 30. a. b. 52. A histogram is based on binning the variable, which means putting the variable into discrete categories. a. b. 53. The mean is a measure of central tendency. a. b. 54. Unlike histograms, box plots depict only one aspect of a variable. a. b. Copyright Cengage Learning. Powered by Cognero. Page 9 Name: Class: Date: Chapter 02 55. In an extremely right-skewed distribution, the mean is much smaller than the median. a. b. 56. Mean absolute deviation (MAD) is the average of the squared deviations. a. b. 57. The median is one of the most frequently used measures of variability. a. b. 58. Assume that the histogram of a data set is symmetric and bell shaped, with a mean of 75 and standard deviation of 10. Then, approximately 95% of the data values were between 55 and 95. a. b. 59. Abby has been keeping track of what she spends to rent movies. The last seven week's expenditures, in dollars, were 6, 4, 8, 9, 6, 12, and 4. The mean amount Abby spends on renting movies is $7. a. b. 60. The value of the mean times the number of observations equals the sum of all of the data values. a. b. 61. The difference between the largest and smallest values in a data set is called the range. a. b. 62. There are four quartiles that divide the values in a data set into four equal parts. a. b. Copyright Cengage Learning. Powered by Cognero. Page 10 Name: Class: Date: Chapter 02 63. Suppose that a sample of 8 observations has a standard deviation of 2.50, then the sum of the squared deviations from the sample mean is 17.50. a. b. 64. The core purpose of time series graphs is to detect historical patterns in the data. a. b. 65. Time series graphs chart the values of one or more time series, using time on the vertical axis. a. b. 66. Because they represent such extreme values, outliers should be eliminated from statistical analyses. a. b. Copyright Cengage Learning. Powered by Cognero. Page 11 Name: Class: Date: Chapter 02 A manager for Marko Manufacturing, Inc. has recently been hearing some complaints that women are being paid less than men for the same type of work in one of their manufacturing plants. The box plots shown below represent the annual salaries for all salaried workers in that facility (40 men and 34 women). 67. Would you conclude that there is a difference between the salaries of women and men in this plant? Justify your answer. Yes. The men seem to have higher salaries than the women do in many cases. We can see from the box plots that the mean and median values for the men are both higher than for the women. You can also see from the box plots that the middle 50% of salaries for men is above the median for women. This means that if you were in the 25th percentile for men, you would be above the 50th percentile for women. You can also see that the mean and median salaries for the men are about $10,000 above those for the women. 68. How large must a person’s salary should be to qualify as an outlier on the high side? How many outliers are there in these data? A person’s salary should be somewhere above $70,000. There is one male salary that would be considered an outlier (at approximately $80,000) 69. What can you say about the shape of the distributions given the accompanying box plots? They both appear to be slightly skewed to the right (both have a mean > median). The total variation seems to be close for both distributions (with one outlier for the male salaries), but there seems to be more variation in the middle 50% for the women than for the men. There seem to be more men’s salaries clustered more closely around the mean than for the women. Copyright Cengage Learning. Powered by Cognero. Page 12 Name: Class: Date: Chapter 02 Statistics professor has just given a final examination in his statistical inference course. He is particularly interested in learning how his class of 40 students performed on this exam. The scores are shown below. 77 81 74 77 79 73 80 85 86 73 83 84 81 73 75 91 76 77 95 76 90 85 92 84 81 64 75 90 78 78 82 78 86 86 82 70 76 78 72 93 70. What are the mean and median scores on this exam? Mean = 80.40, Median = 79.50 71. Explain why the mean and median are different. There are few higher exam scores that tend to pull the mean away from the middle of the distribution. While there is a slight amount of positive skewness in the distribution (skewness = 0.182), the mean and the median are essentially equivalent in this case. Copyright Cengage Learning. Powered by Cognero. Page 13 Name: Class: Date: Chapter 02 The data shown below contains family incomes (in thousands of dollars) for a set of 50 families sampled in 2000 and 2010. Assume that these families are good representatives of the entire United States. 2000 2010 2000 2010 2000 2010 58 54 33 29 73 69 6 2 14 10 26 22 59 55 48 44 64 70 71 57 20 16 59 55 30 26 24 20 11 7 38 34 82 78 70 66 36 32 95 97 31 27 33 29 12 8 92 88 72 68 93 89 115 111 100 96 100 102 62 58 1 0 51 47 23 19 27 23 22 18 34 30 22 47 50 75 36 61 141 166 124 149 125 150 72 97 113 138 121 146 165 190 118 143 88 113 79 104 96 121 72. Find the mean, median, standard deviation, first and third quartiles, and the 95th percentile for family incomes in both years. Income 2000 Income 2010 Mean Median Standard deviation First quartile Third quartile 95th percentile 62.820 59.000 39.786 30.250 92.750 124.550 67.120 57.500 48.087 27.500 97.000 149.55 73. A political figure running for re-election claimed that the country was better off in 2010 than in 2000, because the average income increased. Do you agree? It is that the mean increased slightly, but the median decreased and the standard deviation increased. The 95th percentile shows that the mean increase might be because the rich got richer. Copyright Cengage Learning. Powered by Cognero. Page 14 Name: Class: Date: Chapter 02 74. Generate a box plot to summarize the data. What does the box plot indicate? The box plot shows that there is not much difference between the two populations. In an effort to provide more consistent customer service, the manager of a local fast-food restaurant would like to know the dispersion of customer service times in relation to their average value for the facility’s drive-up window. The table below provides summary measures for the customer service times (in minutes) for a sample of 50 customers collected over the past week. Count 50.000 Mean 0.873 Median 0.885 Standard deviation 0.432 Minimum 0.077 Maximum 1.608 Variance 0.187 Skewness -0.003 75. Interpret the variance and standard deviation of this sample. The variance = 0.187 (minutes squared) and this represents the average of the squared deviations from the mean. The standard deviation = 0.432 (minutes) and is the square root of the variance. Both the variance and standard deviation measure the variation around the mean of the data. However, it is easier to interpret the standard deviation because it is expressed in the same units (minutes) as the values of the random variable (customer service time). Copyright Cengage Learning. Powered by Cognero. Page 15 Name: Class: Date: Chapter 02 76. Are the empirical rules applicable in this case? If so, apply them and interpret your results. If not, explain why the empirical rules are not applicable here. Considering that this distribution is only very slightly skewed to the left, it is acceptable to apply the empirical rules as follows: Approximately 68% of the customer service times will fall between 0.873 0.432, that is between 0.441 and 1.305 minutes. Approximately 95% of the customer service times will fall between 0.873 2(0.432), that is between 0.009 and 1.737 minutes. Approximately 99.7% of the customer service times will fall between 0.873 3(0.432), that is between 0 and 2.169 (lower end is set to zero because service times cannot assume negative values). 77. Explain why the mean is slightly lower than the median in this case. The data is slightly skewed to the left. This causes the mean to be slightly lower than the median. It is important to understand that service times are bounded on the lower end by zero (it is impossible for the service time to be negative). However, there is no boundary on the maximum service time. Therefore, the smaller service times cause the mean to be somewhat lower than the median. Copyright Cengage Learning. Powered by Cognero. Page 16 Name: Class: Date: Chapter 02 Below you will find summary measures on starting salaries for classroom teachers across the United States. You will also find a list of selected states and their average starting teacher salary. All values are in thousands of dollars. Starting salaries for classroom teachers across the United States Salary Count 51.000 Mean 35.890 Median 35.000 Standard deviation 6.226 Minimum 26.300 Maximum 50.300 Variance 38.763 First quartile 31.550 Third quartile 40.050 Selected states and their average starting teacher salary State Salary Alabama 31.3 Colorado 35.4 Connecticut 50.3 Delaware 40.5 Nebraska 31.5 Nevada 36.2 New Hampshire 35.8 New Jersey 47.9 New Mexico 29.6 South Carolina 31.6 South Dakota 26.3 Tennessee 33.1 Texas 32.0 Utah 30.6 Vermont 36.3 Virginia 35.0 Wyoming 31.6 78. Which of the states listed paid their teachers average salaries that exceed at least 75% of all average salaries? Connecticut at 50.3; Delaware at 40.5; and New Jersey at 47.9 (all those 40.05). 79. Which of the states listed paid their teachers average salaries that are below 75% of all average salaries? Alabama at 31.3; Nebraska at 31.5; New Mexico at 29.6; South Dakota at 26.3; and Utah at 30.6 (all those < 31.55). Copyright Cengage Learning. Powered by Cognero. Page 17 Name: Class: Date: Chapter 02 80. What salary amount represents the second quartile? $35,000 (median) 81. How would you describe the salary of Virginia’s teachers compared to those across the entire United States? Justify your answer. Virginia' teacher salary = $35,000, which is also the median. Virginia is at the 50th percentile, meaning that 50% of the teachers’ salaries across the U.S. are below the Virginia teacher salary and 50% of the salaries are above. Suppose that an analysis of a set of test scores reveals that: 82. What do these statistics tell you about the shape of the distribution? The fact that is greater than indicates that the distribution is skewed to the left. 83. What can you say about the relative position of each of the observations 34, 84, and 104? Since 34 is less than , the observation 34 is among the lowest 25% of the values. The value 84 is a bit smaller than the middle value, which is . Since , the value 104 is larger than about 75% of the values. 84. Calculate the interquartile range. What does this tell you about the data? IQR = . This means that the middle 50% of the test scores are between 45 and 105. The following data represent the number of children in a sample of 10 families from Chicago: 4, 2, 1, 1, 5, 3, 0, 1, 0, and 2. 85. Compute the mean number of children. Mean = 1.90 86. Compute the median number of children. Median = 1.5 Copyright Cengage Learning. Powered by Cognero. Page 18 Name: Class: Date: Chapter 02 87. Is the distribution of the number of children symmetrical or skewed? Why? The distribution is positively skewed because the mean is larger than the median. 88. The data below represents monthly sales for two years of beanbag animals at a local retail store (Month 1 represents January and Month 12 represents December). Given the time series plot below, do you see any obvious patterns in the data? Explain. This is a representation of seasonal data. There seems to be a small increase in months 3, 4, and 5 and a large increase at the end of the year. The sales of this item seem to peak in December and have a significant dropoff in January. Copyright Cengage Learning. Powered by Cognero. Page 19 Name: Class: Date: Chapter 02 89. An operations management professor is interested in how her students performed on her midterm exam. The histogram shown below represents the distribution of exam scores (where the maximum score is 100) for 50 students. Based on this histogram, how would you characterize the students’ performance on this exam? Exam scores are fairly normally distributed. Majority of scores (76%) are between 70 and 90 points, while 12% of scores are above 90 and 12% of scores are 70 or below. Copyright Cengage Learning. Powered by Cognero. Page 20 Name: Class: Date: Chapter 02 90. The proportion of Americans under the age of 18 who are living below the poverty line for each of the years 1959 through 2000 is used to generate the following time series plot. How successful have Americans been recently in their efforts to win “the war against poverty” for the nation’s children? Americans have been relatively unsuccessful in winning the war on poverty in the 1990s. This is especially when you compare recent poverty rates with those of the years from 1969 through 1979. However, at least the curve is trending downward in the more recent years. Copyright Cengage Learning. Powered by Cognero. Page 21 Name: Class: Date: Chapter 02 A financial analyst collected useful information for 30 employees at Gamma Technologies, Inc. These data include each selected employees' gender, age, number of years of relevant work experience prior to employment at Gamma, number of years of employment at Gamma, number of years of post-secondary education, and annual salary. 91. Indicate the type of data for each of the six variables included in this set. Gender – categorical, nominal Age – numerical, continuous Prior experience – numerical, discrete Gamma experience – numerical, discrete Education – numerical, discrete Annual salary – numerical, continuous 92. Based on the histogram shown below, how would you describe the age distribution for these data? The age distribution is skewed slightly to the right. Largest grouping is in the 30-40 range. This means that most workers are above the age of 30 years and only one worker is 20 years old or younger. 93. Based on the histogram shown below, how would you describe the salary distribution for these data? The salary distribution is skewed to the right. There appears to be several workers who are being paid substantially more than the others. If you eliminate those above $80,000, the salaries are fairly normally distributed around $35,000. Copyright Cengage Learning. Powered by Cognero. Page 22 Name: Class: Date: Chapter 02 The histogram below represents scores achieved by 250 job applicants on a personality profile. 94. What percentage of the job applicants scored between 30 and 40? 10% 95. What percentage of the job applicants scored below 60? 90% 96. How many job applicants scored above 50? 50 97. How many job applicants scored between 10 and 30? 100 98. Seventy percent of the job applicants scored above what value? 20 99. Half of the job applicants scored below what value? 30 100. A question of great interest to economists is how the distribution of family income has changed in the United States during the last 20 years. The summary measures and histograms shown below are generated for a sample of 500 family incomes, using the 1985 and 2005 income for each family in the sample. Summary Measures: Copyright Cengage Learning. Powered by Cognero. Page 23 Name: Class: Date: Chapter 02 Copyright Cengage Learning. Powered by Cognero. Page 24 Name: Class: Date: Chapter 02 Based on these results, discuss as completely as possible how the distribution of family income in the United States changed from 1985 to 2005. These summary measures say quite a lot. The mean has increased for 2005 when compared with 1985, although the median has decreased. There is also more variation. In fact, the 5th percentile has decreased slightly for 2005 when compared with 1985, whereas the 95th percentile is much larger -- indicating that the rich people are getting richer. This behavior is also evident in the two histograms, which use the same categories for ease of comparison. Copyright Cengage Learning. Powered by Cognero. Page 25 Name: Class: Date: Chapter 02

Voorbeeld 3 van de 25  pagina's

TestMax

Voordelen van het kopen van samenvattingen bij Stuvia op een rij:

Verzekerd van kwaliteit door reviews

Verzekerd van kwaliteit door reviews

Stuvia-kopers hebben meer dan 450.000 samenvattingen beoordeeld. Zo weet je zeker dat je de beste documenten koopt!

Snel en makkelijk kopen

Snel en makkelijk kopen

Je betaalt supersnel en eenmalig met iDeal, creditcard of Stuvia-tegoed voor de samenvatting. Zonder lidmaatschap.

Focus op de essentie

Focus op de essentie

Samenvattingen worden geschreven voor en door anderen. Daarom zijn de samenvattingen altijd betrouwbaar en actueel. Zo kom je snel tot de kern!

8,45  1x  verkocht
  • (0)
  Kopen