Unit 2 Quiz - Exploring Two-Variable Data

Quiz 1

  1. What is the primary purpose of a scatterplot?
  2. Which of the following indicates a positive correlation?
  3. What does the correlation coefficient measure?
  4. If the correlation coefficient is -0.85, what can be inferred?
  5. What is the range of the correlation coefficient?
  6. Which of the following is NOT a characteristic of the least squares regression line?
  7. What does a residual plot help to determine?
  8. In a residual plot, what pattern indicates a good fit for a linear model?
  9. What does a high R-squared value indicate?
  10. Which of the following is true about outliers in a scatterplot?
  11. What is the purpose of transforming data in regression analysis?
  12. Which transformation is commonly used to linearize exponential data?
  13. What does the slope of the regression line represent?
  14. If the slope of the regression line is zero, what does this indicate?
  15. What is the intercept of the regression line?
  16. Which of the following is true about extrapolation?
  17. What is the effect of a lurking variable?
  18. Which of the following is an example of a categorical variable?
  19. What is the purpose of a contingency table?
  20. What does a chi-square test of independence assess?
  21. Which of the following is true about Simpsons paradox?
  22. What is the purpose of a scatterplot matrix?
  23. Which of the following is NOT a measure of association for categorical data?
  24. What does a mosaic plot display?

A) The relationship between two categorical variables

B) The distribution of a single variable

C) The relationship between two quantitative variables

D) The mean of a dataset

  1. Which of the following is true about the least squares regression line?

A) It minimizes the sum of absolute residuals

B) It minimizes the sum of squared residuals

C) It always passes through the origin

D) It is used to predict x from y

 

Quiz 2

 

1. What type of graph is typically used to display the relationship between two quantitative variables?

- a) Histogram

- b) Boxplot

- c) Scatterplot

- d) Bar graph

 

2. In a scatterplot, the variable plotted along the horizontal axis is usually the:

- a) Dependent variable

- b) Explanatory variable

- c) Response variable

- d) Random variable

 

3. Which of the following describes a positive association between two variables?

- a) As one variable increases, the other decreases.

- b) As one variable increases, the other increases.

- c) There is no consistent pattern between the two variables.

- d) As one variable decreases, the other decreases.

 

4. If the correlation coefficient \( r \) is close to 1, it indicates:

- a) A weak positive linear relationship

- b) A strong positive linear relationship

- c) A strong negative linear relationship

- d) No linear relationship

 

5. What is the range of possible values for the correlation coefficient \( r \)?

- a) 0 to 1

- b) -1 to 0

- c) -1 to 1

- d) -∞ to ∞

 

6. Which of the following is NOT a characteristic of the correlation coefficient \( r \)?

- a) It is sensitive to outliers.

- b) It measures the strength and direction of a linear relationship.

- c) It requires both variables to be quantitative.

- d) It changes if the units of measurement are changed.

 

7. The least-squares regression line minimizes the:

- a) Sum of absolute residuals

- b) Sum of squared residuals

- c) Sum of residuals

- d) Sum of the values of \( y \)

 

8. If a residual plot shows a random scatter of points, what does this suggest about the model?

- a) The linear model is appropriate.

- b) The linear model is not appropriate.

- c) The residuals are not independent.

- d) The data is not linear.

 

9. Which of the following statements is true about influential points in regression analysis?

- a) Influential points always have large residuals.

- b) Influential points are always outliers in the x-direction.

- c) Removing an influential point significantly changes the slope of the regression line.

- d) Influential points always lie close to the regression line.

 

10. What does the coefficient of determination \( r^2 \) represent?

- a) The slope of the regression line.

- b) The percentage of variation in the response variable explained by the explanatory variable.

- c) The intercept of the regression line.

- d) The correlation between the residuals and the explanatory variable.

 

11. When interpreting the slope of a regression line, it tells us:

- a) The predicted change in the response variable for a one-unit increase in the explanatory variable.

- b) The strength of the relationship between two variables.

- c) The percentage of variation explained by the model.

- d) The predicted value of the response variable when the explanatory variable is zero.

 

12. Which of the following is a potential problem when using regression analysis?

- a) Non-linear relationships

- b) Outliers

- c) High-leverage points

- d) All of the above

 

13. Extrapolation refers to:

- a) Predicting values within the range of the data.

- b) Predicting values outside the range of the data.

- c) The process of calculating residuals.

- d) The slope of the regression line.

 

14. Which of the following is true about the correlation and causation?

- a) A high correlation implies causation.

- b) A low correlation rules out causation.

- c) Correlation implies causation if \( r^2 \) is high.

- d) Correlation does not imply causation.

 

15. If two variables have a correlation close to zero, it means:

- a) There is a strong linear relationship.

- b) There is no linear relationship.

- c) There is a strong non-linear relationship.

- d) The variables are not related in any way.

 

Quiz 3

 

1. What does a scatterplot with points closely clustered around a straight line indicate?

- a) No relationship between variables

- b) A weak linear relationship

- c) A strong linear relationship

- d) A perfect linear relationship

 

2. If the correlation between two variables is \( r = -0.85 \), what can be said about their relationship?

- a) Strong positive linear relationship

- b) Weak positive linear relationship

- c) Strong negative linear relationship

- d) No linear relationship

 

3. Which of the following is a reason to use a logarithmic transformation on data?

- a) To create a linear relationship between variables

- b) To increase the correlation coefficient

- c) To eliminate outliers

- d) To decrease the spread of the data

 

4. An outlier in a scatterplot can:

- a) Have a small effect on the correlation coefficient

- b) Have no effect on the regression line

- c) Have a large effect on the correlation coefficient

- d) Improve the fit of the regression line

 

5. The slope of a least-squares regression line is interpreted as:

- a) The amount the explanatory variable increases when the response variable increases by one unit

- b) The predicted change in the explanatory variable for a one-unit change in the response variable

- c) The predicted change in the response variable for a one-unit change in the explanatory variable

- d) The percentage change in the response variable for a one-unit change in the explanatory variable

 

6. Which of the following would suggest that a linear model is not appropriate?

- a) A high correlation coefficient

- b) A random scatter of points in the residual plot

- c) A clear pattern in the residual plot

- d) A slope near zero in the regression line

 

7. What does it mean if the residual for a particular data point is negative?

- a) The actual value is greater than the predicted value.

- b) The actual value is less than the predicted value.

- c) The data point is an outlier.

- d) The model underestimates the actual value.

 

8. A data point with high leverage:

- a) Lies far from the mean of the explanatory variable

- b) Lies far from the mean of the response variable

- c) Has a large residual

- d) Always increases the slope of the regression line

 

9. What is the effect of adding a constant to all values of a dataset on the correlation coefficient \( r \)?

- a) It increases \( r \).

- b) It decreases \( r \).

- c) \( r \) remains unchanged.

- d) It makes \( r \) zero.

 

10. Which statement is true about the y-intercept of a least-squares regression line?

- a) It is always meaningful.

- b) It has no interpretation.

- c) It represents the predicted value of the response variable when the explanatory variable is zero.

- d) It is the average of all y-values.

 

11. Which of the following is NOT true about influential points?

- a) They have large residuals.

- b) They can drastically change the slope of the regression line.

- c) They are usually outliers in the x-direction.

- d) Removing them can significantly alter the correlation coefficient.

 

12. The correlation coefficient \( r \) is not resistant to:

- a) Non-linearity

- b) Outliers

- c) Changes in the units of measurement

- d) Shifts in the center of the data

 

13. Which plot is useful for identifying outliers in a linear regression model?

- a) Scatterplot

- b) Residual plot

- c) Histogram

- d) Boxplot

 

14. If the correlation between two variables is close to zero, it means:

- a) There is no relationship between the variables.

- b) There is no linear relationship between the variables.

- c) The variables are independent.

- d) The variables are dependent.

 

15. Which of the following can indicate a non-linear relationship between two variables?

- a) A scatterplot that shows a curved pattern

- b) A correlation coefficient close to 1

- c) A perfectly horizontal line in a scatterplot

- d) A random scatter in the residual plot

 

16. A high \( r^2 \) value in a regression model indicates:

- a) The model explains a large proportion of the variation in the response variable.

- b) The model has a high correlation coefficient.

- c) The model is a perfect fit.

- d) The slope of the regression line is positive.

 

17. Which of the following scenarios would most likely result in extrapolation?

- a) Predicting values far outside the range of the explanatory variable.

- b) Predicting values within the range of the data.

- c) Using the residual plot to make predictions.

- d) Calculating the y-intercept of the regression line.

 

18. The term "bivariate data" refers to:

- a) Data involving two quantitative variables

- b) Data involving two categorical variables

- c) Data involving one quantitative and one categorical variable

- d) Data involving more than two variables

 

19. In a linear regression, if the residuals show a pattern, this suggests:

- a) The linear model is appropriate.

- b) The linear model is not appropriate.

- c) The correlation coefficient is near zero.

- d) There are no influential points.

 

20. Which of the following is true about the effect of an outlier on a regression analysis?

- a) An outlier always decreases the correlation coefficient.

- b) An outlier can significantly change the slope and y-intercept of the regression line.

- c) An outlier has no effect if it lies on the regression line.

- d) An outlier always increases the correlation coefficient.