A) The relationship between two categorical variables
B) The distribution of a single variable
C) The relationship between two quantitative variables
D) The mean of a dataset
A) It minimizes the sum of absolute residuals
B) It minimizes the sum of squared residuals
C) It always passes through the origin
D) It is used to predict x from y
1. What type of graph is typically used to display the relationship between two quantitative variables?
- a) Histogram
- b) Boxplot
- c) Scatterplot
- d) Bar graph
2. In a scatterplot, the variable plotted along the horizontal axis is usually the:
- a) Dependent variable
- b) Explanatory variable
- c) Response variable
- d) Random variable
3. Which of the following describes a positive association between two variables?
- a) As one variable increases, the other decreases.
- b) As one variable increases, the other increases.
- c) There is no consistent pattern between the two variables.
- d) As one variable decreases, the other decreases.
4. If the correlation coefficient \( r \) is close to 1, it indicates:
- a) A weak positive linear relationship
- b) A strong positive linear relationship
- c) A strong negative linear relationship
- d) No linear relationship
5. What is the range of possible values for the correlation coefficient \( r \)?
- a) 0 to 1
- b) -1 to 0
- c) -1 to 1
- d) -∞ to ∞
6. Which of the following is NOT a characteristic of the correlation coefficient \( r \)?
- a) It is sensitive to outliers.
- b) It measures the strength and direction of a linear relationship.
- c) It requires both variables to be quantitative.
- d) It changes if the units of measurement are changed.
7. The least-squares regression line minimizes the:
- a) Sum of absolute residuals
- b) Sum of squared residuals
- c) Sum of residuals
- d) Sum of the values of \( y \)
8. If a residual plot shows a random scatter of points, what does this suggest about the model?
- a) The linear model is appropriate.
- b) The linear model is not appropriate.
- c) The residuals are not independent.
- d) The data is not linear.
9. Which of the following statements is true about influential points in regression analysis?
- a) Influential points always have large residuals.
- b) Influential points are always outliers in the x-direction.
- c) Removing an influential point significantly changes the slope of the regression line.
- d) Influential points always lie close to the regression line.
10. What does the coefficient of determination \( r^2 \) represent?
- a) The slope of the regression line.
- b) The percentage of variation in the response variable explained by the explanatory variable.
- c) The intercept of the regression line.
- d) The correlation between the residuals and the explanatory variable.
11. When interpreting the slope of a regression line, it tells us:
- a) The predicted change in the response variable for a one-unit increase in the explanatory variable.
- b) The strength of the relationship between two variables.
- c) The percentage of variation explained by the model.
- d) The predicted value of the response variable when the explanatory variable is zero.
12. Which of the following is a potential problem when using regression analysis?
- a) Non-linear relationships
- b) Outliers
- c) High-leverage points
- d) All of the above
13. Extrapolation refers to:
- a) Predicting values within the range of the data.
- b) Predicting values outside the range of the data.
- c) The process of calculating residuals.
- d) The slope of the regression line.
14. Which of the following is true about the correlation and causation?
- a) A high correlation implies causation.
- b) A low correlation rules out causation.
- c) Correlation implies causation if \( r^2 \) is high.
- d) Correlation does not imply causation.
15. If two variables have a correlation close to zero, it means:
- a) There is a strong linear relationship.
- b) There is no linear relationship.
- c) There is a strong non-linear relationship.
- d) The variables are not related in any way.
1. What does a scatterplot with points closely clustered around a straight line indicate?
- a) No relationship between variables
- b) A weak linear relationship
- c) A strong linear relationship
- d) A perfect linear relationship
2. If the correlation between two variables is \( r = -0.85 \), what can be said about their relationship?
- a) Strong positive linear relationship
- b) Weak positive linear relationship
- c) Strong negative linear relationship
- d) No linear relationship
3. Which of the following is a reason to use a logarithmic transformation on data?
- a) To create a linear relationship between variables
- b) To increase the correlation coefficient
- c) To eliminate outliers
- d) To decrease the spread of the data
4. An outlier in a scatterplot can:
- a) Have a small effect on the correlation coefficient
- b) Have no effect on the regression line
- c) Have a large effect on the correlation coefficient
- d) Improve the fit of the regression line
5. The slope of a least-squares regression line is interpreted as:
- a) The amount the explanatory variable increases when the response variable increases by one unit
- b) The predicted change in the explanatory variable for a one-unit change in the response variable
- c) The predicted change in the response variable for a one-unit change in the explanatory variable
- d) The percentage change in the response variable for a one-unit change in the explanatory variable
6. Which of the following would suggest that a linear model is not appropriate?
- a) A high correlation coefficient
- b) A random scatter of points in the residual plot
- c) A clear pattern in the residual plot
- d) A slope near zero in the regression line
7. What does it mean if the residual for a particular data point is negative?
- a) The actual value is greater than the predicted value.
- b) The actual value is less than the predicted value.
- c) The data point is an outlier.
- d) The model underestimates the actual value.
8. A data point with high leverage:
- a) Lies far from the mean of the explanatory variable
- b) Lies far from the mean of the response variable
- c) Has a large residual
- d) Always increases the slope of the regression line
9. What is the effect of adding a constant to all values of a dataset on the correlation coefficient \( r \)?
- a) It increases \( r \).
- b) It decreases \( r \).
- c) \( r \) remains unchanged.
- d) It makes \( r \) zero.
10. Which statement is true about the y-intercept of a least-squares regression line?
- a) It is always meaningful.
- b) It has no interpretation.
- c) It represents the predicted value of the response variable when the explanatory variable is zero.
- d) It is the average of all y-values.
11. Which of the following is NOT true about influential points?
- a) They have large residuals.
- b) They can drastically change the slope of the regression line.
- c) They are usually outliers in the x-direction.
- d) Removing them can significantly alter the correlation coefficient.
12. The correlation coefficient \( r \) is not resistant to:
- a) Non-linearity
- b) Outliers
- c) Changes in the units of measurement
- d) Shifts in the center of the data
13. Which plot is useful for identifying outliers in a linear regression model?
- a) Scatterplot
- b) Residual plot
- c) Histogram
- d) Boxplot
14. If the correlation between two variables is close to zero, it means:
- a) There is no relationship between the variables.
- b) There is no linear relationship between the variables.
- c) The variables are independent.
- d) The variables are dependent.
15. Which of the following can indicate a non-linear relationship between two variables?
- a) A scatterplot that shows a curved pattern
- b) A correlation coefficient close to 1
- c) A perfectly horizontal line in a scatterplot
- d) A random scatter in the residual plot
16. A high \( r^2 \) value in a regression model indicates:
- a) The model explains a large proportion of the variation in the response variable.
- b) The model has a high correlation coefficient.
- c) The model is a perfect fit.
- d) The slope of the regression line is positive.
17. Which of the following scenarios would most likely result in extrapolation?
- a) Predicting values far outside the range of the explanatory variable.
- b) Predicting values within the range of the data.
- c) Using the residual plot to make predictions.
- d) Calculating the y-intercept of the regression line.
18. The term "bivariate data" refers to:
- a) Data involving two quantitative variables
- b) Data involving two categorical variables
- c) Data involving one quantitative and one categorical variable
- d) Data involving more than two variables
19. In a linear regression, if the residuals show a pattern, this suggests:
- a) The linear model is appropriate.
- b) The linear model is not appropriate.
- c) The correlation coefficient is near zero.
- d) There are no influential points.
20. Which of the following is true about the effect of an outlier on a regression analysis?
- a) An outlier always decreases the correlation coefficient.
- b) An outlier can significantly change the slope and y-intercept of the regression line.
- c) An outlier has no effect if it lies on the regression line.
- d) An outlier always increases the correlation coefficient.