Reading Questions.
Prob 15.01. Often we are interested in whether two groups are different. For example, we might ask if girls have a different mean footlength than do boys. We can answer this question by constructing a suitable model.
Interpret this report, keeping in mind that the foot length is reported in centimeters. (The reported value <2e-16 means p < 2 × 10^{-16}.)
A
| Girls’ feet are, on average, 25 centimeters long. |
B
| Girls’ feet are 0.4079 cm shorter than boys’. |
C
| Girls’ feet are 0.7839 cm shorter than boys’. |
D
| Girls’ feet are 1.922 cm shorter than boys’. |
A
| Boys’ feet are, on average, longer than girls’ feet. |
B
| Girls’ feet are, on average, shorter than boys’ feet. |
C
| All boys’ feet are longer than all girls’ feet. |
D
| No girl’s foot is shorter than all boys’ feet. |
E
| There is no difference, on average, between boys’ footlengths and girls’ footlengths. |
A
| Boys’ and girls’ feet are, on average, the same length |
B
| The length of kids’ feet is, on average, zero. |
C
| The length of boys’ feet is, on average, zero. |
D
| The length of girls’ feet is, on average, zero. |
E
| Girls’ and boys’ feet don’t intercept. |
Here is the report from a related, but slightly different model:
Note that the p-values for both coefficients are practically zero, p < 2 × 10^{-16}.
What is the Null Hypothesis tested by the p-value on sexG?
A
| Girls’ feet have a different length, on average, than boys’. |
B
| Girls’ feet are no different in length, on average, than boys’. |
C
| Girls’ footlengths are, on average, zero. |
D
| Girls’ footlengths are, on average, greater than zero. |
Prob 15.02. Here is an ANOVA table (with the “intercept” term included) from a fictional study of scores assigned to various flavors, textures, densities, and chunkiness of ice cream. Some of the values in the table have been left out. Figure out from the rest of the table what they should be.
Prob 15.04. Consider the following analysis of the kids’ feet data looking for a relationship between foot width and whether the child is left or right handed. The variable domhand gives the handedness, either L or R. We’ll construct the model in two different ways. There are 39 cases altogether.
Prob 15.05. A statistics student wrote:
I’m interested in the publishing business, particularly magazines, and thought I would try a statistical analysis of some of the features of magazines. I looked at several different magazines, and recorded several variables, some of which I could measure from a single copy and some of which I deduced from my knowledge of the publishing business.
Most people find it hard to believe, but most mass-market magazines are very deliberately written and composed graphically to be attractive to the target audience. The distinctive “styles” of magazines is no accident.
I was interested to see if there is a relation between the average sentence length and any of the other variables. I made one linear model and had a look at the ANOVA table, as shown below.
Answer each question based on the information given above.
A
| sentenceLength ~ age + sex + color |
B
| sentenceLength ~ sex * age + color |
C
| sentenceLength ~ sex + age + color |
D
| color ~ sentenceLength + sex + age |
A
| 8 |
B
| 9 |
C
| 10 |
D
| No way to tell from the information given. |
A
| categorical |
B
| quantitative |
C
| Could be either. |
D
| Can’t know for sure from the data given. |
A
| 0.93553 from pf(7.6407, 3, 3) |
B
| 0.06446 from 1-pf(7.6407, 3, 3) |
C
| 0.98633 from pf(23.689, 3, 3) |
D
| 0.01367 from 1-pf(23.689, 3, 3) |
E
| 0.99902 from pnorm(23.689, 0, 7.6507) |
F
| 0.00098 from 1-pnorm(23.689, 0, 7.6507) |
A
| An average sentence has zero words. |
B
| There is no relationship between the number of color pages and the sex of the intended audience. |
C
| The number of color pages is not related to the sentence length. |
D
| There is no relation between the average number of words per sentence in an article and the age group that the magazine is intended for, after taking sex into account. |
E
| None of the above, because there is a different null hypothesis corresponding to each model term in the ANOVA report. |
A
| To see if the different sexes have a different distribution of age groups. |
B
| To see if there is a difference in average sentence length between magazines for females and males. |
C
| To see if magazines for different age groups are targeted to different sexes. |
D
| To see if the difference in average sentence length between magazines for females and males changes from one age group to another. |
A
| The term was included as the last term in the ANOVA report and didn’t have a significant sum of squares. |
B
| I discovered that sex and age were redundant. |
C
| The p-values disappeared from the report. |
D
| None of the above. |
Prob 15.10. P-values concern the “statistical significance” of evidence for a relationship. This can be a different thing from the real-world importance of the observed relationship. It’s possible for a weak connection to be strongly statistically significant (if there is a lot of data for it) and for a strong relationship to lack statistical significance (if there is not much data).
Consider the data on the times it took runners to complete the Cherry Blossom ten-mile race in March 2005:
Consider the net variable, which gives the time it took the runners to get from the start line to the finish line.
Answer each of the following questions, giving both a quantitative argument and also an everyday English explanation. Assessing statistical significance is a technical matter, but to interpret the substance of a relationship, you will have to put it in a real-world context.
Prob 15.11. You are conducting an experiment of a treatment for balding. You measure the hair follicle density before treatment and again after treatment. The data table has the following variables (with a few examples shown):
Subject.ID | follicle.density | when | sex |
A59 | 7.3 | before | M |
A59 | 7.9 | after | M |
A60 | 61.2 | before | F |
A60 | 61.4 | after | F |
Does this table suggest that the treatment makes a difference? Why or why not?
Why is the F-value on when different in this model than in the previous one?
Prob 15.12. During a conversation about college admissions, a group of high-school students starts to wonder how reliable the SAT score is, that is, how much an individual student’s score could be expected vary just do to random factors such as the day on which the test was taken, the student’s mood, the specific questions on the test, etc. This variation within an individual is quite different from the variation from person to person.
The high-school students decide to study the issue. A simple way to do this is to have one student take the SAT test several times and examine the variability in the student’s scores. But, it would be better to do this for many different students. To this end, the students propose to pool together their scores from the times they took the SAT, producing a data frame that looks like this:
Student | Score | Sex | Order |
PersonA | 2110 | F | 1 |
PersonB | 1950 | M | 1 |
PersonC | 2080 | F | 1 |
PersonA | 2090 | F | 2 |
PersonA | 2150 | F | 3 |
... and so on | |||
The order variable indicates how many times the student has taken the test. 1 means that it is the student’s first time, 2 the second time, and so on.
One student suggests that they simply take the standard deviation of the score variable to measure the variability in the SAT score. What’s wrong with this for the purpose the students have in mind?
A
| There’s nothing wrong with it. |
B
| Standard deviations don’t measure random variability. |
C
| It would confound variability between students with variability. |
Another student suggests looking at the sum of square residuals from the model score ~ student. What’s wrong with this:
A
| There’s nothing wrong with it. |
B
| It’s the coefficients on student that are important. |
C
| Better to look at the mean square residual. |
The students’ statistics teacher points out that the model score ~ student will exactly capture the score of any student who takes the SAT only once; the residuals for those students will be exactly zero. Explain why this isn’t a problem, given the purpose for which the model is being constructed.
Still another student suggests the model score ~ student + order in order to adjust for the possibility that scores change with experience, and not just at random. The group likes this idea and starts to elaborate on it. They make two main suggestions:
Why not include sex as an additional covariate, as in Elaboration 1, to take into account the possibility that males and females might have systematically different scores.
A
| It’s a good idea. |
B
| Bad idea since probably a person’s sex has nothing to do with his or her score. |
C
| Useless, since sex is redundant with student. |
Regarding Elaboration 2, which of the following statements is correct?
Prob 15.21. In conducting a hypothesis test, we need to specify two things:
The numerical output of a hypothesis test is a p-value.
In modeling, a sensible Null Hypothesis is that one or more explanatory variables are unrelated to the response variable. We can simulate a situation in which this Null applies by shuffling the variables. For example, here are two trials of a simulation of the Null in a model of the kidsfeet data:
The test statistic summarizes the situation. There are several possibilities, but here we will use R^{2} from the model since this gives an indication of the quality of the model.
By computing many such trials, we construct the sampling distribution under the Null — that is, the sampling distribution of the test statistic in the world in which the Null holds true. We can automate this process using do:
Finally, to compute the p-value, we need to compute the test statistic on the model fitted to the actual data, not on the simulation.
The p-value is the probability of seeing a value of the test statistic from the Null Hypothesis simulation that is more extreme than our actual value. The meaning of “more extreme” depends on what the test statistic is. In this example, since a better fitting model will always have a larger R^{2} we check the probability of getting a larger R^{2} squares from our simulation than from the actual data.
Our p-value is about 9%.
Here are various computer modeling statements that implement possible Null Hypotheses. Connect each computer statement to the corresponding Null.
Prob 15.22. I’m interested in studying the length of gestation as a function of the ages of the mother and the father. In the gestation data set, ( gestation.csv ) the variable age records the mother’s age in years, and dage gives the father’s age in years. The variable gestation is the length of the gestation in days. I hypothesize that the older the mother and father, the shorter the gestational period. So, I fit a model to those 599 cases where all the relevant data were recorded: