By R: > mean(dataset$money)
> summary (dataset)
year money shopping
F: 10 Min. : 0.00 NO : 9
S: 11 1st Qu.: 50.00 SOMEWHAT: 3
Median : 100.00 YES : 9
Mean : 93.81
3rd Qu.: 140.00
Max. : 200.00
a) The command should be: table(dataset$shopping)
This is because shopping is not a quantitative variable and it is impossible to calculate the mean of a categorical variable.
b) The command should be: summary(dataset)
This is because R is case sensitive so DATASET is different from dataset.
c) The command should be: barplot(table(dataset$shopping),
main=”Responses to the question, Do you like to shop?”,
xlab=”shopping preference”, ylab=”Number of students”)
This is because the variable is categorical, we can only tally it up, hence we are using the table function. Other than that, (dataset$shopping) must be written exactly like that because we already put the variable in data table to make our task easier.
To get the random numbers by R Studio, we can use the sample function. In this situation, the council president wants 30 randomly chosen signatures to determine the proportion of the students of that university that signed the petition. By using the dollar sign ($) method, we will type the data set we want to use along with the name of the variable. Save the data that contain the 500 signatures with their names (one of the variables) as ‘SIGNATURE’ in .cvs file. Next, type in the sample function in R Studio:
Run the function and we can name the result. For example:
samplednames < - sample(SIGNATURE$name, 30) Hence, we can now get the names of the 30 signatures that are being chosen as the representative sample. To find the proportion of those who are registered students from the total sample of 30 individuals, we can use this formula: Sample proportion = number of individuals who are registered students / 30 The population proportion can be determined by using this sample proportion as an estimation. Problem 3 The first bias in this study is non-response bias. This happens when the cases that are being chosen to participate in this study do not participate. For this study, 3000 athletes were chosen to participate in the survey, however only 25% responded. The number of participants is only 750 out of 3000 which is quite a low number of participant, hence this bias can influence the result of this study. Other than that, there is also volunteer response bias, in which the 3000 athletes decided for themselves to be or not to be part of the study. When the copies of the survey were being emailed, not every athlete would want to answer the survey and respond to it. They can just throw away the emailed survey or decided to not be a part of the study because of their own reason. Hence, this would lead to volunteer response bias. Next, there could be also an undercoverage bias, in which out of 351 Division 1 institutions, only 182 were being selected unless this study is using one of the sampling designs which are simple random sample, systematic sample, stratified random sample, or cluster sample. Problem 4 a. Without control group, we could not make comparison or figure out if our response variable is dependent on our explanatory variable or not. This is due to the importance of the control group as comparison, in which is necessary to find out if the treatment is actually giving any effect. b. An experimental condition that we investigate would be the treatment. In this experiment, the treatment would be the families' dust control, because there were two groups that were being compared, intervention and control. The intervention is the group that received the cleaning supplies and was given a demonstration on how to clean well, while the control group only received the pamphlet on how to avoid lead exposure without the cleaning supplies or any demonstration. c. Introducing a placebo is not needed in this experiment because placebo is commonly being used in an experiment as a control to compare the end result. For example, in testing new drugs, we could use placebo and give it to the control group to determine if the drugs is working or not. In this experiment, we want to measure the level of lead in the child's blood, in which even without placebo, we could measure and compare the result that we want because lead level in the body is not something that can be affected by the control group's feelings or thoughts that cause them to have physical changes like how a placebo effect takes place.