Statistical Methods in Epidemiology
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
ASSIGNMENT 1
Spring Semester, 2018
Statistical Methods in Epidemiology (unit no. 401176)
Total marks is 100 which will be converted to 25. Every question carries 20 marks each. Please read the marking rubric towards the end of the document. No late submissions allowed without a valid reason (read the Learning Guide for instructions). Assignment cover sheet is also attached.
Please answer all questions
Q1. Categorize the following variables as qualitative-nominal, qualitative-ordinal, quantitative-discrete or quantitative-continuous (no explanation needed for your answer)
(i) Hospital discharge diagnosis
(ii) Exact serum cholesterol measurements
(iii) Exact age
(iv) Age groups as 1=<30,2=30-39,3=40-49,4=50+
(v) Causes of death
(vi) Sites of a randomized trial
(vii) Education levels coded as 1= high school not completed
2= high school completed
3 = some post-high school education
(viii) Exact systolic blood pressure levels
(ix) Being treated for hypertension with codes as 1=no,2=yes
(x) Pack years of cigarette smoking
Each question above has 2 marks.
Q2. The following stem-and-leaf plot was obtained from the values of BMI (body mass index) for a random sample of 88 persons.
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
Frequency Stem Leaf
1 19 7
2 20 69
7 21 4788999
7 22 3666799
9 23 112355799
17 24 01222222345555679
18 25 002223344444577789
9 26 002577799
5 27 02689
5 28 01289
7 29 0001668
1 30 2
Stem width: 1.0
Each leaf: 1 case
[Hints on how to read the data from stem & leaf plot:
First, Frequency 1 with 19 (stem) and 7(leaf) means just one value, 19.7, frequency 2 with 20(stem) and 69(leaf) means two values, 20.6 and 20.9, and similarly for the remaining stem and leaf values]
Each question below has 4 marks.
(i) What are the smallest and largest BMI values among these 88 persons?
(ii) What percentage of BMI values exceed 25.0? [Hints: use results for a computed binary variable by SAS]
(iii) Obtain the 1st quartile, median and 3rd quartile for BMI based on this sample, and sketch a stem and leaf plot and box and whisker plot for BMI.
(iv) Interpret the histogram for BMI.
(v) Interpret the bar charts for mean BMI classified by males and females.
SAS codes are given below in order to answer all questions:
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
data a;
input bmi;
cards;
19.7 0
20.6 1
20.9 0
21.4 1
21.7 0
21.8 1
21.8 0
21.9 1
21.9 0
21.9 1
22.3 0
22.6 1
22.6 0
22.6 1
22.7 0
22.9 1
22.9 0
23.1 1
23.1 0
23.2 1
23.3 0
23.5 1
23.5 0
23.7 1
23.9 0
23.9 1
24.0 0
24.1 1
24.2 0
24.2 1
24.2 0
24.2 1
24.2 0
24.2 1
24.3 0
24.4 1
24.5 0
24.5 1
24.5 0
24.5 1
24.6 0
24.7 1
24.9 0
25.0 1
25.0 0
25.2 1
25.2 0
25.2 1
25.3 0
25.3 1
25.4 0
25.4 1
25.4 0
25.4 1
25.4 0
25.5 1
25.7 0
25.7 1
25.7 0
25.8 1
25.9 0
26.0 1
26.0 0
26.2 1
26.5 0
26.7 1
26.7 0
26.7 1
26.9 0
26.9 1
27.0 0
27.2 1
27.6 0
27.8 1
27.9 0
28.0 1
28.1 0
28.2 1
28.8 0
28.9 1
29.0 0
29.0 1
29.0 0
29.1 1
29.6 0
29.6 1
29.8 0
30.2 1
;
data a;
set a;
bmigt25=(BMI >25);
run;
proc freq;
tables bmigt25;
run;
proc sort data=a;
by bmi;
ods listing;
ods graphics off;
proc univariate data=a plot;
var bmi;
title “quartiles and mean bmi, side-by-side stem and leaf plot and boxplot for BMI”;
run;
proc univariate data=a plot;
var bmi;
histogram;
title “histogram for a continuous variable bmi”;
run;
proc gchart data=a;
vbar sex/group=sex sumvar=bmi type=mean discrete;
title “Vertical bar chart for mean BMI by sex,0=female, 1=male”;
run;
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
Q3. A variable can be a confounder, effect modifier, both or none of the two. There are statistical tests for detecting effect modification. But, there is no statistical test for detecting an operational confounder. For example, if a test for comparing unadjusted and adjusted odds ratios show no significant difference, but one is considerably larger than the other, then one would still adjust for the confounder. However, if a test for comparing unadjusted and adjusted odds ratios shows significant difference, but one is not considerably larger than the other, one would not have to adjust for the confounder.
Let us consider a study for assessing the association between smoking & lung cancer. Is sex a confounder or effect modifier (quantitative or qualitative)? (10 marks)
We have 4 different scenarios, such as:
OR (Men) OR (Women) Crude OR Adj OR
2.51 2.15 2.32 2.35
1.06 0.95 2.02 1.01
4.40 3.41 4.02 2.63
2.15 0.65 1.42 1.29
The following table presents unadjusted and age-adjusted coronary event rates and death subsequent to a coronary event, for men in north Glasgow, 1991. The exposure of interest is social deprivation. Is age a confounder in the relationship between social deprivation and coronary event rate and coronary death?
(10 marks)
Table for Coronary event rates and risk of death by deprivation group; north Glasgow men in 1991:
Coronary event rate
(per thousand) Risk of coronary death
Deprivation group Unadjusted Age adjusted Unadjusted Age adjusted
I (most advantaged) 2.95 3.28 0.57 0.59
II 4.32 4.20 0.50 0.50
III 6.15 5.30 0.51 0.52
IV (least advantaged) 5.90 5.75 0.56 0.56
Total 4.83 4.88 0.53 0.54
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
Q 4. The data below are modified from Jick, H. et al (Coffee and Myocardial Infarction, New England Journal of Medicine, Vol.289, No.2, pp.63-67, 1973). These authors used a case-control study to investigate the relationship between coffee consumption and myocardial infarction (MI). Cases were patients hospitalized on the basis of acute chest pain with an admission diagnosis of possible or definite MI. Controls were patients with various other diagnoses. To control for confounding, a multivariate risk score was derived for each patient, taken into account a patient’s age, sex, history of MI, smoking status, admission to hospital, season admitted to hospital, history of antianginal drugs, history of digitalis use, presence of diabetes, and religion. The score was computed in such a way that patients with a high score were more at risk of an MI than patients with a low score. The distribution of all such computed scores was divided into quintiles, with patients in the first quintile representing 20% of subjects with lowest scores, and patients in the fifth quintile representing 20% of subjects with the highest scores. The table below shows the distribution of cases and controls among subjects drinking 0 cups of coffee a day and subjects drinking 6+ cups a day, separately within each quantile (the variables below are in the order of risk score, cups of coffee/day, MI and frequency of cell count in the 2X2 table). (20 marks)
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
Quintile 1 6 or more pres 12
Quintile 1 6 or more abs 670
Quintile 1 0 pres 17
Quintile 1 0 abs 1315
Quintile 2 6 or more pres 5
Quintile 2 6 or more abs 261
Quintile 2 0 pres 12
Quintile 2 0 abs 395
Quintile 3 6 or more pres 4
Quintile 3 6 or more abs 174
Quintile 3 0 pres 13
Quintile 3 0 abs 370
Quintile 4 6 or more pres 2
Quintile 4 6 or more abs 80
Quintile 4 0 pres 14
Quintile 4 0 abs 160
Quintile 5 6 or more pres 1
Quintile 5 6 or more abs 38
Quintile 5 0 pres 14
Quintile 5 0 abs 117
Using the aggregate or grouped data set given above, obtain the following
(i) Interpret Breslow and Day test for homogeneity of relative odds across the risk quintiles. Are the relative odds homogeneous across the risk quintiles ? (6 marks)
(ii) If the relative odds are the same across the risk quintiles, interpret the Mantel-Haenszel test on whether the common relative odds differs significantly from 1. Is there a significant association between coffee drinking and myocardial infarction after adjusting for risk quintiles ? (7 marks)
(iii) Interpret the Mantel-Haenszel and Woolf (logit) estimators of common relative odds and the corresponding 95% confidence intervals. (7 marks)
SAS codes for all parts of the question are:
OPTIONS LINESIZE=80 PAGESIZE=60;
data htbact;
input score $ 1-10 coffee $ 12-20 mi $ 22-25 freq 27-29;
cards;
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper
Quintile 1 6 or more pres 12
Quintile 1 6 or more abs 670
Quintile 1 0 pres 17
Quintile 1 0 abs 1315
Quintile 2 6 or more pres 5
Quintile 2 6 or more abs 261
Quintile 2 0 pres 12
Quintile 2 0 abs 395
Quintile 3 6 or more pres 4
Quintile 3 6 or more abs 174
Quintile 3 0 pres 13
Quintile 3 0 abs 370
Quintile 4 6 or more pres 2
Quintile 4 6 or more abs 80
Quintile 4 0 pres 14
Quintile 4 0 abs 160
Quintile 5 6 or more pres 1
Quintile 5 6 or more abs 38
Quintile 5 0 pres 14
Quintile 5 0 abs 117
;
run;
proc freq order=data;
weight freq;
tables score*coffee*mi/cmh;
run;
5. 100 men with lung cancer and 100 men without lung cancer were asked if they had ever smoked; their answers are tabulated in the following table: (20 marks)
Previous Smoking Lung cancer
Present Absent Total
Yes
No 38 22
62 78 60
140
Total 100 100 200
(a) What type of epidemiological study design is this ? (2 marks)
(b) Interpret an approximate 95% confidence interval (by Wald method) for the risk ratio of association between previous smoking and lung cancer. (6 marks)
(c) Interpret the Wald test for the risk ratio of association between previous smoking and lung cancer being equal 1. (6 marks)
(d) Interpret an approximate 95% confidence interval (by Wald method) for the relative odds of association between previous smoking and lung cancer. (6 marks)
SAS codes for all parts of Q5 except for (i):
data lcancer;
input smoking $ 1-10 lcancer $ 12-15 count 18-19;
cards;
Smoker pres 38
Smoker abs 22
Non smoker pres 62
Non smoker abs 78
;
run;
proc freq data=lcancer;
tables smoking*lcancer /relrisk(CL=(Wald) method=Wald equal) OR(CL=(Wald));
weight count;
title ‘Chi-Square Test of Association of smoking and CVD death';
run;
Statistical Methods in Epidemiology (unit no. 401176)
Order Similar Paper










