.. title: Statistics MAT167
.. slug: mat167
.. date: 2016-01-17 08:39:35 UTC-07:00
.. tags:
.. category:
.. link:
.. description:
.. type: text
.. figure:: courseImage.png
:alt: image
image
An introduction to statistics. Includes sampling, data display, measures
of central tendency, variability, and position; random variables,
probability, probability distributions; sampling distributions,
assessing normality, confidence intervals, hypothesis testing, ANOVA,
and regression. Use of the statistics software R is taught throughout
the course.
.. raw:: html
.. raw:: html
.. rubric:: Announcements
:name: announcements
Final grades have been posted. The solutions to the final exam are now
on the website. Have a good summer and thanks for all your hard work
this semester. --Anthony
.. raw:: html

Course Info
===========
- Spring 2009: section 22684, 3 credit hours.
- 9:10 am - 10:25 am Tuesday & Thursday, Santa Rita A102, Jan 20
through May 19 2009, West Campus, Pima Community College.
- `Syllabus `__
Instructor Info
---------------
- Instructor: Anthony Tanbakuchi
- Office: Radiology Research Labs, U of A, (520) 626-4500 (`Map To
Office `__)
- Easiest to contact me via email: mat167@tanbakuchi.com
Exam Dates
----------
- Feb 24: MIDTERM I
- April 16: MIDTERM II
- May 19: Final Exam Ch 1-12 (2 hours)
Resources
=========
Quick Reference Sheet:
----------------------
- `Equation Sheet & R Commands `__
R Statistics Software
---------------------
Using R on Campus:
**R is on a few computers in the Academic Commons Computer Lab**
(2nd floor Santa Catalina building). Just go into the computer lab
and ask Jody or Dennis (one of the lab managers) where the computers
are. Printers are also available in this lab.
To install R:
Visit the `R Resources page <../../Resources/R_Statistics/>`__.
Basic R usage and examples:
`R Basics page <../../Resources/R_Statistics/RBasics.html>`__
R Data Sets:
- `Triola Book Data Page `__
- `Class Survey Data Page `__
- See quick reference sheet or R intro lecture for info on how to
use the data sets.
Technical Notes On R:
When you start a new problem, it's best to delete all the variables
to ensure you don't accidentally use old data, just type
``rm(list=ls())``. Note that you will need to reload the book data
if you need it.
When you close R, you do not need to save the workspace if it asks.
Saving the workspace just saves the variables you have defined.
Solutions / Exams
-----------------
- Spring 2009
- R HW `Questions `__,
`Solutions `__
- Summation HW `Questions `__,
`Solutions `__
- Test 1 `Exam `__,
`Solutions `__
- Test 2 `Exam `__,
`Solutions `__
- Final `Exam `__,
`Solutions `__
- Fall 2008
- Test 1 `Exam `__,
`Solutions `__
- Test 2 `Exam `__,
`Solutions `__
- Final `Exam `__,
`Solutions `__
- Summer 2008
- Midterm `Exam `__,
`Solutions `__
- Final `Exam `__,
`Solutions `__
- Spring 2008
- Test 1 `Exam `__,
`Solutions `__
- Test 2 `Exam `__,
`Solutions `__
- Final `Exam `__,
`Solutions `__
- Fall 2007
- Test 1 `Exam `__,
`Solutions `__
- Test 2 `Exam `__,
`Solutions `__
- Final `Exam `__,
`Solutions `__
Lectures and Homework
=====================
**All homework is due at the beginning of class on Tuesday.** Thus,
homework assigned on Tuesday and Thursday is due at the beginning of
class on the following Tuesday.
1. Tue, Jan 20
**FOUNDATIONS**
.. raw:: html
**Introductory Material.** (Sections 1.1-1.4)
- In Class Survey: `Sexual Partners Survey (encrypted
connection) `__
(Do not submit this until instructed to do so.)
- Lecture:
`Handout `__
- Special Home Work **Complete within 24 hours**:
1. CRITICAL A: `Student Information (encrypted
connection) `__
2. CRITICAL B: `Student Survey (encrypted
connection) `__
- Home Work (Due next Tuesday)
1. CRITICAL C: Return syllabus student contract signed (last
page).
2. Sec 1.2: odds 1-25, 26
3. Sec 1.3: every other odd 1-17, odds 21-27
4. Sec 1.4: odds 1-29
5. If you plan on using your own computer for homework, try to
install R on it using these `installation
instructions <../../Resources/R_Statistics/>`__. If you have
problems getting it to install, email me. If you don't have a
home computer, you can use R in the academic computer commons
on campus.
2. Thur, Jan 22
**Introduction to R.**
- Lecture: `Handout `__
- Homework:
1. `R Worksheet `__
2. `R New York Times
Article `__
Read the New York Times article on R. (`PDF of
article `__ if link does not
work.)
3. Tue, Jan 27
**DESCRIPTIVE STATISTICS**
**Summarizing & graphing data.** (Sections 2.1-2.4)
- Lecture:\ `Handout `__
- Homework:
- **From this point forward if the book has the TI symbol next
to a problem, use R to do it.**
- **As always, make sure to include your plots made with R in
the HW.**
- If you get stuck using R take a look at `these R
examples. `__
1. Sec 2.2: 1-17 odds (do 17 by hand)
2. Sec 2.3: 1, 3.
3. Additional Problem for 2.3: Use R to make two histograms: one
of the male heights and of the female heights in the Appendix
B Data Set 1 (``Mhealth`` and ``Fhealth`` tables in R).
Include both of the histograms in your HW. Then write a
paragraph discussing the differences between the male and
female heights that you can see from the histograms (ie.
center, variation, shape, outliers, min, max). Do either of
the histograms have a distribution that is approximately
normal?
HINTS: If you are having trouble getting the book data, see
the R intro lecture or look at the back of the quick reference
sheet. See the top part of this page to download the data
sets.
4. Sec 2.4: 1-4, 9 (use R), 13 (just sketch by hand), 17 (use R),
19 (use R)
Hint for 19: to make a plot with Both lines and points rather
than a scatter plot use the optional argument ``type="b"`` for
the plot function. ex. ``plot(t, y, type="b")``. The time
vector ``t`` goes from 1990 to 2000. A quick way to make ``t``
is to use this shortcut: ``t=1990:2000``.
4. Thur, Jan 29
.. raw:: html
**Summation Notation.**
- Lecture:
`Handout `__
- Homework: (If you need more explanation and practice with
summation notation: `see this
page `__ )
1. `Summation HW `__
5. Tue, Feb 3
**Measures of center.** (Sections 3.1-3.2)
- Lecture: `Handout `__
- Homework:
1. Sec 3.2: 1-9 odds, 13, 15, 21, 23, 25, 29 (Use R if the
problem has TI by it from now on.)
Hint for 21: to get the first set of differences use:
::
x=WEATHER$HIGH-WEATHER$PREDICTE
You can find the second set of differences in the same way
once you figure out the correct column name. (I admit the
author’s column names are not that good).
Hint for 23: to get the pennies for before 1983:
::
x=Coins$WEIGHT[Coins$TYPE=="Pre-1983 Pennies"]
To find the post 1983 pennies use the same method but take a
look at the ``Coins`` table to see what they are called and
then modify the above statement.
Hint on 29 b: ``mean(x, trim=0.10)``
6. Thur, Feb 5
**Measures of variation.** (Sections 3.3)
- Lecture:
`Handout `__
- Homework:
1. Sec 3.3: 1-9 odds, 15, 21
7. Tue, Feb 10
**Relative standing and exploratory data analysis.** (Sections
3.4-3.5)
- Lecture:
`Handout `__
- Homework
1. Sec 3.4: 1, 5, 7, 9, 11, 13-27 odds
2. Sec 3.5: 1, 3, 5, 9
3. Additional problem: Use the following code to make two
boxplots for comparing gender against bear weight and length.
Then use the boxplots to discuss and compare the distribution
of lengths and weights of bears in terms of their gender.
(Make sure the book data is loaded into R first)
::
boxplot(Bears$LENGTH ~ Bears$SEX, main="Comparison of bear length")
boxplot(Bears$WEIGHT ~ Bears$SEX, main="Comparison of bear weight")
8. Thur, Feb 12
**Descriptive Statistics: Case Study.**
- Lecture:
`Handout `__
**PROBABILITY**
**Probability I: Addition rule.** (Sections 4.1-4.3)
- Lecture: `Handout `__
- Homework:
1. Sec 4.2: 1-25 odds, 29
2. Sec 4.3: 1-23 odds
9. Tue, Feb 17
**Probability II: Multiplication rule.** (Sections 4.4-4.5)
- Lecture: `Handout `__
- Homework:
1. Sec 4.4: 1-21 odds
2. Sec 4.5: 1-25 odds
10. Thur, Feb 19
**Random variables** (Sections 5.1-5.2)
- Lecture: `Handout `__
(Printout next lecture on counting, we may cover part of that if
we have time.)
- Homework:
1. Sec 5.2: 1-19 odds
11. Tue, Feb 24
**MIDTERM I (Chapters 1-4)**
12. Thur, Feb 26
**Rodeo Holiday** (No Classes)
13. Tue, Mar 3
**Counting & Binomial distribution.** (Sections 4.7, 5.3-5.4)
- Lecture: `Handout A `__,
`Handout B `__
- Homework:
1. Sec 4.7: 1, 5, 7, 9, 13
2. Sec 5.3: 1, 3, every other odd 5-33, 35 (If the book says to
use a table in the appendix, use ``dbinom`` in R instead.)
3. Sec 5.4: 1, 3, every other odd 5-17, 19
14. Thur, Mar 5
**Intro to the normal distribution.** (Sections 6.1-6.2)
- Lecture: `Handout `__
- Homework
1. Sec 6.2: 1-4, 5-39 odds (most of these are easy if you use
``pnorm`` and ``qnorm`` R function.). **You must make sketches
to show the area.**
NOTE: From this point onward, if the book says to use a lookup
table in Appendix A, use R instead. (You won't be given tables
on the tests.)
HINT: If you use the technique I used in class, you don’t need
to find z scores OR use the table in the back of the book. If
the question refers to data that has a standard normal
distribution, then it has a normal distribution with a mean=0
and a standard deviation=1.
For example, to do 6.2 #10, it says to find the probability a
thermometer has a reading less than -2.50 if the readings have
a standard normal distribution. Thus we want to find
P(x<-2.50) where x has a standard normal distribution. In R
you would type:
::
> pnorm(-2.50, mean=0, sd=1)
0.006209665
So the probability is only 0.00621!
15. Tue, Mar 10
**Normal distribution cont.** (Section 6.3)
- Lecture: Continuation of previous lecture.
- Homework
1. Sec 6.3: 1, 2, 4, 5-23 odds *Make sketches to show the area.*
16. Thur, Mar 12
**INFERENTIAL STATISTICS**
**Sampling distributions, estimators, and the Central limit theorem
(CLT).** (Section 6.4-6.5)
- Lecture:
`Handout `__
- Homework:
1. Sec 6.4: 1-7 odds, 11
2. Sec 6.5: 1-17 odds *Make sketches*
17. Tue, Mar 17 & Thur, Mar 19
**Spring Break** (No class)
18. Tue, Mar 24
**Normal as approx. to the binomial and assessing normality.**
(Sections 6.6-6.7)
- Lecture: `Handout
A `__,
`Handout B `__
- Homework:
1. Sec 6.6: 1-23 odds (Use R not the appendix tables!) *Make
sketches*
2. Sec 6.7: 1, 3, 9 & 13, 11 & 15
19. Thur, Mar 26
**Estimating a population proportion** (Sections 7.1-7.2)
- Lecture: `Handout `__
- Homework: (Yes, there are many problems for this HW, but these
problems require practice.)
1. Sec. 7.2: 1-35 odds
20. Tue, Mar 31
**Estimating a population mean.** (Sections 7.3-7.4)
- Lecture: `Handout `__
- Homework: (Yes, there are many problems for this HW, but these
problems require practice.)
1. Sec. 7.3: 1-23 odds, 27, 29, 33
2. Sec. 7.4: 1-13 odds, 19, 21, 23
21. Thur, April 2
**HYPOTHESIS TESTING**
**Intro to hypothesis testing** (Sections 8.1-8.2)
- Lecture: `Handout `__
- Homework:
1. Sec. 8.2: 1-43 odds (skip 17-23). You don't need to find
critical values. However, if the book asks you to find the
test statistic, find that using the equation.
Hint for 29-36. If you have the test statistic and it's a
z-score, then use the cumulative probability distribution for
the standard normal ``pnorm`` and find the tail area. See the
section on **p-value** in the notes, it also discusses what to
do.
.. raw:: html
22. Tue, April 7
**Testing a claim about a proportion** (Section 8.3)
- Lecture: Continuation of last lecture handout.
- Homework:
1. Sec. 8.3: 1-3 odds, 5(c,d,e), 9, 15, 19, 23
Note 1: You **do not** need to find the test statistic or
critical values. We are using the p-values.
Note 2: R uses the continuity correction for more accurate
p-values. Your p-values and test statistics will differ from
the book's answers by a few percent. The following are a few
of the p-values you will get with R to help you verify your
work: Q5: p-value = 0.9114, Q9: p-value < 2.2e-16, Q15:
p-value = 0.5395.
23. Thur, April 9
**Testing a claim about a mean** (Section 8.4-8.5)
- Lecture: `Handout `__
- Homework:
1. Sec. 8.4: 1-7 odds, 13, 15
2. Sec. 8.5: 3-13 odds, 21, 25, 27, 31
24. Tue, April 14
**Understanding tests and estimates**
- Lecture: `Handout `__
- Homework: Study for the exam. I won't accept any email questions
after 5 pm the night before the exam. Don't start studying the
night before the test.
25. Thur, April 16
**MIDTERM II (Chapters 5-8 and 4.7)**
26. Tue, April 21
**Inferences about two proportions** (Sections 9.1-9.2)
- Lecture: `Handout `__
- Homework:
1. Sec. 9.2: 1-7 odds, 15, 17, 19, 21, 25
Note that R uses the continuity correction so the p-values
will differ by a few percent from the book's.
27. Thur, April 23
**Inferences about two means & matched pairs** (Section 9.3-9.4)
- Lecture: `Handout `__
- Homework:
1. Sec. 9.3: 1-7 odds, 23, 25, 27, 28
Hint for 27: Use the ``Coins`` table, to get the quarters for
before 1964:
::
pre=Coins$WEIGHT[Coins$TYPE=="Pre-1964 Quarters"]
To find the post 1964 quarters use the same method but take a
look at the ``Coins`` table to see what they are called and
then modify the above statement. The command
``summary(Coins)`` is helpful to find the categories.
Hint for 28: Use the ``Cola`` table. Just figure out which two
columns you need.
2. Sec. 9.4: 1, 3, 5 (manually find the test statistic & p-value
only), 13, 15, 17 (b-c), 19
28. Tue, April 28
**MODELING AND TESTING RELATIONSHIPS**
**Correlation** (Section 10.1-10.2)
- Lecture: `Handout `__
- Homework: **Include scatter plots for each set of data that you
find r**
1. Sec. 10.2: 1-11 odds, 21, 23, 27, 29, 31, 33, 35
29. Thur, April 30
**Regression** (Section 10.3)
- Lecture: `Handout `__
- Homework: **Make sure to determine if r is significant first (via
hypothesis test) SHOW WORK. Include scatter plots with regression
line and residual plots.**
1. Sec. 10.3: 1-11 odds, 21, 23, 27, 29, 31, 33
Hint for 5 and 7: you will need to determine if the linear
correlation coefficient is significant. Since r and n are
already given just use the test statistic equation to manually
find the p-value.
**Variation and prediction intervals, multiple regression** (Section
10.4-10.5)
- Lecture: Continuation of regression lecture.
- Homework: No HW. These sections are optional course material.
However, I highly recommend you read them.
30. Tue, May 5
**Contingency tables** (Section 11.3)
- Lecture: `Handout `__
- Homework: Note that R uses the Yate's continuity correction, so
your P-values may differ slightly from the book's.
1. Sec. 11.3: 1-5, 7, 11, 13, 17, 21
31. Thur, May 7
**ANOVA I** (Section 12.1)
- Lecture: `Handout `__
- Homework:
1. Sec. 12.2: 1-4, 5 (skip d), 9
**ANOVA II** (Section 12.2)
- Lecture: Continuation of last lecture handout.
- Homework: Don't type in the data for 11-14 manually, download
`Chapter 12 Data File `__ and load it
into R (just like the book data), it has data for each problem.
The table name is listed next to each problem. Also, don't forget
to **include the boxplots**.
1. Sec. 12.2: 11 ``car.crash`` (p-val=0.421), 12 ``car.crash``
(p-val=0.296), 13 ``stress`` (p=val=0.091), 14 ``skulls``
(p-val=0.0305), 16 (p-val=0.0369)
32. Tue, May 12
**Review / Questions**
- Lecture: `Handout `__
- Homework: Study for the final exam.
33. Thur, May 14
**Review / Questions**
34. Tue, May 19
**FINAL EXAM Chapters 1-12** (2 hours - early class start time)
**8:10 am to 10:10 am**
If you would like your final exam back, turn in a self addressed
stamped envelope with 2 first class stamps affixed with your final
exam. Once the exams are graded I will mail back those I have
envelopes for. If you don't provide an envelope your exam will be
shredded for your privacy.