This course provides a survey of regression techniques for outcomes common in public health data including continuous, binary, count and survival data. Emphasis is on developing a conceptual understanding of the application of these techniques to solving problems, rather than to the numerical details. Extensive use of the computer will be made for analysis of datasets.
This course is designed for graduate and advanced undergraduate students who will be analyzing data with scientific colleagues and who want to develop a practical hands-on toolkit and gain experience in distilling complex statistical information into formats understandable to colleagues. This course will feature R programming elements. To make the most of R students will be expected to use \href{https://www.rstudio.com/}{RStudio}. There are videos on the course’s R page to explain how to get set up and started in RStudio.
After successful completion of this course you will understand and be able to develop and interpret regression models to describe how an outcome is related to one or more predictor variables. In particular these include the following capabilities:
Students in this course will be expected to do the following:
Attend all lectures and actively participate in discussion.
Read all assigned material prior to coming to class and actively participate in class discussions.
Complete and turn in all assignments on time. Solutions to homework must be clearly written with appropriate tables and figures included.
Demonstrate an understanding on material on examinations.
Respect each other, each others questions and each others discussion.
Course topics will be drawn (but subject to change) from
Applied Regression Analysis and Generalized Linear Models by John Fox Jr. Amazon. We follow this book for content and pacing.
Regression Methods in Biostatistics by Eric Vittinghoff, David V. Glidden, Stephen C. Shiboski, Charles E. McCulloch Amazon. We follow this book for content and pacing.
An Introduction to Statistical Learning: with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani.
We will use this as a reference to some methods as well as a resource for data and R code.
This is available freely available as an eBook Get it . If you prefer a paperback version you may buy it at cost from Springer (see links from library site) or purchase a hardback version at the through Amazon.
For additional information check out Videos for the ISL book
Other resources for reference books, statistical computing using R, etc are provided on the Resource tab
Students will be evaluated based on:
Grade | Category | Percentage |
---|---|---|
Participation | 10% | |
Homework | 20% | |
Exam 1 (03/13/2019) | 20% | |
Exam 2 (05/8/2019) | 20% | |
Reproducible Research Project | 30% |
Task | Hours Spent on Task |
---|---|
Class Time | 40 |
Homework/Class Preparation | 90 |
Exams/Prep | 25 |
Reproducible Research Project | 45 |
This course will move very fast and it is crucial to success in the course that students attend and participate. Many classes will have polls or quizzes that will not be graded for having the most correct or best answer but for participating. Unexcused absences will result in a loss of percentage points.
Weekly assignments will be given out to students. Assignments will require data handling, data cleaning and interpretation of the results. It is expected that all assignments are completed on time. No late assignments will be accepted.
Students will also be graded on the conciseness and quality of work. Turning in many pages of just computer code and output will affect the grade in a negative fashion.
An in class exam will be given. Students will be expected to interpret and analyze regression models. Students will also be expected to understand conceptual ideas.
An in class exam will be given. Students will be expected to interpret and analyze regression models. Students will also be expected to understand conceptual ideas.
Students will spend the semester working on a Reproducible Research Project. This project will require:
The project will consist of individual as well as group content. For the individual content you will complete the 4 requirements. For the group component. You will work in small groups to evaluate each others work. This will require:
You will be graded on both individual and group aspects. It is important to learn not only how to ask a public health question and answer that question with a study or data but equally important to review others work and arguments.
Given the nature of this course with multiple levels of students from Undergraduate to PhD, it is important to discuss the differences of expectations and how students will be graded.
Grade Category | Comments |
---|---|
Participation | Graded the same as all students, Must be in class and prepared to work in groups. |
Homework | Students will be expected to complete a portion of the material with the exception of some more difficult problems which may be attempted but do not have to be complete. |
Exam 1 & 2 | Students will be expected to complete a portion of the exam. |
Reproducible Research Project | Students will be expected to complete a reproducible research project. Data as well as questions explored will be at a level appropriate of the background and other statistical courses taken. This will be a semester long project so it will require a great deal of work. |
Grade Category | Comments |
---|---|
Participation | Graded the same as all students, Must be in class and prepared to work in groups. |
Homework | Students will be expected to complete the entire assignment. |
Exam 1 & 2 | Students will be expected to complete the entire exam. |
Reproducible Research Project | Students will be expected to complete a reproducible resaerch project. Data as well as questions explored will be at a level appropriate of the background and other statistical courses taken. This will be a semester long project so it will require a great deal of work. |
We will use R as a programming language for data analysis and use existing packages written in R to support the course. You should have access to a laptop or desktop capable of running R or RStudio. We will also provide access to a dedicated server running RStudio Pro for all students that will have a unified environment. See the Resources page for books and other resources for learning R.
Any non-personal questions related to the material covered in class, problem sets, labs, projects, etc. should be posted on slack. Before posting a new question please make sure to check if your question has already been answered. The TAs and myself will be answering questions on the forum daily and all students are expected to answer questions as well. Please use informative titles for your posts.
Note that it is more efficient to answer most statistical questions ``in person” so make use of Office Hours.
Brown University is committed to full inclusion of all students. Students who, by nature of a documented disability, require academic accommodations should contact the professor during office hours. Students may also speak with Student and Employee Accessibility Services at 401-863-9588 to discuss the process for requesting accommodations.
This course is designed to support an inclusive learning environment where diverse perspectives are recognized, respected and seen as a source of strength. It is our intent to provide materials and activities that are respectful of various levels of diversity: mathematical background, previous computing skills, gender, sexuality, disability, age, socioeconomic status, ethnicity, race, and culture.
Brown University welcomes students from around the world, and the unique perspectives international students bring enrich the campus community. To empower students whose first language is not English, an array of ELL support is available on campus including language and culture workshops and individual appointments. For more information about English Language Learning at Brown, contact the ELL Specialists at ellwriting@brown.edu.