Tentative STT 3850 Course Schedule

General Notes:

Please bring a notebook and pencil to every class.
The principal documents for this course are ModernDive: An Introduction to Statistical and Data Sciences via R (MD), Data Science with R (DSWR), and Mathematical Statistics with Resampling and R (MSWR) -available inside the ASULEARN course page.
Problem Set (PS) assignments are generally due on Thursdays by 5:00 pm
The links for the problem sets and the sampling distribution assignment in this document may only work for the instructor! To accept each of these assignments, go to ASULEARN and click on the appropriate link in the GitHub Classroom Invitation Links for Assignments block.
QUARTO cheat sheet

Grading Rubric for Assignments

Field	Excellent (3)	Competent (2)	Needs Work (1)
Reproducible	All graphs, code, and answers are created from text files. Answers are never hard-coded but instead are inserted using inline R code. An automatically generated references section with properly formatted citations when appropriate and `sessionInfo()` are provided at the end of the document.	All graphs, code, and answers are created from text files. Answers are hard coded. No `sessionInfo()` is provided at the end of the document. References are present but not cited properly or not automatically generated.	Document uses copy and paste with graphs or code. Answers are hard coded; and references, when appropriate are hard coded.
Graphics	Graphs for categorical data (barplot, mosaic plot, etc.) have appropriately labeled axes and titles. Graphs for quantitative data (histograms, density plots, violin plots, etc.) have appropriately labeled axes and titles. Multivariate graphs use appropriate legends and labels. Computer variable names are replaced with descriptive variable names.	Appropriate graphs for the type of data are used. Not all axes have appropriate labels or computer variable names are used in the graphs.	Inappropriate graphs are used for the type of data. Axes are not labeled and computer variable names appear in the graphs.
Coding	Code (primarily R) produces correct answers. Non-standard or complex functions are commented. Code is formatted using a consistent standard.	Code produces correct answers. Commenting is not used with non-standard and complex functions. No consistent code formatting is used.	Code does not produce correct answers. Code has no comments and is not formatted.
Clarity	Few errors of grammar and usage; any minor errors do not interfere with meaning. Language style and word choice are highly effective and enhance meaning. Style and word choice are appropriate for the assignment. Complete sentences are used to report all answers.	Some errors of grammar and usage; errors do not interfere with meaning. Language style and word choice are, for the most part, effective and appropriate for the assignment. Incomplete sentences and inconsistent punctuation are used to answer questions.	Major errors of grammar and usage make meaning unclear. Language style and word choice are ineffective and/or inappropriate. Only numeric values are reported for answers to questions.
Completeness	All questions are answered correctly. Answers to questions demonstrate clear statistical understanding by comparing theoretical answers to simulated answers. When hypotheses are tested, classical methods are compared and contrasted to randomization methods. When confidence intervals are constructed, classical approaches are compared and contrasted with bootstrap procedures. The scope of inferential conclusions made is appropriate for the sampling method.	A question or two is incorrect or unanswered. Theoretical and simulated answers are computed but no discussion is present comparing and contrasting the results. When hypotheses are tested, results for classical and randomization methods are presented but are not compared and contrasted. When confidence intervals are constructed, classical and bootstrap approaches are computed but the results are not compared and contrasted. The scope of inferential conclusions made is appropriate for the sampling method.	More than two questions are incorrect or unanswered. Theoretical and simulated answers are not computed correctly. No comparison between classical and randomization approaches is present when testing hypotheses. When confidence intervals are constructed, there is no comparison between classical and bootstrap confidence intervals .

Tiered Feedback Explanation

Level one. Problem Sets are graded using the rubric on the course pacing guide. The same rubric is used for all of the PS assignments, and you are graded on five categories with possible 3, 2, 1, or 0 points awarded per category. Everyone who accepts a Problem Set will receive level 1 feedback in their repository Issues.

Level two. If you cannot determine what you could do better on future assignments based on the rubric feedback, you can request annotated (Level 2) feedback. If you would like level 2 feedback, you should respond to me in the Issues (@alanarnholt) before noon the Monday after you receive level 1 feedback (which should arrive on Fridays) requesting Level 2 feedback.

I will provide Level two feedback using Issues in your repository to give additional details based on the rubric. Anyone may ask for Level 2 feedback. When you get your level 2 feedback (by Tuesday morning), you are expected to act on it to improve your code and mark the issues as “resolved” and message me in the Issues using (@alanarnholt) before noon on Wednesday.

Level three. After you have received your level 2 feedback, if you are still unclear as to how you can improve your work, you may request to meet with me during student help/office hours Wednesday to receive in-depth feedback and guidance for how to be more successful on the next assignment and how to resolve the Level 2 feedback/Git issues before noon on Thursday.

Asking for level 2 feedback is an agreement between you and me that you will revise and resubmit your document by noon on Thursday and I will look at your revisions and may revise your original rubric grade. If you ask for level 2 feedback and do not revise your document by noon of Thursday I may revise your original grade. After the Thursday following the Thursday when your PS is due, I will not review any further updates or corrections you push to your repository.

Week 1: (Jan 16 – 18)

Before the first class meeting, read Chapter 1 (Getting Started with Data in R) of MD—pgs 1-20
Before the first class meeting, read Chapter 1 Why Git? Why GitHub? of Happy Git With R.
Become familiar with the Appstate RStudio/POSIT workbench server. You will use your Appstate user name and password to log in to the server. You must be registered in the class to access the server.

We will walk through everything outlined below in class. If you want to complete the setup before class that is fine.

Sign-up for a free account on GitHub. When you register for a free individual GitHub account, request a student discount to obtain a few private repositories as well as unlimited public repositories. Please use something similar to FirstNameLastName as your username when you register with GitHub. For example, my username on GitHub is alanarnholt. If you have a popular name such as John Smith, you may need to provide some other distinguishing characteristic in your username.
Introduce yourself to Git by following the directions in HappyGitWithR
Cache your credentials and set up a personal access token (PAT) by following the directions in HappyGitWithR.
TL;DR the chapters in Happy Git With R — follow this document to Set up Git and GitHub
VIDEO of the setup process
Complete PS-01 due by 5:00 pm Jan 18
VIDEO of how to accept and clone the assignment

Optional

Introduction to R slides
Watch Paul the Octopus clip (61 seconds).
You may want to install Git, R, RStudio, zotero, and optionally \(LaTeX\) on your personal computer. If you do, you will want to follow Jenny Bryan’s excellent advice for installing R and RStudio and installing Git. Jenny’s advice is also in chapters 6 and 7 of Happy Git and GitHub for the useR. Note: Git, R, RStudio, and \(LaTeX\) are installed on the Appstate RStudio server.
Watch the following videos as appropriate:
Install R on Mac (2 min)
Install R for Windows (3 min)
Install R and RStudio on Windows (5 min)
Work through chapter 1 (Git and GitHub) of DSWR. Make sure RStudio is set up to communicate with Git by following the directions in HappyGitWithR for introducing yourself to Git.
Work through chapter 2 (Introduction to R) of DSWR
Reading in Data

Week 2: (Jan 23 – 25)

Before class read chapter 2 (Data Visualization) of MD — pgs 21-62
Complete the Data Visualization chapter of Introduction to the Tidyverse — DataCamp — Due NLT 5:00 pm Jan 23
Complete the Types of Visualizations chapter of Introduction to the Tidyverse — DataCamp — Due NLT 5:00 pm Jan 24
Complete PS-02 due by 5:00 pm Jan 25
Lecture Slides

Optional

Read Getting used to R, RStudio, and R Markdown
Work through chapter 5 (Using ggplot2) of DSWR
Complete Data Visualization with ggplot2 (Part 1) (DataCamp)

Week 3: (Jan 30 – Feb 1)

Before class read chapter 3 (Data Wrangling) of MD — pgs 65-96
Lecture Slides
Complete the Data Wrangling chapter of Introduction to the Tidyverse — DataCamp — Due NLT 5:00 pm Jan 30
Complete the Grouping and Summarizing chapter of Introduction to the Tidyverse — DataCamp — Due NLT 5:00 pm Jan 31
Complete PS-03 by 5:00 pm Feb 1
In-class work on dplyr-CH1-handout

Test yourself:

Optional

Posit Cheat Sheets
Work through chapter 3 (Starting with Data) of DSWR
Work through chapter 4 (Data Manipulation) of DSWR
In-class work on dplyr-CH2-handout
In-class work on dplyr-CH3-handout
In-class work on dplyr-CH4-handout

Week 4: (Feb 6 – 8)

Before class read chapter 5 (Basic Regression) of MD — pgs 119-160
Lecture Slides
In class go over this document
Class notes for one quantitative and one qualitative predictor
Complete the Introduction to Modeling chapter of Modeling with Data in the Tidyverse — DataCamp — Due NLT 5:00 pm Feb 6
Complete the Modeling with Basic Regression chapter of Modeling with Data in the Tidyverse — DataCamp — Due NLT 5:00 pm Feb 7
Complete PS-04 due by 5:00 pm Feb 8

Optional

Read chapter 4 (Data Importing and “Tidy” Data) of MD — pgs 99-117
Read the Git and GitHub chapter from Hadley Wickham’s book R Packages
Brian Caffo’s take on R IDEs

Week 5: (Feb 13 – 15)

Before class read chapter 6 (Multiple Regression) of MD — pgs 161-191
Lecture Slides
Regression with a single categorical variable handout.
Complete the Modeling with Multiple Regression chapter of Modeling with Data in the Tidyverse — DataCamp — Due NLT 5:00 pm Feb 13
Complete the Model Assessment and Selection chapter of Modeling with Data in the Tidyverse — DataCamp — Due NLT 5:00 pm Feb 14
Complete PS-05 by 5:00 pm Feb 15

Optional

Complete Correlation and Regression in R (DataCamp)

Week 6: (Feb 20 – 22)

Lecture Slides
Before class read/review chapter 6 (Multiple Regression) of MD — pgs 161-191
Go over in class Misc Regression
Complete PS-06 by 5:00 pm Feb 22

Optional

Answer the questions at the end of Misc Regression for extra credit
Work on Is this Discrimination?
Some ideas for how to answer the Is this Discrimination?

Week 7: (Feb 27 – 29)

Lecture Slides
Probability
Complete (The binomial distribution) in Foundations of Probability in R — DataCamp — Due NLT 5:00 pm Feb 27
Complete (Laws of probability) in Foundations of Probability in R — DataCamp — Due NLT 5:00 pm Feb 28
Complete (Bayesian statistics) in Foundations of Probability in R — DataCamp — Due NLT 5:00 pm Feb 29

Optional

In Class Problems
Foundations of Probability with some Extras
Complete the Improving the Report chapter of Reporting with R Markdown— DataCamp
Complete the Customizing the Report chapter of Reporting with R Markdown — DataCamp

Week 8: (Mar 5 – 7)

Complete (Related distributions) in Foundations of Probability in R — DataCamp — Due NLT 5:00 pm Mar 5
Mid-Term Exam/Opportunity To Excel — Due no later than 2:00 pm Mar 7

Optional

Study

Spring Break: Mar 11 – 15

Week 9: (Mar 19 – 21)

Before class read chapter 7 (Sampling) of MD — pgs 195-232
Complete (will go over most questions in class) Sampling Distributions Lab by 5:00 pm Mar 19 — not graded Partial Solution
Start PS-07 due by 5:00 pm Mar 21

Optional

Sampling Distributions
Read Chapter 4 of MSWR — Sampling Distributions; Problems 2, 5, 12-16

Week 10: (Mar 26 – 28)

Before class read chapter 8 (Bootstrapping and Confidence Intervals) of MD — pgs 233-305
Read Chapter 5 of MSWR
Chapter 5 notes
Complete the Bootstrapping for Estimating a Parameter chapter in Inference for Numerical Data in R — DataCamp — Due NLT 5:00 pm Mar 26
Complete the Introducing the t-distribution chapter in Inference for Numerical Data in R — DataCamp — Due NLT 5:00 pm Mar 27
Complete the Inference for Difference in Two Parameters chapter in Inference for Numerical Data in R — DataCamp — Due NLT 5:00 pm Mar 28
Bootstrap Example
Lecture Slides

Optional

Week 11: (Apr 2 – 4)

Lecture Slides for weeks 11, 12, and 13
Before class review chapter 8 (Bootstrapping and Confidence Intervals) of MD — pgs 233-305
Read Chapter 7 of MSWR
Chapter 7 notes
Bootstrap \(t\)
Complete PS-08 by 5:00 pm Apr 4

Optional

Week 12: (Apr 9 – 11)

Lecture Slides for weeks 11, 12, and 13
Before class read Chapter 9 (Hypothesis Testing) of MD — pgs 307-360
Read about Permutation Testing
Complete the Introduction to ideas of inference chapter of Foundations of Inference — DataCamp — Due NLT 5:00 pm Apr 9
Complete the Completing a randomization test: gender discrimination chapter of Foundations of Inference — DataCamp — Due NLT 5:00 pm Apr 10
Complete the Hypothesis testing errors: opportunity cost chapter of Foundations of Inference — DataCamp — Due NLT 5:00 pm Apr 11

Optional

Week 13: (Apr 16 – 18)

Lecture Slides for weeks 11, 12, and 13
Before class review Chapter 9 (Hypothesis Testing) of MD — pgs 307-360
Permutation Examples
Complete the Inference for a Single Parameter chapter in Inference for Categorical Data in R — DataCamp — Due NLT 5:00 pm Apr 16
Complete the Proportions: Testing and Power chapter in Inference for Categorical Data in R — DataCamp — Due NLT 5:00 pm Apr 17
Complete PS-09 by 5:00 pm Apr 18

Optional

Complete the problems in the R Markdown file and publish your solution to RPubs.
Misc R Markdown Examples

Week 14: (Apr 23 – 25)

Reading material for weeks 14 and 15
Watch Goodness-Of-Fit video on ASULEARN
Goodness-Of-Fit
In class Examples
Complete the Comparing Many Parameters: Independence chapter in Inference for Categorical Data in R — DataCamp — Due NLT 5:00 pm Apr 23
Lecture Slides

Optional

Complete the problems in the R Markdown file and publish your solution to RPubs

Week 15: (Apr 30 – May 1)

Reading material for weeks 14 and 15
Watch Chi-Square Test of Independence video on ASULEARN
Watch Chi-Square Test of Homogeneity video on ASULEARN
Goodness-Of-Fit
In class Examples
Complete the Comparing Many Parameters: Goodness of Fit chapter in Inference for Categorical Data in R— DataCamp — Due NLT 5:00 pm Apr 30
Slides for weeks 14 and 15
Course Review

Final Exam — Section -101 (9:30 am Class): May 9: 8:00am - 10:30 am

Final Exam — Section -102 (11:00 am Class): May 7: 11:00am - 1:30 pm

Last Updated on: Feb 19, 2024 at 01:22:17 PM

Tentative STT 3850 Course Schedule - Spring 2024

General Notes:

Grading Rubric for Assignments

Week 1: (Jan 16 – 18)

Optional

Week 2: (Jan 23 – 25)

Optional

Week 3: (Jan 30 – Feb 1)

Optional

Week 4: (Feb 6 – 8)

Optional

Week 5: (Feb 13 – 15)

Optional

Week 6: (Feb 20 – 22)

Optional

Week 7: (Feb 27 – 29)

Optional

Week 8: (Mar 5 – 7)

Optional

Spring Break: Mar 11 – 15

Week 9: (Mar 19 – 21)

Optional

Week 10: (Mar 26 – 28)

Optional

Week 11: (Apr 2 – 4)

Optional

Week 12: (Apr 9 – 11)

Optional

Week 13: (Apr 16 – 18)

Optional

Week 14: (Apr 23 – 25)

Optional

Week 15: (Apr 30 – May 1)

Final Exam — Section -101 (9:30 am Class): May 9: 8:00am - 10:30 am

Final Exam — Section -102 (11:00 am Class): May 7: 11:00am - 1:30 pm