General Notes:

  1. Please bring a notebook and pencil to every class.
  2. The principal documents for this course are ModernDive: An Introduction to Statistical and Data Sciences via R (MD), Data Science with R (DSWR), and Mathematical Statistics with Resampling and R (MSWR) -available inside the ASULEARN course page.
  3. Problem Set (PS) assignments are generally due on Thursdays by 5:00 pm
  4. The links for the problem sets and the sampling distribution assignment in this document may only work for the instructor! To accept each of these assignments, go to ASULEARN and click on the appropriate link in the GitHub Classroom Invitation Links for Assignments block.
  5. QUARTO cheat sheet

Grading Rubric for Assignments

Field Excellent (3) Competent (2) Needs Work (1)
Reproducible All graphs, code, and answers are created from text files. Answers are never hard-coded but instead are inserted using inline R code. An automatically generated references section with properly formatted citations when appropriate and sessionInfo() are provided at the end of the document. All graphs, code, and answers are created from text files. Answers are hard coded. No sessionInfo() is provided at the end of the document. References are present but not cited properly or not automatically generated. Document uses copy and paste with graphs or code. Answers are hard coded; and references, when appropriate are hard coded.
Graphics Graphs for categorical data (barplot, mosaic plot, etc.) have appropriately labeled axes and titles. Graphs for quantitative data (histograms, density plots, violin plots, etc.) have appropriately labeled axes and titles. Multivariate graphs use appropriate legends and labels. Computer variable names are replaced with descriptive variable names. Appropriate graphs for the type of data are used. Not all axes have appropriate labels or computer variable names are used in the graphs. Inappropriate graphs are used for the type of data. Axes are not labeled and computer variable names appear in the graphs.
Coding Code (primarily R) produces correct answers. Non-standard or complex functions are commented. Code is formatted using a consistent standard. Code produces correct answers. Commenting is not used with non-standard and complex functions. No consistent code formatting is used. Code does not produce correct answers. Code has no comments and is not formatted.
Clarity Few errors of grammar and usage; any minor errors do not interfere with meaning. Language style and word choice are highly effective and enhance meaning. Style and word choice are appropriate for the assignment. Complete sentences are used to report all answers. Some errors of grammar and usage; errors do not interfere with meaning. Language style and word choice are, for the most part, effective and appropriate for the assignment. Incomplete sentences and inconsistent punctuation are used to answer questions. Major errors of grammar and usage make meaning unclear. Language style and word choice are ineffective and/or inappropriate. Only numeric values are reported for answers to questions.
Completeness All questions are answered correctly. Answers to questions demonstrate clear statistical understanding by comparing theoretical answers to simulated answers. When hypotheses are tested, classical methods are compared and contrasted to randomization methods. When confidence intervals are constructed, classical approaches are compared and contrasted with bootstrap procedures. The scope of inferential conclusions made is appropriate for the sampling method. A question or two is incorrect or unanswered. Theoretical and simulated answers are computed but no discussion is present comparing and contrasting the results. When hypotheses are tested, results for classical and randomization methods are presented but are not compared and contrasted. When confidence intervals are constructed, classical and bootstrap approaches are computed but the results are not compared and contrasted. The scope of inferential conclusions made is appropriate for the sampling method. More than two questions are incorrect or unanswered. Theoretical and simulated answers are not computed correctly. No comparison between classical and randomization approaches is present when testing hypotheses. When confidence intervals are constructed, there is no comparison between classical and bootstrap confidence intervals .

Tiered Feedback Explanation

Level one. Problem Sets are graded using the rubric on the course pacing guide. The same rubric is used for all of the PS assignments, and you are graded on five categories with possible 3, 2, 1, or 0 points awarded per category. Everyone who accepts a Problem Set will receive level 1 feedback in their repository Issues.

Level two. If you cannot determine what you could do better on future assignments based on the rubric feedback, you can request annotated (Level 2) feedback. If you would like level 2 feedback, you should respond to me in the Issues (@alanarnholt) before noon the Monday after you receive level 1 feedback (which should arrive on Fridays) requesting Level 2 feedback.

I will provide Level two feedback using Issues in your repository to give additional details based on the rubric. Anyone may ask for Level 2 feedback. When you get your level 2 feedback (by Tuesday morning), you are expected to act on it to improve your code and mark the issues as “resolved” and message me in the Issues using (@alanarnholt) before noon on Wednesday.

Level three. After you have received your level 2 feedback, if you are still unclear as to how you can improve your work, you may request to meet with me during student help/office hours Wednesday to receive in-depth feedback and guidance for how to be more successful on the next assignment and how to resolve the Level 2 feedback/Git issues before noon on Thursday.

Asking for level 2 feedback is an agreement between you and me that you will revise and resubmit your document by noon on Thursday and I will look at your revisions and may revise your original rubric grade. If you ask for level 2 feedback and do not revise your document by noon of Thursday I may revise your original grade. After the Thursday following the Thursday when your PS is due, I will not review any further updates or corrections you push to your repository.


Week 1: (Aug 20 – 22)

We will walk through everything outlined below in class. If you want to complete the setup before class, that is fine.

Optional


Week 2: (Aug 27 – 29)

Optional


Week 3: (Sep 3 – 5)


Week 4: (Sep 10 – 12)

Optional

  • Read chapter 4 (Data Importing and “Tidy” Data) of MD — pgs 99-117

  • Read the Git and GitHub chapter from Hadley Wickham’s book R Packages

  • Brian Caffo’s take on R IDEs


Week 5: (Sep 17 – 19)

Optional


Week 6: (Sep 24 – 26)

Optional


Week 7: (Oct 1 – 3 – University Closed Due to Helene)

* Lecture Slides

* Probability

* Complete (The binomial distribution) in Foundations of Probability in RDataCamp — Due NLT 5:00 pm Sep 30

* Complete (Laws of probability) in Foundations of Probability in RDataCamp — Due NLT 5:00 pm Oct 1

* Complete (Bayesian statistics) in Foundations of Probability in RDataCamp — Due NLT 5:00 pm Oct 2

Optional


Week 8: (Oct 8 – 10 – University Closed Due to Helene)


Fall Break: Oct 14 – 15


Week 9: (Oct 16 – 17)


Week 10: (Oct 22 – 24)

Optional


Week 11: (Oct 29 – 31)



Week 12: (Nov 5 – 7)


Week 13: (Nov 12 – 14)

Optional


Week 14: (Nov 19 – 21)

Optional


Thanksgiving Break (Nov 27 – 29)


Week 15: (Nov 26 & Dec 3)


Final Exam — Section -101 (9:30 am Class): Dec 10: 8:00 am - 10:30 am

Final Exam — Section -102 (11:00 am Class): Dec 5: 11:00 am - 1:30 pm


Last Updated on: Oct 03, 2024 at 07:41:36 AM