24 Training Resources and Plans

This section under revision as we consider alternate resources.

If you are planning on spending significant time improving your data science and modeling skills, you will want to create a training plan.

24.1 Training plan components

  • A description of your goals for the training plans and to which EHA projects and activities you will apply your skills
  • A list of courses/tutorials your plan to complete
  • The total time the courses will take to complete.
  • The time frame over which you expect to complete them
  • The name of a peer learner. You should have a peer learner at EcoHealth who will be a partner over the course of your training. This may be someone working on a similar training plan or someone with knowledge of the material already. They should play some or all of these roles:
    • Accountability: Your peer learner should know about your training plan and its time frame, and check in on how you are doing.
    • Co-learning: Your learning peer and you may want to schedule times to watch course videos and complete exercises together
    • Review: Especially for materials without automating, your peer learner should be able to look at your work and provide feedback
    • Motivation: Your peer learner should make your training fun and tell you that you rock.

When your supervisor signs off on your training plan, contact Megan Walsh to provide you with any subscriptions for the period you need them.

24.2 Training Plans

These are some suggestions for assembling resources and courses into training plans. This is of course a small fraction of the many learning and teaching resources available. Consult your peers, supervisor, and the #data-sci-discuss Slack channel to find courses or resources on the topics you require. If you use a new resource or course, please add to to this page so others can learn from your impressions of it!

24.2.1 Introductory Programming materials

  • Hands-On Programming with R: An introduction to R for non-programmers with a focus on project based learning.

  • Introduction to R: This introduction is designed to get you familiar with R quickly. It covers the basics and explains how to work with common data types in R.

  • Eloquent Javascript A project based book that will take you from the basics to creating websites. For R users, the chapter on data structures is especially helpful for understanding JSON and jsonlite.

24.2.2 Better Managing Data

24.2.3 Version Control and Git

24.2.4 Reproducible reporting

24.2.5 Improving Your Statistical Fundamentals

24.2.6 Improving your data visualization

  • Fundamentals of Data Visualization by Claus Wilke is an excellent guide to making high-quality figures, focusing more on design than mechanics of programming. R code is available for all of its examples. If you feel you have a solid grasp of ggplot2 but want to improve the quality of your figures, we recommend reading this e-book, and using the accompanying code in its GitHub repository to reproduce figures.

### R Programming

Advanced R: Functions “In this chapter, you’ll learn how to turn informal, working knowledge [of functions] into more rigorous, theoretical understanding.”

24.2.7 Map-making and geospatial analysis in R

  • Geocomputation in R is a comprehensive guide for understanding geographic data, mapping, and conducting spatial analysis in R. Likely, the most relevant chapters for your purposes are 1-8, 10-11. A chapter might take you 1-3 hours to work through, depending on how in depth you want to get and the number of exercises that you complete.
  • Data Carpentry has a course on using R for spatial data. Like other *Carpentry lessons its designed as a workshop lesson plan but can be self-taught. It presumes very little R knowledge at all, and includes stuff like setting a project in RStudio. This is a good place to start people or students with little R experience to get them making maps right away. If you just want to get a quick feel for R spatial data types, jump into Chapter 3.

  • Making Maps with R is a quick-start guide to mapping with ggplot2. It also introduces the gmap, maps, and mapdata packages for providing basemaps on which to overlay your spatial data. It is good for getting a map together quickly but if you are going to be doing things on a regular basis we suggest the resources above, which give you a better foundation on geographic data.

  • Leaflet for R is a manual on the use of the R leaflet package to harness Leaflet, an open-source JS library for creating interactive maps. Leaflet maps particularly useful for exploring and visualizing spatial data, and are easily embedded into R Markdown documents. You should take a course or have knowledge of R Markdown prior to taking this course.

24.2.8 Bioinformatics

  • Conceptual and practical introduction to some of the main topics in Bioinformatics.

24.3 Metagenomics