3 Project Management

3.1 Teams and Work Cycles

At EHA projects are typically months-to-years-long workstreams centered around a main analytical or research problem. A program, grant or contract may have multiple projects, and a project may have multiple outputs such as reports, scientific publications, or apps. A project typically has a small (2-5 person team) with a project lead and possibly a project administrative point of contact (APOC).

We organize projects into work cycles of 4-8 weeks. For each cycle, a team should define day-to-week scale tasks, assign tasks to members, determine the percentage of time team members will put towards the project in that cycle, given other workloads, and plan travel, reporting, collaboration, or other deadlines.

Teams report out their progress at the end (and start) of each work cycle at our weekly M&A meetings. Report-outs should include

  • Progress on tasks assigned and completed in previous cycle
  • Substantive report-out of results and products
  • Draft plan for tasks and goals for the coming cycle
  • Team assignments for the next cycle and level of involvement (high (>50%), medium (25%-50%, low (<25%)) of team members over the cycle.
  • Any additional deadlines or reporting anticipated in that time frame, including plans for other internal presentations or feedback sessions.

During report-outs, the M&A group will provide feedback for the upcoming cycle and set a date for the next one.

Teams track work cycle progress through various mechanisms based on team preferences. One option is GitHub Milestones (Example). Others use Google Spreadsheets, Air tables, or other systems. Teams may choose what they prefer as long as their system

  • Shows current tasks, deadlines, and assignments Tracks past tasks, deadlines, and assignments
  • Includes top-level summaries for a reporting period
  • Is available in “real time” online rather than stored on individual machines and e-mailed
  • Can be made accessible to other staff via a URL but kept private within EHA

note: historically teams had used Asana

3.2 Setup and materials Organization

An M&A project lasting more than one work cycle should typically have a Slack channel for communication, a GitHub repo for data and analysis code, or Dropbox or Google Drive folder for documents or materials not appropriate for git-based version control. In addition, it may have a Paperpile folder for references. In general, one URL (often the GitHub README) should be the starting point from which one can reach all project materials.

3.2.1 Code organization

In general, one should aim to set up the analysis portion of a project in a self-contained way, with clear separation between raw data, processed data, exploratory analyses, and final products. In organizing a project folder, ask

  • If I copied this whole folder onto someone else’s computer, could they pick up the project?

  • Are the folder organization and file naming clear?

  • There are some exceptions for large data sets or rapidly changing data sets. In these cases, data can be organized as a separate folder or project, and large data sets can be stored in an Amazon Web Services S3 bucket.

  • In many cases it is actually best for data to be organized as a separate resource from analysis. This allows multiple analysis projects to rely on the same upstream data project, avoiding multiple versions of data. Data may also not be best stored in a git repository but in a project database to be pulled for analyses.

  • See EHA guidance on setting up data resources for a project here

3.2.1.1 RStudio Projects

In general we also prefer that R analyses be set up as RStudio projects.

3.2.1.2 targets

We recommend that analysis projects be set up using the targets framework to define steps in the code. targets is a package for defining R project workflow and tracks your functions and objects to ensure everything is up-to-date. Here are some resources for getting started with targets

3.2.1.3 Package management with renv

We strongly recommend that projects use the renv package to manage versions of R packages so that the project does not break when packages update or when run on machines that have different package versions installed.