3 Project Management
3.1 Teams and Work Cycles
At EHA projects are typically months-to-years-long workstreams centered around a main analytical or research problem. A program, grant or contract may have multiple projects, and a project may have multiple outputs such as reports, scientific publications, or apps. A project typically has a small (2-5 person team) with a project lead and possibly a project administrative point of contact (APOC).
We organize projects into work cycles of 4-8 weeks. For each cycle, a team should define day-to-week scale tasks, assign tasks to members, determine the percentage of time team members will put towards the project in that cycle, given other workloads, and plan travel, reporting, collaboration, or other deadlines.
Teams report out their progress at the end (and start) of each work cycle at our weekly M&A meetings. Report-outs should include
- Progress on tasks assigned and completed in previous cycle
- Substantive report-out of results and products
- Draft plan for tasks and goals for the coming cycle
- Team assignments for the next cycle and level of involvement (high (>50%), medium (25%-50%, low (<25%)) of team members over the cycle.
- Any additional deadlines or reporting anticipated in that time frame, including plans for other internal presentations or feedback sessions.
During report-outs, the M&A group will provide feedback for the upcoming cycle and set a date for the next one.
Teams track work cycle progress through various mechanisms based on team preferences. One option is GitHub Milestones (Example). Others use Google Spreadsheets, Air tables, or other systems. Teams may choose what they prefer as long as their system
- Shows current tasks, deadlines, and assignments Tracks past tasks, deadlines, and assignments
- Includes top-level summaries for a reporting period
- Is available in “real time” online rather than stored on individual machines and e-mailed
- Can be made accessible to other staff via a URL but kept private within EHA
note: historically teams had used Asana
3.2 Setup and materials Organization
An M&A project lasting more than one work cycle should typically have a Slack channel for communication, a GitHub repo for data and analysis code, or Dropbox or Google Drive folder for documents or materials not appropriate for git-based version control. In addition, it may have a Paperpile folder for references. In general, one URL (often the GitHub README) should be the starting point from which one can reach all project materials.
3.2.1 Code organization
In general, one should aim to set up the analysis portion of a project in a self-contained way, with clear separation between raw data, processed data, exploratory analyses, and final products. In organizing a project folder, ask
If I copied this whole folder onto someone else’s computer, could they pick up the project?
Are the folder organization and file naming clear?
There are some exceptions for large data sets or rapidly changing data sets. In these cases, data can be organized as a separate folder or project, and large data sets can be stored in an Amazon Web Services S3 bucket.
In many cases it is actually best for data to be organized as a separate resource from analysis. This allows multiple analysis projects to rely on the same upstream data project, avoiding multiple versions of data. Data may also not be best stored in a git repository but in a project database to be pulled for analyses.
See EHA guidance on setting up data resources for a project here
3.2.1.1 RStudio Projects
In general we also prefer that R analyses be set up as RStudio projects.
3.2.1.2 targets
We recommend that analysis projects be set up using the
targets
framework to
define steps in the code. targets
is a package for defining R project workflow
and tracks your functions and objects to ensure everything is up-to-date. Here
are some resources for getting started with targets
Internal presentations on using
targets
at EHA: (Videos (password protected on internal AirTable),The targets manual, including a walkthrough
A high-level overview paper outlining the concepts and benefits to workflow tools like
targets
.A good paper discussing how
targets
helps with research workflowsA good intro talk(~90 mins with questions) by Miles McBain on using targets and getting a good workflow
A great introductory video series (5 30-minute lessons as a YouTubePlaylist)
3.2.1.3 Package management with renv
We strongly recommend that projects use the renv
package to manage versions of
R packages so that the project does not break when packages update or when run
on machines that have different package versions installed.
- The
renv
getting started guide tells you most of what you need to know. - Here is a short presentation on using
renv
at EHA: (Video (via password at internal AirTable), Slides, Example Repository on GitHub)