File management
University of Edinburgh
A Data Management Plan (DMP) covers data types and volume, capture, storage, integrity, confidentiality, retention and destruction, sharing and deposit.
A research compendium accompanies, enhances, or is a scientific publication providing data, code, and documentation for reproducing a scientific workflow.
A research compendium is a collection of all digital parts of a research project including data, code, texts (protocols, reports, questionnaires, meta data). The collection is created in such a way that reproducing all results is straightforward.
Create one folder and make that the folder for your dissertation project.
In that folder, create folders for data/
and for scripts/
(and plots/
, dissertation/
, etc).
In data/
have a raw/
and derived/
folder:
Raw data (data that, if lost, it is very unfortunate; for example, experiment data, data which was manually annotated, etc) should be saved in data/raw/
.
Derived data (data that is derived with scripts) should be saved in data/derived/
.
Make sure you have a backup system in place.
Save copies of the entire folder in an external hard drive.
Saving copies of the entire folder in an online storage service (iCloud Drive, One Drive, DropBox, Google Drive, …).
Using a versioning system like git.
Be prepared to change how files and folders are organised after you start.
Projects evolve over time and sometimes you need to clean things up.
Use a good system to mark versions in your files. Two simple systems:
dissertation-2022-11-21
.dissertation-2023-03-01
.dissertation-v1.0
.dissertation-v1.1
.dissertation-v2.0
.A license gives someone official permission to reuse something while protecting the intellectual property of the original creator.
Use open licenses to ensure the data/code can be used by other researchers.
The Creative Commons licenses are now common in research.
Discuss in small groups.
How have you organised your files so far?
Something you would like to change?
Something you would like to keep?