Norwegian version of this page

Data organization

Organizing your data well will enable efficiency and reproducibility of your work. 

A well-organized research project will save you time, improve reproducibility, and make it easier to share. 

Deciding how to organize your data depends on the specific characteristics of your research project. A well-organized project includes: 

Folder Structure 

Structuring your data files in folders is important for making it easier to locate and organise files and to keep track of different versions of files. A proper folder structure is especially needed when collaborating with others. The data folder structure has a huge impact on how your files can be processed and analysed. Once your structure has been filled with data, changing it will be laborious and time consuming. Here are some tips: 

  • Do not use your computer desktop as a storage place 
  • Make a folder hierarchy and use descriptive folder names 
  • Avoid folders that become too general, create more subfolders instead 
  • Create a structure and follow it 
  • Systematic, logical, and clear before you start (!) 
  • Quick and easy to navigate 
  • Simple enough to be used all the time 
  • Scaleable 
  • Keep active and finished parts of your project separate  
  • Periodically take the time to tidy  

You can find an example of a folder structure that is systematic, simple, and scaleable on CodeRefinery pages. You can find another example (that you can also download) here.

File Formats  

In an early stage of your research, you are faced with the question of which formats you will use for your data files. Your initial decision about the file formats should be considered thoroughly. An important part of your project’s metadata and documentation may be embedded into the data file. An example of this is when you take a picture using your mobile’s camera and it embeds the date and location of where a picture was taken in the image file that it creates (i.e. metadata). This information can aid in data analysis, documentation, and reusability. 

To ensure that your data can be accessed well into the future it is often a good idea to store your data (or copies of your data) in sustainable file formats. For example, plain text files (.txt) are more sustainable than Microsoft Word files (.docx) since they have a format that is open, non-proprietary, and often used. 

Here you will find information on sustainable file formats: 

File Naming Conventions 

A file naming convention is a set of rules that govern how files are named in your project. Using a file naming convention can help you save time when trying to find a specific file. It will also make your data easier to reuse and reproduce.  

Information that you might consider including in a file name: 

  • Dates or times that are relevant for the contents of the file 
  • Name of the project or experiment 
  • A version number for the file 
  • Short information on the file’s content 
  • Name or initials of a researcher 
  • Unique identifiers such as experiment number or number in a series 

When naming your files consider the following best practices: 

  • Short and descriptive names 
  • General information first, then add details to the name 
  • Underscore or hyphenate separate words 
  • Write dates backwards (YYYYMMDD) – ISO 8601 standard 
  • Numbers (e.g. version or experiment numbers) should have the same number of digits 
    • use 01 and not 1 if numbers will go over 10 
    • use 0001 and not 1 if numbers will go over 1000 
  • Version number at the end 
  • Avoid using special characters  
    • #, %, &, \ , / , ‘ , “ , !, $ , > , < , { , } , * , ?, = 
  • DO NOT use space in file names 
  • DO NOT start or end your filename with a space, period, hyphen, or underline 
  • Most operating systems are case sensitive; always use lowercase 

Need advice?

Contact us at: research-data@uio.no

Tags: data, organization, folder, file, naming, structure, formats
Published June 16, 2022 9:24 AM - Last modified Apr. 12, 2023 11:05 AM