Boring you think?
Until you face the daunting task of dealing with a project without documentation and properly commented code and then its full blow panic 😨…
So hear me out, I’ve done my fair share of computational projects in biology and medicine to give you some basics of good project management. Let’s go!
- Properly Structure the Project – A good structure makes all the difference in the future and if your find a consistent one accross your projects it will be so easy to know where each file is and in which folder. Here’s my Typical one:
- /Data – in which i often create two subfolders: /Data/rawdata and /Data/prepareddata . In this folder goes all the data of the project. If the project demands data preparation it’s ok to create the aforementioned folders, allowing you to keep a copy of the raw data and fine tune the data preparation pipeline, over and over again.
- /Scripts or /Functions all functions or reusable scripts should go in here.
- /Results or /Outputs to save the results of the code;
- /Docs to save documentation on procedures and the project
- /logs when running tools such as in multiple sequence alignment, logs are important to debug errors. Save them in their folder.
- Outside of all the folders should be your main or master script that aggregates all the pipeline and your Readme file with all the basic instructions to understand and run the project.

- Use Version Control Tools – Keeping track of your code and analysis steps is crucial in any bioinformatics/computational biology project, where pipelines evolve and mistakes happen. Version control systems like Git help you record changes, backtrack when something breaks, and collaborate with others without overwriting each other’s work. Start by initializing a Git repository in your project folder, commit changes regularly with clear messages, and consider hosting your repository on platforms like GitHub, GitLab, or Bitbucket for easy sharing and backup. Even for solo projects, version control is an invaluable safety net. And it helps you build a profile of projects and pipelines you’ve built along the way!
- Keep a Clear README File:
A well-written README is the roadmap to your project — it tells others (and your future self) exactly what your project is about and how to navigate it. At a minimum, your README should explain the purpose of the project, describe the folder structure, list all software and dependencies (including version numbers), and provide simple instructions on how to run your analysis step-by-step. This single file can save countless hours of confusion, especially when you revisit the project months later or share it with collaborators. Treat it as your project’s user manual. - Make a Concise Header for each file – A concise and simple header for each file is crucial to keep it easy to understand. Below is a simple example you can use.
#***********************************************
# Project : <Project Name>
#
# Script name : <name of the file>
#
# Author: <name of the author>
#
# Date created: YYYYMMDD
#
# Summary: <short summary of what's on the file>
#
#
# Revision History: <add updates here>
#
# Date Author Num Summary
# YYYYMMDD <author> 1 Created
#
#**********************************************
5. Use Clear and Consistent Naming Conventions: Good file names make it easy to understand your data at a glance and prevent mix-ups as your project grows. Use descriptive but concise names — for example, sample1_trimmed.fastq
is far clearer than file1.fastq. Including version numbers or dates can help track different stages of your analysis. Avoid spaces in file names; instead, use underscores or hyphens to keep everything script-friendly and easy to read. A consistent naming system saves time and confusion for both you and anyone else working with your data.
6. Archive and Share Data Responsibly: Large bioinformatics datasets can quickly eat up storage and become difficult to manage. Compress intermediate or final results using tools like tar or gzip to save space and make transfers more efficient. When sharing data with collaborators or moving files between servers, create checksum files (such as with md5sum) to ensure that files remain intact and uncorrupted during transfer. For long-term storage and to make your work accessible to the scientific community, consider depositing data in trusted repositories like NCBI SRA, ENA, or Zenodo. Proper archiving and sharing help maintain data integrity and support open, reproducible science.
- Document Each Step of Your Workflow: Don’t rely on memory — write things down as you go! For every step in your pipeline, keep track of what you did, why you did it, and which tool and parameters you used. You can document this in a simple Markdown file, a lab notebook, or directly as comments in your scripts. It’s also a good habit to save the terminal output or log files for each run; this helps with debugging when something goes wrong and gives you a clear audit trail for your results.
- Write a Short Final Report : Once your project is finished (or ready to share), take the time to write a short summary. It doesn’t need to be a full manuscript — just a clear, concise overview of what you did, what you found, and where to find all the relevant scripts, data, and results. This little report can be a lifesaver when you revisit the project months later, or when you want to turn your pipeline into a figure or table for publication.
✅ Wrapping Up
I know it might feel tedious at first, but good project organization and clear documentation are truly your best friends in bioinformatics and computational biology. They save you from countless headaches, prevent lost data and forgotten parameters, and make it so much easier to share your work with collaborators or your future self.
So next time you start a new project, invest a bit of time up front to structure it well, use version control, write helpful READMEs, keep your code tidy, and back up your data properly. Your future self will thank you — trust me!
💡 What about you?
I’d love to hear your favorite tricks or horror stories about messy projects. Drop a comment below and let’s help each other stay organized!
Leave a Reply