Learner Profiles

Ana: The Grad Student with a Growing Codebase

Ana is a second-year PhD student in computational chemistry. She writes Python scripts daily to process simulation data and generate plots. Her analysis/ folder has grown to dozens of files, and she frequently copies functions between projects. She has heard of pip install but has never created her own package. She wants to organize her code so that her labmates can use it without needing to clone her entire project directory.

Ana will benefit most from the episodes on modules, pyproject.toml with uv, and quality assurance. She needs to learn how to turn her collection of scripts into an installable package and add basic tests so her labmates can trust the code.

Raj: The Postdoc Who Needs Reproducibility

Raj is a postdoc in materials science. He has published two papers where reviewers asked for the analysis code, and each time he spent days getting scripts to run on a clean machine. His code depends on numpy, scipy, and a Fortran library he inherited from a former group member. He is comfortable with Python but has never written a pyproject.toml or used a build system.

Raj will benefit from the full lesson arc, especially reproducible scripts, release engineering, and compiled extensions. He needs to learn how to declare dependencies, pin versions, and build binary wheels that include his Fortran code.

Sofia: The RSE Supporting a Research Group

Sofia is a Research Software Engineer embedded in a physics department. She maintains three internal Python packages and helps researchers publish their code. She is familiar with setuptools and pip but wants to modernize the group’s workflow. She has heard about uv and meson-python but has not tried them yet.

Sofia will use this lesson as a reference for migrating existing projects. The episodes on pyproject.toml with uv, quality assurance, release engineering, and compiled extensions are directly relevant to her daily work.

Liam: The Master’s Student Starting From Scratch

Liam is a first-year Master’s student in bioinformatics. He has taken one Python course and can write basic scripts with pandas. His advisor has asked him to “make his analysis reproducible” and he is not sure where to start. He has never used the terminal beyond running python script.py.

Liam should start from the very beginning of the lesson. The episodes on writing reproducible Python and modules will give him the conceptual foundation he needs before moving on to packaging and tooling.

Elena: The PI Who Wants Better Group Practices

Elena is a tenure-track professor who leads a computational materials science group of eight students and postdocs. She has noticed that onboarding new members takes weeks because each project has its own ad-hoc setup, and published results are difficult to reproduce a year later. She can write Python but delegates most coding to her team.

Elena will not attend the full workshop but needs the instructor notes and setup materials to evaluate whether to adopt these practices group-wide. The episodes on quality assurance, release engineering, and CI/CD are most relevant to her goal of establishing reproducible lab standards.

Marco: The Domain Scientist with Legacy Fortran

Marco is a senior PhD student in theoretical chemistry. His group maintains a Fortran codebase (~15k lines) for electronic structure calculations that dates back to the early 2000s. He has written Python wrappers using ctypes and subprocess calls, but the build process is brittle and only works on the group’s cluster. He wants to make the code installable via pip so collaborators can use it.

Marco will benefit most from the episodes on compiled extensions and release engineering. The Meson build system and cibuildwheel pipeline will directly address his distribution challenge. The documentation and CI/CD episodes are also relevant for maintaining the long-term health of his project.