class: center, middle, inverse, title-slide .title[ # Using git ] .subtitle[ ## ⚔
Tales of Peril, Pain and Protection ] .author[ ### Julia Piaskowski ] .date[ ### 2022/11/08 ] --- class: center, middle, inverse ## https://jpiaskowski.gitlab.io/talks/git-asa-cssa-sssa-2022/ --- # What Do I Use git & GitHub For? .pull-left[ ### 1. Share and access data ### 2. Collaborate ### 3. Build things ] .pull-right[ data:image/s3,"s3://crabby-images/8cbfb/8cbfbb43e108afcdd82be5bcc1ebf50211ae57f9" alt="" .right[[GitHub Octocat](https://octodex.github.com/)] ] --- # Share & Access Data/Software/Demos data:image/s3,"s3://crabby-images/af149/af1494b604237248dfcdd788e0825fa53b10cb07" alt="" .right[[*Tidy Tuesday datasets*](https://github.com/rfordatascience/tidytuesday)] --- # Share & Access Data/Software/Demos <img src="git_talk/hugging_face.png" width="85%" /> .right[[*Hugging Face Deep Learning Tools*](https://github.com/huggingface)] --- # Fork (Copy) a Repository <img src="git_talk/fork_example.png" width="95%" /> .right[[*Course Template*](https://github.com/jpiaskowski/sta210-s22-template)] --- # Collaborate: File Issues - Find out if someone else has experience a problem similar to yours - Notify developers of a software bug - Request a new feature data:image/s3,"s3://crabby-images/c8e1b/c8e1b98734774631d4fe7abcef578d286284f3bb" alt=""github examples issues"" .right[[*Issue tab for r-lib/actions*](https://github.com/r-lib/actions/issues)] --- # Collaborate With Others <br> <br> <br> > Collaboration is the most compelling reason to manage a project with Git and GitHub. My definition of collaboration includes hands-on participation by multiple people, including your past and future self, as well as an asymmetric model, in which some people are active makers and others only read or review. .right[[J. Bryan, *Peer J* (2017)](https://peerj.com/preprints/3159/)] --- # git vs. GitHub data:image/s3,"s3://crabby-images/f96c9/f96c943b041c0998f39056ff41da1cc537f4134d" alt="" --- # Version Control Version control captures changes across files, describing the difference between each and maintaining a linear history that can be rewound if needed. <img src="git_talk/flower_growing.jpg" width="85%" /> --- # Why Might You Need This? To clarify your data curation and analysis process for *research reproducibility* * Files change over time * Results may be shared * Many people are contributing to a project > ...reproducibility is obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis. - Reproducibility and Replicability in Science (2018) National Academies of Sciences .right[[*Reference*](https://www.nationalacademies.org/our-work/reproducibility-and-replicability-in-science)] --- # Why Might You Need This? You may be already implementing ad hoc version control. data:image/s3,"s3://crabby-images/75493/75493ec8c4f03212657b31a7cb5243e60a5283d3" alt=""ad hoc version control example"" --- # What It Could Look Like data:image/s3,"s3://crabby-images/6b3a7/6b3a71c43b28568e2691414ebd8d6a116a9e4618" alt=""schematic comparing version control"" --- # Version Control Advantages * Sharing for asynchronous work - no more "data_final_KC_EP_JLP_BP.csv" or wondering if you have the most recent version of files * Abundantly clear what is the central or main version of files * Very advanced version of "track changes" * Commits create safety points in case of disaster - you can recover previous version * Branching creates space for experimentation --- # Generate a Web Presence data:image/s3,"s3://crabby-images/34df3/34df37ebe374bd76d44b65bcded22bc230f4f713" alt=""agstat.io screenshot"" .right[[agstats.io](agstats.io) | [GitHub](https://github.com/IdahoAgStats/stat-programs-website)] --- # Post a Tutorial as an Online Book data:image/s3,"s3://crabby-images/529cd/529cd7c0c039612e08d09d08a874dfe205afc46f" alt=""example gitbook screenshot"" .right[[Happy Git with R](https://happygitwithr.com) | [GitHub](https://github.com/jennybc/happy-git-with-r)] --- # Deploy Course Website data:image/s3,"s3://crabby-images/6a0da/6a0daad980fca6e667de0585d5eb223d407500e9" alt=""example course"" .right[[Statistical Rethinking](https://bayesf22-notebook.classes.andrewheiss.com/rethinking/) | [GitHub](https://github.com/andrewheiss/bayesf22-notebook)] --- # Documentation for Software <img src="git_talk/pkgdown_example.png" width="95%" /> .right[[*rusda R package*](https://docs.ropensci.org/rusda/) | [GitHub](https://github.com/ropensci/rusda/)] --- # Automate Tasks * Automate action: `R cmd check` or `blogdown::build_site()` * Run a GitHub action to do CI/CD (continuous integration/continuous deployment) * GitHUb actions can be challenging: data:image/s3,"s3://crabby-images/107be/107be2b46d8ec0523476adbfed2483ca2c773efc" alt=""examples of failed workflows"" --- # What Git and GitHub Are *Not* Suitable For * Making publication data sets available - this is not an appropriate long-term repository (just ask the National Academy of Sciences!) * Tracking changes in binary files (.doc, .docx, .xls, .xlsx, .pdf) There are special tools to work with large files - make sure you use those if your files are large! --- # Things Can Go Wrong With Version Control <img src="git_talk/rejected_badges.png" width="85%" /> .right[[*Snark GitHub Badges*](https://github.com/Flet/rejected-github-profile-achievements)] --- # Lesson 1: Take Learning git Seriously .pull-left[data:image/s3,"s3://crabby-images/18e53/18e53fcf0f382a01c67da291f29648f7cbd56a0a" alt=""] .pull-right[ Decent resources for self study: * [Happy Git with R](https://happygitwithr.com/) * [Software Carpentry git workshop](https://swcarpentry.github.io/git-novice/) * [Thee book of git](https://git-scm.com/book/en/v2) <br> <br> <br> .right[[*xkcd cartoon*](https://xkcd.com/1597/)] ] ??? Those 2-hour workshops are a good way to get started! But more training is needed. You will find yourself in a sticky situation that only you can resolve. Do you want to implement a mystery solution you found on Stack Overflow and rick data loss?? --- # Lesson 2: Use a git Client (A graphical user interface) data:image/s3,"s3://crabby-images/8cb03/8cb03b2359f91e9ed4fe27d54527619ffe775a87" alt="" --- # Some Graphical User Interfaces .pull-left[ * GitHub GUI * Git Kraken * SourceTree * ....[tons more](https://git-scm.com/downloads/guis) The goal is to become accustomed to using git regularly - use the tools that help you reach that] .pull-right[ data:image/s3,"s3://crabby-images/6eb72/6eb7219db9a844b08d5076e6e2decc839490305e" alt="" ] --- # Lesson 3: Be Patient When an Error Occurs .pull-left[ When the inevitable error happens: * a merge error * can't pull or push! * a `git revert` gone horribly wrong Proceed **cautiously** and diagnose the problem ] .pull-right[ data:image/s3,"s3://crabby-images/f3315/f33152a7fbd45fa38d2c3cb748c31a43ac0eefac" alt=""] ??? All can be untangled - if you act wisely. It's also very possible to make things irreversibly worse! --- # Lesson 4: Don't Expose Secrets .pull-left[* Learn about and use the `.gitignore` file * Consider private repositories when appropriate * There are some guidelines regarding legal compliance (e.g. [HIPAA](https://github.com/truevault/hipaa-compliance-developers-guide))] .pull-right[ data:image/s3,"s3://crabby-images/7f802/7f802b228738895ca985f76638406ed9ea15f264" alt="" ] --- # git is a Humbling Experience .center[data:image/s3,"s3://crabby-images/b37e0/b37e0367b98819a779c9d7c4e6e750f4ba492c33" alt=""] *(I still think git is worth the trade-offs)* .center[data:image/s3,"s3://crabby-images/45f6d/45f6dac4c54dd0c7343b67991f47a359a12dbc51" alt=""] .right[[*original tweet*](https://twitter.com/HenryHoffman/status/694184106440200192)]