class: center, middle, inverse, title-slide .title[ # Using git ] .subtitle[ ## ⚔
Tales of Peril, Pain and Protection ] .author[ ### Julia Piaskowski ] .date[ ### 2022/11/08 ] --- class: center, middle, inverse ## https://jpiaskowski.gitlab.io/talks/git-asa-cssa-sssa-2022/ --- # What Do I Use git & GitHub For? .pull-left[ ### 1. Share and access data ### 2. Collaborate ### 3. Build things ] .pull-right[ ![](git_talk/octocat_original.png) .right[[GitHub Octocat](https://octodex.github.com/)] ] --- # Share & Access Data/Software/Demos ![](git_talk/tidy_tuesday.png) .right[[*Tidy Tuesday datasets*](https://github.com/rfordatascience/tidytuesday)] --- # Share & Access Data/Software/Demos <img src="git_talk/hugging_face.png" width="85%" /> .right[[*Hugging Face Deep Learning Tools*](https://github.com/huggingface)] --- # Fork (Copy) a Repository <img src="git_talk/fork_example.png" width="95%" /> .right[[*Course Template*](https://github.com/jpiaskowski/sta210-s22-template)] --- # Collaborate: File Issues - Find out if someone else has experience a problem similar to yours - Notify developers of a software bug - Request a new feature !["github examples issues"](git_talk/r-lib_issues.png) .right[[*Issue tab for r-lib/actions*](https://github.com/r-lib/actions/issues)] --- # Collaborate With Others <br> <br> <br> > Collaboration is the most compelling reason to manage a project with Git and GitHub. My definition of collaboration includes hands-on participation by multiple people, including your past and future self, as well as an asymmetric model, in which some people are active makers and others only read or review. .right[[J. Bryan, *Peer J* (2017)](https://peerj.com/preprints/3159/)] --- # git vs. GitHub ![](git_talk/git_github_schematic.jpg) --- # Version Control Version control captures changes across files, describing the difference between each and maintaining a linear history that can be rewound if needed. <img src="git_talk/flower_growing.jpg" width="85%" /> --- # Why Might You Need This? To clarify your data curation and analysis process for *research reproducibility* * Files change over time * Results may be shared * Many people are contributing to a project > ...reproducibility is obtaining consistent results using the same input data; computational steps, methods, and code; and conditions of analysis. - Reproducibility and Replicability in Science (2018) National Academies of Sciences .right[[*Reference*](https://www.nationalacademies.org/our-work/reproducibility-and-replicability-in-science)] --- # Why Might You Need This? You may be already implementing ad hoc version control. !["ad hoc version control example"](git_talk/file_overload.png) --- # What It Could Look Like !["schematic comparing version control"](git_talk/version_control_contrast.png) --- # Version Control Advantages * Sharing for asynchronous work - no more "data_final_KC_EP_JLP_BP.csv" or wondering if you have the most recent version of files * Abundantly clear what is the central or main version of files * Very advanced version of "track changes" * Commits create safety points in case of disaster - you can recover previous version * Branching creates space for experimentation --- # Generate a Web Presence !["agstat.io screenshot"](git_talk/agstat_website.png) .right[[agstats.io](agstats.io) | [GitHub](https://github.com/IdahoAgStats/stat-programs-website)] --- # Post a Tutorial as an Online Book !["example gitbook screenshot"](git_talk/example_gitbook.png) .right[[Happy Git with R](https://happygitwithr.com) | [GitHub](https://github.com/jennybc/happy-git-with-r)] --- # Deploy Course Website !["example course"](git_talk/example_course.png) .right[[Statistical Rethinking](https://bayesf22-notebook.classes.andrewheiss.com/rethinking/) | [GitHub](https://github.com/andrewheiss/bayesf22-notebook)] --- # Documentation for Software <img src="git_talk/pkgdown_example.png" width="95%" /> .right[[*rusda R package*](https://docs.ropensci.org/rusda/) | [GitHub](https://github.com/ropensci/rusda/)] --- # Automate Tasks * Automate action: `R cmd check` or `blogdown::build_site()` * Run a GitHub action to do CI/CD (continuous integration/continuous deployment) * GitHUb actions can be challenging: !["examples of failed workflows"](git_talk/workflow_failures.png) --- # What Git and GitHub Are *Not* Suitable For * Making publication data sets available - this is not an appropriate long-term repository (just ask the National Academy of Sciences!) * Tracking changes in binary files (.doc, .docx, .xls, .xlsx, .pdf) There are special tools to work with large files - make sure you use those if your files are large! --- # Things Can Go Wrong With Version Control <img src="git_talk/rejected_badges.png" width="85%" /> .right[[*Snark GitHub Badges*](https://github.com/Flet/rejected-github-profile-achievements)] --- # Lesson 1: Take Learning git Seriously .pull-left[![](git_talk/xkcd_git.png)] .pull-right[ Decent resources for self study: * [Happy Git with R](https://happygitwithr.com/) * [Software Carpentry git workshop](https://swcarpentry.github.io/git-novice/) * [Thee book of git](https://git-scm.com/book/en/v2) <br> <br> <br> .right[[*xkcd cartoon*](https://xkcd.com/1597/)] ] ??? Those 2-hour workshops are a good way to get started! But more training is needed. You will find yourself in a sticky situation that only you can resolve. Do you want to implement a mystery solution you found on Stack Overflow and rick data loss?? --- # Lesson 2: Use a git Client (A graphical user interface) ![](git_talk/git_kraken.png) --- # Some Graphical User Interfaces .pull-left[ * GitHub GUI * Git Kraken * SourceTree * ....[tons more](https://git-scm.com/downloads/guis) The goal is to become accustomed to using git regularly - use the tools that help you reach that] .pull-right[ ![](git_talk/rstudio_git.png) ] --- # Lesson 3: Be Patient When an Error Occurs .pull-left[ When the inevitable error happens: * a merge error * can't pull or push! * a `git revert` gone horribly wrong Proceed **cautiously** and diagnose the problem ] .pull-right[ ![](git_talk/waldocat.png)] ??? All can be untangled - if you act wisely. It's also very possible to make things irreversibly worse! --- # Lesson 4: Don't Expose Secrets .pull-left[* Learn about and use the `.gitignore` file * Consider private repositories when appropriate * There are some guidelines regarding legal compliance (e.g. [HIPAA](https://github.com/truevault/hipaa-compliance-developers-guide))] .pull-right[ ![](git_talk/saint-nicktocat.jpg) ] --- # git is a Humbling Experience .center[![](git_talk/git_disaster.png)] *(I still think git is worth the trade-offs)* .center[![](git_talk/guitar_hero.png)] .right[[*original tweet*](https://twitter.com/HenryHoffman/status/694184106440200192)]