How I use github in 500 words

My feelings about github are somewhere at the nexus of aggravated and grateful.  Git is not easy because it is composed of a series of incantations based out of somewhat identifiable English.  Close enough that you think they have meaning but far enough that you want to bludgeon yourself in the hopes of forgetting semantics.  And yet, version control and backups are important to save your butt.

I shall confess (without apology), that I do not use command line git(hub). Can I? Sure. Do I want to try and remember those commands while I’m trying to keep my brain on my code? No. So the github app is where I find the sweet spot between extracting the power of version control without the anguish.

Here’s my use case for git: I write code, I want to track it, github works well for that, and I usually work alone.  All I need to do is create, commit, sync, and get back to my freaking job. I don’t do pull requests or contribute to shared projects. GET OFF MY LAWN.

Sorry, let’s get back to how to do commits and restore deleted files.

Step 1: Make a github account. Already have one? Go apply for the student developer pack if you haven’t already.

Step 2: Download the Github Desktop application: https://desktop.github.com/

Step 3: Go into the application, find preferences. Remember password. Sign in.

Step 4: Find the plus button thing. Click. Add will let you point the app at an existing folder and create a repo from that, Create makes a new one, and Clone is an incantation for the sociable. Click Create, give it a name, and click Create repository.

Screen Shot 2016-05-09 at 10.03.23 PM.png

Step 5: That repository name is now a folder. Go find it. Or make a new one somewhere you can remember.

Step 6: Add some crap to that folder.

Screen Shot 2016-05-09 at 10.10.43 PM.png

Step 7: Go back to the app and pour your eyeballs on your newly tracked file. The stuff on the right shows you the contents of the file. Green shows additions, red deletions. The little icon next to the repository name on the sidebar should be a computer monitor looking doodad, meaning that it is a local repo.

Step 8: Add a commit message (short & sweet) and description (maybe longer, with punctuation). Click Commit to Master. To send off to github, click Publish on the upper right. You can stack up a bunch of commits before publishing, if you want.  The first time through it’ll ask you some stuff. Just click the publish again.

Screen Shot 2016-05-09 at 10.14.22 PM

Step 10:  Go change some crap in your file and check back to the app. Make a less crappy commit message and click Commit to master. Then this time Sync when ready to send it to github.

Screen Shot 2016-05-09 at 10.18.37 PM

Step 11: Delete that file and check the app. GONE.

Screen Shot 2016-05-09 at 10.31.28 PM.png

Click the Repository menu item and select Discard Changes to Selected Files.

Screen Shot 2016-05-09 at 10.32.08 PM.png

Ya, everything goes away from the repo because THE FILE IS BACK.

Step 12: Go back to your research.

490 words.

Advertisements

Review of: R for Everyone

Like so many people out there, I have been hacking and spitting my way through R.  I’ve made some awesome stuff, made the stats work, made some graphs, and written R Markdown notebooks that take 30 minutes to render (no, not because of for loops).  I feel comfortable saying that I am capable in R, but I’m still in the “incantation” phase of language understanding: I don’t really know why I’m doing [thing] but I know that [thing] will work because Stack Overflow told me so.

Screen Shot 2016-05-04 at 5.57.25 PM

I remember this phase in Python, but after attending a week long PyCamp, hanging out with extraordinary people of Py-CU I feel completely capable in Python.  I don’t know everything, but I understand every piece of syntax that I use and I’m comfortable diving into new topics.

The challenge of R is that so many of the materials and documentation are written for statisticians.  R is a statistical language, so this isn’t a bad thing, but is a piece of context that seems to be lost for many of the R experts.  Please stop telling me “everything is a vector” because my soul dies a little more each time someone earnestly tells me that, as if it is helpful to the general public.  No.  It isn’t.

I don’t care that everything is a vector and no, I don’t want to explore the philosophical implications of that. I need to run some statistics and make a few charts.  I understand data types, variable names, and data processing.  I’ve got my data and I know my research question.  I just need to smash that into a script and I need to know how to do it in R.  In short, I needed an R book written for a developer from another language, or at least something good for the angry cynical crowd.

Cue a timely recommendation for R for Everyone (2014) by Jared P. Lander.  At this point in my gum-and-spit-based R career I’m pretty desperate for help.  The R Cookbook helped a little, but lacking much of the foundational R know-how means that even clear explanations of advanced concepts are still opaque.  I loaded up an ebook version from my library, skimmed the chapter on the apply() family and ordered it from Amazon with my fingers crossed.

Striking a great balance between at the intersection of knowing the language incredibly well but not giving us the hard sell on why R is savior of our data souls, the examples are short, simple, and don’t try to clean up the messy output you’re used to in the interpreter.

Chapters 4 & 5 are the missing pages of my R life.  These cover the absolute basics of working within R, including data types and containers. These chapters need to be standard reading for everyone who complains about R.  The writing perspective highlights the variety of syntax oddities with acknowledgment of them rather than apology.

Screen Shot 2016-05-04 at 6.26.58 PM

Why I can’t use library books on R

Some chapters are perhaps overly detailed and would suit someone newer to programming (chapters 8-10 cover functions, control statements, and looping), while others attempt to cover such broad topics that they are more of a look book (chapter 7 on ggplot2). I was particularly happy with the pace until I hit chapter 11, where the plyr section went a little nuts.  Some syntax and packages are not explained, and a peak into some of the incorrect index page numbers makes me suspect that some editing and reorganizing happened without picking up the pieces.  But that doesn’t take away the ultimate value of this book.

The book seems to have three basic sections:  basic R programming, statistical tools, and advanced R programming topics.  The covered range of topics is ridiculously broad, and I think does a decent job of balancing the pace and level of detail.  Some chapters can be a bit on the side of just a vocabulary lesson rather than instructive, but this is a hallmark of a book where the chapters are meant to stand alone from the whole.  Those chapters tend to be the topic areas where further instruction would put the book’s content into maths instruction rather than R instruction.  So I understand.

This book is not for basic statistics instruction, for teaching core programming fundamentals, or to serve as a singleton resource on R.

This book is a valuable supplement for a statistics course in R, an intermediate R user wanting to sample some advanced techniques, or a self-taught R user to fill in some blank spots.

Overall I would classify this book as exceptional for reference and supplement, but not as a textbook or something with problems for students to work through.

Nitpicks:

  • The narrative doesn’t clarify which packages are standard library versus external and often pulls in packages but doesn’t note which functions are coming in from that package.  Much of this has to do with the profoundly annoying namespace issues that R has with namespaces, and often being overly explicit about where functions are coming in from is necessary for R instruction.
  • The author names many people who work to teach and create R packages, which provides a nice peak into the development community, but sometimes they feel like unnecessary name dropping. Again, though, this is a nitpick.
  • Coverage of the [] subsetting method doesn’t seem to appear in the book. It is used, but never thoroughly spoken of.  I would have traded some of the longer sections on basic programming concepts for more discussion about subsetting data.
  • Additional tables with summary information would be extremely valuable.  Particularly in the chapters where specific tasks are covered. Examples:  cheatsheets on selecting columns for data frames, the apply family, and aggregation.