Advice for helping coding beginners in user groups: smile, don’t brain dump, and remember to shut up

I was recently asked to be a guest speaker talking to a Makerspaces class about teaching and supporting programming in the makerspace context.  After writing up my notes and presenting, I realized that I have learned a lot in my three years with the Champaign-Urbana Python User Group (Py-CU).  This blog post contains my general framework for making recommendations to and supporting beginners who come to our Python meetup.  Each space and group has a unique vibe, so I don’t expect this to be a universal framework.

Py-CU offers a weekly hack/project work time, open to all members of the community and all skill levels.  Designed to be a simple work time for people, we get a diverse crowd of regular, semi-regular, and one-off attendees of all skill levels.  These include absolute beginners, students and faculty from UIUC, professional developers, and everything in between.

Before I start digging into the recommendations, I want to break down some of the important characteristics often found in our attendees.  Again, this doesn’t describe everyone, but most of our attendees would identify with at least one of these factors.

Py-CU is a subgroup of Makerspace Urbana, which (as makerspaces are wont to do) is located in an awkward space, in an awkward building, in an awkward part of town.  Thus, the people who successfully do the research and make the trek down are there for a reason.  People don’t usually show up at a user group because all their needs are met and they are completely satisfied with their use of that product.  This is, in fact, why user groups generally exist.  There’s a hidden word in “user group”, which should really unpack to “user support group.”  Never forget this.

So, people show up at Py-CU because something has gone wrong in their use or learning experience with Python.   Sometimes this means they:

  • can’t find the right resource or information
  • can’t determine which resource is right for them
  • can’t get that resource to work or otherwise make sense of it
  • have questions they can’t get answered or don’t understand the answers they’ve received (insert the Stack Overflow stink eye)
  • are doing this outside of their job and/or coursework and want a dedicated time and place to learn
  • just want to be around other people while they get stuff done

Each of these things are needs that user groups and makerspaces can and should satisfy at some level.  Some groups are focused more on talks and networking, but our group is focused more on getting work done and providing a community experience around this work.  This keeps the pressure off the organizers to track down speakers and deal with event planning, which means that our organizational efforts are dedicated to the attendee experience and providing support.  The (we) organizers can also get our own work done during these hours, which makes our weekly time commitment much more sustainable.

Now that we’ve set the stage, we can talk about supporting newcomers who need help.  I’m going to particularly focus on someone who shows up to a group, declares themselves to be new to Python (or whichever language/platform/etc. you are there for), and requests assistance.

I am a librarian and social scientist by training, so I approach these interactions as if it were a patron coming up to a reference desk.  I need to offer support, empathy, and guidance.  Most of all:  I am the voice of the community they are visiting.  This is even more so with programming:  to some beginners I am not just the voice of the Python community, I am the voice of the entire programming community.

There is no one true learning resource out there, as there is no one true programming style or purpose in life.  This means you need to track down the right resource for that person’s need (https://en.wikipedia.org/wiki/Five_laws_of_library_science).  Here are the core questions I need them to answer at some level:

  1. What is your educational background?
    • Don’t name any specific domain. Leave this open ended such that the answer cannot be “no.”
  2. What programming have you done before? And it’s ok if not!
    • Be sure to keep your tone supportive for those who may need to say that they’ve never programmed before. Asking this in a non-binary way will avoid some, but not all, of people’s misclassification of their experience level.  More discussion on this later.
  3. What do you want to do with programming? or Why do you want to learn that topic?
    • Again, tone. Keep it positive and welcoming.

These questions are the framework for a conversation with this person, so ask clarifying questions and provide encouragement along the way is also important.  Before we start going into which resources to recommend, etc., here are some things to keep in mind as you’re having this conversation:

  • Do not presume any level of math education, interest, or comfort.
  • Do not presume complete inexperience for someone who says that they are a beginner. Some people will discount courses they took in high school or even college as being valueless, when that experience will totally change the resource I send them to.  I see this mostly occurring with women, and even some describe themselves as beginners when they’ve taken several programming classes.
  • Understand that there’s a difference between being a beginner to Python and a beginner to programming. Try to tease out which they are when they just say “beginner.”
  • Be explicit about welcoming beginners of all kinds. If you think you’ve added that phrasing in enough places, you can find more.  You will need to constantly repeat yourself, encourage, and be as noisy as possible about welcoming beginners.  Even with that, be prepared to field direct communication asking if it’s okay for them to come. No matter how many times you have it on your website, there are levels of imposter syndrome that make the personal permission the only one that sticks.

When it comes to recommending resources, much of my perspective is influenced by my perspective that programming, at its core, is an information problem.  So I’m super obsessed about finding people the right learning material.  Learners of programming generally need materials for instructional narrative to explain concepts, practice to tinker with those concepts, reference to look up concepts/tools, and a safe place/person to as questions.

In my final piece of advice, here are some general guidelines to consider when giving people advice or making recommendations:

  • “Just the read the docs” is never the right answer for a programming beginner.
  • Recommend no more than 3 – 5 resources, and hopefully some of them will hit on all the requirements listed above.  Avoid brain dumping every book you’ve ever seen about Python on them.  Pick out a few for them to investigate.  Remember that they’ve sought out experts for their expert advice, not human google search for “python”.
  • Remind them that it is okay to do some window shopping of materials, and that it is okay to reject a book after a few chapters if they aren’t feeling it.
  • Keep your IDE and command line wars at home. Get them set up with a development environment that is right for them.
  • Finally, once you’ve given them links and answered their questions, remember to stop talking and give them time to read and explore the resources you recommended. Leave them alone in this process (do not make small talk), and invite them to let you know when they’d like more help.  Don’t sit and stare at them practicing or provide unrequested commentary.

This post isn’t meant to fully unpack the selection of resources with an information request, but hopefully can serve as a place to start formalizing your own thoughts on welcoming beginners.  Remember: your community/group will have different needs, so you’ll need to change some of this to fit their needs.

How I use github in 500 words

My feelings about github are somewhere at the nexus of aggravated and grateful.  Git is not easy because it is composed of a series of incantations based out of somewhat identifiable English.  Close enough that you think they have meaning but far enough that you want to bludgeon yourself in the hopes of forgetting semantics.  And yet, version control and backups are important to save your butt.

I shall confess (without apology), that I do not use command line git(hub). Can I? Sure. Do I want to try and remember those commands while I’m trying to keep my brain on my code? No. So the github app is where I find the sweet spot between extracting the power of version control without the anguish.

Here’s my use case for git: I write code, I want to track it, github works well for that, and I usually work alone.  All I need to do is create, commit, sync, and get back to my freaking job. I don’t do pull requests or contribute to shared projects. GET OFF MY LAWN.

Sorry, let’s get back to how to do commits and restore deleted files.

Step 1: Make a github account. Already have one? Go apply for the student developer pack if you haven’t already.

Step 2: Download the Github Desktop application: https://desktop.github.com/

Step 3: Go into the application, find preferences. Remember password. Sign in.

Step 4: Find the plus button thing. Click. Add will let you point the app at an existing folder and create a repo from that, Create makes a new one, and Clone is an incantation for the sociable. Click Create, give it a name, and click Create repository.

Screen Shot 2016-05-09 at 10.03.23 PM.png

Step 5: That repository name is now a folder. Go find it. Or make a new one somewhere you can remember.

Step 6: Add some crap to that folder.

Screen Shot 2016-05-09 at 10.10.43 PM.png

Step 7: Go back to the app and pour your eyeballs on your newly tracked file. The stuff on the right shows you the contents of the file. Green shows additions, red deletions. The little icon next to the repository name on the sidebar should be a computer monitor looking doodad, meaning that it is a local repo.

Step 8: Add a commit message (short & sweet) and description (maybe longer, with punctuation). Click Commit to Master. To send off to github, click Publish on the upper right. You can stack up a bunch of commits before publishing, if you want.  The first time through it’ll ask you some stuff. Just click the publish again.

Screen Shot 2016-05-09 at 10.14.22 PM

Step 10:  Go change some crap in your file and check back to the app. Make a less crappy commit message and click Commit to master. Then this time Sync when ready to send it to github.

Screen Shot 2016-05-09 at 10.18.37 PM

Step 11: Delete that file and check the app. GONE.

Screen Shot 2016-05-09 at 10.31.28 PM.png

Click the Repository menu item and select Discard Changes to Selected Files.

Screen Shot 2016-05-09 at 10.32.08 PM.png

Ya, everything goes away from the repo because THE FILE IS BACK.

Step 12: Go back to your research.

490 words.

Review of: R for Everyone

Like so many people out there, I have been hacking and spitting my way through R.  I’ve made some awesome stuff, made the stats work, made some graphs, and written R Markdown notebooks that take 30 minutes to render (no, not because of for loops).  I feel comfortable saying that I am capable in R, but I’m still in the “incantation” phase of language understanding: I don’t really know why I’m doing [thing] but I know that [thing] will work because Stack Overflow told me so.

Screen Shot 2016-05-04 at 5.57.25 PM

I remember this phase in Python, but after attending a week long PyCamp, hanging out with extraordinary people of Py-CU I feel completely capable in Python.  I don’t know everything, but I understand every piece of syntax that I use and I’m comfortable diving into new topics.

The challenge of R is that so many of the materials and documentation are written for statisticians.  R is a statistical language, so this isn’t a bad thing, but is a piece of context that seems to be lost for many of the R experts.  Please stop telling me “everything is a vector” because my soul dies a little more each time someone earnestly tells me that, as if it is helpful to the general public.  No.  It isn’t.

I don’t care that everything is a vector and no, I don’t want to explore the philosophical implications of that. I need to run some statistics and make a few charts.  I understand data types, variable names, and data processing.  I’ve got my data and I know my research question.  I just need to smash that into a script and I need to know how to do it in R.  In short, I needed an R book written for a developer from another language, or at least something good for the angry cynical crowd.

Cue a timely recommendation for R for Everyone (2014) by Jared P. Lander.  At this point in my gum-and-spit-based R career I’m pretty desperate for help.  The R Cookbook helped a little, but lacking much of the foundational R know-how means that even clear explanations of advanced concepts are still opaque.  I loaded up an ebook version from my library, skimmed the chapter on the apply() family and ordered it from Amazon with my fingers crossed.

Striking a great balance between at the intersection of knowing the language incredibly well but not giving us the hard sell on why R is savior of our data souls, the examples are short, simple, and don’t try to clean up the messy output you’re used to in the interpreter.

Chapters 4 & 5 are the missing pages of my R life.  These cover the absolute basics of working within R, including data types and containers. These chapters need to be standard reading for everyone who complains about R.  The writing perspective highlights the variety of syntax oddities with acknowledgment of them rather than apology.

Screen Shot 2016-05-04 at 6.26.58 PM

Why I can’t use library books on R

Some chapters are perhaps overly detailed and would suit someone newer to programming (chapters 8-10 cover functions, control statements, and looping), while others attempt to cover such broad topics that they are more of a look book (chapter 7 on ggplot2). I was particularly happy with the pace until I hit chapter 11, where the plyr section went a little nuts.  Some syntax and packages are not explained, and a peak into some of the incorrect index page numbers makes me suspect that some editing and reorganizing happened without picking up the pieces.  But that doesn’t take away the ultimate value of this book.

The book seems to have three basic sections:  basic R programming, statistical tools, and advanced R programming topics.  The covered range of topics is ridiculously broad, and I think does a decent job of balancing the pace and level of detail.  Some chapters can be a bit on the side of just a vocabulary lesson rather than instructive, but this is a hallmark of a book where the chapters are meant to stand alone from the whole.  Those chapters tend to be the topic areas where further instruction would put the book’s content into maths instruction rather than R instruction.  So I understand.

This book is not for basic statistics instruction, for teaching core programming fundamentals, or to serve as a singleton resource on R.

This book is a valuable supplement for a statistics course in R, an intermediate R user wanting to sample some advanced techniques, or a self-taught R user to fill in some blank spots.

Overall I would classify this book as exceptional for reference and supplement, but not as a textbook or something with problems for students to work through.

Nitpicks:

  • The narrative doesn’t clarify which packages are standard library versus external and often pulls in packages but doesn’t note which functions are coming in from that package.  Much of this has to do with the profoundly annoying namespace issues that R has with namespaces, and often being overly explicit about where functions are coming in from is necessary for R instruction.
  • The author names many people who work to teach and create R packages, which provides a nice peak into the development community, but sometimes they feel like unnecessary name dropping. Again, though, this is a nitpick.
  • Coverage of the [] subsetting method doesn’t seem to appear in the book. It is used, but never thoroughly spoken of.  I would have traded some of the longer sections on basic programming concepts for more discussion about subsetting data.
  • Additional tables with summary information would be extremely valuable.  Particularly in the chapters where specific tasks are covered. Examples:  cheatsheets on selecting columns for data frames, the apply family, and aggregation.

What I’m working on right now

By request of Julia Evans (tweet) to the world, I am writing about what I’m working on right now.

As the tweets were posted I was in the middle of facilitating Py-CU‘s weekly open hours. The discussion threads were:

  1. Attempting to help someone get Python 2.7 going within Anaconda3 so he could use computer vision packages in Jupyter Notebooks.  Windows. Pain. Unresolved.
  2. Chatting with a programming newbie about resources and common problems humanities students having when first learning how to code.
  3. Awesome nerd out over text and code editors.
  4. Listening to a BB-8 build group yell at a robot to test voice commands.
  5. Listening to a 3D printer choke to death on Darth Vader printer.
  6. Me ranting about the crazy train that chapter 11 of R for Everyone went on.

I was working on:

  1. Writing a tool to auto-generate a bunch of CSVs with fake data.
  2. In order to have test data to build an auto-documentation tool.
    1. And attempting to figure out how to slam this JSON file into a sqlite3 database.
    2. I just wrote data[‘files’][data[‘files’].keys()[0]].keys().
    3. Rethinking life choices.
  3. Musing over my book review notes for R for Everyone.
  4. Resisting the urge to get R for Everyone out of my bag because I need to finish this class project.

Things near the top of my stack:

  1. Finishing an XPath tutorial.
  2. Finishing R for Everyone.
  3. Planning how I would code up some data for a analytics project for work.
  4. Thinking about my summer learning stack once I GRADUATE in May and am FREE.

Why it barely matters where you start

There is no one true anything in life. Expanded out to the programming world, there is no one true IDE, book, language, package, etc. Anyone trying to sell you on that is a liar. A more refined statement might be: any hybrid tool can rarely ever be as good as a specific tool.

FullSizeRender

Many newcomers interested in data analysis ask the following completely reasonable questions:

  • Should I learn Python or R?
  • Which IDE should I use?
  • Is there a book or workshop that I need?

An appropriate answer is to scratch your head, hedge a bit, and then try to list off some stuff you think is recent, doesn’t involve too much of a headache to attain or install, and hopefully won’t terrify this person back to Excel. Imagine trying to explain childbirth to a young girl going through puberty and looking forward to adulthood. Stating the reality of “At least you probably won’t die” isn’t likely to make you feel great as a mentor and certainly won’t make her excited for future parenthood.

Many communities use the word stack to describe a pile of stuff that likely has some form of internal hierarchy or workflow. We can see this in software, networking, libraries, math, architecture, and many others. English has plenty of idioms implying that tool selection is a fluid process almost as important as the use of the tool itself. These include, right tool for the right job, bring out the big guns, and don’t bring a knife to a gunfight.

Each of these implies that there are multiple tools and the core joke is that the selection process should be determined by the job to be done. Let’s look at chess for a moment. The Queen piece may be more powerful than the Knight, but the Knight can move in ways the Queen can’t. This means there may be problems for which the most powerful piece is utterly useless. The language that our community uses is describing something that we understand implicitly once we have enough experience, yet we often let students have this perspective of reverence for a single specific tool.

Observing this truth and given how open source obsessed this research domain is as well, it isn’t shocking there is an overabundance of tools in certain areas. At the time of writing, PyPI is approaching nearly 80,000 packages. So how to choose? Which are worth investing your mental energy into getting good at? In the end, does it matter which package you use? These are very serious questions, and sadly, the more I study these things the more I come to the conclusion of “Who knows, but not a big deal one way or another.”

nostrongfeelings

So my best advice for the newcomer is to just pick something. Start somewhere. Anywhere. Really, it barely matters. Because in the end you’ll likely need to know a little bit about everything on the list in front of you. Even if that tool or platform turns out to be a bust for the problem you were working on, that experience adds to your knowledge about what is available in the analysis stack and how to approach problems. You’ll run into it again if you stay in the analytics world.

Now, it doesn’t have to be a complete free-for-all. Some informed selection is always beneficial. Just keep in mind that you are selecting which thing to learn first and not only. Also, be open to accepting that you’ve gone down the wrong path.

You, the learner, have power over how you learn. I want to keep stressing that. You have the absolute power to accept or reject suggestions and strategies. Strict adherence to ‘expert’ recommendations doesn’t reflect your unique needs and puts too much value on those recommendations. I know some of the tools I use aren’t the best, but I use them because they have worked for me up until now. I’ve also run into problems with some of the gold standard tools out there and I end up dealing with pearl clutching coders when I mention that I don’t use them.

I will provide some recommendations below, but you should plan on trying a few things out. See what sticks. While I may recommend something, that is not to the exclusion of something else. Recommending Python does not imply that R will be useless.

When you know a little bit about what you’ll need to do

Search around online for similar projects. See what they’ve done. Pay attention to any chat about specific packages designed for these tasks. Go with whichever platform has the tools designed for your task. For example, I was asked about visualizations for Likert scale data from a survey. R happens to have a nifty 3rd party package for Likert charts. However, if the online survey tool you’re using has mangled your data, you may need to break out some Python to whack it into something compatible with math.

When you just need to start somewhere

Unless you think you may need to do a lot of stats, start with Python. It’ll be straightforward, and you can apply the basic concepts for other tasks, like making games, front-end web development, etc. You’re likely to move on from Python, and that’s fine. It’s a great starter language.

In conclusion

Don’t let decision fatigue prevent you from getting started on your path.  Nearly every programming area requires a stack of things to know, so all experience is good experience to have.  Investigate a little or ask colleagues, but at the end of the day just flip a coin and go with something.

“Programming as an information-centric activity” talk at the Python Education Summit

After developing/teaching several types of programming workshops and spending a lot of time listening to my peers at GSLIS talk about learning how to code, it is fair to say that I have a lot of opinions on the state of teaching programming for those outside of a STEM past and going into a non-STEM future. Additionally, being part of a graduate program in library and information science has biased me to see a lot of activities in our daily lives as information problems.

Much of my experience with programming books has revealed a concept-formula-drill presentation model, but this doesn’t encapsulate the real work of problem solving with code. Yes, students absolutely need to drill and practice the core concepts, but the activity of programming goes far beyond just that need. Many experienced coders criticize new students begging for help for not immediately searching for their problem on Google and solving it on their own. This is such a common response that the programming instruction community should listen and take note. Why are there so many Stack Overflow posts closed as duplicates? Sure, there are certainly searchers who are too lazy to actually read through other posts, but I believe that this group is in the minority. I’d pin the problem on searchers being unable to either a) correctly form a useful search query, or b) recognize an appropriate solution as useful for their problem.

Indeed, many instructors will encourage students to search online for their answers, but even a Google layperson understands that there is skill required to construct a useful search string. There are quirks and tricks to solving code problems via search engines. In unpacking this problem, we can see that students need to be able to identify the actual problem in their code, find the relevant section of code on a line, understand the words to describe the problem, and recognize how to apply a potential fix to their own code. There are a lot of essential skills here but do any introductory textbooks talk about this? (Seriously, let me know if you find one).

I collected many of my thoughts about this into a talk I presented at the Python Education Summit titled “Programming as an information-centric activity.” The core argument: the normal activity of programming involves a lot of information skills and these skills should be incorporated very explicitly into the classroom or other instruction environment. Instructors should not only use documentation and reference materials within lectures and demos but they should also take the time to talk about the common reference materials within programming communities. For example, answer the question: What is a programming cookbook and when should a student reference one? This is a reference document somewhat unique to the programming community but often discovered by accident by novices.

Slides are up on FigShare: http://dx.doi.org/10.6084/m9.figshare.1372436
A pre-recorded screencast version is up on YouTube: https://www.youtube.com/watch?v=7irxT_Q-0e8

The great Python Mashup lesson plan

I’m often asked by new students where they should go to learn Python.  That isn’t always an easy answer, because I haven’t found the one perfect resource yet.  However, there are some really strong ones out there.  My goal was to construct a lesson plan that was a mashup of my favorite resources into a coherent plan of readings and homework.  Readings are important to have a foundation to build on and reference back to, but so is having a solid queue of content to whack on until you understand the how and why of things.

I’ve mashed up content from Python for Informatics, http://www.pythonlearn.com, Codecademy, and Python Batting Practice together into one course book.

I believe that these materials are some of the best out there, and I reject the notion that students need to learn from a single source.  Each has benefits, and I feel like these sources are very complimentary.  Recall learning how to spell or learning another (human) language.  We always had workbooks or some material that required us to act on the content we had just studied.  Learning how to write code is a skill based activity that requires a ton of practice to refine your understanding of the concepts and syntax.  Additionally, learning from multiple sources allows the student to experience how things are referred to by different people from more perspectives.

That being said, a streamlined and supportive course in programming designed to minimize frustration and difficulties does not mean that either of those will disappear.  An important part of the learning process is the fight to learn.  The harder we have to work for something the better we remember it, but that doesn’t mean that learning how to program needs to be the worst thing ever.  I have aimed to keep a good balance inside the “productively difficult” zone.

So, I am happy to announce a new page at the top: the Guided Self-Study Lesson Plan.  I will be leading another introductory workshop again soon with this structure as the basis, and I plan to document and publish that as well.