Tips for live teaching tech online, deeply informed by The Carpentries

As the higher education world begins to adapt to an online format and industry to virtual meetings, many of us are adapting to a new set of social etiquettes around teaching in platforms like Zoom.

I won’t be getting into the important topics about digital equity involved in a flip to fully online education or working from home.  Just to highlight:

  • don’t presume people have {thing}, have great {thing}, have {thing} to themselves, or even have a home where {thing} could be.  For each thing, fill in:  internet, computer, monitor, attention, etc.

I shared the following list with my department today.  This is not exhaustive, but is a general toolkit that I try to utilize in any online meeting or course.

A large part of my service work is with The Carpentries ( which is completely distributed across the globe. We use Zoom, a lot. Like, A LOT.

Some examples of the online zoom work that we do:

  • Run 2 day workshops synchronously in zoom, this includes breakout rooms, screen sharing, and a lot of discussion oriented activities.  Participants will usually be grouped by timezones but from multiple countries, and sometimes we’ll have a remote site with multiple people in combination with individual remote attendees.  The instructors will be remote/geographically distributed as well.  We use a combination of etherpad or google doc to collaboratively share the results of the think/pair/share activities.
  • Run live teaching demos as part of the certification process, where up to 5 people join a meeting and do 5 minute screenshare teaching presentations.
  • All staff and community discussions take place in zoom. This includes the Executive Council (this is my third year on it) meets entirely in zoom. We use a shared google doc for agenda, with someone in another section taking minutes.
  • Our tips on zoom are here, including other advice for leading online training in zoom:

I’ve done all these things, and now also been teaching in Zoom this semester for UIUC.  I’ve found that scale does make a difference, but many things hold true.

Many of the iSchool’s graduate students are remote/online students, so I always design my classes as online first. I might pilot a class in person, but I always presume things will end up online or in a hybrid classroom.

My general online tips for meetings and teaching are below.  Again, this is general, and I will deploy as needed depending on context.  Not every class and topic fits well with this.  But having a large catalog of online skills makes you quickly adaptable for when things go poorly.  These tips are geared toward synchronous sessions, but I’ll add some recordings advice at the end. There are other guides out there on recording videos for asynchronous use.

My suggested toolkit for online meetings and synchronous classes:

  • Be clear about your expectations for chat and state it up front at the beginning of every class.
    • Zoom chats aren’t threaded and can get quite cluttered.  I usually discourage side conversations unrelated to the content being presented.
    • Remind students to remain muted when they are not speaking/presenting.
    • As a host, you have permissions to mute someone. This can happen when someone calls in on their phone, as the controls are odd.
  • Be invitational if you want video.
    • Clearly state if you would like people to use video as able, and it’s ok to turn it off/leave it off during a break of if they are eating.
    • Encourage, but don’t pressure, is key.
    • Be explicit with permission to turn it off as needed.
  • Practice using gallery view/speaker view, and remind students that these are options.
  • You can narrate what you are doing if you are invisibly fighting with things behind the scenes so students don’t think their connection has died.
    • Try to laugh off problems.  It happens to everyone!
    • You can use this to cultivate some solidarity with the students. They are also likely learning a new platform.
  • Be clear about how you want people to communicate. When calling on a student, state “[name], go ahead and take the mic” or “[name], put your question in the chat”.
    • Invite students to take the mic, as they will likely default to using chat.
    • Get used to indicating which communication channel you want them to use when you are calling on them or when you are asking for feedback.
  • Remember that there is a strong delay with chat.  So it may take up to 10-30 seconds to start getting questions or answers appear in the chat.
    • Students not only need time to think, but additional time to type & formulate a question, plus lag between when you speak and when they get it.  This is especially true when teaching in a hybrid environment.  In person students have a time/bandwidth/speaking advantage, and rarely will an online student beat that.
  • There are many ways to run an interactive class without having full discussion.  Don’t just read slides to them.
    • Invite students to stop and think about things.
    • Ask them to state an example and put it in chat, and then read some of them out. Try to balance who you are highlighting.
    • Ask them to make predictions and place them in the chat. This is a great/easy way to do formative assessment.
    • These give everyone a chance to be involved even if they are timid about the mic or you have some extremely chatty students.
  • Turn taking must be an explicit and policed etiquette.
    • The raising hands feature in zoom in very limited.  The Carpentries get around this by using the /hand in the chat. Anyone wanting to speak should place /hand in the chat. This is a visual indicator that’s clearer than the long participants list, and it also naturally captures the queue as the chats roll in.
    • An alternative form of this is “/hand [topic]”.  So you can see what students want to talk about and configure discussion as needed.  This can semi thread the hand stack.
    • Scale makes this hard.  “Can I get about 3 people wanting to talk who haven’t yet?” is a good way to naturally limit the queue.
    • Actively remind students to use /hand or ask a student to turn off their mic if they are not taking turns.
  • Assign roles in meetings (, and
    • Having a dedicated facilitator, who is someone not presenting and not taking notes, can be great to handle heavy discussion situations. So someone is dedicated to watching the /hand queue and keeping track of who is next.
    • Rotate these duties, and have subs for when people in those positions need to present or speak. Eg you can’t take notes while you are presenting one of your items.  The sub should fill in, and the person should direct the sub to take over while they present.
    • Add, adjust, or change the roles as it makes sense for your meeting.
  • Think about rotating these roles or asking students to volunteer to take these roles during class.
    • Practice with facilitation, gatekeeping, and timekeeping are important.
    • Rotate by discussion point or class as desired.
  • Activities are different in online, be creative.
    • Having students do something independently and then post the results in an LMS forum is a nice low effort way to share results without eating time by fighting with breakout rooms.
    • Add time or an assignment that they need to respond to each other in the forum.  Sort of an asynchronous pair/share.
  • Have your notes printed out or on another screen when you are sharing your screen.
    • So you don’t have to move away from your screenshare.
  • Present on your smaller laptop or monitor, have your notes/email on your bigger monitor.
    • This reduces the size of your video and bandwidth.
  • If you are going to livecode, do it properly.
  • Gauging the temperature of a room can be difficult.  You can use textual indicators.
    • Example, we will use +1 for agreement, -1 for disagreement, and +0 for neutral feelings. Sometimes this becomes just ++,–,0 depending on how CS the audience is. But it does work well.  And invites the moderator to engage with any -1 or +0s.
  • Tools like Poll Everywhere ( have a variety of feedback types and worth playing around with them.
  • Share your slides!
  • Take some time to play with any annotation tools offered by the platform you are in, or get something like ScreenBrush.

Recordings are not my expertise, but here are some other aspects to think about:

  • make them in short content chunks
  • add timestamp notes about content transitions when your videos run longer
    • students will try and go back for reference
  • prompt students to pause the video and experiment
  • repeat yourself a lot
  • prioritize getting captions/transcripts for your recordings.  Google slides has some options, as does YouTube, and Kaltura.  Depends on your campus.  Again, not my expertise area, but there are some options. Work with your disability resource office if you aren’t aware of the tools available for your campus. They usually have the good stuff.

Sticky activity 1: Describe your data in three attributes

This is hopefully the first part in a series of posts where I share some of my favorite group activities for workshops or group facilitation.

The scene:

Imagine being in the middle of a conversation with a team trying to grapple with complex data, and the folk at the table are explaining their project.  You’ve heard from each and are having a problem putting all the pieces together.  You would have thought that they were all working on different projects given their answers, but you know that isn’t the case.

The problems:

  1. You (as a facilitator) aren’t seeing the whole picture and are having a hard time organizing your thoughts around how to help. You need more information but need to ask for it in an orderly way.
  2. The team is having a hard time discussing data flow because they are describing with details of their perspectives and responsibilities, often times talking past each other.

Here is an activity that I like to deploy in these situations that help me get a better sense of the team and internally help members of the group better understand each other.

The setup:

  • Sticky notes of any size or color, have about 3-5 for each
  • Pens/pencils
  • Large board to write on where the group can see or a large paper easel
  • Encourage everyone to put away their laptops and turn their phone screens to face the table while completing the exercise

Group size:

Ideally this would be a small group of 2-10 people.  This can be adapted to larger groups if you have multiple facilitators or having smaller groups self-facilitate a mid-point review and report out.

The script:

Hand out a stack of sticky notes to each participant.

“Everyone lay out 3 sticky notes in front of you.” Pause and let them do this.

“Think of your data and try to describe how you see all in three attributes or categories.  You can add 1 note if you absolutely must.  You’ll need to think really high level here. Write down one name or category on each note.”

You can provide some relevant examples if you have skeptical faces in the crowd.  Such as, “Size, status, processed, unprocessed, business, operations, etc.”  Try to not be super specific, because you don’t want to prime them into a single thought area.

You may also want to provide some reassurance.  “There is no one right answer here, so do your best. We’re using this to better understand your perspective, so whatever comes to mind first is likely what we want.  If you want to make a correction, these are just sticky notes.  Toss out the old one and make a new one as many times as you need.”

Give them several minutes of silence.  Check email on your phone or review your notes if you don’t want to stand there awkwardly.  After 2-3 minutes, wander around the room and glance over answers.  Gently correct anyone back on track that you need. Do so softly and positively.

Prompt them for 1 more minute of work time.  Use this time to get you white board or easel set up.

“Let’s start at one end and go in order.  Read off your categories in order, and give me time to write them down.”  Ask them to reiterate their job/position on the team if introductions were a while ago or not done yet.

As they read off the names, write them down in a vertical list on the board.  Make a + or tally mark next to any repeats.

Go around the room in order, don’t take volunteers or ask for hands.

The group may tell you that there are no more unique values to read off, and that’s fine!  Go ahead and add the tally marks when you hit repeats, but it’s ok to be done with it if you are running short on time or you feel like things have been covered.

This is a good time to allow a quick bio/coffee/email break for the participants.  Take care of yourself first, and then return to the board.

The discussion

This is where your knowledge of the team needs to come in.  Use your best guesses to attempt and cluster synonyms or synthesize what you’ve heard from the group.  You may have heard things related to workflows, processing categories, storage locations, data types, etc.  Pay close attention to the job roles the clusters seem to be coming from.  This may have started a good conversation in the team that you should assist in keeping on track, but otherwise don’t interrupt.

Should you need to get them talking, ask the participants to…

  • weigh in on their interpretation of the values and clusters.
  • report if they learned anything new from hearing their team mates.
  • confirm or alter the clusters that you made.
  • decide on specific language or labels for things that may have come up as synonyms.
  • create category groups for specific topical areas, if relevant.

This is usually enough to give you a better idea of what’s happening and allow you to start a nice transition to other activities.  This may yield an important piece of documentation for the team, an important first step as team building, or even a nice ice breaker for the start of a full day of activities.

Advice for helping coding beginners in user groups: smile, don’t brain dump, and remember to shut up

I was recently asked to be a guest speaker talking to a Makerspaces class about teaching and supporting programming in the makerspace context.  After writing up my notes and presenting, I realized that I have learned a lot in my three years with the Champaign-Urbana Python User Group (Py-CU).  This blog post contains my general framework for making recommendations to and supporting beginners who come to our Python meetup.  Each space and group has a unique vibe, so I don’t expect this to be a universal framework.

Py-CU offers a weekly hack/project work time, open to all members of the community and all skill levels.  Designed to be a simple work time for people, we get a diverse crowd of regular, semi-regular, and one-off attendees of all skill levels.  These include absolute beginners, students and faculty from UIUC, professional developers, and everything in between.

Before I start digging into the recommendations, I want to break down some of the important characteristics often found in our attendees.  Again, this doesn’t describe everyone, but most of our attendees would identify with at least one of these factors.

Py-CU is a subgroup of Makerspace Urbana, which (as makerspaces are wont to do) is located in an awkward space, in an awkward building, in an awkward part of town.  Thus, the people who successfully do the research and make the trek down are there for a reason.  People don’t usually show up at a user group because all their needs are met and they are completely satisfied with their use of that product.  This is, in fact, why user groups generally exist.  There’s a hidden word in “user group”, which should really unpack to “user support group.”  Never forget this.

So, people show up at Py-CU because something has gone wrong in their use or learning experience with Python.   Sometimes this means they:

  • can’t find the right resource or information
  • can’t determine which resource is right for them
  • can’t get that resource to work or otherwise make sense of it
  • have questions they can’t get answered or don’t understand the answers they’ve received (insert the Stack Overflow stink eye)
  • are doing this outside of their job and/or coursework and want a dedicated time and place to learn
  • just want to be around other people while they get stuff done

Each of these things are needs that user groups and makerspaces can and should satisfy at some level.  Some groups are focused more on talks and networking, but our group is focused more on getting work done and providing a community experience around this work.  This keeps the pressure off the organizers to track down speakers and deal with event planning, which means that our organizational efforts are dedicated to the attendee experience and providing support.  The (we) organizers can also get our own work done during these hours, which makes our weekly time commitment much more sustainable.

Now that we’ve set the stage, we can talk about supporting newcomers who need help.  I’m going to particularly focus on someone who shows up to a group, declares themselves to be new to Python (or whichever language/platform/etc. you are there for), and requests assistance.

I am a librarian and social scientist by training, so I approach these interactions as if it were a patron coming up to a reference desk.  I need to offer support, empathy, and guidance.  Most of all:  I am the voice of the community they are visiting.  This is even more so with programming:  to some beginners I am not just the voice of the Python community, I am the voice of the entire programming community.

There is no one true learning resource out there, as there is no one true programming style or purpose in life.  This means you need to track down the right resource for that person’s need (  Here are the core questions I need them to answer at some level:

  1. What is your educational background?
    • Don’t name any specific domain. Leave this open ended such that the answer cannot be “no.”
  2. What programming have you done before? And it’s ok if not!
    • Be sure to keep your tone supportive for those who may need to say that they’ve never programmed before. Asking this in a non-binary way will avoid some, but not all, of people’s misclassification of their experience level.  More discussion on this later.
  3. What do you want to do with programming? or Why do you want to learn that topic?
    • Again, tone. Keep it positive and welcoming.

These questions are the framework for a conversation with this person, so ask clarifying questions and provide encouragement along the way is also important.  Before we start going into which resources to recommend, etc., here are some things to keep in mind as you’re having this conversation:

  • Do not presume any level of math education, interest, or comfort.
  • Do not presume complete inexperience for someone who says that they are a beginner. Some people will discount courses they took in high school or even college as being valueless, when that experience will totally change the resource I send them to.  I see this mostly occurring with women, and even some describe themselves as beginners when they’ve taken several programming classes.
  • Understand that there’s a difference between being a beginner to Python and a beginner to programming. Try to tease out which they are when they just say “beginner.”
  • Be explicit about welcoming beginners of all kinds. If you think you’ve added that phrasing in enough places, you can find more.  You will need to constantly repeat yourself, encourage, and be as noisy as possible about welcoming beginners.  Even with that, be prepared to field direct communication asking if it’s okay for them to come. No matter how many times you have it on your website, there are levels of imposter syndrome that make the personal permission the only one that sticks.

When it comes to recommending resources, much of my perspective is influenced by my perspective that programming, at its core, is an information problem.  So I’m super obsessed about finding people the right learning material.  Learners of programming generally need materials for instructional narrative to explain concepts, practice to tinker with those concepts, reference to look up concepts/tools, and a safe place/person to as questions.

In my final piece of advice, here are some general guidelines to consider when giving people advice or making recommendations:

  • “Just the read the docs” is never the right answer for a programming beginner.
  • Recommend no more than 3 – 5 resources, and hopefully some of them will hit on all the requirements listed above.  Avoid brain dumping every book you’ve ever seen about Python on them.  Pick out a few for them to investigate.  Remember that they’ve sought out experts for their expert advice, not human google search for “python”.
  • Remind them that it is okay to do some window shopping of materials, and that it is okay to reject a book after a few chapters if they aren’t feeling it.
  • Keep your IDE and command line wars at home. Get them set up with a development environment that is right for them.
  • Finally, once you’ve given them links and answered their questions, remember to stop talking and give them time to read and explore the resources you recommended. Leave them alone in this process (do not make small talk), and invite them to let you know when they’d like more help.  Don’t sit and stare at them practicing or provide unrequested commentary.

This post isn’t meant to fully unpack the selection of resources with an information request, but hopefully can serve as a place to start formalizing your own thoughts on welcoming beginners.  Remember: your community/group will have different needs, so you’ll need to change some of this to fit their needs.

How I use github in 500 words

My feelings about github are somewhere at the nexus of aggravated and grateful.  Git is not easy because it is composed of a series of incantations based out of somewhat identifiable English.  Close enough that you think they have meaning but far enough that you want to bludgeon yourself in the hopes of forgetting semantics.  And yet, version control and backups are important to save your butt.

I shall confess (without apology), that I do not use command line git(hub). Can I? Sure. Do I want to try and remember those commands while I’m trying to keep my brain on my code? No. So the github app is where I find the sweet spot between extracting the power of version control without the anguish.

Here’s my use case for git: I write code, I want to track it, github works well for that, and I usually work alone.  All I need to do is create, commit, sync, and get back to my freaking job. I don’t do pull requests or contribute to shared projects. GET OFF MY LAWN.

Sorry, let’s get back to how to do commits and restore deleted files.

Step 1: Make a github account. Already have one? Go apply for the student developer pack if you haven’t already.

Step 2: Download the Github Desktop application:

Step 3: Go into the application, find preferences. Remember password. Sign in.

Step 4: Find the plus button thing. Click. Add will let you point the app at an existing folder and create a repo from that, Create makes a new one, and Clone is an incantation for the sociable. Click Create, give it a name, and click Create repository.

Screen Shot 2016-05-09 at 10.03.23 PM.png

Step 5: That repository name is now a folder. Go find it. Or make a new one somewhere you can remember.

Step 6: Add some crap to that folder.

Screen Shot 2016-05-09 at 10.10.43 PM.png

Step 7: Go back to the app and pour your eyeballs on your newly tracked file. The stuff on the right shows you the contents of the file. Green shows additions, red deletions. The little icon next to the repository name on the sidebar should be a computer monitor looking doodad, meaning that it is a local repo.

Step 8: Add a commit message (short & sweet) and description (maybe longer, with punctuation). Click Commit to Master. To send off to github, click Publish on the upper right. You can stack up a bunch of commits before publishing, if you want.  The first time through it’ll ask you some stuff. Just click the publish again.

Screen Shot 2016-05-09 at 10.14.22 PM

Step 10:  Go change some crap in your file and check back to the app. Make a less crappy commit message and click Commit to master. Then this time Sync when ready to send it to github.

Screen Shot 2016-05-09 at 10.18.37 PM

Step 11: Delete that file and check the app. GONE.

Screen Shot 2016-05-09 at 10.31.28 PM.png

Click the Repository menu item and select Discard Changes to Selected Files.

Screen Shot 2016-05-09 at 10.32.08 PM.png

Ya, everything goes away from the repo because THE FILE IS BACK.

Step 12: Go back to your research.

490 words.

Review of: R for Everyone

Like so many people out there, I have been hacking and spitting my way through R.  I’ve made some awesome stuff, made the stats work, made some graphs, and written R Markdown notebooks that take 30 minutes to render (no, not because of for loops).  I feel comfortable saying that I am capable in R, but I’m still in the “incantation” phase of language understanding: I don’t really know why I’m doing [thing] but I know that [thing] will work because Stack Overflow told me so.

Screen Shot 2016-05-04 at 5.57.25 PM

I remember this phase in Python, but after attending a week long PyCamp, hanging out with extraordinary people of Py-CU I feel completely capable in Python.  I don’t know everything, but I understand every piece of syntax that I use and I’m comfortable diving into new topics.

The challenge of R is that so many of the materials and documentation are written for statisticians.  R is a statistical language, so this isn’t a bad thing, but is a piece of context that seems to be lost for many of the R experts.  Please stop telling me “everything is a vector” because my soul dies a little more each time someone earnestly tells me that, as if it is helpful to the general public.  No.  It isn’t.

I don’t care that everything is a vector and no, I don’t want to explore the philosophical implications of that. I need to run some statistics and make a few charts.  I understand data types, variable names, and data processing.  I’ve got my data and I know my research question.  I just need to smash that into a script and I need to know how to do it in R.  In short, I needed an R book written for a developer from another language, or at least something good for the angry cynical crowd.

Cue a timely recommendation for R for Everyone (2014) by Jared P. Lander.  At this point in my gum-and-spit-based R career I’m pretty desperate for help.  The R Cookbook helped a little, but lacking much of the foundational R know-how means that even clear explanations of advanced concepts are still opaque.  I loaded up an ebook version from my library, skimmed the chapter on the apply() family and ordered it from Amazon with my fingers crossed.

Striking a great balance between at the intersection of knowing the language incredibly well but not giving us the hard sell on why R is savior of our data souls, the examples are short, simple, and don’t try to clean up the messy output you’re used to in the interpreter.

Chapters 4 & 5 are the missing pages of my R life.  These cover the absolute basics of working within R, including data types and containers. These chapters need to be standard reading for everyone who complains about R.  The writing perspective highlights the variety of syntax oddities with acknowledgment of them rather than apology.

Screen Shot 2016-05-04 at 6.26.58 PM

Why I can’t use library books on R

Some chapters are perhaps overly detailed and would suit someone newer to programming (chapters 8-10 cover functions, control statements, and looping), while others attempt to cover such broad topics that they are more of a look book (chapter 7 on ggplot2). I was particularly happy with the pace until I hit chapter 11, where the plyr section went a little nuts.  Some syntax and packages are not explained, and a peak into some of the incorrect index page numbers makes me suspect that some editing and reorganizing happened without picking up the pieces.  But that doesn’t take away the ultimate value of this book.

The book seems to have three basic sections:  basic R programming, statistical tools, and advanced R programming topics.  The covered range of topics is ridiculously broad, and I think does a decent job of balancing the pace and level of detail.  Some chapters can be a bit on the side of just a vocabulary lesson rather than instructive, but this is a hallmark of a book where the chapters are meant to stand alone from the whole.  Those chapters tend to be the topic areas where further instruction would put the book’s content into maths instruction rather than R instruction.  So I understand.

This book is not for basic statistics instruction, for teaching core programming fundamentals, or to serve as a singleton resource on R.

This book is a valuable supplement for a statistics course in R, an intermediate R user wanting to sample some advanced techniques, or a self-taught R user to fill in some blank spots.

Overall I would classify this book as exceptional for reference and supplement, but not as a textbook or something with problems for students to work through.


  • The narrative doesn’t clarify which packages are standard library versus external and often pulls in packages but doesn’t note which functions are coming in from that package.  Much of this has to do with the profoundly annoying namespace issues that R has with namespaces, and often being overly explicit about where functions are coming in from is necessary for R instruction.
  • The author names many people who work to teach and create R packages, which provides a nice peak into the development community, but sometimes they feel like unnecessary name dropping. Again, though, this is a nitpick.
  • Coverage of the [] subsetting method doesn’t seem to appear in the book. It is used, but never thoroughly spoken of.  I would have traded some of the longer sections on basic programming concepts for more discussion about subsetting data.
  • Additional tables with summary information would be extremely valuable.  Particularly in the chapters where specific tasks are covered. Examples:  cheatsheets on selecting columns for data frames, the apply family, and aggregation.

What I’m working on right now

By request of Julia Evans (tweet) to the world, I am writing about what I’m working on right now.

As the tweets were posted I was in the middle of facilitating Py-CU‘s weekly open hours. The discussion threads were:

  1. Attempting to help someone get Python 2.7 going within Anaconda3 so he could use computer vision packages in Jupyter Notebooks.  Windows. Pain. Unresolved.
  2. Chatting with a programming newbie about resources and common problems humanities students having when first learning how to code.
  3. Awesome nerd out over text and code editors.
  4. Listening to a BB-8 build group yell at a robot to test voice commands.
  5. Listening to a 3D printer choke to death on Darth Vader printer.
  6. Me ranting about the crazy train that chapter 11 of R for Everyone went on.

I was working on:

  1. Writing a tool to auto-generate a bunch of CSVs with fake data.
  2. In order to have test data to build an auto-documentation tool.
    1. And attempting to figure out how to slam this JSON file into a sqlite3 database.
    2. I just wrote data[‘files’][data[‘files’].keys()[0]].keys().
    3. Rethinking life choices.
  3. Musing over my book review notes for R for Everyone.
  4. Resisting the urge to get R for Everyone out of my bag because I need to finish this class project.

Things near the top of my stack:

  1. Finishing an XPath tutorial.
  2. Finishing R for Everyone.
  3. Planning how I would code up some data for a analytics project for work.
  4. Thinking about my summer learning stack once I GRADUATE in May and am FREE.

Why it barely matters where you start

There is no one true anything in life. Expanded out to the programming world, there is no one true IDE, book, language, package, etc. Anyone trying to sell you on that is a liar. A more refined statement might be: any hybrid tool can rarely ever be as good as a specific tool.


Many newcomers interested in data analysis ask the following completely reasonable questions:

  • Should I learn Python or R?
  • Which IDE should I use?
  • Is there a book or workshop that I need?

An appropriate answer is to scratch your head, hedge a bit, and then try to list off some stuff you think is recent, doesn’t involve too much of a headache to attain or install, and hopefully won’t terrify this person back to Excel. Imagine trying to explain childbirth to a young girl going through puberty and looking forward to adulthood. Stating the reality of “At least you probably won’t die” isn’t likely to make you feel great as a mentor and certainly won’t make her excited for future parenthood.

Many communities use the word stack to describe a pile of stuff that likely has some form of internal hierarchy or workflow. We can see this in software, networking, libraries, math, architecture, and many others. English has plenty of idioms implying that tool selection is a fluid process almost as important as the use of the tool itself. These include, right tool for the right job, bring out the big guns, and don’t bring a knife to a gunfight.

Each of these implies that there are multiple tools and the core joke is that the selection process should be determined by the job to be done. Let’s look at chess for a moment. The Queen piece may be more powerful than the Knight, but the Knight can move in ways the Queen can’t. This means there may be problems for which the most powerful piece is utterly useless. The language that our community uses is describing something that we understand implicitly once we have enough experience, yet we often let students have this perspective of reverence for a single specific tool.

Observing this truth and given how open source obsessed this research domain is as well, it isn’t shocking there is an overabundance of tools in certain areas. At the time of writing, PyPI is approaching nearly 80,000 packages. So how to choose? Which are worth investing your mental energy into getting good at? In the end, does it matter which package you use? These are very serious questions, and sadly, the more I study these things the more I come to the conclusion of “Who knows, but not a big deal one way or another.”


So my best advice for the newcomer is to just pick something. Start somewhere. Anywhere. Really, it barely matters. Because in the end you’ll likely need to know a little bit about everything on the list in front of you. Even if that tool or platform turns out to be a bust for the problem you were working on, that experience adds to your knowledge about what is available in the analysis stack and how to approach problems. You’ll run into it again if you stay in the analytics world.

Now, it doesn’t have to be a complete free-for-all. Some informed selection is always beneficial. Just keep in mind that you are selecting which thing to learn first and not only. Also, be open to accepting that you’ve gone down the wrong path.

You, the learner, have power over how you learn. I want to keep stressing that. You have the absolute power to accept or reject suggestions and strategies. Strict adherence to ‘expert’ recommendations doesn’t reflect your unique needs and puts too much value on those recommendations. I know some of the tools I use aren’t the best, but I use them because they have worked for me up until now. I’ve also run into problems with some of the gold standard tools out there and I end up dealing with pearl clutching coders when I mention that I don’t use them.

I will provide some recommendations below, but you should plan on trying a few things out. See what sticks. While I may recommend something, that is not to the exclusion of something else. Recommending Python does not imply that R will be useless.

When you know a little bit about what you’ll need to do

Search around online for similar projects. See what they’ve done. Pay attention to any chat about specific packages designed for these tasks. Go with whichever platform has the tools designed for your task. For example, I was asked about visualizations for Likert scale data from a survey. R happens to have a nifty 3rd party package for Likert charts. However, if the online survey tool you’re using has mangled your data, you may need to break out some Python to whack it into something compatible with math.

When you just need to start somewhere

Unless you think you may need to do a lot of stats, start with Python. It’ll be straightforward, and you can apply the basic concepts for other tasks, like making games, front-end web development, etc. You’re likely to move on from Python, and that’s fine. It’s a great starter language.

In conclusion

Don’t let decision fatigue prevent you from getting started on your path.  Nearly every programming area requires a stack of things to know, so all experience is good experience to have.  Investigate a little or ask colleagues, but at the end of the day just flip a coin and go with something.

“Programming as an information-centric activity” talk at the Python Education Summit

After developing/teaching several types of programming workshops and spending a lot of time listening to my peers at GSLIS talk about learning how to code, it is fair to say that I have a lot of opinions on the state of teaching programming for those outside of a STEM past and going into a non-STEM future. Additionally, being part of a graduate program in library and information science has biased me to see a lot of activities in our daily lives as information problems.

Much of my experience with programming books has revealed a concept-formula-drill presentation model, but this doesn’t encapsulate the real work of problem solving with code. Yes, students absolutely need to drill and practice the core concepts, but the activity of programming goes far beyond just that need. Many experienced coders criticize new students begging for help for not immediately searching for their problem on Google and solving it on their own. This is such a common response that the programming instruction community should listen and take note. Why are there so many Stack Overflow posts closed as duplicates? Sure, there are certainly searchers who are too lazy to actually read through other posts, but I believe that this group is in the minority. I’d pin the problem on searchers being unable to either a) correctly form a useful search query, or b) recognize an appropriate solution as useful for their problem.

Indeed, many instructors will encourage students to search online for their answers, but even a Google layperson understands that there is skill required to construct a useful search string. There are quirks and tricks to solving code problems via search engines. In unpacking this problem, we can see that students need to be able to identify the actual problem in their code, find the relevant section of code on a line, understand the words to describe the problem, and recognize how to apply a potential fix to their own code. There are a lot of essential skills here but do any introductory textbooks talk about this? (Seriously, let me know if you find one).

I collected many of my thoughts about this into a talk I presented at the Python Education Summit titled “Programming as an information-centric activity.” The core argument: the normal activity of programming involves a lot of information skills and these skills should be incorporated very explicitly into the classroom or other instruction environment. Instructors should not only use documentation and reference materials within lectures and demos but they should also take the time to talk about the common reference materials within programming communities. For example, answer the question: What is a programming cookbook and when should a student reference one? This is a reference document somewhat unique to the programming community but often discovered by accident by novices.

Slides are up on FigShare:
A pre-recorded screencast version is up on YouTube:

The great Python Mashup lesson plan

I’m often asked by new students where they should go to learn Python.  That isn’t always an easy answer, because I haven’t found the one perfect resource yet.  However, there are some really strong ones out there.  My goal was to construct a lesson plan that was a mashup of my favorite resources into a coherent plan of readings and homework.  Readings are important to have a foundation to build on and reference back to, but so is having a solid queue of content to whack on until you understand the how and why of things.

I’ve mashed up content from Python for Informatics,, Codecademy, and Python Batting Practice together into one course book.

I believe that these materials are some of the best out there, and I reject the notion that students need to learn from a single source.  Each has benefits, and I feel like these sources are very complimentary.  Recall learning how to spell or learning another (human) language.  We always had workbooks or some material that required us to act on the content we had just studied.  Learning how to write code is a skill based activity that requires a ton of practice to refine your understanding of the concepts and syntax.  Additionally, learning from multiple sources allows the student to experience how things are referred to by different people from more perspectives.

That being said, a streamlined and supportive course in programming designed to minimize frustration and difficulties does not mean that either of those will disappear.  An important part of the learning process is the fight to learn.  The harder we have to work for something the better we remember it, but that doesn’t mean that learning how to program needs to be the worst thing ever.  I have aimed to keep a good balance inside the “productively difficult” zone.

So, I am happy to announce a new page at the top: the Guided Self-Study Lesson Plan.  I will be leading another introductory workshop again soon with this structure as the basis, and I plan to document and publish that as well.

Interview with

I had the great pleasure of meeting Elliott Hauser and the awesome development team of at PyCon 2014.  Trinket taps into the web-based power of Python by creating a framework to host interactive Python sessions as part of lessons or embedded in a blog.

Trinket’s blog has been featuring a series of interviews with programming educators, showcasing a wide assortment of approaches and tools.  I was recently interviewed by them about the Python group that I co-organize and the various outreach we’ve taken up.

Trinket offers a platform to document your workshop or instructional notes in such a way that your students can take the link home for reference or sharing.  I’ll be posting about a workshop I recently documented on their platform shortly.