Research Tools

Helpful hints for R beginners

By Nathalie Sommer

I am not good at math. I don’t have a natural knack for it. So when I was first introduced to R in a quantitative biology course, I was completely overwhelmed. I didn’t get it. The learning curve was too steep and I gave up.

Fast-forward a few months, and I had data to analyze – I hit the reset button on R and tried again. After plenty of struggling and a whole lot of help, I was successful.

Looking back on my first months of learning, I realized I was so caught up in the formidable interface that I lost sight of why I wanted to learn R in the first place. I’m still learning R and will always be learning. It is not something you can ever completely master. But you have to start somewhere.

From one R learner to another, here are my top five tips for getting started and staying in.

  1. Use data that makes you excited.

R is a powerful computer language for data analysis. Before you even download the software, find your motivation.

Someone once described R to me as the process of opening a highly anticipated present. You have data, and within your data is an answer to your fundamental question. Science is driven by questions, and learning R is no exception. You will be far more motivated to teach yourself when you are excited about finding an answer. If you don’t have data to work with, talk to someone whose research interests align with yours and explain that you’re trying to learn R. They can guide you to relevant online databases.

  1. Master the basic building blocks.

No one learns a language overnight, but the R package “swirl” is a great way to master the basics. Swirl will walk you through simple syntax and operations step-by-step. It even gives you encouragement as you move through the exercises! Once you are comfortable importing data, manipulating objects, and running base functions, swirl has more advanced courses on regression analyses and exploratory data analysis. (Note: most ecologists work with R through the RStudio interface. RStudio is very user friendly and easier to use than the base R scripting console. You’ll need to download both R and RStudio to use swirl.)

Recommended: If your university offers it, take a course in Linear Algebra. R is a language based on matrices and Linear Algebra is matrix math. It doesn’t sound fun, I know, but after I took a linear algebra course, I could understand the mathematical basis for errors in my code and more quickly diagnose problems.

  1. Play with existing code and packages.

You can ask a colleague to share some of their code, or find open-access code repositories on GitHub. Run their code, manipulate it, and see what happens. Be deliberate and try to figure out exactly what the code is doing, line-by-line. Keep notes on your code using “#”. (This tells R not to read it as a command and is a helpful way to annotate your scripts.) If you can understand the important elements of existing code, you can confidently begin writing for yourself.

It is important to know that R is an open-sourced program, meaning anyone can write packages. Packages contain functions, which are tools you can download for your R toolbox (e.g., swirl), and they come in every variety imaginable. Vegan is a popular package among community ecologists that includes functions for all kinds of exciting analyses, like diversity indices and ordination. Some packages are more for fun, such as beepr, which will play the Mario theme song to alert you when your simulation is finished. This brings me to number four:

  1. Google is your friend.

But R is the Wild West. Since R is open sourced, some packages may not be the most reliable for your analysis. R will not detect fundamental errors in how you run your statistics – make sure you have the statistical background for a particular analysis before you draw conclusions for the result. Google is a great place to find out which packages may be better suited to your goals and data.

Google can also be helpful when you run into problems with your code. Chances are high that someone else has definitely had the same issue before. When you first run into an error, try to solve it on your own before opening your browser. You’ll learn faster that way.

  1. Keep practicing.

As with any language, you will forget if you take long breaks. Keep revisiting your scripts and writing bits of code, even if it is only 20 minutes each day. Challenge yourself to learn for-loops and functions. Teach yourself the syntax differences between R and RMarkdown. Learn at your own pace, and don’t be afraid to ask for help.

Knowing how to analyze your data is an extremely important skill. Some days you will feel like a coding wizard, and on other days, R will make you feel like a fool. Even real coding wizards have foolish days.

You got this.

Author biography: Nathalie is a master’s student at the Yale School of Forestry & Environmental Studies, where she studies the evolutionary ecology of animal behavior and food web dynamics. Though she’s been working in R for more than four years, she still has to Google things. Connect with her on Twitter @NathalieRSommer and share your #RStats tips for beginners.

Image caption/credit: Blue circle with white question mark. Image by Nathalie Sommer.

Categories: Research Tools

Tagged as: , ,