1.2 Why R?
So, why R? One big attraction, especially for penny-pinching journalists and students, is that it’s free and open source, unlike some powerful but pricey commercial platforms.
There are several popular open-source platforms for wrangling and analyzing data, and each has its ardent cheerleaders. If you’ve heard passionate arguments between iPhone and Android users, or Mac vs. Windows enthusiasts, you’ll have a pretty good idea of what, say, R vs. Python arguments can sound like.
I don’t want to disrespect Python, though – it’s another great language. I happen to prefer R for much of my data work because it was designed to analyze data. And that means many of the things you want to do with data – structuring, summarizing, visualizing – are well thought out. There’s a built-in data structure called a data frame that’s spreadsheet-like in its organization, making it easy to apply calculations across columns or rows. And unlike most computer languages, R starts counting at 1 instead of 0, which means if you want row 273, you ask for 273 and not 272. (If you’ve never programmed before, you won’t realize how unusual this is. If you have experience with one or more other languages, though, you may have to break yourself of some habits.)
It’s fairly easy to install basic R and get started, whether on Windows or a Mac, which is something that can’t necessarily be said about all programming languages.
R’s capabilities are rapidly evolving, making it particularly interesting as a platform. The R ecosystem of today is far more robust than when I started learning R in 2012. For example, you can now create interactive Web maps and tables with just a couple of lines of code. It seems that every month, there are new, more elegant ways to wrangle, analyze, and visualize data.
Visualization is one of the most compelling features of R. When I did data exploration in Excel, I tended not to generate graphics until pretty late in my data work – usually only when I was ready to think about what chart to publish with a story. With R, though, it’s easy to build dataviz into a standard workflow.
Finally, the large and growing community of R users is one of its best features. There are thousands of R “packages” – code written to enhance the core language or solve a specific problem – available for free download, making it likely that someone has already thought through how to solve a problem you might have with your data. And people in the community are usually eager to help if you run into problems.