7 Understanding and Using R Packages
Packages are one of the most important concepts in R programming. It’s almost hard to conceive R programming without them.
An R package stores various functions and data sets for other users to access. It allows R to move beyond its roots in statistical programming and achieve more complex goals.
For example, you might be writing a research paper. You want to clearly show the results of your regression analysis in this report, along with various tables and charts. You can use a combination of rmarkdown
, ggplot2
, xtable
, and various other packages to accomplish this goal.
That way you don’t have to copy and paste your work to a word document as you analyze the results. You merely write it and program it in R and then export it when you’re done. This saves you a lot of time in the long run and makes your code far more re-producible.
7.1 Why Does R Use Packages?
Packages allow R to operate as an open source language. Programmers, statisticians, and data scientists can develop new functions and commands and then share them with other users elsewhere - for free! This is common for open source programming languages.
If you want, you can actually develop you’re own package. If you find existing resources don’t perform or operate the way you’d like, you can develop your own functions and save them in a package for others to use.
7.2 How to Access R Packages
Before I show you how to use an R package, you need to understand there’s a difference between installing and loading a package. Installing means pulling it from CRAN and saving it on your computer. Loading a package means using it in your current R session.
Why would R do this?
Mostly for efficiency. It would take more memory if your R session ran every package installed on your computer. It improves your computer’s performance to load packages only as you need them.
Also, it’s not uncommon for R packages from different developers to have functions with the same name, but different purposes and inputs. Forcing you to load only the packages with the functions you need solves this issue.
7.3 How to Install and Load a Package - The Easy Way
There are a couple of different ways to install and load packages. It depends on whether you need to save and re-use your code later or if you’re running a quick analysis.
The easiest way to manage R packages is through RStudio’s user interface. This is better for quick analysis that you don’t need to save.
The RStudio packages tab on the bottom right pane neatly organizes and details your current packages:
You can use this tab to install and download a package.
To install a package, select the install button:
After that, type in the name of your package. In the example below, I type in “dplyr” to install the dplyr
package.
You will now see this package show up in the packages tab in the bottom right pane of RStudio:
This doesn’t make the dplyr
package available for us to use though. We still have to load it.
This is where RStudio makes things easy. All you have to do is click the little check box next to dplyr to load it.
And now your package is loaded!
7.4 How to Install and Load R Packages - The Old Fashioned Way
Sometimes it’s better to hand-type code and that also applies to package management. That’s what I call the “old fashioned way.” You may choose this approach because you find it faster than scrolling through the user-interface. Or you might be writing a script that will be used again later on.
There are two key functions you need to remember to install and load a package this way:
install.packages()
installs the package from CRAN onto your computer. library()
will load it into your current R session.
Oddly enough, there’s a difference in notation between the two. The install.packages()
function requires you to put quotations “” around the package name. library()
does not.
To see what I mean, look at the example down below:
install.packages("dplyr")
library(dplyr)
Notice how the quotations marks are used in the first function? This is required for install.packages()
. The library()
function does not require it, but you can use quotation marks and it’ll still execute.
7.5 Make It Easy for the Next Person to Use the Required Packages
If you share your R script with a colleague, they may not have all packages needed on their local computer to execute it.
To take care of this, you can include both the install.packages()
and library()
functions at the top of the script that you send them.
If you’re the only person who will open that script in the future, you only need to include the library()
function. Since you’ve already installed the package before, there’s no need to do it again.
7.6 How to Find New Packages to Install
One of the not-so-best kept secrets about programmers is that we rely on google for the majority of what we do. Every time something doesn’t work, we just google the answer!
Chances are high that you’ll do the same as you program in R. Whenever you come across a website that documents a function, they’ll either specify the package at the start of the article or post it in the top right or left corner of the screen.
For example, I recently google’d “survival analysis in R.” Unlike regression analysis, R doesn’t have handy base functions to perform survival analysis. I found a couple of websites with information on how to do this in R. All of them required a package called survival, which they displayed in the top left or right of the website.
7.7 How to Find Documentation on Packages
Most packages you install will have documentation with it. Sadly, much of this documentation is not easy to read, but it’s still a great resource and I rely on it heavily.
To access this documentation, you can click on the hyperlinked package name in the Packages tab.
For example, try clicking on dplyr
’s link on your own RStudio screen to see what I mean:
This will take you to the Help tab and you can see documentation on all commands, functions, and data sets for a given package.
You can select any of these hyperlinks to view instructions for how to use a specific function from this package.
Down below, I select the mutate hyperlink:
And this will take you to the documentation page…
As you may have noticed, the documentation also lists the function’s package in the top left hand of the corner. This is useful as you look up functions later.
You can also use certain commands to pull up this documentation. The following will bring up a package’s documentation in the Help tab.
?dplyr()
And you can look up individual functions from the package.
?mutate()
7.8 Things to Remember
- Packages are what allows you to adapt R programming to meet your needs
- You can install and load packages using the Packages tab in RStudio or the
install.packages()
andlibrary()
commands - Whenever you research new functions on the internet, you will often see the package required in the top left or right hand corner
- You can research packages using the Help tab in RStudio