<- 81 object_1
3 R objects, functions and packages
3.1 Objects
One of the main advantages to using R over other software packages such as SPSS is that more than one dataset can be accessed at the same time. A collection of data stored in any format within the R session is known as an object. Objects can include single numbers, single variables, entire datasets, lists of datasets, or even tables and graphs.
Object names should only contain lower case letters, numbers and _
(instead of a space to separate words). They should be meaningful and concise.
Objects are defined in R using the <-
symbol or =
. For example,
Creates an object in the environment named object_1
, which takes the value 81
. This will appear in the environment window of the console (window C from the interface shown in the introduction).
Although both work, use <-
for assignment, not =
.
To retrieve an object, type its name into the script or console and run it. This object can then be included in functions or operations in place of the value assigned to it:
object_1
[1] 81
sqrt(object_1)
[1] 9
R has some mathematical objects stored by default such as pi that can be used in calculations.
pi
[1] 3.141593
3.2 Functions
Functions are built-in commands that allow R users to run analyses. All functions require the definition of arguments within round brackets ()
. Each function requires different information and has different arguments that can be used to customise the analysis. A detailed list of these arguments and a description of the function can be found in the function’s associated help file.
3.2.1 Help files
Each function that exists within R has an associated help file. RStudio does not require an internet connection to access these help files if the function is available in the current session of R.
To retrieve help files, enter ?
followed by the function name into the console window, e.g ?mean
. The help file will appear in window D of the interface shown in the introduction.
Help files contain the following information:
- Description: what the function is used for
- Usage: how the function is used
- Arguments: required and optional arguments entered into round brackets necessary for the function to work
- Details: relevant details about the function in question
- References
- See also: links to other relevant functions
- Examples: example code with applications of the function
3.2.2 Error and warning messages
Where a function or object has not been correctly specified, or their is some mistake in the syntax that has been sent to the console, R will return an error message. These messages are generally informative and include the location of the error.
The most common errors include misspelling functions or objects:
sqrt(ojbect_1)
Error in eval(expr, envir, enclos): object 'ojbect_1' not found
Sqrt(object_1)
Error in Sqrt(object_1): could not find function "Sqrt"
Or where an object has not yet been specified:
plot(x, y)
Error in eval(expr, envir, enclos): object 'x' not found
When R returns an error message, this means that the operation has been completely halted. R may also return warning messages which look similar to errors but does not necessarily mean the operation has been stopped.
Warnings are included to indicate that R suspects something in the operation may be wrong and should be checked. There are occasions where warnings can be ignored but this is only after the operation has been checked.
3.2.3 Cleaning the environment
To remove objects from the RStudio environment, we can use the rm
function. This can be combined with the ls()
function, which lists all objects in the environment, to remove all objects currently loaded:
rm(list = ls())
There are no undo and redo buttons for R syntax. The rm
function will permanently delete objects from the environment. The only way to reverse this is to re-run the code that created the objects originally from the script file.
3.3 Packages
R packages are a collection of functions and datasets developed by R users that expand existing R capabilities or add completely new ones. Packages allow users to apply the most up-to-date methods shortly after they are developed, unlike other statistical software packages that require an entirely new version.
3.3.1 Installing packages from CRAN
The quickest way to install a package in R is by using the install.packages
function. This sends RStudio to the online repository of tested and verified R packages (known as CRAN) and downloads the package files onto the machine you are currently working from in temporary files. Ensure that the package you wish to install is spelled correctly and surrounded by ''
.
The install.packages
function requires an internet connection, and can take a long time if the package has a lot of dependent packages that also need downloading.
This process should only be carried out the first time a package is used on a machine, or when a substantial update has taken place, to download the latest version of the package.
3.3.2 Loading packages to an R session
Every time a new session of RStudio is opened, packages must be reloaded. To load a package into R (and gain access to the associated functions and data), use the library
function.
Loading a package does not require an internet connection, but will only work if the package has already been installed and saved onto the computer you are working from. If you are unsure, use the function installed.packages
to return a list of all packages that are loaded onto the machine you are working from.
Begin any script file that requires packages by loading them into the current session. This ensures that there will be no error messages from functions that are not available in the current session.
3.3.3 The pacman package
The pacman
package is a set of package management functions which is designed to make tasks such as installing and loading packages simpler, and speeds up these processes. There are lots of useful functions included in this package, but the one that we will be using in this course is p_load
.
p_load
acts as a wrapper for the library
function. It first checks the computer to see whether the package(s) listed is installed. If they are, p_load
loads the package(s) into the current RStudio session. If not, it attempts to install the package(s) from the CRAN repository.
If you have never used the pacman
package before, run the following code to ensure that it is installed on your machine:
install.packages('pacman')