Top Qs
Timeline
Chat
Perspective
R (programming language)
Programming language for statistics From Wikipedia, the free encyclopedia
Remove ads
R is a programming language for statistical computing and data visualization. It has been adopted in the fields of data mining, bioinformatics and data analysis.[9]
![]() | This article's lead section may be too technical for most readers to understand. (May 2025) |
Remove ads
The core R language is augmented by a large number of extension packages, containing reusable code, documentation, and sample data.
R software is open-source and free software. R is a GNU Project and licensed under the GNU General Public License.[3][10] It's written primarily in C, Fortran, and R itself. Precompiled executables are provided for various operating systems.
As an interpreted language, R has a native command line interface. Moreover, multiple third-party graphical user interfaces are available, such as RStudio—an integrated development environment—and Jupyter—a notebook interface.
Remove ads
History


R was started by professors Ross Ihaka and Robert Gentleman as a programming language to teach introductory statistics at the University of Auckland.[11] The language was inspired by the S programming language, with most S programs able to run unaltered in R.[6] The language was also inspired by Scheme's lexical scoping, allowing for local variables.[1]
The name of the language, R, comes from being both an S language successor as well as the shared first letter of the authors, Ross and Robert.[12] In August 1993, Ihaka and Gentleman posted a binary of R on StatLib — a data archive website.[13] At the same time, they announced the posting on the s-news mailing list.[14] On 5 December 1997, R became a GNU project when version 0.60 was released.[15] On 29 February 2000, the 1.0 version was released.[16]
Remove ads
Packages
Summarize
Perspective

R packages are collections of functions, documentation, and data that expand R.[17] For example, packages add report features such as RMarkdown, Quarto,[18] knitr and Sweave. Packages also add the capability to implement various statistical techniques such as linear, generalized linear and nonlinear modeling, classical statistical tests, spatial analysis, time-series analysis, and clustering. Easy package installation and use have contributed to the language's adoption in data science.[19]
Base packages are immediately available when starting R and provide the necessary syntax and commands for programming, computing, graphics production, basic arithmetic, and statistical functionality.[20]
The Comprehensive R Archive Network (CRAN) was founded in 1997 by Kurt Hornik and Friedrich Leisch to host R's source code, executable files, documentation, and user-created packages.[21] Its name and scope mimic the Comprehensive TeX Archive Network and the Comprehensive Perl Archive Network.[21] CRAN originally had three mirrors and 12 contributed packages.[22] As of 16 October 2024[update], it has 99 mirrors[23] and 21,513 contributed packages.[24] Packages are also available on repositories R-Forge, Omegahat, and GitHub.[25][26][27]
The Task Views on the CRAN web site list packages in fields such as causal inference, finance, genetics, high-performance computing, machine learning, medical imaging, meta-analysis, social sciences, and spatial statistics.
The Bioconductor project provides packages for genomic data analysis, complementary DNA, microarray, and high-throughput sequencing methods.
The tidyverse package bundles several subsidiary packages that provide a common interface for tasks related to accessing and processing "tidy data",[28] data contained in a two-dimensional table with a single row for each observation and a single column for each variable.[29]
Installing a package occurs only once. For example, to install the tidyverse package:[29]
> install.packages("tidyverse")
To load the functions, data, and documentation of a package, one executes the library()
function. To load tidyverse:[a]
> # Package name can be enclosed in quotes
> library("tidyverse")
> # But also the package name can be called without quotes
> library(tidyverse)
Remove ads
Interfaces
R comes installed with a command line console. Available for installation are various integrated development environments (IDE). IDEs for R include R.app[30] (OSX/macOS only), Rattle GUI, R Commander, RKWard, RStudio, and Tinn-R.[31]
General purpose IDEs that support R include Eclipse via the StatET plugin and Visual Studio via R Tools for Visual Studio.
Editors that support R include Emacs, Vim via the Nvim-R plugin, Kate, LyX via Sweave, WinEdt (website), and Jupyter (website).
Scripting languages that support R include Python (website), Perl (website), Ruby (source code), F# (website), and Julia (source code).
General purpose programming languages that support R include Java via the Rserve socket server, and .NET C# (website).
Statistical frameworks which use R in the background include Jamovi and JASP.[citation needed]
Community
The R Core Team was founded in 1997 to maintain the R source code. The R Foundation for Statistical Computing was founded in April 2003 to provide financial support. The R Consortium is a Linux Foundation project to develop R infrastructure.
The R Journal is an open access, academic journal which features short to medium-length articles on the use and development of R. It includes articles on packages, programming tips, CRAN news, and foundation news.
The R community hosts many conferences and in-person meetups - see the community maintained GitHub list. These groups include:
- UseR!: an annual international R user conference (website)
- Directions in Statistical Computing (DSC) (website)
- R-Ladies: an organization to promote gender diversity in the R community (website)
- SatRdays: R-focused conferences held on Saturdays (website)
- R Conference (website)
- posit::conf (formerly known as rstudio::conf) (website)
Remove ads
Implementations
The main R implementation is written primarily in C, Fortran, and R itself. Other implementations include:
- pretty quick R (pqR), by Radford M. Neal, attempts to improve memory management.
- Renjin is an implementation of R for the Java Virtual Machine.
- CXXR and Riposte[32] are implementations of R written in C++.
- Oracle's FastR is an implementation of R, built on GraalVM.
- TIBCO Software, creator of S-PLUS, wrote TERR — an R implementation to integrate with Spotfire.[33]
Microsoft R Open (MRO) was an R implementation. As of 30 June 2021, Microsoft started to phase out MRO in favor of the CRAN distribution.[34]
Remove ads
Commercial support
Although R is an open-source project, some companies provide commercial support:
- Oracle provides commercial support for the Big Data Appliance, which integrates R into its other products.
- IBM provides commercial support for in-Hadoop execution of R.
Examples
Summarize
Perspective
Hello, World!
> print("Hello, World!")
[1] "Hello, World!"
Alternatively:
> cat("Hello, World!")
Hello, World!
Basic syntax
The following examples illustrate the basic syntax of the language and use of the command-line interface. (An expanded list of standard language features can be found in the R manual, "An Introduction to R".[35])
In R, the generally preferred assignment operator is an arrow made from two characters <-
, although =
can be used in some cases.[36]
> x <- 1:6 # Create a numeric vector in the current environment
> y <- x^2 # Create vector based on the values in x.
> print(y) # Print the vector’s contents.
[1] 1 4 9 16 25 36
> z <- x + y # Create a new vector that is the sum of x and y
> z # Return the contents of z to the current environment.
[1] 2 6 12 20 30 42
> z_matrix <- matrix(z, nrow = 3) # Create a new matrix that turns the vector z into a 3x2 matrix object
> z_matrix
[,1] [,2]
[1,] 2 20
[2,] 6 30
[3,] 12 42
> 2 * t(z_matrix) - 2 # Transpose the matrix, multiply every element by 2, subtract 2 from each element in the matrix, and return the results to the terminal.
[,1] [,2] [,3]
[1,] 2 10 22
[2,] 38 58 82
> new_df <- data.frame(t(z_matrix), row.names = c("A", "B")) # Create a new data.frame object that contains the data from a transposed z_matrix, with row names 'A' and 'B'
> names(new_df) <- c("X", "Y", "Z") # Set the column names of new_df as X, Y, and Z.
> print(new_df) # Print the current results.
X Y Z
A 2 6 12
B 20 30 42
> new_df$Z # Output the Z column
[1] 12 42
> new_df$Z == new_df['Z'] && new_df[3] == new_df$Z # The data.frame column Z can be accessed using $Z, ['Z'], or [3] syntax and the values are the same.
[1] TRUE
> attributes(new_df) # Print attributes information about the new_df object
$names
[1] "X" "Y" "Z"
$row.names
[1] "A" "B"
$class
[1] "data.frame"
> attributes(new_df)$row.names <- c("one", "two") # Access and then change the row.names attribute; can also be done using rownames()
> new_df
X Y Z
one 2 6 12
two 20 30 42
Structure of a function
One of R's strengths is the ease of creating new functions.[37] Objects in the function body remain local to the function, and any data type may be returned. In R, almost all functions and all user-defined functions are closures.[38]
Create a function:
# The input parameters are x and y.
# The function returns a linear combination of x and y.
f <- function(x, y) {
z <- 3 * x + 4 * y
# An explicit return() statement is optional, could be replaced with simply `z`.
return(z)
}
# Alternatively, the last statement executed is implicitly returned.
f <- function(x, y) 3 * x + 4 * y
Usage output:
> f(1, 2)
[1] 11
> f(c(1, 2, 3), c(5, 3, 4))
[1] 23 18 25
> f(1:3, 4)
[1] 19 22 25
It is possible to define functions to be used as infix operators with the special syntax `%name%`
where "name" is the function variable name:
> `%sumx2y2%` <- function(e1, e2) {e1 ^ 2 + e2 ^ 2}
> 1:3 %sumx2y2% -(1:3)
[1] 2 8 18
Since version 4.1.0 functions can be written in a short notation, which is useful for passing anonymous functions to higher-order functions:[39]
> sapply(1:5, \(i) i^2) # here \(i) is the same as function(i)
[1] 1 4 9 16 25
Native pipe operator
In R version 4.1.0, a native pipe operator, |>
, was introduced.[40] This operator allows users to chain functions together one after another, instead of a nested function call.
> nrow(subset(mtcars, cyl == 4)) # Nested without the pipe character
[1] 11
> mtcars |> subset(cyl == 4) |> nrow() # Using the pipe character
[1] 11
Another alternative to nested functions, in contrast to using the pipe character, is using intermediate objects:
> mtcars_subset_rows <- subset(mtcars, cyl == 4)
> num_mtcars_subset <- nrow(mtcars_subset_rows)
> print(num_mtcars_subset)
[1] 11
While the pipe operator can produce code that is easier to read, it has been advised to pipe together at most 10 to 15 lines and chunk code into sub-tasks which are saved into objects with meaningful names.[41] Here is an example with fewer than 10 lines that some readers may still struggle to grasp without intermediate named steps:
(\(x, n = 42, key = c(letters, LETTERS, " ", ":", ")"))
strsplit(x, "")[[1]] |>
(Vectorize(\(chr) which(chr == key) - 1))() |>
(`+`)(n) |>
(`%%`)(length(key)) |>
(\(i) key[i + 1])() |>
paste(collapse = "")
)("duvFkvFksnvEyLkHAErnqnoyr")
Object-oriented programming
The R language has native support for object-oriented programming. There are two native frameworks, the so-called S3 and S4 systems. The former, being more informal, supports single dispatch on the first argument and objects are assigned to a class by just setting a "class" attribute in each object. The latter is a Common Lisp Object System (CLOS)-like system of formal classes (also derived from S) and generic methods that supports multiple dispatch and multiple inheritance[42]
In the example, summary
is a generic function that dispatches to different methods depending on whether its argument is a numeric vector or a "factor":
> data <- c("a", "b", "c", "a", NA)
> summary(data)
Length Class Mode
5 character character
> summary(as.factor(data))
a b c NA's
2 1 1 1
Modeling and plotting

The R language has built-in support for data modeling and graphics. The following example shows how R can generate and plot a linear model with residuals.
# Create x and y values
x <- 1:6
y <- x^2
# Linear regression model y = A + B * x
model <- lm(y ~ x)
# Display an in-depth summary of the model
summary(model)
# Create a 2 by 2 layout for figures
par(mfrow = c(2, 2))
# Output diagnostic plots of the model
plot(model)
Output:
Residuals:
1 2 3 4 5 6 7 8 9 10
3.3333 -0.6667 -2.6667 -2.6667 -0.6667 3.3333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.3333 2.8441 -3.282 0.030453 *
x 7.0000 0.7303 9.585 0.000662 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.055 on 4 degrees of freedom
Multiple R-squared: 0.9583, Adjusted R-squared: 0.9478
F-statistic: 91.88 on 1 and 4 DF, p-value: 0.000662
Mandelbrot set

This Mandelbrot set example highlights the use of complex numbers. It models the first 20 iterations of the equation z = z2 + c
, where c
represents different complex constants.
Install the package that provides the write.gif()
function beforehand:
install.packages("caTools")
R Source code:
library(caTools)
jet.colors <-
colorRampPalette(
c("green", "pink", "#007FFF", "cyan", "#7FFF7F",
"white", "#FF7F00", "red", "#7F0000"))
dx <- 1500 # define width
dy <- 1400 # define height
C <-
complex(
real = rep(seq(-2.2, 1.0, length.out = dx), each = dy),
imag = rep(seq(-1.2, 1.2, length.out = dy), times = dx)
)
# reshape as matrix of complex numbers
C <- matrix(C, dy, dx)
# initialize output 3D array
X <- array(0, c(dy, dx, 20))
Z <- 0
# loop with 20 iterations
for (k in 1:20) {
# the central difference equation
Z <- Z^2 + C
# capture the results
X[, , k] <- exp(-abs(Z))
}
write.gif(
X,
"Mandelbrot.gif",
col = jet.colors,
delay = 100)
Remove ads
Version names
Summarize
Perspective

All R version releases from 2.14.0 onward have codenames that make reference to Peanuts comics and films.[43][44][45]
In 2018, core R developer Peter Dalgaard presented a history of R releases since 1997.[46] Some notable early releases before the named releases include:
- Version 1.0.0 released on 29 February 2000 (2000-02-29), a leap day
- Version 2.0.0 released on 4 October 2004 (2004-10-04), "which at least had a nice ring to it"[46]
The idea of naming R version releases was inspired by the Debian and Ubuntu version naming system. Dalgaard also noted that another reason for the use of Peanuts references for R codenames is because, "everyone in statistics is a P-nut".[46]
Remove ads
See also
Notes
- This displays to standard error a listing of all the packages that tidyverse depends upon. It may also display warnings showing namespace conflicts, which may typically be ignored.
References
Further reading
External links
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads