Top Qs
Timeline
Chat
Perspective
Tidyverse
Collection of R packages From Wikipedia, the free encyclopedia
Remove ads
The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham[3] and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data.[4] Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.[5][6][7]
Remove ads
As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages.[8] The tidyverse is the subject of multiple books and papers.[9][10][11][12] In 2019, the ecosystem has been published in the Journal of Open Source Software.[13]
Its syntax has been referred to as "supremely readable",[14] and some[15] have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks.[16][15] Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas.[17] There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC),[18] where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier.[19] Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages.[20][21]
The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features.[22] An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma.[23]
Remove ads
Packages
The core tidyverse packages, which provide functionality to model, transform, and visualize data, include:[24]
- ggplot2 – for data visualization
- dplyr – for wrangling and transforming data
- tidyr – help transform data specifically into tidy data, where each variable is a column, each observation is a row; each row is an observation, and each value is a cell.
- readr – help read in common delimited, text files with data
- purrr – a functional programming toolkit
- tibble – a modern implementation of the built-in data frame data structure
- stringr – helps to manipulate string data types
- forcats – helps to manipulate category data types
Additional packages assist the core collection.[25] Other packages based on the tidy data principles are regularly developed, such as tidytext[26] for text analysis, tidymodels[27] for machine learning, or tidyquant[28] for financial operations.
Remove ads
References
Wikiwand - on
Seamless Wikipedia browsing. On steroids.
Remove ads