Tidyverse
Collection of R packages From Wikipedia, the free encyclopedia
The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham[1] and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data.[2] Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping.[3][4][5]
![]() The tidyverse hex logo | |
Repository | github |
---|---|
Written in | R |
Type | Package collection |
License | MIT |
Website | www |
As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages.[6] The tidyverse is the subject of multiple books and papers.[7][8][9][10] In 2019, the ecosystem has been published in the Journal of Open Source Software.[11]
Its syntax has been referred to as "supremely readable",[12] and some[13] have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks.[14][13] Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas.[15] There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC),[16] where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier.[17] Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages.[18][19]
The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features.[20] An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma.[21]
Packages
The core tidyverse packages, which provide functionality to model, transform, and visualize data, include:[22]
- ggplot2 – for data visualization
- dplyr – for wrangling and transforming data
- tidyr – help transform data specifically into tidy data, where each variable is a column, each observation is a row; each row is an observation, and each value is a cell.
- readr – help read in common delimited, text files with data
- purrr – a functional programming toolkit
- tibble – a modern implementation of the built-in data frame data structure
- stringr – helps to manipulate string data types
- forcats – helps to manipulate category data types
Additional packages assist the core collection.[23] Other packages based on the tidy data principles are regularly developed, such as tidytext[24] for text analysis, tidymodels[25] for machine learning, or tidyquant[26] for financial operations.
References
Wikiwand - on
Seamless Wikipedia browsing. On steroids.