The issue is of human time rather than silicon chip time. Human time can be wasted by taking longer to write the code, and (often much more importantly) by taking more time to understand subsequently what it does.
– The R Inferno (Patrick Burns, 2011)
The goal of mutagen is to provide extensions to dplyr’s mutate()
.
mutagen provides simple-to-use functions as alternatives to complex R idioms for variable generation. Some mutagen functions are specific to problems encountered in R (e.g., working with list-columns in a data frame), while others solve more generic data science operations that are inspired by the excellent set of egen
(‘extensions to generate’) and egenmore
functions in Stata.
Installation
You can install the development version of mutagen from GitHub with:
# install.packages("devtools")
devtools::install_github("gvelasq/mutagen")
Usage
mutagen functions begin with the prefix gen_*
and are designed to be used inside dplyr’s mutate()
. A mnemonic for this is that to use mutagen, first mutate then generate.
mutagen function | R idiom1 | Stata function |
---|---|---|
gen_na_listcol() 2
|
modify_tree(leaf = \(x) replace(x, is.null(x), NA)) |
N/A |
gen_percent() |
mutate(data, pct = col / sum(col) * 100, .by = group_cols) |
egen pc |
gen_rowany() 3
|
pmap_int(data, \(cols) any(list(cols) %in% values)) |
egen anymatch , egenmore rany
|
gen_rowcount() 4
|
pmap_int(data, \(cols) sum(list(cols) %in% values)) |
egen anycount |
gen_rowfirst() 5
|
pmap_vec(data, \(cols) first(c(cols), na_rm = TRUE)) |
egen rowfirst |
gen_rowlast() 6
|
pmap_vec(data, \(cols) last(c(cols), na_rm = TRUE)) |
egen rowlast |
gen_rownth() 7
|
pmap_vec(data, \(cols) nth(c(cols), n, na_rm = TRUE)) |
N/A |
gen_rowmax() |
inject(pmax(!!!data, na.rm = TRUE)) |
egen rowmax |
gen_rowmean() 8
|
pmap_dbl(data, \(cols) mean(c(cols), na.rm = TRUE)) |
egen rowmean |
gen_rowmedian() 9
|
pmap_dbl(data, \(cols) median(c(cols), na.rm = TRUE)) |
egen rowmedian |
gen_rowmin() |
inject(pmin(!!!data, na.rm = TRUE)) |
egen rowmin |
gen_rowmiss() 10
|
pmap_int(data, \(cols) sum(!complete.cases(c(cols)))) |
egen rowmiss |
Contributing
Please note that the mutagen project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Contributors
All contributions to this project are gratefully acknowledged using the allcontributors
package following the all-contributors specification. Contributions of any kind are welcome!
Issues
delabj |
thomascwells |