Get a list of the data sets in a particular package

R

R Problem Overview


I would like to get a list of all the data sets in a particular R package shown in the console. I know that the function data() will list all the data sets in loaded packages. That's not my target. I want to get the list of all data sets in a particular R package. The following attempt is not working.

data()
data('arules')
# Warning message:
# In data("arules") : data set ‘arules’ not found

My other intention is to get a list of dim for all the data sets in a particular package.

R Solutions


Solution 1 - R

There's some good info on this in the details section of help(data). Here are the basics, using the plyr package as an example. For starters, let's see what's available from data().

names(data())
#[1] "title"   "header"  "results" "footer" 

Further investigation of those elements will reveal what's in them. Next, we can use the arguments in data() and then subset the resulting list to find the names of the data sets in the package.

d <- data(package = "plyr")
## names of data sets in the package
d$results[, "Item"]
# [1] "baseball" "ozone"   
## assign it to use later
nm <- d$results[, "Item"]
## call the promised data
data(list = nm, package = "plyr")
## get the dimensions of each data set
lapply(mget(nm), dim)
# $baseball
# [1] 21699    22
#
# $ozone
# [1] 24 24 72

Edit/Update: If you wish to find the names of data sets in all installed packages, you can use the following. .packages(TRUE) gives all packages available in the library location path lib.loc. Since the data sets in the base and stats packages have been moved to the datasets package, we need to account for that by taking them out with setdiff().

## names of all packages sans base and stats
pkgs <- setdiff(.packages(TRUE), c("base", "stats"))
## get the names of all the data sets
dsets <- data(package = pkgs)$result[, "Item"]
## look at the first few in our result
head(dsets)
# [1] "AirPassengers"          "BJsales"                "BJsales.lead (BJsales)"
# [4] "BOD"                    "CO2"                    "ChickWeight"   

Solution 2 - R

The vcdExtra package has a function datasets for just this purpose. It returns a data frame containing the name, class, dimensions, and title of each data set found in a package.

> vcdExtra::datasets("plyr")
      Item      class      dim                                                        Title
1 baseball data.frame 21699x22 Yearly batting records for all major league baseball players
2    ozone      array 24x24x72             Monthly ozone measurements over Central America.
>

It works with several package names also:

> vcdExtra::datasets(c("plyr", "dplyr"))
  Package     Item      class      dim
1    plyr baseball data.frame 21699x22
2    plyr    ozone      array 24x24x72
3   dplyr     nasa   tbl_cube  41472x4
                                                         Title
1 Yearly batting records for all major league baseball players
2             Monthly ozone measurements over Central America.
3                                    NASA spatio-temporal data
>

Solution 3 - R

If you are in the R-studio and you have imported that package

you can switch from global environment to the specific package in your "environment" window

Then you can see the list of the data set in that package

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionS DasView Question on Stackoverflow
Solution 1 - RRich ScrivenView Answer on Stackoverflow
Solution 2 - Ruser101089View Answer on Stackoverflow
Solution 3 - RcloudscomputesView Answer on Stackoverflow