Dataverse has some terminology that is worth quickly reviewing before showing how to work with Dataverse in R. Dataverse is an application that can be installed in many places. As a result, dataverse can work with any instllation but you need to specify which installation you want to work with. This can be set by default with an environment variable, DATAVERSE_SERVER
:
Dataverse has some terminology that is worth quickly reviewing before showing how to work with Dataverse in R. Dataverse is an application that can be installed in many places. As a result, dataverse can work with any installation but you need to specify which installation you want to work with. This can be set by default with an environment variable, DATAVERSE_SERVER
:
library("dataverse")
Sys.setenv("DATAVERSE_SERVER" = "dataverse.harvard.edu")
You can search for and retrieve data without a Dataverse account for that a specific Dataverse installation. For example, to search for data files or datasets that mention “ecological inference”, we can just do:
dataverse_search("ecological inference")[c("name", "type", "description")]
The search vignette describes this functionality in more detail. To retrieve a data file, we need to investigate the dataset being returned and look at what files it contains using a variety of functions, the last of which - get_file()
- can retrieve the files as raw vectors:
The search vignette describes this functionality in more detail. To retrieve a data file, we need to investigate the dataset being returned and look at what files it contains using a variety of functions, the last of which - get_file()
- can retrieve the files as raw vectors:
get_dataset()
dataset_files()
diff --git a/docs/articles/B-search.html b/docs/articles/B-search.html
index dff2407..978a49b 100644
--- a/docs/articles/B-search.html
+++ b/docs/articles/B-search.html
@@ -31,7 +31,7 @@
This vignette shows how to download data from Dataverse using the dataverse package. We’ll focus on a Dataverse repository that contains supplemental files for Jamie Monogan’s book Political Analysis Using R, which is stored at Harvard University’s IQSS Dataverse Network:
+This vignette shows how to download data from Dataverse using the dataverse package. We’ll focus on a Dataverse repository that contains supplemental files for Political Analysis Using R, which is stored at Harvard University’s IQSS Dataverse Network:
@@ -294,14 +294,14 @@Monogan, Jamie, 2015, “Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems”, doi:10.7910/DVN/ARKOTI, Harvard Dataverse, V1, UNF:6:+itU9hcUJ8I9E0Kqv8HWHg==
To reproduce the analysis, we can simply run the code file either as a system()
call or directly in R using source()
(note this particular file begins with an rm()
call so you may want to run it in a new enviroment):
To reproduce the analysis, we can simply run the code file either as a system()
call or directly in R using source()
(note this particular file begins with an rm()
call so you may want to run it in a new environment):
Any well-produced set of analysis reproduction files, like this one, should run without error once the data and code are in-hand. Troubleshooting anlaysis files is beyond the scope of this vignette, but common sources are
+Any well-produced set of analysis reproduction files, like this one, should run without error once the data and code are in-hand. Troubleshooting analysis files is beyond the scope of this vignette, but common sources are
The main data archiving (or “deposit”) workflow for Dataverse is built on SWORD v2.0. This means that to create a new dataset listing, you will have first initialize a dataset entry with some metadata, add one or more files to the dataset, and then publish it. This looks something like the following:
+The main data archiving (or “deposit”) workflow for Dataverse is built on SWORD v2.0. This means that to create a new dataset listing, you will have to first initialize a dataset entry with some metadata, add one or more files to the dataset, and then publish it. This looks something like the following:
# retrieve your service document
d <- service_document()
diff --git a/docs/articles/index.html b/docs/articles/index.html
index 682428b..13d032e 100644
--- a/docs/articles/index.html
+++ b/docs/articles/index.html
@@ -71,7 +71,7 @@
inst/CITATION
- Thomas J. Leeper (). dataverse: R Client for Dataverse 4. R package version 0.2.1.9002.
+Thomas J. Leeper (). dataverse: R Client for Dataverse 4. R package version 0.3.0.
@Manual{, title = {dataverse: R Client for Dataverse 4}, author = {Thomas J. Leeper}, - note = {R package version 0.2.1.9002}, + note = {R package version 0.3.0}, }
The dataverse package provides access to Dataverse 4 APIs, enabling data search, retrieval, and deposit, thus allowing R users to integrate public data sharing into the reproducible research workflow. dataverse is the next-generation iteration of the dvn package, which works with Dataverse 3 (“Dataverse Network”) applications. dataverse includes numerous improvements for data search, retrieval, and deposit, including use of the (currently in development) sword package for data deposit and the UNF package for data fingerprinting.
## Downloading ingested version of data with readr::read_tsv. To download the original version and remove this message, set original = TRUE.
-##
+##
## ── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────
## cols(
## idcode = col_double(),
@@ -180,7 +180,7 @@
original = TRUE,
server = "demo.dataverse.org"
)
Note that even though the file prefix is “.tab”, we use read_dta
.
Note that even though the file prefix is “.tab”, we use haven::read_dta
.
Of course, when the dataset is not ingested (such as a Rds file), users would always need to specify an .f
argument for the specific file.
Note the difference between nls_tsv
and nls_original
. nls_original
preserves the data attributes like value labels, whereas nls_tsv
has dropped this or left this in file metadata.
@@ -188,7 +188,7 @@
-attr(nlsw_original$race, "labels") # original dta has value labels
@@ -214,7 +214,7 @@-dataset = "10.70122/FK2/PPIAXE", server = "demo.dataverse.org" )
diff --git a/docs/reference/add_file.html b/docs/reference/add_file.html index cd64134..187c17d 100644 --- a/docs/reference/add_file.html +++ b/docs/reference/add_file.html @@ -72,7 +72,7 @@## Dataset (182162): +
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml index e7f6cff..ca6d720 100644 --- a/docs/pkgdown.yml +++ b/docs/pkgdown.yml @@ -6,7 +6,7 @@ articles: B-search: B-search.html C-retrieval: C-retrieval.html D-archiving: D-archiving.html -last_built: 2021-01-17T17:13Z +last_built: 2021-01-18T17:11Z urls: reference: https://IQSS.github.io/dataverse-client-r/reference article: https://IQSS.github.io/dataverse-client-r/articles diff --git a/docs/reference/add_dataset_file.html b/docs/reference/add_dataset_file.html index 7400029..92c9607 100644 --- a/docs/reference/add_dataset_file.html +++ b/docs/reference/add_dataset_file.html @@ -72,7 +72,7 @@## Dataset (182162): ## Version: 1.1, RELEASED ## Release Date: 2020-12-30T00:00:24Z ## License: CC0 @@ -241,7 +241,7 @@
diff --git a/docs/news/index.html b/docs/news/index.html index d9988b7..bfbe74e 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -71,7 +71,7 @@Data Archiving
-Dataverse provides two - basically unrelated - workflows for managing (adding, documenting, and publishing) datasets. The first is built on SWORD v2.0. This means that to create a new dataset listing, you will have first initialize a dataset entry with some metadata, add one or more files to the dataset, and then publish it. This looks something like the following:
+Dataverse provides two - basically unrelated - workflows for managing (adding, documenting, and publishing) datasets. The first is built on SWORD v2.0. This means that to create a new dataset listing, you will have to first initialize a dataset entry with some metadata, add one or more files to the dataset, and then publish it. This looks something like the following:
# retrieve your service document d <- service_document() @@ -287,7 +287,7 @@
Other Installations
-Users interested in downloading metadata from archives other than Dataverse may be interested in Kurt Hornik’s OAIHarvester and Scott Chamberlain’s oai, which offer metadata download from any web repository that is compliant with the Open Archives Initiative standards. Additionally, rdryad uses OAIHarvester to interface with Dryad. The rfigshare package works in a similar spirit to dataverse with https://figshare.com/.
+Users interested in downloading metadata from archives other than Dataverse may be interested in Kurt Hornik’s OAIHarvester and Scott Chamberlain’s oai, which offer metadata download from any web repository that is compliant with the Open Archives Initiative standards. Additionally, rdryad uses OAIHarvester to interface with Dryad. The rfigshare package works in a similar spirit to dataverse with https://figshare.com/.
An object of class “dataverse_dataset”.
create_dataset
creates a Dataverse dataset. In Dataverse, a “dataset” is the lowest-level structure in which to organize files. For example, a Dataverse dataset might contain the files used to reproduce a published article, including data, analysis code, and related materials. Datasets can be organized into “Dataverse” objects, which can be further nested within other Dataverses. For someone creating an archive, this would be the first step to producing said archive (after creating a Dataverse, if one does not already exist). Once files and metadata have been added, the dataset can be publised (i.e., made public) using publish_dataset
.
create_dataset
creates a Dataverse dataset. In Dataverse, a “dataset” is the lowest-level structure in which to organize files. For example, a Dataverse dataset might contain the files used to reproduce a published article, including data, analysis code, and related materials. Datasets can be organized into “Dataverse” objects, which can be further nested within other Dataverses. For someone creating an archive, this would be the first step to producing said archive (after creating a Dataverse, if one does not already exist). Once files and metadata have been added, the dataset can be published (i.e., made public) using publish_dataset
.
update_dataset
updates a Dataverse dataset that has already been created using create_dataset
. This creates a draft version of the dataset or modifies the current draft if one is already in-progress. It does not assign a new version number to the dataset nor does it make it publicly visible (which can be done with publish_dataset
).
-# Retrieve data.frame from dataverse DOI and file name +diff --git a/docs/reference/get_dataset.html b/docs/reference/get_dataset.html index 8907f9f..919a17f 100644 --- a/docs/reference/get_dataset.html +++ b/docs/reference/get_dataset.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/get_dataverse.html b/docs/reference/get_dataverse.html index 74a8ae7..ffa84a8 100644 --- a/docs/reference/get_dataverse.html +++ b/docs/reference/get_dataverse.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/get_facets.html b/docs/reference/get_facets.html index 694ae54..1368bba 100644 --- a/docs/reference/get_facets.html +++ b/docs/reference/get_facets.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/get_file_metadata.html b/docs/reference/get_file_metadata.html index 9eaf7d6..eea4df1 100644 --- a/docs/reference/get_file_metadata.html +++ b/docs/reference/get_file_metadata.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/get_user_key.html b/docs/reference/get_user_key.html index c665dca..6bc588a 100644 --- a/docs/reference/get_user_key.html +++ b/docs/reference/get_user_key.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/index.html b/docs/reference/index.html index af54628..fcaa10f 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -71,7 +71,7 @@ diff --git a/docs/reference/initiate_sword_dataset.html b/docs/reference/initiate_sword_dataset.html index 77a6ea5..40ed417 100644 --- a/docs/reference/initiate_sword_dataset.html +++ b/docs/reference/initiate_sword_dataset.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/is_ingested.html b/docs/reference/is_ingested.html index 1ff70da..78874e9 100644 --- a/docs/reference/is_ingested.html +++ b/docs/reference/is_ingested.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/list_datasets.html b/docs/reference/list_datasets.html index 22424db..386f6db 100644 --- a/docs/reference/list_datasets.html +++ b/docs/reference/list_datasets.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/publish_dataset.html b/docs/reference/publish_dataset.html index aa139bc..378c020 100644 --- a/docs/reference/publish_dataset.html +++ b/docs/reference/publish_dataset.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/publish_dataverse.html b/docs/reference/publish_dataverse.html index 5d1bd92..e313f9d 100644 --- a/docs/reference/publish_dataverse.html +++ b/docs/reference/publish_dataverse.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/publish_sword_dataset.html b/docs/reference/publish_sword_dataset.html index 194f299..6031218 100644 --- a/docs/reference/publish_sword_dataset.html +++ b/docs/reference/publish_sword_dataset.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/service_document.html b/docs/reference/service_document.html index 695d136..877311b 100644 --- a/docs/reference/service_document.html +++ b/docs/reference/service_document.html @@ -72,7 +72,7 @@ diff --git a/docs/reference/set_dataverse_metadata.html b/docs/reference/set_dataverse_metadata.html index 4f97254..2fccb38 100644 --- a/docs/reference/set_dataverse_metadata.html +++ b/docs/reference/set_dataverse_metadata.html @@ -72,7 +72,7 @@ diff --git a/for-developers/developer-tasks.R b/for-developers/developer-tasks.R index 9b61d04..ade6b61 100644 --- a/for-developers/developer-tasks.R +++ b/for-developers/developer-tasks.R @@ -33,6 +33,6 @@ lintr::lint("R/add_dataset_file.R") # devtools::check(force_suggests = FALSE) devtools::check(cran=T) # devtools::check_rhub(email="wibeasley@hotmail.com") -# devtools::build_win(version="R-devel") #CRAN submission policies encourage the development version +# devtools::check_win_devel() #CRAN submission policies encourage the development version # devtools::revdep_check(pkg="REDCapR", recursive=TRUE) # devtools::release(check=FALSE) #Careful, the last question ultimately uploads it to CRAN, where you can't delete/reverse your decision. diff --git a/man/create_dataset.Rd b/man/create_dataset.Rd index 948af53..cb31ca3 100644 --- a/man/create_dataset.Rd +++ b/man/create_dataset.Rd @@ -51,7 +51,7 @@ An object of class \dQuote{dataverse_dataset}. Create or update dataset within a Dataverse } \details{ -\code{create_dataset} creates a Dataverse dataset. In Dataverse, a \dQuote{dataset} is the lowest-level structure in which to organize files. For example, a Dataverse dataset might contain the files used to reproduce a published article, including data, analysis code, and related materials. Datasets can be organized into \dQuote{Dataverse} objects, which can be further nested within other Dataverses. For someone creating an archive, this would be the first step to producing said archive (after creating a Dataverse, if one does not already exist). Once files and metadata have been added, the dataset can be publised (i.e., made public) using \code{\link{publish_dataset}}. +\code{create_dataset} creates a Dataverse dataset. In Dataverse, a \dQuote{dataset} is the lowest-level structure in which to organize files. For example, a Dataverse dataset might contain the files used to reproduce a published article, including data, analysis code, and related materials. Datasets can be organized into \dQuote{Dataverse} objects, which can be further nested within other Dataverses. For someone creating an archive, this would be the first step to producing said archive (after creating a Dataverse, if one does not already exist). Once files and metadata have been added, the dataset can be published (i.e., made public) using \code{\link{publish_dataset}}. \code{update_dataset} updates a Dataverse dataset that has already been created using \code{\link{create_dataset}}. This creates a draft version of the dataset or modifies the current draft if one is already in-progress. It does not assign a new version number to the dataset nor does it make it publicly visible (which can be done with \code{\link{publish_dataset}}). } diff --git a/man/files.Rd b/man/files.Rd index 7137ad5..6eb2c6a 100644 --- a/man/files.Rd +++ b/man/files.Rd @@ -5,7 +5,7 @@ \alias{get_file_by_name} \alias{get_file_by_id} \alias{get_file_by_doi} -\title{Download File} +\title{Download dataverse file as a raw binary} \usage{ get_file( file, @@ -107,19 +107,20 @@ function. To load datasets into the R environment dataframe, see \link{get_dataframe_by_name}. } \description{ -Download Dataverse File(s). \code{get_file} is a general wrapper, +Download Dataverse File(s). \verb{get_file_*} +functions return a raw binary file, which cannot be readily analyzed in R. +To use the objects as dataframes, see the \verb{get_dataset_*} functions at +\link{get_dataset} instead. +} +\details{ +This function provides access to data files from a Dataverse entry. +\code{get_file} is a general wrapper, and can take either dataverse objects, file IDs, or a filename and dataverse. +Internally, all functions download each file by \code{get_file_by_id}. \code{get_file_by_name} is a shorthand for running \code{get_file} by specifying a file name (\code{filename}) and dataset (\code{dataset}). \code{get_file_by_doi} obtains a file by its file DOI, bypassing the \code{dataset} argument. - -Internally, all functions download each file by \code{get_file_by_id}. \verb{get_file_*} -functions return a raw binary file, which cannot be readily analyzed in R. -To use the objects as dataframes, see the \verb{get_dataset_*} functions at \link{get_dataset} -} -\details{ -This function provides access to data files from a Dataverse entry. } \examples{ \dontrun{ diff --git a/man/get_dataframe.Rd b/man/get_dataframe.Rd index 6f4f4c1..f60ec63 100644 --- a/man/get_dataframe.Rd +++ b/man/get_dataframe.Rd @@ -4,7 +4,7 @@ \alias{get_dataframe_by_name} \alias{get_dataframe_by_id} \alias{get_dataframe_by_doi} -\title{Get file from dataverse and convert it into a dataframe or tibble} +\title{Download dataverse file as a dataframe} \usage{ get_dataframe_by_name( filename, @@ -28,10 +28,10 @@ for example \code{"doi:10.70122/FK2/HXJVJU"}. Alternatively, an object of class \item{.f}{The function to used for reading in the raw dataset. This user must choose the appropriate function: for example if the target is a .rds -file, then \code{.f} should be \code{readRDS} or \code{readr::read_}rds`.} +file, then \code{.f} should be \code{readRDS} or \code{readr::read_rds}.} \item{original}{A logical, defaulting to TRUE. Whether to read the ingested, -archival version of the dataset if one exists. The archival versions are tab-delimited +archival version of the datafile if one exists. The archival versions are tab-delimited \code{.tab} files so if \code{original = FALSE}, \code{.f} is set to \code{readr::read_tsv}. If functions to read the original version is available, then \code{original = TRUE} with a specified \code{.f} is better.} @@ -68,39 +68,41 @@ or globally using \code{Sys.setenv("DATAVERSE_SERVER" = "dataverse.example.com") \code{"10.70122/FK2/PPIAXE/MHDB0O"} or \code{"doi:10.70122/FK2/PPIAXE/MHDB0O"}} } \description{ -\code{get_dataframe_by_id}, if you know the numeric ID of the dataset, or instead -\code{get_dataframe_by_name} if you know the filename and doi. The dataset +Use \code{get_dataframe_by_name} if you know the name of the datafile and the DOI +of the dataset. Use \code{get_dataframe_by_doi} if you know the DOI of the datafile +itself. Use \code{get_dataframe_by_id} if you know the numeric ID of the +datafile. } \examples{ - # Retrieve data.frame from dataverse DOI and file name -df_from_rds_ingested <- +df_tab <- get_dataframe_by_name( filename = "roster-bulls-1996.tab", dataset = "doi:10.70122/FK2/HXJVJU", server = "demo.dataverse.org" ) -# Retrieve the same data.frame from dataverse + file DOI -df_from_rds_ingested_by_doi <- +# Retrieve the same file from file DOI +df_tab <- get_dataframe_by_doi( filedoi = "10.70122/FK2/HXJVJU/SA3Z2V", server = "demo.dataverse.org" ) +# Do not run when submitting to CRAN, because the whole +# example sometimes takes longer than 10 sec. +\dontrun{ # Retrieve ingested file originally a Stata dta df_from_stata_ingested <- get_dataframe_by_name( filename = "nlsw88.tab", dataset = "doi:10.70122/FK2/PPIAXE", server = "demo.dataverse.org" - ) - + ) # To use the original file version, or for non-ingested data, # please specify `original = TRUE` and specify a function in .f. -# A data.frame is still returned, but the if (requireNamespace("readr", quietly = TRUE)) { df_from_rds_original <- get_dataframe_by_name( @@ -109,17 +111,29 @@ if (requireNamespace("readr", quietly = TRUE)) { server = "demo.dataverse.org", original = TRUE, .f = readr::read_rds - ) + ) } +# Get Stata file as original if (requireNamespace("haven", quietly = TRUE)) { - df_from_stata_original <- + df_stata_original <- get_dataframe_by_name( filename = "nlsw88.tab", dataset = "doi:10.70122/FK2/PPIAXE", server = "demo.dataverse.org", original = TRUE, .f = haven::read_dta - ) + ) } + +# Stata file as ingested file (less information than original) +df_stata_ingested <- + get_dataframe_by_name( + filename = "nlsw88.tab", + dataset = "doi:10.70122/FK2/PPIAXE", + server = "demo.dataverse.org" + ) + +} + } diff --git a/tests/testthat/tests-get_dataframe-original-basketball.R b/tests/testthat/tests-get_dataframe-original-basketball.R index c26c900..8741c98 100644 --- a/tests/testthat/tests-get_dataframe-original-basketball.R +++ b/tests/testthat/tests-get_dataframe-original-basketball.R @@ -1,6 +1,21 @@ # See https://demo.dataverse.org/dataverse/dataverse-client-r # https://doi.org/10.70122/FK2/HXJVJU +# standarize_string <- function (x) { +# substring(x, 1, 10) +# } +standarize_string <- function (x, start = 1, stop = nchar(x)) { + x %>% + base::iconv( + x = ., + from = "latin1", + to = "ASCII//TRANSLIT", + sub = "?" + ) %>% + sub("KukoA?,SF", "Kukoc,SF") %>% + substring(start, stop) +} + test_that("roster-by-name", { expected_ds <- retrieve_info_dataset("dataset-basketball/expected-metadata.yml") expected_file <- expected_ds$roster$raw_value @@ -16,7 +31,23 @@ test_that("roster-by-name", { expect_equal(substr(actual, 1, 30), substr(expected_file, 1, 30)) expect_equal(nchar( actual ), nchar( expected_file )) - expect_equal(actual, expected_file) + # actual <- standarize_string(actual) + # expected_file <- standarize_string(expected_file) + # expect_equal(actual, expected_file) + expect_equal(standarize_string(actual, 0001, 0100), standarize_string(expected_file, 0001, 0100)) + expect_equal(standarize_string(actual, 0101, 0200), standarize_string(expected_file, 0101, 0200)) + expect_equal(standarize_string(actual, 0201, 0300), standarize_string(expected_file, 0201, 0300)) + expect_equal(standarize_string(actual, 0301, 0400), standarize_string(expected_file, 0301, 0400)) + expect_equal(standarize_string(actual, 0401, 0500), standarize_string(expected_file, 0401, 0500)) + expect_equal(standarize_string(actual, 0501, 0600), standarize_string(expected_file, 0501, 0600)) + expect_equal(standarize_string(actual, 0601, 0700), standarize_string(expected_file, 0601, 0700)) + expect_equal(standarize_string(actual, 0701, 0800), standarize_string(expected_file, 0701, 0800)) + expect_equal(standarize_string(actual, 0801, 0900), standarize_string(expected_file, 0801, 0900)) + expect_equal(standarize_string(actual, 0901, 1000), standarize_string(expected_file, 0901, 1000)) + expect_equal(standarize_string(actual, 1001, 1085), standarize_string(expected_file, 1001, 1085)) + + + expect_equal(standarize_string(actual), standarize_string(expected_file)) }) test_that("roster-by-doi", { @@ -33,6 +64,9 @@ test_that("roster-by-doi", { expect_equal(substr(actual, 1, 30), substr(expected_file, 1, 30)) expect_equal(nchar( actual ), nchar( expected_file )) + actual <- standarize_string(actual) + expected_file <- standarize_string(expected_file) + expect_equal(actual, expected_file) }) @@ -50,6 +84,9 @@ test_that("roster-by-id", { expect_equal(substr(actual, 1, 30), substr(expected_file, 1, 30)) expect_equal(nchar( actual ), nchar( expected_file )) + actual <- standarize_string(actual) + expected_file <- standarize_string(expected_file) + expect_equal(actual, expected_file) }) diff --git a/vignettes/A-introduction.Rmd b/vignettes/A-introduction.Rmd index e8377e9..e6c6bab 100644 --- a/vignettes/A-introduction.Rmd +++ b/vignettes/A-introduction.Rmd @@ -57,7 +57,7 @@ You can search for and retrieve data without a Dataverse account for that a spec dataverse_search("ecological inference")[c("name", "type", "description")] ``` -The [search vignette](search.html) describes this functionality in more detail. To retrieve a data file, we need to investigate the dataset being returned and look at what files it contains using a variety of functions, the last of which - `get_file()` - can retrieve the files as raw vectors: +The [search vignette](B-search.html) describes this functionality in more detail. To retrieve a data file, we need to investigate the dataset being returned and look at what files it contains using a variety of functions, the last of which - `get_file()` - can retrieve the files as raw vectors: ```R get_dataset() diff --git a/vignettes/A-introduction.Rmd2 b/vignettes/A-introduction.Rmd2 index 4105d1a..63f630d 100644 --- a/vignettes/A-introduction.Rmd2 +++ b/vignettes/A-introduction.Rmd2 @@ -56,7 +56,7 @@ You can search for and retrieve data without a Dataverse account for that a spec dataverse_search("ecological inference")[c("name", "type", "description")] ``` -The [search vignette](search.html) describes this functionality in more detail. To retrieve a data file, we need to investigate the dataset being returned and look at what files it contains using a variety of functions, the last of which - `get_file()` - can retrieve the files as raw vectors: +The [search vignette](B-search.html) describes this functionality in more detail. To retrieve a data file, we need to investigate the dataset being returned and look at what files it contains using a variety of functions, the last of which - `get_file()` - can retrieve the files as raw vectors: ```R get_dataset() diff --git a/vignettes/C-retrieval.Rmd b/vignettes/C-retrieval.Rmd index f41eae2..ffe6db9 100644 --- a/vignettes/C-retrieval.Rmd +++ b/vignettes/C-retrieval.Rmd @@ -17,7 +17,7 @@ vignette: > -This vignette shows how to download data from Dataverse using the dataverse package. We'll focus on a Dataverse repository that contains supplemental files for [Jamie Monogan](https://spia.uga.edu/faculty-member/jamie-monogan/)'s book [*Political Analysis Using R*](https://www.springer.com/gb/book/9783319234458), which is stored at Harvard University's [IQSS Dataverse Network](https://dataverse.harvard.edu/): +This vignette shows how to download data from Dataverse using the dataverse package. We'll focus on a Dataverse repository that contains supplemental files for [*Political Analysis Using R*](https://www.springer.com/gb/book/9783319234458), which is stored at Harvard University's [IQSS Dataverse Network](https://dataverse.harvard.edu/): > Monogan, Jamie, 2015, "Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems", [doi:10.7910/DVN/ARKOTI](https://doi.org/10.7910/DVN/ARKOTI), Harvard Dataverse, V1, UNF:6:+itU9hcUJ8I9E0Kqv8HWHg== diff --git a/vignettes/C-retrieval.Rmd2 b/vignettes/C-retrieval.Rmd2 index a6c5230..493b2cd 100644 --- a/vignettes/C-retrieval.Rmd2 +++ b/vignettes/C-retrieval.Rmd2 @@ -20,7 +20,7 @@ options(width = 120) knitr::opts_chunk$set(results = "hold") ``` -This vignette shows how to download data from Dataverse using the dataverse package. We'll focus on a Dataverse repository that contains supplemental files for [Jamie Monogan](https://spia.uga.edu/faculty-member/jamie-monogan/)'s book [*Political Analysis Using R*](https://www.springer.com/gb/book/9783319234458), which is stored at Harvard University's [IQSS Dataverse Network](https://dataverse.harvard.edu/): +This vignette shows how to download data from Dataverse using the dataverse package. We'll focus on a Dataverse repository that contains supplemental files for [*Political Analysis Using R*](https://www.springer.com/gb/book/9783319234458), which is stored at Harvard University's [IQSS Dataverse Network](https://dataverse.harvard.edu/): > Monogan, Jamie, 2015, "Political Analysis Using R: Example Code and Data, Plus Data for Practice Problems", [doi:10.7910/DVN/ARKOTI](https://doi.org/10.7910/DVN/ARKOTI), Harvard Dataverse, V1, UNF:6:+itU9hcUJ8I9E0Kqv8HWHg==# Retrieve data.frame from dataverse DOI and file name df_from_rds_ingested <- get_dataframe_by_name( filename = "roster-bulls-1996.tab", @@ -263,34 +262,17 @@Examp #> col_double(), #> col_character() #>
+# Do not run when submitting to CRAN, because the whole +# example sometimes takes longer than 10 sec. +if (FALSE) { # Retrieve ingested file originally a Stata dta df_from_stata_ingested <- get_dataframe_by_name( filename = "nlsw88.tab", dataset = "doi:10.70122/FK2/PPIAXE", server = "demo.dataverse.org" - ) -#>#> -#> ── Column specification ──────────────────────────────────────────────────────── -#> -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double(), -#> col_double() -#>+ ) + # To use the original file version, or for non-ingested data, # please specify `original = TRUE` and specify a function in .f. @@ -304,7 +286,7 @@Examp server = "demo.dataverse.org", original = TRUE, .f = readr::read_rds - ) + ) } if (requireNamespace("haven", quietly = TRUE)) { @@ -315,8 +297,10 @@
Examp server = "demo.dataverse.org", original = TRUE, .f = haven::read_dta - ) + ) } +} +