Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data file documentation through fetchData() #2

Open
rpruim opened this issue May 26, 2015 · 9 comments
Open

data file documentation through fetchData() #2

rpruim opened this issue May 26, 2015 · 9 comments
Assignees
Labels

Comments

@rpruim
Copy link
Contributor

rpruim commented May 26, 2015

From @dtkaplan on October 19, 2012 18:47

I propose to add a documentation=TRUE argument to fetchData() that will cause a corresponding documentation file to be display. Such files can give descriptions, etc. of the data. The author would upload a documentation file in an appropriate format.

Copied from original issue: ProjectMOSAIC/mosaic#169

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

Another option would be to take advantage of comment characters to make the files self documenting. That way there is only one file to deal with.

In any case, I think fetching documentation should be a separate function (fetchDoc) instead of a flag.

And do we know if anyone else besides Danny is using this feature?

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

From @nicholasjhorton on October 20, 2012 13:3

By feature, do you mean using fetchData()? I am, but that's because it's necessitated by the instructions in the second edition of the textbook and the AcroScore problems.

Overall, it works well for the built-in datasets.

Nick

On Oct 19, 2012, at 3:14 PM, Randall Pruim notifications@github.com wrote:

Another option would be to take advantage of comment characters to make the files self documenting. That way there is only one file to deal with.

In any case, I think fetching documentation should be a separate function (fetchDoc) instead of a flag.

And do we know if anyone else besides Danny is using this feature?


Reply to this email directly or view it on GitHub.

Nicholas Horton
Department of Mathematics and Statistics, Smith College
Clark Science Center, Northampton, MA 01063-0001
http://www.math.smith.edu/~nhorton

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

From @nicholasjhorton on March 18, 2013 19:46

We agreed to deprecate fetchData() in the next release (potentially over a 2 year period): this just gets added to the NEWS for now, but might eventually involve adding a warning when it is used.

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

From @nicholasjhorton on March 19, 2013 10:43

Danny wrote: After falling asleep as soon as I got back to the room, I woke up at 10:30 with a simple idea to make fetchData() useful to instructors who want a simple way to post their own data.

• Instructors who want to use the system create their own file server, e.g. using Dropbox's public directory. They will put their files on that server.  
• They make up a short name for use by fetchData(), e.g. "NJH"
• They email that short name to me, together with a link to a file on their server.  I will then check it for uniqueness, create a directory by that name on the mosaic server and put the address of their server in a simple text file called "redirectName.txt"
• Instructors can then put whatever files on their own server.  A file with an address of [server name]/file.csv would be referred to as fetchData("NHJ/file.csv") 

I've prototyped this using the mosaic-web.org server, creating a NJH account which I've redirected to one of my non-fetchData() dropbox directories. (I don't have access to your server, but if you send me a link to a CSV file on your server, I'll update NJH to go to your server.)

Here's the prototype: remoteFetch() in the attached file fetchRemote.R. Ignore the name; it would be folded into fetchData()

Once you've sourced remoteFetch.R, you can try these commands with the files being served from a vanilla Dropbox directory on my account:

remoteFetch("NJH/Course1/mydata2.csv")
weather when
1 snow night
2 sun day
3 rain tomorrow
remoteFetch("NJH/mydata1.csv")
Who Age
1 Bill 3
2 Charley 4
3 Debby 5

Regards,
Danny
<fetchRemote.R>

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

How close are we to making fetchData() of general usefulness to others?

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

Another stranded issue regarding fetchData(). This issue is less important to me than fixing what I consider to be bugs (things like side effects in the environment) and limited general usability.

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

From @nicholasjhorton on February 26, 2014 16:40

I’m increasingly frustrated with fetchData(), in that it makes it hard to reference datasets, since students will often interchange things like:

ds = fetchData(“KidsFeet”)

or

kids = fetchData(“KidsFeet”)

then wonder why their later commands don’t work (using the other name).

Can we move to referencing dataframes from packages?

Just my $0.02,

Nick

On Feb 26, 2014, at 11:09 AM, Randall Pruim notifications@github.com wrote:

Another stranded issue regarding fetchData(). This issue is less important to me than fixing what I consider to be bugs (things like side effects in the environment) and limited general usability.


Reply to this email directly or view it on GitHub.

Nicholas Horton
Professor of Statistics
Department of Mathematics and Statistics, Amherst College
Box 2239, 31 Quadrangle Dr
Amherst, MA 01002-5000
https://www.amherst.edu/people/facstaff/nhorton

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

I don't use fetchData(). I use data in packages or read.file() with a reasonable URL. Lately, I've also been using read.xls() some to read data directly from excel files. But note that these latter options suffer from the same naming issues that Nick is concerned about.

I think use of fetchData() should be removed from our public documents except to document how fetchData() works, and then only if it is useable by others for quick delivery of "late breaking" data. For other things, we should be more stable and have well documented data in packages.

If this is only useful as a way for Danny and a few others to distribute their data, then we should put it in a separate package, and document it as a tool for that more limited purpose.

I'd like to get this resolved by the end of March. We've been dancing around this for too long.

@rpruim
Copy link
Contributor Author

rpruim commented May 26, 2015

Seems we are still dancing around fetchData(), but since we are relying on it less and less, and I never use it, I'm going to move this into "dormant ideas" and open a new issue about whether fetchData() has a role (and should be corrected to serve the role, or whether we should just remove fetchData() altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants