EddieD (9) [Avatar] Offline
#1
John / Nina,

I saw several solutions for downloading, both on Manning (Appendix A) and stackOverflow, from your https://github.com site. None of them worked for me to get past the SSL Certificate problem I get.

If I you can just put some of the CSV and TSV files in in ZIP format we can just download and use them. I can download your .rData files very easily, as it comes with a download format. Most of the other Manning authors put it in ZIP format. So do you, for your codeExamples. Why can't you do it for the following files:

orange_small_train_churn.labels.txt
orange_small_train_appetency.labels.txt
orange_small_train_upselling.labels.txt

The above file comes down in HTML format just using:

churn <- read.table(
'https://github.com/WinVector/zmPDSwR/tree/master/KDD2009/orange_small_train_churn.labels.txt',
sep=' ',
header=F
)

Am I missing something?

Can you please ZIP the text files, so that PC users will have access.

Thanks,
EddieD
john.mount (79) [Avatar] Offline
#2
Re: Downloading from https://github.com/WinVector/zmPDSwR/tree/master/
R can't deal with https (due to certificate issues) without using a package like curl. In the appendix we suggest using RCurl (to avoid having to know where your machine's curl is installed). Also GitHab may be changing URLs around (so deep links into GitHub content may be problematic). In particular you need the GitHub URL to have "raw" in it (one of the tab options in GitHub), otherwise you get HTML not raw data (which is one of the issues blocking you).

You can already download the entire collection of data sets as a zip from GitHub: https://github.com/WinVector/zmPDSwR/archive/master.zip .


Also, have you tried the following (adapted from https://github.com/WinVector/zmPDSwR/blob/master/CodeExamples/x0A_Working_with_R_and_other_tools/00239_example_A.9_of_section_A.2.3.R , may depend on a lot of system details):


install.packages('RCurl')
require(RCurl)
urlBase <-
'https://raw.github.com/WinVector/zmPDSwR/master/'
mkCon <- function(nm) {
textConnection(getURL(paste(urlBase,nm,sep='')))
}
cars <- read.table(mkCon('UCICar/car.data.csv'),
sep=',',header=T,comment.char='')
EddieD (9) [Avatar] Offline
#3
Re: Downloading from https://github.com/WinVector/zmPDSwR/tree/master/
John,

Thank your for you AMAZINGLY SWIFT REPLY !!!

Yes I did try the following (adapted from https://github.com/WinVector/zmPDSwR/blob/master/CodeExamples/x0A_Working_with_R_and_other_tools/00239_example_A.9_of_section_A.2.3.R ):

but still could not get past the SSL error.

I want to thank you for pointing to the ENTIRE collection of data sets website. I got the Zip file downloaded already. THAT IS EXACTLY WHAT I WAS LOOKIN FOR !!

Your book is awesome. I get frustrated when I am on a roll reading and doing the listings and I hit these "download" issues.

Thanks again,
EddieD
john.mount (79) [Avatar] Offline
#4
Re: Downloading from https://github.com/WinVector/zmPDSwR/tree/master/
You are welcome.

A note to all readers. Downloading (and downloading raw data) from GitHub is much more complicated than I had hoped. For example to use the load() command you must deal with https (at least requiring RCurl and possibly introducing certificate management issues), the fact that GitHub re-writes URLs, and binary connections. I have gotten it to all work in the following example:

1) Start with GitHub content URL (such as https://github.com/WinVector/Examples/blob/master/AmazonBookData/amazonBookData.Rdata ).
2) Right click on the "Raw" tab and copy the reported raw download URL (in this case: https://github.com/WinVector/Examples/raw/master/AmazonBookData/amazonBookData.Rdata ).
3) Remove the "/raw/" from the URL and switch the server to raw.github.com (in this case yielding: https://raw.github.com/WinVector/Examples/master/AmazonBookData/amazonBookData.Rdata ).
4) Run something like the following:

<code>
library('RCurl')
url <- paste('https://raw.github.com/WinVector/',
'Examples/master/',
'AmazonBookData/amazonBookData.Rdata',sep='')
load(rawConnection(getBinaryURL(url)))
ls()
</code>