by Joseph Rickert
Quite a few times over the past few years I have highlighted presentations posted by R user groups on their websites and recommended these sites as a source for interesting material, but I have never thought to see what the user groups were doing on GitHub. As you might expect, many people who make presentations at R user group meetings make their code available on GitHub. However as best as I can tell, only a few R user groups are maintaining GitHub sites under the user group name.
The Indy UseR Group is one that seems to be making very good use of their GitHub Site. Here is the link to a very nice tutorial from Shankar Vaidyaraman on using the rvest package to do some web scraping with R. The following code which scrapes the first page from Springer’s Use R! series to produce a short list of books comes form Shankar’s simple example.
# load libraries library(rvest) library(dplyr) library(stringr) # link to Use R! titles at Springer site useRlink = "http://www.springer.com/?SGWID=0-102-24-0-0&series=Use+R&sortOrder=relevance&searchType=ADVANCED_CDA&searchScope=editions&queryText=Use+R" # Read the page userPg = useRlink %>% read_html() ## Get info of books displayed on the page booktitles = userPg %>% html_nodes(".productGraphic img") %>% html_attr("alt") bookyr = userPg %>% html_nodes(xpath = "//span[contains(@class,'renditionDescription')]") %>% html_text() bookauth = userPg %>% html_nodes("span[class = 'displayBlock']") %>% html_text() bookprice = userPg %>% html_nodes(xpath = "//div[@class = 'bookListPriceContainer']//span[2]") %>% html_text() pgdf = data.frame(title = booktitles, pubyr = bookyr, auth = bookauth, price = bookprice) pgdf
This plot,which shows a list of books ranked by number of downloads, comes from Shankar’s extended recommender example.
The Ann Arbor R User Group meetup site has done an exceptional job of creating an aesthetically pleasing and informative web property on their GitHub site.
I am particularly impressed …read more
Source:: http://revolutionanalytics.com