Fonts in Space
By David Smith I love typography, and I love sci-fi, and I’m probably not alone in that combination of interests. So I’m somewhat amazed it’s taken me this long to discover Typeset in the Future, a...
View ArticleIntroductions to R and predictive analytics
By David Smith If you’re new to the concept of predictive models, or just want to review the background on how data scientists learn from past data to predict the future, you may be interested in my...
View ArticleIntroductions to R and predictive analytics
By David Smith If you’re new to the concept of predictive models, or just want to review the background on how data scientists learn from past data to predict the future, you may be interested in my...
View ArticleEasy data validation with the validate package
By mark The validate package is our attempt to make checking data against domain knowledge as easy as possible. Here is an example. library(magrittr) library(validate) iris %>% check_that(...
View ArticleTrump Losing and Feeling the Bern in Utah
By Julia Silge Well, it’s been an interesting election season so far, right? Everybody holding up OK? Utah held its caucuses this past Tuesday on March 22 and I thought I would do a bit of plotting to...
View ArticleChoice Modeling with Features Defined by Consumers and Not Researchers
By Joel Cadwell Choice modeling begins with a researcher “deciding on what attributes or levels fully describe the good or service.” This is consistent with the early neural networks in which features...
View ArticleThe Simpsons as a Chart
By The Clerk Inspired by this clever image, I thought I would whip it up in R.Results: Below is the R code: 1: # Prepare ----------------------------------------------------------------- 2:...
View ArticleNuclear Animations in R
By hrbrmstr library(dplyr) library(tidyr) library(sp) library(maptools) library(maps) library(grid) library(scales) library(ggplot2) # devtools::install_github("hadley/ggplot2") library(ggthemes)...
View ArticleAdditive modelling global temperature time series: revisited
By Gavin L. Simpson Quite some time ago, back in 2011, I wrote a post that used an additive model to fit a smooth trend to the then-current Hadley Centre/CRU global temperature time series data set....
View ArticleEven Businessweek Is Talking about P-Values
By matloff The March 28 issue of Bloomberg Businessweek has a rather good summary of the problems of p-values, even recommending the use of confidence intervals and — wonder of wonders — “[looking] at...
View ArticleAnimated Flow in the non-tidal Delaware River
By AdventuresInData The Delaware River experienced some high flow in late February 2016, providing an opportunity for an interesting animated graph of river response. This plot was developed using data...
View ArticleSoap-film smoothers & lake bathymetries
By Gavin L. Simpson A number of years ago, whilst I was still working at ENSIS, the consultancy arm of the ECRC at UCL, I worked on a project for the (then) Countryside Council for Wales (CCW; now part...
View ArticleDoes weather cause accidents – part 1
By Adventures in Data Scotland and other parts of the UK have some nicely curated open data on road traffic accidents. For individual cases, where and when they happened, how severe they were, the...
View ArticleRcpp 0.12.4: And another one
By Thinking inside the box The fourth update in the 0.12.* series of Rcpp has now arrived on the CRAN network for GNU R, and has just been pushed to Debian as well. This follows four days of idleness...
View ArticleTips for First Year Comprehensive Exams
By strictlystat During our program, like most others, you have to take written comprehensive exams (“comps”) at the end of your first year of coursework. For many students it’s a time of stress, which...
View ArticleAbout those weird things in R…
By David Smith There’s no denying that for a language as popular as R, it has more than its fair share of quirks. If you’ve ever wondered why, for example, R has a non-standard assignment operator, or...
View ArticleRStudio at the Open Data Science Conference
By Roger Oberg If you’re a data wrangler or data scientist, ODSC East in Boston from May 20-22 is a wonderful opportunity to get up-to-date on the latest open source tools and trends. R and RStudio...
View Articleggnetwork: Network geometries for ggplot2
By Françoisn – f@briatte.org This note is a shameless plug demo of the ggnetwork package, which provides several geoms to plot network objects with ggplot2, and which just got published on CRAN. See...
View Article“Efficient Data Manipulation with R” Course | April 11-12 Milan
By Quantide srl A new R course, focused on Data Manipulation, is organized by the R training and consulting company Quantide. Next live class is on April 11-12 in Legnano (Milan). If you want to know...
View ArticleImproving Adaboosting with decision stumps in R
By nivangio Adaboosting is proven to be one of the most effective class prediction algorithms. It mainly consists of an ensemble simpler models (known as “weak learners”) that, although not very...
View Article