Sessions

Towards a grammar of interactive graphics

Towards a grammar of interactive graphics

2 years ago
I announced ggvis in 2014, but there has been little progress on it since. In this talk, I’ll tell you a little bit about what I’ve been working on instead (data ingest, purrr, multiple models, …) and tell you my plans for the future of ggvis. The goal is for 2016 to be the year […]
Towards a grammar of interactive graphics

Towards a grammar of interactive graphics

2 years ago
I announced ggvis in 2014, but there has been little progress on it since. In this talk, I’ll tell you a little bit about what I’ve been working on instead (data ingest, purrr, multiple models, …) and tell you my plans for the future of ggvis. The goal is for 2016 to be the year […]
Statistical Thinking in a Data Science Course

Statistical Thinking in a Data Science Course

2 years ago
The intuition and experience needed for sound statistics practice can be hard to learn, and a course that combines computing, statistics, and working with data offers an excellent learning environment in this regard. Moreover, an integrated approach to data science creates opportunities to reinforce statistical thinking skills throughout the full data analysis cycle, from data […]
R in machine learning competitions

R in machine learning competitions

2 years ago
Kaggle is a community of almost 450K data scientists who have built almost 2MM machine learning models to participate in our competitions. Data scientists come to Kaggle to learn, collaborate and develop the state of the art in machine learning. This talk will cover some of the lessons from winning techniques, with a particular emphasis […]
RCloud – Collaborative Environment for Visualization and Big Data Analytics

RCloud – Collaborative Environment for Visualization and Big Data Analytics

2 years ago
Analyzing Big Data in real life poses challenges with respect to performance, methodology and reusability. R is well known for its succinct syntax for analytic tasks as well as plethora of tools for data analysis and visualization, but it is not always associated with scalability. In this talk we will present a scalable environment that […]
Changing lives with Data Science at Microsoft

Changing lives with Data Science at Microsoft

2 years ago
Whether it’s called data science, machine learning, or analytics, the combination of new data sources and statistical modeling has produced some truly revolutionary applications. Many of these applications incorporate open source technologies (including R) and research from academic institutions. In this talk, I’ll share a few ways that Microsoft is improving the lives of people […]
Using Jupyter notebooks with R in the classroom

Using Jupyter notebooks with R in the classroom

2 years ago
When teaching statistics to non-programmers, the challenges of programming in R often exceed the challenge presented by new statistics concepts. This presentation will discuss a recent paper comparing methods for teaching programming (Jacobs, Gorman, Rees, and Craig, 2016), including the use of Jupyter notebooks. Jupyter notebooks are run in a server-client Notebook Application that allows […]
Reusable R for automation, small area estimation and legacy systems

Reusable R for automation, small area estimation and legacy systems

2 years ago
Running a complex model once is easy, just pull up your statistical program of choice, plug in the data, the model and off you go. The problem comes when you then find yourself trying to scale to running that model with different data hundreds or thousands of times. In order to scale and save analysts […]
Connecting R to the OpenML project for Open Machine Learning

Connecting R to the OpenML project for Open Machine Learning

2 years ago
OpenML is an online machine learning platform where researchers can automatically log and share data, code, and experiments, and organize them online to work and collaborate more effectively. We present an R package to interface the OpenML platform and illustrate its usage both as a stand-alone package and in combination with the mlr machine learning […]
Teaching R to 200 people in a week

Teaching R to 200 people in a week

2 years ago
Across disciplines, scholars are waking up to the potential benefits of computational competence. This has created a surge in demand for computational education which has gone widely underserved. Software Carpentry and similar efforts have worked to fill this gap with short, intensive introductions to computational tools, including R. Such an approach has numerous advantages; however, […]
Literate Programming

Literate Programming

2 years ago
The speaker will discuss what he considers to be the most important outcome of his work developing TeX in the 1980s, namely the accidental discovery of a new approach to programming — which caused a radical change in his own coding style. Ever since then, he has aimed to write programs for human beings (not […]
broom: Converting statistical models to tidy data frames

broom: Converting statistical models to tidy data frames

2 years ago
The concept of "tidy data" offers a powerful and intuitive framework for structuring data to ease manipulation, modeling and visualization, and has guided the development of R tools such as ggplot2, dplyr, and tidyr. However, most functions for statistical modeling, both built-in and in third-party packages, produce output that is not tidy, and that is […]