Status
Call number
Genres
Collection
Publication
Description
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You ?ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you ?ve learned along the way. You ?ll learn how to: Wrangle ?transform your datasets into a form convenient for analysis Program ?learn powerful R tools for solving data problems with greater clarity and ease Explore ?examine your data, generate hypotheses, and quickly test them Model ?provide a low-dimensional summary that captures true "signals" in your dataset Communicate ?learn R Markdown for integrating prose, code, and results… (more)
User reviews
Usually, I skip the exercises sections of most computer books because, well, they offer challenges that are underwhelming. Recall is all that is required to answer them. Usually, I can figure them out in the confines of my mind so that I don't have to waste my time looking up the answers or coding example code to check whether I'm right or where I err.
Not so for Hadley Wickham. Many of his questions were awakened my curiosity and had me applying me new knowledge in R Studio immediately. In fact, the only way I could answer my burning curiosity was to write code in order to test my hypotheses.
Rare is the computer book that is a page turner. This book qualifies as just that if one has the aptitude in statistics to embrace the challenges. R is an ideal language to handles these challenges in statistics, and Wickham and Grodemund fill the role of ideal apostles/evangelists to share this free fruit.
The fun part about R is that it is free, creative, and well-supplied with packages to solve interesting statistical problems. This book carries that message squarely to my lap (and then to my brain) in an engaging manner.
Part I: Explore gives an overview of using R+ggplot2+some tidyverse to do exploratory data analysis. It is one of
Part II: Wrangle is a more thorough look at the tidyverse. I recommend supplementing this by reading Wickham's original paper on tidy data.
Part III: Program was a little tedious because I already have decades of programming experience, though the coverage of purrr is interesting.
Part IV: Model covers building linear and non- models. I don't have a statistics background but even so found this easy to follow and very clear.
Part V: Communicate is a smorgasbord of R Markdown and options building on top of it. I thought this section had a bit of a conflicting message to end on, because after 400 some pages of doing work in RStudio with .R script files, the authors all of a sudden seem to say to forget all that and do everything as R Markdown. Which is fine, but if that's their recommendation I think introducing that earlier would have been better.
There are some copy editing issues, luckily Wickham has an updated online edition with corrections. Some of the exercises weren't entirely clear as to intent, but that could entirely be do to my lacking stats background. (Plenty of people have posted solutions online if you get stuck.)