R for Data Science: Import, Tidy, Transform, Visualize, and Model Data

by Hadley Wickham

Other authorsGarrett Grolemund (Author)
Ebook, 2017

Status

Available

Call number

006.312

Collection

Publication

O'Reilly Media (2017), Edition: 1, 520 pages

Description

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You ?ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you ?ve learned along the way. You ?ll learn how to: Wrangle ?transform your datasets into a form convenient for analysis Program ?learn powerful R tools for solving data problems with greater clarity and ease Explore ?examine your data, generate hypotheses, and quickly test them Model ?provide a low-dimensional summary that captures true "signals" in your dataset Communicate ?learn R Markdown for integrating prose, code, and results… (more)

User reviews

LibraryThing member scottjpearson
If the above quote is the mission of this book, consider the task accomplished. Where most books in computer science fall down in trying to be cute while communicating an educational message, this book addresses the task of education about R squarely, and it does so in a manner that engages the
Show More
mind with interesting problems.

Usually, I skip the exercises sections of most computer books because, well, they offer challenges that are underwhelming. Recall is all that is required to answer them. Usually, I can figure them out in the confines of my mind so that I don't have to waste my time looking up the answers or coding example code to check whether I'm right or where I err.

Not so for Hadley Wickham. Many of his questions were awakened my curiosity and had me applying me new knowledge in R Studio immediately. In fact, the only way I could answer my burning curiosity was to write code in order to test my hypotheses.

Rare is the computer book that is a page turner. This book qualifies as just that if one has the aptitude in statistics to embrace the challenges. R is an ideal language to handles these challenges in statistics, and Wickham and Grodemund fill the role of ideal apostles/evangelists to share this free fruit.

The fun part about R is that it is free, creative, and well-supplied with packages to solve interesting statistical problems. This book carries that message squarely to my lap (and then to my brain) in an engaging manner.
Show Less
LibraryThing member encephalical
This is one of the best O'Reilly books I've read. For context, I'm a graphics programmer that fell into sci vis, e.g., visualizing fluid simualtions, and is now pivoting into info vis.

Part I: Explore gives an overview of using R+ggplot2+some tidyverse to do exploratory data analysis. It is one of
Show More
the best intro overview dives I've come across for any type of programming. Most dives of this sort have at least one or two gaps in material or unclear motivation or try to do too much. This was perfectly crafted to lead someone into the tidyverse.

Part II: Wrangle is a more thorough look at the tidyverse. I recommend supplementing this by reading Wickham's original paper on tidy data.

Part III: Program was a little tedious because I already have decades of programming experience, though the coverage of purrr is interesting.

Part IV: Model covers building linear and non- models. I don't have a statistics background but even so found this easy to follow and very clear.

Part V: Communicate is a smorgasbord of R Markdown and options building on top of it. I thought this section had a bit of a conflicting message to end on, because after 400 some pages of doing work in RStudio with .R script files, the authors all of a sudden seem to say to forget all that and do everything as R Markdown. Which is fine, but if that's their recommendation I think introducing that earlier would have been better.

There are some copy editing issues, luckily Wickham has an updated online edition with corrections. Some of the exercises weren't entirely clear as to intent, but that could entirely be do to my lacking stats background. (Plenty of people have posted solutions online if you get stuck.)
Show Less
LibraryThing member sashame
if ur gonna use R, its probably the best resource out there, aside from wickham's advanced R guide; but the language is so antiquated and outdated, with so many issues in its fundamental data structures, that its frustrating to pretend it can b an elegant front end for research development; it
Show More
seems like wickham's energy would b better directed towards developing an R2 (couldn't u just ship a wrapper to make R1 packages compatible?) rather than trying to patch R as it is w more and more packages to try to smooth things ove
Show Less
LibraryThing member markm2315
Like a week-long workshop with the authors, this book presents data analysis in terms of the R packages in the tidyverse. I don't think you can read it and fail to learn a lot. It has an especially nice organized approach to data import and non-tidy data. I think I would recommend it to almost
Show More
anyone who does some data analysis. My only caveat would be that although you could start learning R with this book, it might be a difficult and non-traditional path for some complete beginners.
Show Less

Language

Original language

English

Original publication date

2017

Physical description

520 p.; 9.02 inches

Pages

520

ISBN

1491910399 / 9781491910399
Page: 0.1476 seconds