Beginning data science in R 4 : data analysis, visualization, and modelling for the data scientist

by Thomas Mailund

Paper Book, 2022

Status

Available

Call number

519.502855133

Library's review

Indeholder "About the Author", "About the Technical Reviewer", "Acknowledgments", "Introduction", "Chapter 1: Introduction to R Programming", " Basic Interaction with R", " Using R As a Calculator", " Simple Expressions", " Assignments", " Indexing Vectors", " Vectorized Expressions", " Comments",
Show More
" Functions", " Getting Documentation for Functions", " Writing Your Own Functions", " Summarizing and Vector Functions", " A Quick Look at Control Flow", " Factors", " Data Frames", " Using R Packages", " Dealing with Missing Values", " Data Pipelines", " Writing Pipelines of Function Calls", " Writing Functions That Work with Pipelines", " The Magical "." Argument", " Other Pipeline Operations", " Coding and Naming Conventions", " Exercises", " Mean of Positive Values", " Root Mean Square Error", "Chapter 2: Reproducible Analysis", " Literate Programming and Integration of Workflow and Documentation", " Creating an R Markdown/knitr Document in RStudio", " The YAML Language", " The Markdown Language", " Formatting Text", " Cross-Referencing", " Bibliographies", " Controlling the Output (Templates/Stylesheets)", " Running R Code in Markdown Documents", " Using chunks when analyzing data (without compiling documents)", " Caching Results", " Displaying Data", " Exercises", " Create an R Markdown Document", " Different Output", " Caching", "Chapter 3: Data Manipulation", " Data Already in R", " Quickly Reviewing Data", " Reading Data", " Examples of Reading and Formatting Data Sets", " Breast Cancer Data set", " Boston Housing Data Set", " The readr Package", " Manipulating Data with dplyr", " Some Useful dplyr Functions", " Breast Cancer Data Manipulation", " Tidying Data with tidyr", " Exercises", " Importing Data", " Using dplyr", " Using tidyr", "Chapter 4: Visualizing Data", " Basic Graphics", " The Grammar of Graphics and the ggplot2 Package", " Using qplot()", " Using Geometries", " Facets", " Scaling", " Themes and Other Graphics Transformations", " Figures with Multiple Plots", " Exercises", "Chapter 5: Working with Large Data Sets", " Subsample Your Data Before You Analyze the Full Data Set", " Running Out of Memory During an Analysis", " Too Large to Plot", " Too Slow to Analyze", " Too Large to Load", " Exercises", " Subsampling", " Hex and 2D Density Plots", "Chapter 6: Supervised Learning", " Machine Learning", " Supervised Learning", " Regression vs. Classification", " Inference vs. Prediction", " Specifying Models", " Linear Regression", " Logistic Regression (Classification, Really)", " Model Matrices and Formula", " Validating Models", " Evaluating Regression Models", " Evaluating Classification Models", " Confusion Matrix", " Accuracy", " Sensitivity and Specificity", " Other Measures", " More Than Two Classes", " Sampling Approaches", " Random Permutations of Your Data", " Cross-Validation", " Selecting Random Training and Testing Data", " Examples of Supervised Learning Packages", " Decision Trees", " Random Forests", " Neural Networks", " Support Vector Machines", " Naive Bayes", " Exercises", " Fitting Polynomials", " Evaluating Different Classification Measures", " Breast Cancer Classification", " Leave-One-Out Cross-Validation (Slightly More Difficult)", " Decision Trees", " Random Forests", " Neural Networks", " Support Vector Machines", " Compare Classification Algorithms", "Chapter 7: Unsupervised Learning", " Dimensionality Reduction", " Principal Component Analysis", " Multidimensional Scaling", " Clustering", " k-means Clustering", " Hierarchical Clustering", " Association Rules", " Exercises", " Dealing with Missing Data in the HouseVotes84 Data", " k-means", "Chapter 8: Project 1: Hitting the Bottle", " Importing Data", " Exploring the Data", " Distribution of Quality Scores", " Is This Wine Red or White?", " Fitting Models", " Exercises", " Exploring Other Formulas", " Exploring Different Models", " Analyzing Your Own Data Set", "Chapter 9: Deeper into R Programming", " Expressions", " Arithmetic Expressions", " Boolean Expressions", " Basic Data Types", " Numeric", " Integer", " Complex", " Logical", " Character", " Data Structures", " Vectors", " Matrix", " Lists", " Indexing", " Named Values", " Factors", " Formulas", " Control Structures", " Selection Statements", " Loops", " Functions", " Named Arguments", " Default Parameters", " Return Values", " Lazy Evaluation", " Scoping", " Function Names Are Different from Variable Names", " Recursive Functions", " Exercises", " Fibonacci Numbers", " Outer Product", " Linear Time Merge", " Binary Search", " More Sorting", " Selecting the k Smallest Element", "Chapter 10: Working with Vectors and Lists", " Working with Vectors and Vectorizing Functions", " ifelse", " Vectorizing Functions", " The apply Family", " apply", " Nothing Good, It Would Seem", " lapply", " sapply and vapply", " Advanced Functions", " Special Names", " Infix Operators", " Replacement Functions", " How Mutable Is Data Anyway?", " Exercises", " between", " rmq", "Chapter 11: Functional Programming", " Anonymous Functions", " Higher-Order Functions", " Functions Taking Functions As Arguments", " Functions Returning Functions (and Closures)", " Filter, Map, and Reduce", " Functional Programming with purrr", " Functions As Both Input and Output", " Ellipsis Parameters...", " Exercises", " apply_if", " power", " Row and Column Sums", " Factorial Again", " Function Composition", " Implement This Operator", "Chapter 12: Object-Oriented Programming", " Immutable Objects and Polymorphic Functions", " Data Structures", " Example: Bayesian Linear Model Fitting", " Classes", " Polymorphic Functions", " Defining Your Own Polymorphic Functions", " Class Hierarchies", " Specialization As Interface", " Specialization in Implementations", " Exercises", " Shapes", " Polynomials", "Chapter 13: Building an R Package", " Creating an R Package", " Package Names", " The Structure of an R Package", " .Rbuildignore", " Description", " Title", " Version", " Description", " Author and Maintainer", " License", " Type, Date, LazyData", " URL and BugReports", " Dependencies", " Using an Imported Package", " Using a Suggested Package", " NAMESPACE", " R/ and man/", " Checking the Package", " Roxygen", " Documenting Functions", " Import and Export", " Package Scope vs. Global Scope", " Internal Functions", " File Load Order", " Adding Data to Your Package", " NULL", " Building an R Package", " Exercises", "Chapter 14: Testing and Package Checking", " Unit Testing", " Automating Testing", " Using testthat", " Writing Good Tests", " Using Random Numbers in Tests", " Testing Random Results", " Checking a Package for Consistency", " Exercise", "Chapter 15: Version Control", " Version Control and Repositories", " Using Git in RStudio", " Installing Git", " Making Changes to Files, Staging Files, and Committing Changes", " Adding Git to an Existing Project", " Bare Repositories and Cloning Repositories", " Pushing Local Changes and Fetching and Pulling Remote Changes", " Handling Conflicts", " Working with Branches", " Typical Workflows Involve Lots of Branches", " Pushing Branches to the Global Repository", " GitHub", " Moving an Existing Repository to GitHub", " Installing Packages from GitHub", " Collaborating on GitHub", " Pull Requests", " Forking Repositories Instead of Cloning", " Exercises", "Chapter 16: Profiling and Optimizing", " Profiling", " A Graph-Flow Algorithm", " Speeding Up Your Code", " Parallel Execution", " Switching to C++", " Exercises", "Chapter 17: Project 2: Bayesian Linear Regression", " Bayesian Linear Regression", " Exercises: Priors and Posteriors", " Predicting Target Variables for New Predictor Values", " Formulas and Their Model Matrix", " Working with Model Matrices in R", " Exercises", " Model Matrices Without Response Variables", " Exercises", " Interface to a blm Class", " Constructor", " Updating Distributions: An Example Interface", " Designing Your blm Class", " Model Methods", " Building an R Package for blm", " Deciding on the Package Interface", " Organization of Source Files", " Document Your Package Interface Well", " Adding README and NEWS Files to Your Package", " Testing", " GitHub", "Conclusions", " Data Science", " Machine Learning", " Data Analysis", " R Programming", " The End", "Index".

Her er lidt om versionskontrol og R markdown. Og en masse nyttigt om R pakker.
Show Less

Publication

New York, NY : Apress Media, LLC, [2022]

Description

Discover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. Updated for the R 4.0 release, this book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Beginning Data Science in R 4, Second Edition details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this. Modern data analysis requires computational skills and usually a minimum of programming. After reading and using this book, you'll have what you need to get started with R programming with data science applications. Source code will be available to support your next projects as well.… (more)

Language

Original language

English

Physical description

528 p.; 25.4 cm

ISBN

9781484281550

Local notes

Omslag: freepik
Omslagsdesign: eStudioCalamar
Omslaget viser titel og forfatter på en sort baggrund
Indskannet omslag - N650U - 150 dpi

Pages

528

Library's rating

Rating

(1 rating; 3)

DDC/MDS

519.502855133
Page: 0.2268 seconds