Featured AI Products
Compute
Build, deploy, and scale cloud compute resources
Containers and Images
Safely store and manage containers and backups
Managed Databases
Fully managed resources running popular database engines
Management and Dev Tools
Control infrastructure and gather insights
Networking
Secure and control traffic to apps
Security
Help protect your account and resources with these security features
Storage
Store and access any amount of data reliably in the cloud
Browse all products
AI/ML
CMS
Data and IoT
Developer Tools
Gaming and Media
Hosting
Security and Networking
Startups and SMBs
Web and App Platforms
See all solutions
Community
Documentation
Developer Tools
Get Involved
Utilities and Help
Become a Partner
Marketplace
Pricing

- Community
- DigitalOcean
- Community
- DigitalOcean

R replace() Function: Guide with Examples

Updated on December 21, 2022

R Programming

By Prajwal CN and Bradley Kouchi

R replace() Function: Guide with Examples

Introduction

In data analysis, you may need to address missing values, negative values, or non-accurate values that are present in the dataset. These problems can be addressed by replacing the values with 0, NA, or the mean.

In this article, you will explore how to use the replace() and is.na() functions in R.

Prerequisites

To complete this tutorial, you will need:

R installed locally or on a server.

Replacing the Values in a Vector with `replace()`

This section will show how to replace a value in a vector.

The replace() function in R syntax includes the vector, index vector, and the replacement values:

replace(target, index, replacement)

First, create a vector:

df <- c('apple', 'orange', 'grape', 'banana')
df

This will create a vector with apple, orange, grape, and banana:

Output"apple"  "orange"  "grape"  "banana"

Now, let’s replace the second item in the list:

dy <- replace(df, 2, 'blueberry')
dy

This will replace orange with blueberry:

Output"apple"  "blueberry"  "grape"  "banana"

Now, we’ll replace the fourth item in the list:

dx <- replace(dy, 4, 'cranberry')
dx

This will replace banana with cranberry:

Output"apple"  "blueberry"  "grape"  "cranberry"

Replacing `NA` Values with `0` in R

Consider a scenario where you have a data frame containing measurements:

air_quality

    Ozone  Solar.R  Wind  Temp  Month  Day
1      41      190   7.4    67      5    1
2      36      118   8.0    72      5    2
3      12      149  12.6    74      5    3
4      18      313  11.5    62      5    4
5      NA       NA  14.3    56      5    5
6      28       NA  14.9    66      5    6
7      23      299   8.6    65      5    7
8      19       99  13.8    59      5    8
9       8       19  20.1    61      5    9
10     NA      194   8.6    69      5   10
11      7       NA   6.9    74      5   11
12     16      256   9.7    69      5   12

Here is the data in CSV format:

air_quality.csv

Ozone,Solar.R,Wind,Temp,Month,Day
41,190,7.4,67,5,1
36,118,8.0,72,5,2
12,149,12.6,74,5,3
18,313,11.5,62,5,4
NA,NA,14.3,56,5,5
28,NA,14.9,66,5,6
23,299,8.6,65,5,7
19,99,13.8,59,5,8
8,19,20.1,61,5,9
NA,194,8.6,69,5,10
7,NA,6.9,74,5,11
16,256,9.7,69,5,12

This contains the string NA for “Not Available” for situations where the data is missing.

You can replace the NA values with 0.

First, define the data frame:

df <- read.csv('air_quality.csv')

Use is.na() to check if a value is NA. Then, replace the NA values with 0:

df[is.na(df)] <- 0
df

The data frame is now:

Output    Ozone  Solar.R  Wind  Temp  Month  Day
1      41      190   7.4    67      5    1
2      36      118   8.0    72      5    2
3      12      149  12.6    74      5    3
4      18      313  11.5    62      5    4
5       0        0  14.3    56      5    5
6      28        0  14.9    66      5    6
7      23      299   8.6    65      5    7
8      19       99  13.8    59      5    8
9       8       19  20.1    61      5    9
10      0      194   8.6    69      5   10
11      7        0   6.9    74      5   11
12     16      256   9.7    69      5   12

All occurrences of NA in the data frame have been replaced.

Replacing `NA` Values with the Mean of the Values in R

In the data analysis process, accuracy is improved in many cases by replacing NA values with a mean value. The mean() function calculates the mean value.

To overcome this situation, the NA values are replaced by the mean of the rest of the values. This method has proven vital in producing good accuracy without any data loss.

Consider the following input data set with NA values:

air_quality

    Ozone  Solar.R  Wind  Temp  Month  Day
1      41      190   7.4    67      5    1
2      36      118   8.0    72      5    2
3      12      149  12.6    74      5    3
4      18      313  11.5    62      5    4
5      NA       NA  14.3    56      5    5
6      28       NA  14.9    66      5    6
7      23      299   8.6    65      5    7
8      19       99  13.8    59      5    8
9       8       19  20.1    61      5    9
10     NA      194   8.6    69      5   10
11      7       NA   6.9    74      5   11
12     16      256   9.7    69      5   12

df <- read.csv('air_quality.csv')

Use is.na() and mean() to replace NA:

df$Ozone[is.na(df$Ozone)] <- mean(df$Ozone, na.rm = TRUE)

First, this code finds all the occurrences of NA in the Ozone column. Next, it calculates the mean of all the values in the Ozone column - excluding the NA values with the na.rm argument. Then each instance of NA is replaced with the calculated mean.

Then round() the values to whole numbers:

df$Ozone <- round(df$Ozone, digits = 0)

The data frame is now:

Output    Ozone  Solar.R  Wind  Temp  Month  Day
1      41      190   7.4    67      5    1
2      36      118   8.0    72      5    2
3      12      149  12.6    74      5    3
4      18      313  11.5    62      5    4
5      21       NA  14.3    56      5    5
6      28       NA  14.9    66      5    6
7      23      299   8.6    65      5    7
8      19       99  13.8    59      5    8
9       8       19  20.1    61      5    9
10     21      194   8.6    69      5   10
11      7       NA   6.9    74      5   11
12     16      256   9.7    69      5   12

The NA values in the Ozone column are now replaced by the rounded mean of the values in the Ozone column (21).

Replacing the Negative Values with `0` or `NA` in R

In the data analysis process, sometimes you will want to replace the negative values in the data frame with 0 or NA. This is necessary to avoid the negative tendency of the results. The negative values present in a dataset will mislead the analysis and produce false accuracy.

Consider the following input data set with negative values:

negative_values.csv

    count  entry1  entry2  entry3
 1      1     345    -234     345
 2      2      65     654     867
 3      3      23     345    3456
 4      4      87     876       9
 5      5    2345      34     867
 6      6     876      98      76
 7      7      35    -456     123
 8      8      87      98     345
 9      9    -765      67     765
10     10    4567     -87     234

Here is the data in CSV format:

count,entry1,entry2,entry3
1,345,-234,345
2,65,654,867
3,23,345,3456
4,87,867,9
5,2345,34,867
6,876,98,76
7,35,-456,123
8,87,98,345
9,-765,67,765
10,4567,-87,234

Read the CSV file:

df <- read.csv('negative_values.csv')

Replacing the Negative Values with `0`

Use replace() to change the negative values in the entry2 column to 0:

data_zero <- df
data_zero$entry2 <- replace(df$entry2, df$entry2 < 0, 0) 
data_zero

The data frame is now:

Output   count entry1 entry2 entry3
1      1    345      0    345
2      2     65    654    867
3      3     23    345   3456
4      4     87    867      9
5      5   2345     34    867
6      6    876     98     76
7      7     35      0    123
8      8     87     98    345
9      9   -765     67    765
10    10   4567      0    234

The negative values in the entry2 column have been replaced with 0.

Replacing the Negative Values with `NA`

Use replace() to change the negative values in the entry2 column to NA:

data_na <- df
data_na$entry2 <- replace(df$entry2, df$entry2 < 0, NA)
data_na

The data frame is now:

Output   count entry1 entry2 entry3
1      1    345     NA    345
2      2     65    654    867
3      3     23    345   3456
4      4     87    867      9
5      5   2345     34    867
6      6    876     98     76
7      7     35     NA    123
8      8     87     98    345
9      9   -765     67    765
10    10   4567     NA    234

The negative values in the entry2 column have been replaced with NA.

Conclusion

Replacing values in a data frame is a convenient option available in R for data analysis. Using replace() in R, you can switch NA, 0, and negative values when appropriate to clear up large datasets for analysis.

Continue your learning with How To Use sub() and gsub() in R.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author(s)

Prajwal CN

Author

Bradley Kouchi

Editor

See author profile

Category:

Tutorial

Tags:

R Programming

Still looking for an answer?

Ask a question Search for more help

Was this helpful?

This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

View all products

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Report this

R replace() Function: Guide with Examples

Introduction

Prerequisites

Replacing the Values in a Vector with `replace()`

Replacing `NA` Values with `0` in R

Replacing `NA` Values with the Mean of the Values in R

Replacing the Negative Values with `0` or `NA` in R

Replacing the Negative Values with `0`

Replacing the Negative Values with `NA`

Conclusion

About the author(s)

Still looking for an answer?

Join the Tech Talk

Deploy on DigitalOcean

Become a contributor for community

DigitalOcean Documentation

Resources for startups and AI-native businesses

The developer cloud

Start building today