Featured AI Products
Compute
Build, deploy, and scale cloud compute resources
Containers and Images
Safely store and manage containers and backups
Managed Databases
Fully managed resources running popular database engines
Management and Dev Tools
Control infrastructure and gather insights
Networking
Secure and control traffic to apps
Security
Help protect your account and resources with these security features
Storage
Store and access any amount of data reliably in the cloud
Browse all products
AI/ML
CMS
Data and IoT
Developer Tools
Gaming and Media
Hosting
Security and Networking
Startups and SMBs
Web and App Platforms
See all solutions
Community
Documentation
Developer Tools
Get Involved
Utilities and Help
Become a Partner
Marketplace
Pricing

- Community
- DigitalOcean
- Community
- DigitalOcean

How to Normalize data in R [3 easy methods]

Published on August 3, 2022

R Programming

By Safa Mulani

How to Normalize data in R [3 easy methods]

Hello, readers! In this article, we will be having a look at 3 Easy Ways to Normalize data in R programming.

So, let us begin!! :)

What is Normalization?

Feature Scaling is an essential step prior to modeling while solving prediction problems in Data Science. Machine Learning algorithms work well with the data that belongs to a smaller and standard scale.

This is when Normalization comes into picture. Normalization techniques enables us to reduce the scale of the variables and thus it affects the statistical distribution of the data in a positive manner.

In the subsequent sections, we will be having a look at some of the techniques to perform Normalization on the data values.

1. Normalize data in R - Log Transformation

In the real world scenarios, to work with the data, we often come across situations wherein we find the datasets that are unevenly distributed. That is, they are either skewed or do not follow normalization of values.

In such cases, the easiest way to get values into proper scale is to scale them through the individual log values.

In the below example, we have scaled the huge data values present in the data frame ‘data’ using log() function from the R documentation.

Example:

rm(list = ls())

data = c(1200,34567,3456,12,3456,0985,1211)
summary(data)
log_scale = log(as.data.frame(data))

Output:

         data
1	7.090077
2	10.450655
3	8.147867
4	2.484907
5	8.147867
6	6.892642
7	7.099202

2. Normalize Data with Min-Max Scaling in R

Another efficient way of Normalizing values is through the Min-Max Scaling method.

With Min-Max Scaling, we scale the data values between a range of 0 to 1 only. Due to this, the effect of outliers on the data values suppresses to a certain extent. Moreover, it helps us have a smaller value of the standard deviation of the data scale.

In the below example, we have used ‘caret’ library to pre-process and scale the data. The preProcess() function enables us to scale the value to a range of 0 to 1 using method = c('range') as an argument. The predict() method applies the actions of the preProcess() function on the entire data frame as shown below.

Example:

rm(list = ls())

data = c(1200,34567,3456,12,3456,0985,1211)
summary(data)
library(caret)
process <- preProcess(as.data.frame(data), method=c("range"))

norm_scale <- predict(process, as.data.frame(data))

Output:

           data
1	0.03437997
2	1.00000000
3	0.09966720
4	0.00000000
5	0.09966720
6	0.02815801
7	0.03469831

3. Normalize Data with Standard Scaling in R

In Standard scaling, also known as Standardization of values, we scale the data values such that the overall statistical summary of every variable has a mean value of zero and an unit variance value.

The scale() function enables us to apply standardization on the data values as it centers and scales the

rm(list = ls())

data = c(1200,34567,3456,12,3456,0985,1211)
summary(data)
scale_data <- as.data.frame(scale(data))

Output:

As seen below, the mean value of the data frame before scaling is 6412. Whereas, after performing scaling of values, the mean has reduced to Zero.

 Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
     12    1092    1211    6412    3456   34567	

            V1
1	-0.4175944
2	2.2556070
3	-0.2368546
4	-0.5127711
5	-0.2368546
6	-0.4348191
7	-0.4167131

           V1         
 Min.   :-0.5128  
 1st Qu.:-0.4262  
 Median :-0.4167  
 Mean   : 0.0000  
 3rd Qu.:-0.2369  
 Max.   : 2.2556

Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question. For more such posts related to R programming, stay tuned with us!

Till then, Happy Learning!! :)

References

scale() in R - Documentation

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author

Safa Mulani

Author

Category:

Tutorial

Tags:

R Programming

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a question Search for more help

Was this helpful?

JournalDev

October 8, 2021

HI, Thank you very much. Very clear and nicely explained. I wish I could press the ‘like’ button.

- Sumrah

This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

View all products

Start building today

From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.

Report this

How to Normalize data in R [3 easy methods]

What is Normalization?

1. Normalize data in R - Log Transformation

2. Normalize Data with Min-Max Scaling in R

3. Normalize Data with Standard Scaling in R

Conclusion

References

About the author

Still looking for an answer?

Join the Tech Talk

Deploy on DigitalOcean

Become a contributor for community

DigitalOcean Documentation

Resources for startups and AI-native businesses

The developer cloud

Start building today