How to merge datasets in R

In this recipe, we will learn how to merge datasets in R. We will learn how to merge datasets both horizontally and vertically with the help of functions in R.

How to merge datasets in R

In this tutorial, you will learn –
• How to merge datasets vertically i.e. merging rows
• How to merge datasets horizontally i.e. merging columns

Learn About the Application of ARCH and GARCH models in Real-World 


We will first create a few sample data frames to merge.

Code:

#defining column values for dataframe 1
company <- c("Ford","BMW")
models <- c(10,23)

#creating dataframe 1
df1 <- data.frame(company,models)

#defining column values for dataframe 2 company <- c("Lamborghini","Mercedes")
models <- c(15,28)

#creating dataframe 2
df2 <- data.frame(company,models)

#defining column values for dataframe 3
company <- c("Ford","BMW")
sales <- c(2345,3921)

#creating dataframe 3
df3 <- data.frame(company,sales)

print(df1)
print(df2)
print(df3)


Output:
  company models
1    Ford     10
2     BMW     23
      company models
1 Lamborghini     15
2    Mercedes     28
  company sales
1    Ford  2345
2     BMW  3921

How to merge datasets vertically?

When you have numerous datasets with the same set of columns, you can vertically merge one dataset to another. That means you can add new rows to your dataset while preserving your dataset's columns. Being able to access such information in one file allows you to look at the broad picture without having to jump back and forth between several files and lose track of them.
You can merge datasets vertically by making use of “rbind()”. It should also be noted that if you have the same observation in both datasets, you will end up having duplicate observations in your dataset. Let us merge data frames 1 and 2 using rbind() –

Code:
#merging dataframes 1 and 2 vertically
rbind(df1,df2)

Output:
      company models
1        Ford     10
2         BMW     23
3 Lamborghini     15
4    Mercedes     28

How to merge datasets horizontally?

When you have more than one dataset regarding the same collection of observations, you can merge them horizontally. You can add extra columns to your dataset while preserving the rows. You must always check if the observations across the tables to be merged are in the same order.
You can merge datasets horizontally by making use of “cbind()”. Let us merge data frames 1 and 3 using rbind() –

Code:
#merging dataframes 1 and 3 horizontally
cbind(df1,df3)

Output:
  company models company sales
1    Ford     10    Ford  2345
2     BMW     23     BMW  3921

You can see that the common column i.e. “company” appears twice. We would not want this to happen ideally. So instead of using cbind() to avoid duplicate columns, we can also make use of the “merge()” function. It helps join two data frames with one or more common key variables.
Let us now merge data frames 1 and 3 using merge()-

Code:
#merging dataframes 1 and 3
merge(df1,df3)

Output: 
  company models sales
1     BMW     23  3921
2    Ford     10  2345

What Users are saying..

profile image

Jingwei Li

Graduate Research assistance at Stony Brook University
linkedin profile url

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

Build a Face Recognition System in Python using FaceNet
In this deep learning project, you will build your own face recognition system in Python using OpenCV and FaceNet by extracting features from an image of a person's face.

MLOps Project for a Mask R-CNN on GCP using uWSGI Flask
MLOps on GCP - Solved end-to-end MLOps Project to deploy a Mask RCNN Model for Image Segmentation as a Web Application using uWSGI Flask, Docker, and TensorFlow.

Demand prediction of driver availability using multistep time series analysis
In this supervised learning machine learning project, you will predict the availability of a driver in a specific area by using multi step time series analysis.

Build a Multi Class Image Classification Model Python using CNN
This project explains How to build a Sequential Model that can perform Multi Class Image Classification in Python using CNN

Build a Speech-Text Transcriptor with Nvidia Quartznet Model
In this Deep Learning Project, you will leverage transfer learning from Nvidia QuartzNet pre-trained models to develop a speech-to-text transcriptor.

Build Real Estate Price Prediction Model with NLP and FastAPI
In this Real Estate Price Prediction Project, you will learn to build a real estate price prediction machine learning model and deploy it on Heroku using FastAPI Framework.

AWS MLOps Project for ARCH and GARCH Time Series Models
Build and deploy ARCH and GARCH time series forecasting models in Python on AWS .

Text Classification with Transformers-RoBERTa and XLNet Model
In this machine learning project, you will learn how to load, fine tune and evaluate various transformer models for text classification tasks.