What are dataframes and how to create them?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What are dataframes and how to create them?

What are dataframes and how to create them?

This recipe explains what are dataframes and how to create them

0

Recipe Objective

Dataframes are data-objects in R which are combination of vectors of same length. It is represented as a two-dimensional array or a table where columns represent variables of the dataset while rows are the observations in it. Unlike matrices, dataframes contains different datatypes.

Often dataframes are created by loading a dataset from existing storage like an excel file, csv file etc. But we can also create a dataframe using a list of vectors in R. This recipe demonstrates how to create a dataframe using vectors.

Step 1: Creating a list of vectors

We are going to take an example of student dataset which has variables like marks, name, ID. To create this dataframe, we will first create three vectors named "marks", "ID" and "name".

Note: the length of each vector has to be same

ID = c(1,2,3,4) name = c('Tom', "Harry", "David", "Daniel") marks = c(50,60,35,95)

Step 2: Creating a Dataframe

We use data.frame() function to create a dataframe using a list of vectors.

Syntax: data.frame(df, stringAsFactors)

where:

  1. df = is matrix or collection of vectors that needs to be joined;
  2. stringAsFactors = if TRUE, it converts string to vector by default;
student = data.frame(ID,name, marks) student
1	Tom	50
2	Harry	60
3	David	35
4	Daniel	95

Relevant Projects

Topic modelling using Kmeans clustering to group customer reviews
In this Kmeans clustering machine learning project, you will perform topic modelling in order to group customer reviews based on recurring patterns.

Data Science Project - Instacart Market Basket Analysis
Data Science Project - Build a recommendation engine which will predict the products to be purchased by an Instacart consumer again.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Build a Music Recommendation Algorithm using KKBox's Dataset
Music Recommendation Project using Machine Learning - Use the KKBox dataset to predict the chances of a user listening to a song again after their very first noticeable listening event.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

Predict Employee Computer Access Needs in Python
Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Expedia Hotel Recommendations Data Science Project
In this data science project, you will contextualize customer data and predict the likelihood a customer will stay at 100 different hotel groups.