What are dataframes and how to create them?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What are dataframes and how to create them?

What are dataframes and how to create them?

This recipe explains what are dataframes and how to create them

0

Recipe Objective

Dataframes are data-objects in R which are combination of vectors of same length. It is represented as a two-dimensional array or a table where columns represent variables of the dataset while rows are the observations in it. Unlike matrices, dataframes contains different datatypes.

Often dataframes are created by loading a dataset from existing storage like an excel file, csv file etc. But we can also create a dataframe using a list of vectors in R. This recipe demonstrates how to create a dataframe using vectors.

Step 1: Creating a list of vectors

We are going to take an example of student dataset which has variables like marks, name, ID. To create this dataframe, we will first create three vectors named "marks", "ID" and "name".

Note: the length of each vector has to be same

ID = c(1,2,3,4) name = c('Tom', "Harry", "David", "Daniel") marks = c(50,60,35,95)

Step 2: Creating a Dataframe

We use data.frame() function to create a dataframe using a list of vectors.

Syntax: data.frame(df, stringAsFactors)

where:

  1. df = is matrix or collection of vectors that needs to be joined;
  2. stringAsFactors = if TRUE, it converts string to vector by default;
student = data.frame(ID,name, marks) student
1	Tom	50
2	Harry	60
3	David	35
4	Daniel	95

Relevant Projects

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Machine Learning project for Retail Price Optimization
In this machine learning pricing project, we implement a retail price optimization algorithm using regression trees. This is one of the first steps to building a dynamic pricing model.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

Learn to prepare data for your next machine learning project
Text data requires special preparation before you can start using it for any machine learning project.In this ML project, you will learn about applying Machine Learning models to create classifiers and learn how to make sense of textual data.

German Credit Dataset Analysis to Classify Loan Applications
In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Sequence Classification with LSTM RNN in Python with Keras
In this project, we are going to work on Sequence to Sequence Prediction using IMDB Movie Review Dataset​ using Keras in Python.

Ecommerce product reviews - Pairwise ranking and sentiment analysis
This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Customer Market Basket Analysis using Apriori and Fpgrowth algorithms
In this data science project, you will learn how to perform market basket analysis with the application of Apriori and FP growth algorithms based on the concept of association rule learning.

Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.