What does sample function do?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

What does sample function do?

What does sample function do?

This recipe explains what does sample function do

0

Recipe Objective

In R, we use sample() function whenever to want to generate a random sample of a specified from dataset. This can be done with or without replacement. We can create a numeric or character vector sample using sample() function. ​

Whenever you are generating random sample, you are using an algorithm that requires a seed whose function is to initialise. These numbers are actually pseudorandom numbers which can be predicted if we know the seed and the generator. ​

Setting a seed means iniltialising a pseudorandom generator. We set a seed when we need the same output of numbers everytime you want to generate random numbers. If we don't set a seed, the generated pseudorandom numbers are different on each execution. ​

In most of the simulation methods in statistics, random numbers are used to mimic the properties of uniform or normal distribution in a certain interval. ​

In this recipe, you will learn how to use sample() function by setting a seed. ​

Example:

Generating a sample of 10 random numbers between 1 and 30 by setting a seed without replacement (i.e. every value will be unique) ​

Syntax: sample(x, size = , replace = ) ​

where: ​

  1. x = (equivalent to population) Dataset or a vector of more than 1 element from which sample needs to be chosen
  2. size = Size of the sample
  3. size = Size of the sample

We use set.seed() function to set a seed. We specify any integer in the function as a seed. ​

# setting a seed set.seed(20) # Generating a sample of 10 random numbers between 1 and 30 by setting a seed without replacement (i.e. every value will be unique) sample(1:30, 10, replace = FALSE)
6 11 24 2 25 27 13 9 3 28

Note: The random numbers generated remains constant even after multiple executions. ​

Relevant Projects

Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

Predict Churn for a Telecom company using Logistic Regression
Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

Forecast Inventory demand using historical sales data in R
In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

Machine Learning or Predictive Models in IoT - Energy Prediction Use Case
In this machine learning and IoT project, we are going to test out the experimental data using various predictive models and train the models and break the energy usage.

Identifying Product Bundles from Sales Data Using R Language
In this data science project in R, we are going to talk about subjective segmentation which is a clustering technique to find out product bundles in sales data.

NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.