Predict Employee Computer Access Needs in Python

Predict Employee Computer Access Needs in Python

Data Science Project in Python- Given his or her job role, predict employee access needs using amazon employee database.


Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Arvind Sodhi

VP - Data Architect, CDO at Deutsche Bank

I have extensive experience in data management and data processing. Over the past few years I saw the data management technology transition into the Big Data ecosystem and I needed to follow suit. I... Read More

Shailesh Kurdekar

Solutions Architect at Capital One

I have worked for more than 15 years in Java and J2EE and have recently developed an interest in Big Data technologies and Machine learning due to a big need at my workspace. I was referred here by a... Read More

What will you learn

Understanding the problem statement
Initializing necessary libraries and understanding its use
Importing Dataset and performing basic EDA
Checking for null values and filling them with appropriate values
Visualization using Barplot
Perform Univariate Analysis and Data Transformation conversion
Dictionary encoding and decoding using functions
Grouping data for combined analysis by creating functions
Creating functions for label encoding and one hot encoding
Creating function for preprocessing of Test dataset
Creating a function for K-fold cross validation
Making the "main" function that performs every processing and gives the final predictions in CSV format
Performing aproximate greedy feature selection
Applying Logistic Regression
Hyper-parameter tuning the model for the best result
Evaluation using AUC score
Calculating final pred_probabilities and saving it in CSV format

Project Description

When an employee at any company starts work, they first need to obtain the computer access necessary to fulfill their role. This access may allow an employee to read/manipulate resources through various applications or web portals. It is assumed that employees fulfilling the functions of a given role will access the same or similar resources. It is often the case that employees figure out the access they need as they encounter roadblocks during their daily work (e.g. not able to log into a reporting portal). A knowledgeable supervisor then takes time to manually grant the needed access in order to overcome access obstacles. As employees move throughout a company, this access discovery/recovery cycle wastes a nontrivial amount of time and money.


There is a considerable amount of data regarding an employee’s role within an organization and the resources to which they have access. Given the data related to current employees and their provisioned access, models can be built that automatically determine access privileges as employees enter and leave roles within a company. In this data science project, we will build an auto-access model that minimizes the human involvement required to grant or revoke employee access.

Similar Projects

This project analyzes a dataset containing ecommerce product reviews. The goal is to use machine learning models to perform sentiment analysis on product reviews and rank them based on relevance. Reviews play a key role in product recommendation systems.

Using this Kaggle dataset, you will explore which type of employees make less or more money, or which employees get normal pay hikes and promotions.

Data Science Project-Predict the car insurance policy a customer buys after receiving a number of quotes.

Curriculum For This Mini Project

Understanding the data set
Univariate Data Analysis
Univariate Data Analysis - Troubleshooting
Example Univariate Data Analysis
Model Building
Data Transformation - Feature Engineering
Utility Functions
Count Variables
Feature Creation - 2 Way Count
2 Way Count - Role Family Variable
Feature Creation - 3 Way Count
Defining Rollup Variable To Combine Results
Computing Role Type Id Creation
Computing Resource Type Id Creation