Data Science Project-Movie Review Sentiment Analysis using R

Data Science Project-Movie Review Sentiment Analysis using R

Learn to classify the sentiment of sentences from the Rotten Tomatoes dataset. You will be asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Hiren Ahir

Microsoft Azure SQL Sever Developer, BI Developer

I'm a Graduate student and came into the job market and found a university degree wasn't sufficient to get a good paying job. I aimed at hottest technology in the market Big Data but the word BigData... Read More

Ray Han

Tech Leader | Stanford / Yale University

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

What will you learn

Understanding the problem statement
Importing the dataset and unzipping a zipped file
Loading all the necessary libraries for NLP
What is a Bag of Word model
Tokenization, N-grams and Splitting
Difference between Stemming and Lemmatization
What is part of speech tagging
VNegterms, Negterms, and VPOSterms
Installing packages for Naive Bayes and SVM
What are a sparse matrix and its application
Random Sampling
Converting from Word to Vector for prediction
Applying Naive Bayes for training model and making predictions
Applying SVM for training model and making predictions
Making predictions for test dataset

Project Description

With the increasing usage of Social Media such as Twitter and review websites like yelp and rotten tomatoes, it has become important to glean insights from the huge amounts of subjective opinionated data. The Rotten Tomatoes movie review dataset is a corpus of movie reviews used for sentiment analysis, originally collected by Pang and Lee. In their work on sentiment treebanks, Socher et al. used Amazon's Mechanical Turk to create fine-grained labels for all parsed phrases in the corpus. You will get a chance to benchmark your sentiment-analysis ideas on the Rotten Tomatoes dataset. You are asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others make this data science project  challenging.

Similar Projects

Machine Learning Project in R- Predict the customer churn of telecom sector and find out the key drivers that lead to churn. Learn how the logistic regression model using R can be used to identify the customer churn in telecom dataset.

In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

Given a partial trajectory of a taxi, you will be asked to predict its final destination using the taxi trajectory dataset.

Curriculum For This Mini Project

30-Jan-2016
05h 45m