Data Science Project-Movie Review Sentiment Analysis using R

Data Science Project-Movie Review Sentiment Analysis using R

Learn to classify the sentiment of sentences from the Rotten Tomatoes dataset. You will be asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive.

Videos

Each project comes with 2-5 hours of micro-videos explaining the solution.

Code & Dataset

Get access to 50+ solved projects with iPython notebooks and datasets.

Project Experience

Add project experience to your Linkedin/Github profiles.

Customer Love

Read All Reviews

Ray Han

Tech Leader | Stanford / Yale University

I think that they are fantastic. I attended Yale and Stanford and have worked at Honeywell,Oracle, and Arthur Andersen(Accenture) in the US. I have taken Big Data and Hadoop,NoSQL, Spark, Hadoop... Read More

Mike Vogt

Information Architect at Bank of America

I have had a very positive experience. The platform is very rich in resources, and the expert was thoroughly knowledgeable on the subject matter - real world hands-on experience. I wish I had this... Read More

What will you learn

Understanding the problem statement
Importing the dataset and unzipping a zipped file
Loading all the necessary libraries for NLP
What is a Bag of Word model
Tokenization, N-grams and Splitting
Difference between Stemming and Lemmatization
What is part of speech tagging
VNegterms, Negterms, and VPOSterms
Installing packages for Naive Bayes and SVM
What are a sparse matrix and its application
Random Sampling
Converting from Word to Vector for prediction
Applying Naive Bayes for training model and making predictions
Applying SVM for training model and making predictions
Making predictions for test dataset

Project Description

With the increasing usage of Social Media such as Twitter and review websites like yelp and rotten tomatoes, it has become important to glean insights from the huge amounts of subjective opinionated data. The Rotten Tomatoes movie review dataset is a corpus of movie reviews used for sentiment analysis, originally collected by Pang and Lee. In their work on sentiment treebanks, Socher et al. used Amazon's Mechanical Turk to create fine-grained labels for all parsed phrases in the corpus. You will get a chance to benchmark your sentiment-analysis ideas on the Rotten Tomatoes dataset. You are asked to label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others make this data science project  challenging.

Similar Projects

In this machine learning project, you will develop a machine learning model to accurately forecast inventory demand based on historical sales data.

In this machine learning project, we will predict which coupons a customer will buy.

In this data science project, you will work with German credit dataset using classification techniques like Decision Tree, Neural Networks etc to classify loan applications using R.

Curriculum For This Mini Project

30-Jan-2016
05h 45m