In God We Trust - Everyone Else Brings Data

In God We Trust - Everyone Else Brings Data

We are adrift in a sea of data with different protocols providing us status information and providing us statistics round the clock. There are systems that aggregate data and sort them into critical alert levels and action items. There are systems that correlate data and provide a big picture of the human brain which identifies patterns on what it sees. The problem with this is at times the conclusion reached is incorrect. Big data analytics has become the key to prevent such back and forth arguments and incorrect conclusions as data does all the talking in today’s era in decision making. People might never stop lying but big data and analytics can at least make people more honest about themselves.This article explores on how big data is becoming the real decision maker in every activity regardless of the human assumptions.

Build hands-on projects on Big Data and Hadoop

“You are how you click and not how you speak”

Faking identities is not easier with clickstream data being leveraged for analytics purpose to meet competitive advantages. Users can no more hide the details about the websites they visit and the acts they perform on these websites. Every time a user visits a webpage or performs any action it is tracked by Google and a “hit” is recorded for the same. A clickstream hit shows the analytics system when and from where a user came to the website, what are the pages he or she viewed, the time user spent on each page, when they left and where they left. Clickstream data was one of the first unstructured types of data analysed in the early days of Hadoop.

In a data-driven decision world where big data and analytics spectre hangs above all of us-lies will be tougher to get away. Big data and analytics possibly means the end of cheating and fraud. The strenuous big data analytics algorithms will easily catch every outlier in data.Big data gives us something interesting- Honesty. Big data and analytics can spontaneously verify if a human being is acting in a manner that we should trust.

In God We Trust - Everyone Else Brings Data


Image Credit:

No More Lies On Social Media Platforms- Big Data and Analytics Will Reveal the Truth

Lies on social media platforms can be proliferating- resulting in severe penalties. In 2013, the official Twitter account of Associate Press was hacked. Hackers sent a tweet to approximately 2 million follower’s headlined -"Breaking: Two Explosions in the White House and Barack Obama is injured”. This rumour went viral on the web resulting in the Dow plummeting to 140 points.


Breaking: Two Explosions in the White House and Barack Obama is injured

Image Credit:

Social media platforms have realized that such viral rumours can lead to critical situations. Thus, they are adopting big data and analytics solution to show reliable information or alert concerned authorities before things get out of hand.

If your LinkedIn profile has overstatements and is heaving with partial truths about your Graduate degree then you might want to take a look at it before you are caught red-handed by big data analytics tools for lying. Now that social media has great attention from you, it would like to have your trust also for helping businesses leverage big data and analytics.

LinkedIn has recently bought an analytics patented interactive fact checking system. The fact checking system is similar to an auto-spell check, but for facts.

For instance, if you update your LinkedIn profile –“Won First Prize in an International Beauty Pageant”. The analytics system will learn about the winners of the beauty pageant and pose more questions, if it cannot immediately verify your claims as a winner.

If the analytics system learns that you actually did not participate in the beauty pageant then the idea would be to stop you from spreading incorrect and false information to others. This big data and analytics system at least exhibits some willingness to reduce the inclination for spreading lies on social media platforms.

Want to become a Big Data Hadoop Expert? Learn Big Data and Hadoop Online

No More Lies with your Boss -Big Data and Analytics will reveal the Truth

You are joking around in the lunchroom discussing about each other’s wisecracks and suddenly the topic turns to gossiping about your boss and his micro management skills. You like your job but the work culture is bothering you.

“What if your boss already knows that you want to quit the job much before you know it?”

You deny the fact that there is no plan of quitting the job but lies may be tougher to get away with advancements in big data analytics tools.

Companies like  Walmart, Box and Credit Suisse are using analytic algorithms on various data points to find out who is likely to leave a job. The idea behind the big data analytics here is to give the employers heads up so that they take necessary action much before employees send in their resignation letter. The big data and analytics algorithm runs through several factors such as performance ratings and reviews, tenure of the job, employee surveys, personality tests, and communication patterns. Walmart is leveraging all this big data and analytics to find out the complex picture behind what motivates employees to stay and what makes them leave the job.

Using big data and analytics on dozens of factors, Credit Suisse found out that the size of the team and performance of the manager have a great influence on the attrition rate of the employees. They observed that there is a great spike in the attrition rate amongst employees working in large team with managers having low ratings.

Big data and analytics can help an HR quickly to check the previous employment history and graduation records. Analytics systems are being developed to automatically verify the information present in the job applications.

On the other side, big data and analytics can put an end to the privilege of employees as- good looks and charms will not help them get a job when the extremely neutral big data analytics algorithm deems you unworthy.

For the complete list of big data companies and their salaries- CLICK HERE

No More Lies with your Insurance Claims- Big Data and Analytics will reveal the Truth

Estimates show that close to 10% of the insurance claims are fraudulent and the total global sum of all the fraudulent insurance claims is close to few billions or trillions of dollars. Insurance companies are using big data and analytics to develop predictive models based on call centre notes , voice recordings, wages, demographics, attorney fees, medical claims and weather data so that they can find out people who are filing false claims in the early stages.

For example, if an individual claims his car has drowned in flood, however, his social media activity shows that the car was not actually in the city when the flood occurred. Insurance companies can easily catch the white lies claimants tend to tell by analysing the social media data which indicates that disorders described in an insurance claim might not have taken place on the day in question.

Gadgets and smartphone play a vital role in evaluating the legitimacy of insurance claims. The accelerometer sensors, light sensors and GPS trackers make it possible for data analyst to find out where the claimant was, how fast he or she was moving and if he or she was texting while the crash tool place. Thus, big data and analytics has made it easy for insurance authorities to rely on big data insights generated from gadgets that don’t lie instead of depending on statements of drivers who possibly may.

Insurance analytics systems can now keep an eye on the manner in which you fill out the online application form to discern any signs of fraud. If an individual changes his or her answer all of a sudden while filling out the insurance application just to get a cheaper price then the insurance analytics system might flag the response as a suspicion of fraudulent claim.

Lie Detection Innovations with Big Data and Analytics in Action

Become a Hadoop Developer By Working On Industry Oriented Hadoop Projects

There are several interdisciplinary big data projects leveraging analytics to maintain the honesty of individuals, regardless of the fact that they post on social media or are being monitored by insurance authorities or any other activity.

  • Pheme is an interdisciplinary big data project that is funded by the European Union. Pheme has partners from different fields of text mining, language processing, web science,and social network analysis and information visualization. The main goal of this interdisciplinary big data project is to develop and release open source big data and analytics algorithms to detect lies online.

Pheme Twitter Lie Detector App

Image Credit :

  • AVATAR abbreviated as Automated Virtual Agent for Truth Assessments in Real-Time is an interdisciplinary big data project that makes use of non-invasive sensor technologies and artificial intelligence. This analytics system developed by University of Arizona’s National Centre for Border Security and Immigration is specifically meant to make the borders safe. AVATAR system conducts an automated interview that analyses the biometric data such as body movements, vocal pitch, eye movement, pupil dilation and other document data like the travel history, visa application form to find out the credibility of an individual.AVATAR system notifies the authorities of any kind of anomalous behaviour or flags any suspicious activity that requires the attention of an investigating agent.This analytics systems finds its best use at airports, visa processing centres, asylum requests and land ports of entry.
  • CVSA abbreviated as Computer Voice Stress Analysis is a big data analytics system that uses vocal pitch biometric data –any changes in the vocal pitch will help the analytics system to detect lies. The supervised release of this lie detecting technology is extensively being used to monitor sex offenders.

Big data and analytics creates a very interesting process-with businesses increasingly leveraging big data analytics, they might in reality distance themselves from the “TRUTH” by posing more questions than the users answer. This does not mean that big data is wrong, it is just that at times big data might provide only one part of the perspective.

“The truth is rarely pure and never simple” says Oscar Wilde. This phrase has real significance in the modernized big data world. In theory, answers to all our questions are within our comprehension but it is the deluge of big data that is taking us closer to the truth. The ubiquitous growth big data and analytics is making in different business domains has the potential of revealing significant big data insights.

With  big data trends making waves in wearable technology, Internet of Things (IoT) and sensor driven applications in smart gadgets, the world is not far from witnessing a smartphone that could tell an individual if the person they are speaking to over the call is speaking the truth or not.

People might never stop lying but big data and analytics can at least make people more honest about themselves. Big data never lies and it will continue to be leveraged through analytics for discouraging untruths. 

Learn Big Data and Hadoop to join the Big Data Bandwagon!



Build Big Data and Hadoop projects along with industry professionals

Relevant Projects

Real-Time Log Processing in Kafka for Streaming Architecture
The goal of this apache kafka project is to process log entries from applications in real-time using Kafka for the streaming architecture in a microservice sense.

Web Server Log Processing using Hadoop
In this hadoop project, you will be using a sample application log file from an application server to a demonstrated scaled-down server log processing pipeline.

Data Mining Project on Yelp Dataset using Hadoop Hive
Use the Hadoop ecosystem to glean valuable insights from the Yelp dataset. You will be analyzing the different patterns that can be found in the Yelp data set, to come up with various approaches in solving a business problem.

Spark Project -Real-time data collection and Spark Streaming Aggregation
In this big data project, we will embark on real-time data collection and aggregation from a simulated real-time system using Spark Streaming.

Hadoop Project-Analysis of Yelp Dataset using Hadoop Hive
The goal of this hadoop project is to apply some data engineering principles to Yelp Dataset in the areas of processing, storage, and retrieval.

Spark Project-Analysis and Visualization on Yelp Dataset
The goal of this Spark project is to analyze business reviews from Yelp dataset and ingest the final output of data processing in Elastic Search.Also, use the visualisation tool in the ELK stack to visualize various kinds of ad-hoc reports from the data.

Yelp Data Processing using Spark and Hive Part 2
In this spark project, we will continue building the data warehouse from the previous project Yelp Data Processing Using Spark And Hive Part 1 and will do further data processing to develop diverse data products.

Analysing Big Data with Twitter Sentiments using Spark Streaming
In this big data spark project, we will do Twitter sentiment analysis using spark streaming on the incoming streaming data.

Explore features of Spark SQL in practice on Spark 2.0
The goal of this spark project for students is to explore the features of Spark SQL in practice on the latest version of Spark i.e. Spark 2.0.

Airline Dataset Analysis using Hadoop, Hive, Pig and Impala
Hadoop Project- Perform basic big data analysis on airline dataset using big data tools -Pig, Hive and Impala.