Data Scientist have always been around – it is just that no one knew that the work that these people are doing is called data science. Data Science as a discipline as emerged only in the last couple of years but people have been working in the data science domain as statisticians, mathematicians, machine learning and actuarial scientists, business analytic practitioners, digital analytic consultants, quality analysts and spatial data scientists. The people working under these roles are well equipped with data scientist skills and they are in high demand in the industry.
Data science has fast emerged as a challenging, lucrative and highly rewarding career. While developed countries became familiar with it halfway through the last decade, data science has caught attention on a global scale after the exponential growth of e-commerce in developing economies, especially India and China. In the past decade there has been considerable paradigm shift in the way the world shops, books holidays, makes transactions and pretty much everything else.
Emergence of mobile technology combined with a spurt in growth of affordable smartphones and mobile internet usage generates tons of data per second. Just to give you a feel for it, the world has about 2.5 Zettabytes of data at present and by the end of 2020 it is expected to cross 8 Zettabytes. Organizations are fully aware of the expanding volume of data being generated and are keen to leverage this to their advantage.
Different organizations around the world (government agencies, banking and financial institutions, telecom companies, media and market research companies, etc.) generate different kinds of data. This requires different types of data analysis. In this article we have a look at these types and the names that get assigned to Data Scientists depending upon the work profile expected of them.
Data Scientists get assigned different names in different organizations. According to datasciencecentral there are 400 different designations assigned to them. A marketing research company would require a statistician to crunch the survey data to formulate their strategy whereas an advertising agency would require a data expert to dig into TRP data and create actionable insights for strategizing next stage advertising campaign for their clients.
Contrary to popular belief, data science is not entirely about numbers, though it is a lot about them. A statistician, an astrologer, a survey designer, a biostatistician all play a data scientist’s role at some point without being known as one. There are a number of programming languages and software applications that support data analysis functions and they require different levels of programming skills. The following section explores different types of data scientists and corresponding functions performed by them:
This is data analysis in the traditional sense. The field of statistics has always been about number crunching. A strong statistical base qualifies you to extrapolate your interest in a number of data scientist fields. Hypothesis testing, confidence intervals, Analysis of Variance (ANOVA), data visualization and quantitative research are some of the core skills possessed by statisticians which can be extrapolated to gain expertise in specific data scientist fields explained in following section of this article.
Statistics knowledge, when clubbed with domain knowledge (such as marketing, risk, actuarial science) is the ideal combination to land a statistician’s work profile. They can develop statistical models from big data analysis, carry out experimental design and apply theories of sampling, clustering and predictive modelling to available data to determine future corporate actions.
Mathematicians have conventionally been related with extensive theoretical research but emergence of big data and data science have changed that perception. Mathematicians have been gaining more acceptance into the corporate world than ever before, owing to their deep knowledge of operations research and applied mathematics. Their services are sought after by businesses to carry out analytics and optimization in various fields such as inventory management, forecasting, pricing algorithm, supply chain, quality control mechanism and defect control. Defence and military organizations also seek mathematicians to carry out crucial big data assignments such as digital signal processing, series analysis and transformative algorithms.
These are often confused with data scientists. However, a data engineer’s role is very different from that of a data scientist. A data engineer has the responsibility to design, build and manage the information captured by an organization. He is entrusted with the job of putting in place a data handling infrastructure to analyse and process data in line with an organization’s requirements. Additionally, he is also responsible for its smooth functioning. They need to work closely with data scientists, IT managers and other business leaders to translate raw data into actionable insights which would result in competitive edge for the organization.
Computer systems around the world are increasingly being equipped with artificial intelligence and decision making capabilities. They possess neural networks that are programmed for adaptive learning – meaning they can be trained over a period of time to make same decisions when same set of inputs is given to them. Machine Learning Scientists develop such algorithms which are used to suggest products, pricing strategies, extract patterns from big data inputs and most importantly, demand forecasting (which can be extrapolated for better inventory management, strengthening supply chain networks, etc.).
Actuarial Science has been around for a long time. Banks and financial institutions rely a lot on actuarial science to predict the market conditions and determine the future income, revenue, profits/losses from these mathematical algorithms.
It is possible to be an actuarial scientist without having to go through data science training. But a data scientist will have a very good grasp over the mathematical and statistical algorithms that are required for actuarial science. A lot of companies are now expediting the process by hiring CFAs to do the work of an actuarial scientist.
This is a very specific position which requires data science professionals to apply mathematical and statistical models to BFSI (Banking, Financial Services and Insurance) and other associated professions. One must possess a globally defined skill set and demonstrate it by passing a series of professional examinations before applying for this job. Preliminary requirement is to know a number of interrelated mathematical subjects such as probability, statistics, finance, economics, financial engineering and computer programming.
Unlike other positions, actuarial science has existed and evolved over the past few decades and many universities around the world have relevant courses at undergraduate and postgraduate levels. Job search website CareerCast ranked it as the No. 1 job in United States in the year 2010 and its popularity has grown ever since.
Businesses make the final use of all the number crunching done by data science professionals. As a business analytic professional it is important to have business acumen as well as know your numbers. Business analysis is a science as well as art and one cannot afford to be driven entirely by either business acumen or by insights obtained based on data analysis. These professionals sit between front end decision making teams and the back end analysts.
They work on crucial decision making such as ROI analysis, ROI optimization, dashboards design, performance metrics determination, high level database design, etc.
Unlike traditional coders, this class of professionals have a knack for number crunching through programming. Needless to mention, they are adept at logical thinking and as a result, they take to new programming languages as ducks takes to water. A number of programming languages such as R programming, Python, Apache Hive, Pig, Hadoop and the like support data analytics and visualizations.
Software programming analysts have the programming skills to automate routine big data related tasks to reduce computing time. They are also required to handle database and associated ETL (Extract Transform Learn) tools that can extract data, transform it by applying business logic and to load it into visual summary representations such as charts, histograms and interactive dashboards.
Increasing use of GPS base systems has given rise to a separate category of data scientists – the spatial engineers. Unlike normal big data analysis which largely involves numbers, spatial data needs specialized handling. GPS coordinates need to be stored, mapped and processed differently compared to scalar numbers. They also need a separate database management system for storage.
Google maps, car navigation systems, Bing maps and a number of applications, use spatial data for localization, navigation, site selection, situation assessment, etc. Government agencies use spatial data received from satellites to make important decisions related to weather conditions, irrigation, fertilizer usage, etc.
A Data Scientists needs to be able to define the data in accordance with the business problem – and for this he/she needs to know the business end of the spectrum.
Quality Analyst has for long been associated with statistical process control in manufacturing industry. This position has been included here to emphasize the importance of data science in core industries. Assembly lines involved in mass production have large data sets to be analysed to maintain quality control and meet minimum performance standards. The job has evolved over the years with new analytic tools which are used by data scientists to prepare interactive visualizations that serve as key inputs in decision making across teams such as management, business, marketing, sales and customer service.
CLICK HERE to get the 2016 data scientist salary report delivered to your inbox!
A Data Scientist has emerged into an all-inclusive job role which encompasses data mining, data analysis, business analysis, predictive modelling and machine learning. Apart from this storytelling and data visualization are also some of the skills that a data scientist must have.
Learn Data Science in Python to become an Enterprise Data Scientist