Data Scientist # 5177652
* Participate in building smart data systems that ingest, model and analyze massive flow of data from online, social, mobile and offline commerce/user activity to set key business attributes for millions of products in real time. You will use cutting edge machine learning, data mining and optimization algorithms underneath it all to analyze all this data on top of Hadoop/HBase/Hive.
* You will have the opportunity to create applications that will impact over 1.5 billion customers in a true global environment. Help invent the next generation of e-commerce; integrated experiences that leverage the store, the web and mobile, with social identity being the glue.
* Apply strong expertise in Java, machine-learning, data mining, and information retrieval to design, prototype, and build the next-generation of client's semantic-based engines and services.
* Build next-generation search and natural language interfaces that apply semantic technology to match user intent and interests with products.
Following are important skills
- MapReduce - must
- Hive - must
- Python - must
- SQL - must
- Machine Learning is great
- Data Sanitation
- Automation and transformation of data
* PhD in Computer Science, Statistics or related field; OR a Master’s degree or equivalent in Computer Science, Statistics or related field and 2 years of related experience.
* Knowledge of machine learning, information retrieval, data mining, statistics, NLP or related field.
* Programming skills in one of the following languages: Java, Scala, C/C++.
* Knowledge of one of the scripting languages such as Python or Perl.
* Experience analyzing and interpreting the results of product experiments. Knowledge of statistical languages such as R.
* Experience working with large data sets and distributed computing tools (Map/Reduce, Hadoop, Hive, or Spark).
* Working knowledge of Relational Data Base Systems and SQL.
* Experience managing end-to-end machine learning pipeline from data exploration, feature engineering, model building, performance evaluation, and online testing with big data set.
* Prior experience in this area with eCommerce or Online Retail would be a plus.
* Knowledge of PIG, UDFs and Hadoop streaming is required.
* Preferred US education.
* Proven experience working with big data in a Fortune 1000 (especially if it is a technology) company.
* Predictive modeling is a plus.