Hadoop Developer

Company Name: Capgemini
Location: New York, NY
Date Posted: 27th Oct, 2016
  • Design and Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time.
  • Collaborate with other teams to design and develop data tools that support both operations and product use cases.
  • Source huge volume of data from diversified data platforms into Hadoop platform
  • Perform offline analysis of large data sets using components from the Hadoop ecosystem.
  • Evaluate big data technologies and prototype solutions to improve our data processing architecture.
  • Knowledge of Private Banking & Wealth Management domain is an added advantage
  • 10+ years of hands-on programming experience with 3+ years in Hadoop platform
  • Experience designing and architecting Hadoop based platforms for building Data Lakes
  • Knowledge of various components of Hadoop ecosystem and experience in applying them to practical problems
  • Proficiency with Java and one of the scripting languages like Python / Scala etc.
  • Flair for data, schema, data model, how to bring efficiency in big data related life cycle
  • Experience building ETL frameworks in Hadoop using Pig/Hive/Map reduce
  • Experience in creating custom UDFs and custom input/output formats / serdes
  • Ability to acquire, compute, store and provision various types of datasets in Hadoop platform
  • Understanding of various Visualization platforms (Tableau, Qlikview, others)
  • Experience in data warehousing, ETL tools , MPP database systems
  • Strong object-oriented design and analysis skills