Hadoop Spark Developer - Remote
- Interface with key stakeholders and apply your technical proficiency across different stages of the Spark code development life cycle including gathering requirements, implementing complex componential models, massively parallel processing and advanced data modeling as well as performance tuning and scalability.
- You will play an important role all throughout the development process, from creating the high-level design artifacts to actual implementation.
2+ years of experience in Spark development including data frames, data transformations, performance tuning, memory sizing and tuning Spark Master and Executors.
2+ years of experience in Python and Spark/Python writing streaming and/or batch processing code (for example – coding ETL pipelines).
2+ years of hands-on experience working with the Hadoop ecosystem components for data analysis (for example - Impala, HIVE, Tez, Presto).
2+ years of hands-on Hadoop data modeling including table partitioning models, deep understanding of Hadoop file formats such as Parquet Files or SequenceFiles.
Deep understanding in multi-threading and thread concurrency concepts.
Deep understanding of computer science concepts such as data structures, object-oriented programming, algorithms.
Good understanding of database data structures, data modeling and practices.
Proficient in Linux including scripting (bash, shell) and process scheduling.
Analytical skills, problem-solving skills in the Big Data knowledge domain.
Excellent problem-solving skills, team player.
Experience in the following areas and technologies are highly desirable
Experience with JAVA and/or Scala
Education and Requirements
Bachelor’s degree in Computer Science, Engineering or a related field or equivalent work experience