Data Engineer

Company Name: Blackwood Seven
Location: Greater Los Angeles Area
Date Posted: 21st Feb, 2017
  • Participate in design and optimization of our Data Lake (Hadoop / Spark / Hive) and Elasticsearch data infrastructure
  • Optimize Airflow ETL jobs and Hive table schemas, while supporting our Data Science team
  • Investigate tools that may accelerate data discovery (Apache Zeppelin, etc…)
  • Collaborate and document complex flows into easy to follow diagrams.




  • Bachelor’s Degree (technical or science preferred)
  • 6+ years working with Linux
  • 3+ years working with Hadoop and Hive
  • 3+ years working in Python in a production environment
  • Experience working in AGILE SCRUM
  • Experience with scalable architectures and large data processing
  • Experience in evaluating emerging technologies and their applicability
  • Experience in effectively communicating data flows, design choices, and risks
  • Experience in query tuning and optimization
  • Experience in production backend development with Git or a similar source control.
  • Experience with Amazon Web Services a plus
  • Experience with Apache Spark / pySpark a plus
  • Experience with Apache Airflow a plus
  • Experience with Elasticsearch a plus.