Senior Data Engineer

Company Name: CyberCoders
Location: Alexandria, VA
Date Posted: 01st Jul, 2016

* Use streaming analytics and machine learning to enable our customers to change their entire operations of their enterprises
* Tackle fun challenges like ingestion of data in real-time—with any need of an API—and self-healing, exactly-once processing at thousands of transactions per node per second (we recently converted from Storm and MapReduce to Spark--Streaming and batch)
* Work in small, multi-disciplined teams of product managers, hardware engineers, data engineers, data scientists, application engineers, and devops professionals
* See your work in production in days or weeks—not months
* You will get the opportunity and flexibility to explore a wide range of technologies and challenges (you can expand into work in everything from front end tech to data engineering to firmware programming)


* BS/BA in computer science, computer engineering, or related degree
* 5+ years experience developing large-scale, distributed data platforms and data processing solutions for PaaS / DaaS platforms, Internet-scale companies, government agencies, etc.
* Experience working in an agile development environment
* Experience with Scala or Java
* Experience with any queueing technology: Kafka, AMQP, ZeroMQ, Celery, AWS SQS
* Hadoop, HDFS, Hive
* Data ingestion using distributed queuing technologies such as RabbitMQ, AMQP, Kafka, or ZeroMQ
* Streaming analytics and complex event processing using Storm, Spark, IBM InfoStreams, or similar technologies
* Proven experience using MapReduce, Spark, other ETL/ELT technologies to process TBs of data daily, across hundreds (or even thousands) of continuously running jobs
* Deep understanding of how to design high-performant data models for multiple NoSQL data stores (file stores, wide column databases, key-value stores, etc.)
* Hands-on experience with Hive, Impala, Presto, and similar tools for SQL-like exploration of large-scale data sets

Ideal Qualifications:
* Cassandra, HBase, Accumulo, Big Table, Riak
* Neo4J, Titan, Redis
* Enjoy designing interactive analytics solutions
* Are comfortable with the flexibility of high-agile environments
* Prefer using Continuous Integration and Deployment to reduce manual work
* Are interested blurring the line between Software Engineering & Data Science
* Have experience in geospatial processing at scale

**Candidates with Spark experience are highly desired!**