The successful candidate will be a self-starter comfortable with ambiguity, has strong attention to detail, and the ability to work in a fast-paced environment. Candidate should have deep expertise and proven success in the design, creation, management, and business use of extremely large datasets. Candidate should be expert at designing, implementing, and operating stable, scalable, low-cost solutions to flow data from production systems into data store and into end-user facing applications. Successful candidates are passionate about working with huge data sets and eager to learn new solutions to answer business questions and drive change.
Responsibilities of this role include:
- Develop robust, scalable production data products based on prototype algorithms developed by the data science R&D team.
- Help design solutions to novel problems in software development, data engineering, and machine learning.
- Create and maintain optimal data pipeline architecture
Transform and convert data streams into structures needed for algorithm input.
- Assemble large, complex data sets that meet functional and non-functional business requirements.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources structured and unstructured using big data technologies.
- Collaborate with other teams to understand the customer, provide valuable input into functional design and usability.