Data Engineer
You will design, build, and optimize scalable data pipelines to ingest, transform, and serve large volumes of web and multimedia data. You will develop and manage ETL/ELT workflows, integrate and tune cloud database infrastructure, automate infrastructure and data workflows, and monitor and troubleshoot pipeline performance to ensure data quality and availability.
Responsibilities
- Design scalable data pipelines for batch and real-time processing
- Develop and manage ETL/ELT workflows to transform raw data into structured formats
- Integrate and configure database infrastructure for performance and scalability
- Automate data workflows and infrastructure setup using infrastructure as code
- Collaborate with data scientists and analysts to ensure data accessibility
- Monitor, troubleshoot, and improve pipeline and infrastructure performance
- Manage cloud databases, storage, and compute resources efficiently
- Implement data governance, data security, and disaster recovery practices
Requirements
- Bachelor's degree in Computer Science, Information Systems, Data Engineering, or related field
- Extensive experience with cloud data warehouse systems such as Redshift or Snowflake
- Advanced proficiency in SQL and query performance optimization
- Experience building and managing data pipelines with Airflow, AWS Glue, or similar tools
- Strong understanding of ETL processes and data integration best practices
- Experience with infrastructure automation tools such as Terraform or CloudFormation
- Proficiency in Python, Scala, or Java for pipeline orchestration and data manipulation
- Familiarity with containerization and orchestration using Docker and Kubernetes
- Strong analytical and problem-solving skills
Benefits
- Remote work
- Equity package
- Benefits package