AWS Data Architect
Title – AWS Data Architect
Job Summary
We are seeking a skilled AWS Data Architect for one of our clients; a leading international sports league. The Architect will play a crucial role in designing, implementing and maintaining data and technology solutions that align with client’s business goals and objectives. This role requires a deep understanding of AWS data services, good understanding of AWS Infra & Ops services, and the ability to translate business requirements into scalable and efficient solutions.
Key Responsibilities
Data Architecture and Cloud Strategy:
- Develop and maintain a comprehensive data architecture and cloud strategy that aligns with the organization's goals and needs.
- Design, implement, and manage cloud-based data infrastructure on AWS, ensuring scalability, reliability, and cost-efficiency.
- Utilize AWS services (S3, Glue, EMR, Redshift, Lambda, Kinesis, MWAA, etc.) to build and optimize data pipelines and storage solutions.
- Champion the use of data lakehouse architecture and optimize its performance for analytical and operational workloads.
- Identify the gaps and opportunities in the current system and suggest/implement to optimise the processes and costs.
Data Engineering:
- Lead and guide data engineering teams to develop, maintain, and optimize ETL processes for data ingestion, transformation, and loading.
- Implement real-time data processing solutions using technologies such as Apache Kafka and AWS Kinesis.
- Collaborate with data scientists, business stakeholders and analysts to ensure data availability and quality, enabling effective analytics and reporting.
- Leverage DBT for data modelling and transformation to support self-service analytics and data governance.
Data Ingestion & Ingestion:
- Architect and implement data integration solutions for API ingestion, enabling data from diverse sources to be captured, transformed, and ingested into our data lakehouse.
- Utilize Airbyte and custom APIs to ensure efficient, reliable, and secure data transfers.
- Manage data integration pipelines to support real-time and batch data processing.
Workflow Orchestration:
- Design, configure, and maintain workflow orchestration using Apache Airflow to automate ETL processes and data pipeline executions.
- Monitor and optimize job scheduling, error handling, and performance of data workflows.
Security and Compliance:
- Implement data security protocols, access controls, and encryption to safeguard sensitive data, especially PIIs.
- Ensure compliance with data privacy regulations and industry standards.
Collaboration and Documentation:
- Collaborate with cross-functional teams to understand data requirements and provide data solutions to meet their needs.
- Maintain comprehensive documentation for data engineering and data architecture processes and solutions.
Infra & Operations:
- Guide the team in setting up cloud Infra and automate using tools like terraform, cloud formation, Jenkins etc
- Guide the operations team in setting up automated monitoring & alerts mechanism
Relevant Qualifications
- Bachelor's or higher degree in a relevant field.
- 6+ years of proven experience in data engineering, cloud architecture, and AWS services.
- Extensive knowledge of data lakehouse technologies, Hudi, DBT, Airbyte, Redshift, Glue, Kinesis and Apache Airflow.
- Strong expertise in programming languages like SQL, Python and processing frameworks like PySpark
- Strong expertise in real-time data processing.
- Excellent problem-solving and analytical skills.
- Strong communication and teamwork abilities.
- Passion for Sports/Gaming/Entertainment is preferred