Data Engineer
As a Data Engineer, you will be engaged from the first client conversation all the way through to delivery \u2014 gathering requirements, designing the solution, and seeing it through to completion.<\/span> You will design, construct, install, test, and maintain highly scalable data management systems and robust data pipelines. Your work will ensure data quality, reliability, and accessibility for our AI/ML engineers and LLM applications, leveraging cloud platforms and modern data engineering practices, including workflow orchestration.<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span> Design, build, and optimize scalable ETL/ELT data pipelines using Python and cloud\-native tools (on AWS, Azure, or GCP).<\/span><\/span><\/span><\/span><\/span><\/span> Develop data models and schemas optimized for analytical and AI/ML workloads.<\/span><\/span><\/span><\/span><\/span><\/span> Implement data quality checks and monitoring frameworks.<\/span><\/span><\/span><\/span><\/span><\/span> Manage and administer data warehouses, data lakes, and databases (SQL/NoSQL).<\/span><\/span><\/span><\/span><\/span><\/span> Implement and manage workflow orchestration tools (e.g., Airflow, Prefect, Dagster) for scheduling and monitoring data pipelines.<\/span><\/span><\/span><\/span><\/span><\/span> Collaborate closely with AI/ML Engineers and LLM Engineers to understand their data requirements.<\/span><\/span><\/span><\/span><\/span><\/span> Ensure data security and compliance standards are met.<\/span><\/span><\/span><\/span><\/span><\/span> Optimize data storage and processing costs on Hyperscaler platforms.<\/span><\/span><\/span><\/span><\/span><\/span> Write efficient and maintainable Python code for data processing tasks.<\/span><\/span><\/span><\/span><\/span><\/span> Work independently to troubleshoot and resolve data\-related issues.<\/span><\/span><\/span><\/span><\/span><\/span> Strong proficiency in Python for data manipulation and pipeline development (e.g., Pandas, PySpark)<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Expertise in SQL and experience with relational and NoSQL databases<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Hands\-on experience with cloud\-based data services on at least one Hyperscaler (e.g., AWS S3/Glue/Redshift, Azure Data Factory/Synapse, GCP Cloud Storage/Dataflow/BigQuery)<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Experience building and managing data pipelines and ETL/ELT processes<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Familiarity with data warehousing concepts and data modeling<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Understanding of data quality principles<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Ability to work independently and take ownership of data infrastructure components<\/span><\/span><\/span><\/span><\/span><\/span><\/span> Share your philosophy on developing data infrastructure, the methodologies you utilize, and provide concrete examples of data systems you've built that have delivered tangible results.<\/span><\/span><\/span><\/span><\/span> Tell us why you are interested to join Suzega <\/span><\/span><\/span><\/span><\/span>
<\/span><\/span>
<\/p>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li><\/ul>
<\/div><\/span>Requirements<\/h3>
<\/div>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li>
<\/p><\/li><\/ul>
<\/div>
<\/div>
<\/p><\/li>
<\/p><\/li><\/ul>
<\/div>
<\/li>
<\/li>
<\/li>