Data Engineer AWS – LATAM
About Distillery
Distillery is a global technology consulting firm that partners with innovative companies to build high-quality software solutions. We specialize in assembling elite, distributed engineering teams that work closely with our clients to solve complex business challenges.
At Distillery, we value craftsmanship, ownership, and continuous learning. Our teams are empowered to make technical decisions, collaborate openly, and deliver real impact. We work with modern technologies, cloud-native architectures, and data-driven organizations across multiple industries.
About the Role
We are seeking an experienced Data Engineer to join our team and drive operational excellence across a large-scale, AWS-based data platform.
You will work with a mature, production-grade data ecosystem consisting of approximately 1,000 Airflow DAGs and AWS Glue jobs, migrated from Hadoop/MapReduce. Your primary focus will be on optimizing performance, reducing costs, improving reliability, and ensuring operational efficiency across the platform.
This role is ideal for someone who enjoys working close to production systems, improving existing pipelines, and partnering with stakeholders to enable data-driven decision-making at scale.
Key Responsibilities
Operational Excellence & Optimization
- Monitor, maintain, and optimize ~1,000 production Airflow DAGs and AWS Glue jobs
- Identify and resolve performance bottlenecks, reduce pipeline execution times, and optimize resource utilization
- Implement cost optimization strategies across AWS services (Redshift, Glue, S3, compute resources)
- Improve pipeline reliability through enhanced error handling, retries, and data validation
- Establish and improve SLAs, monitoring, alerting, and observability across data pipelines
- Reduce technical debt and standardize patterns across the DAG ecosystem
Data Infrastructure & Engineering
- Maintain and optimize scalable data architectures on AWS (S3, Redshift, Glue, EMR, Lambda)
- Continuously improve Redshift query performance, data models, and cluster efficiency
- Optimize data partitioning, compression, distribution strategies, and storage costs
- Manage infrastructure as code and implement automated deployment processes
- Ensure data security, compliance, and governance best practices
- Build self-service tooling and capabilities to improve team productivity
Collaboration & Analytics Support
- Partner with business stakeholders, analysts, and data scientists to understand data requirements
- Translate business needs into robust technical solutions and data models
- Support ad-hoc analysis and data exploration requests
- Document data pipelines, schemas, and processes for cross-functional teams
- Contribute to data governance initiatives and data catalog maintenance
Production Support & Reliability
- Proactively monitor pipeline health and address issues before they impact SLAs
- Troubleshoot and resolve production incidents efficiently
- Implement comprehensive logging, metrics, and alerting for operational visibility
- Drive continuous improvement to reduce failures and operational toil
- Establish and enforce CI/CD practices for safe, automated deployments
- Participate in on-call rotation to ensure data platform availability (if applicable)
- Conduct root cause analysis and implement long-term fixes for recurring issues
Required Qualifications
Technical Skills
- Python: 3+ years of production experience in data engineering (pandas, boto3, SQL libraries)
- AWS: Strong hands-on experience with AWS data services, including:
- Amazon Redshift (query optimization, data modeling, administration)
- AWS Glue (ETL jobs, crawlers, Data Catalog)
- Apache Airflow / MWAA (DAG development, operators, sensors)
- S3, Lambda, Step Functions, EMR (experience or exposure)
- SQL: Advanced SQL skills with experience optimizing complex queries
- Cloud Infrastructure: Solid understanding of networking, IAM, and security concepts
- Version Control: Proficiency with Git and collaborative development workflows
Soft Skills
- Strong communication skills with the ability to explain technical concepts to non-technical stakeholders
- Collaborative mindset and experience working cross-functionally
- Problem-solving orientation with strong attention to detail and data quality
- Ability to manage multiple priorities in a fast-paced environment
Preferred Qualifications
- Experience with dbt (Data Build Tool) for analytics engineering
- Familiarity with alternative orchestration tools (Prefect, Dagster, Step Functions)
- Exposure to streaming technologies (Kinesis, Kafka, Flink)
- Experience with DataOps/MLOps practices and CI/CD for data pipelines
- AWS certifications (Solutions Architect, Data Analytics, or similar)
- Knowledge of data warehousing concepts (Kimball, star schemas, SCDs)
- Experience with infrastructure as code (Terraform, CloudFormation)
- Familiarity with data observability tools (Monte Carlo, Datadog, Great Expectations)
Why You Should Work at Distillery
- Work on large-scale, real-world data platforms with meaningful technical challenges
- Collaborate with talented engineers in a culture that values quality and ownership
- Influence architectural decisions and improve systems already in production
- Grow your career through continuous learning and exposure to modern cloud technologies
- Flexible, remote-friendly environment with a strong emphasis on work-life balance
- Be part of a company that trusts its engineers and values long-term partnerships
