Sr. Data Engineer AWS – MEXICO

LATAM

About Distillery

Distillery Tech Inc accelerates innovation through an unyielding approach to nearshore software development. The world’s most innovative technology teams choose Distillery to help accelerate strategic innovation, fill a pressing technology gap, and hit mission-critical deadlines. We support essential applications, mobile apps, websites, and eCommerce platforms through the placement of senior, strategic technical leaders and by deploying fully managed technology teams that work intimately alongside our client’s in-house development teams. At Distillery Tech Inc, we’re not here to reinvent nearshore software development, we’re on a mission to perfect it. Distillery Tech Inc is committed to diversity and inclusion. We actively seek to cultivate a workforce that reflects the rich tapestry of perspectives, backgrounds, and experiences present in our society. Our recruitment efforts are dedicated to promoting equal opportunities for all candidates, regardless of race, etc.


About the Position

As a Data Engineer, you will play a critical role in rebuilding a large-scale AWS-based data lake and analytics platform. This opportunity is exclusively for professionals located in Mexico. You will be responsible for re-implementing ingestion pipelines, EMR/Glue workflows, Redshift loading logic, and integrations with tools such as Databricks and SageMaker, using Terraform. Your work will ensure accuracy, integrity, and consistency across complex data workflows in a new AWS account.


Responsibilities:

● Rebuild and validate data ingestion pipelines using AWS services (Lambda, Kinesis Firehose, MSK, S3).

● Migrate and reconfigure processing jobs in Glue, EMR, and Amazon MWAA (Airflow).

● Recreate and validate table definitions in Glue Data Catalog for downstream Athena queries.

● Develop and manage ingestion from third-party APIs (e.g., Revature, eCommerce Affiliates) via Lambda or Airflow DAGs.

● Partner with ML engineers to re-establish SageMaker and Personalize workflows.

● Collaborate with the DevOps team to ensure Terraform-managed infrastructure supports data pipeline requirements.

● Conduct thorough data validation across the migration process, ensuring object counts, schema consistency, and accurate source-to-target delivery.

● Maintain clear documentation of data flows, transformations, and analytics logic.


Technical Expertise:

● 5+ years of experience in data engineering or analytics engineering roles.

● Advanced AWS expertise: Lambda, Kinesis (Data Streams & Firehose), MSK, S3, Glue, Athena, EMR, Redshift.

● Experience with Airflow (self-hosted or MWAA).

● Proficiency in Python for ETL development and Lambda scripting.

● Strong knowledge of Glue Data Catalog schema design and partitioning strategies.

● Familiarity with common data formats and storage best practices (JSON, Parquet, AVRO).

● Experience integrating external APIs securely and managing authentication/secrets.


Preferred Skills:

● Exposure to Martech tools like Sailthru, Zephr, or Databricks.

● Knowledge of SageMaker pipelines, feature store, and endpoint deployments.

● Understanding of cross-account data ingestion within AWS.

● Hands-on experience using Terraform to provision data infrastructure.

● Knowledge of Redshift Spectrum and federated querying strategies.


Why You'll Like Working Here

● Collaborate with multi-national teams committed to our core values: Unyielding Commitment, Relentless Pursuit, Courageous Ambition, and Authentic Connection.

● Enjoy a competitive compensation package, generous vacation, and comprehensive benefits.

● Work remotely in a flexible, supportive environment.

● Access professional and personal development opportunities to advance your career.