Senior Data Engineer (Python) – LATAM
About Distillery
Distillery Tech is committed to diversity and inclusion. We actively seek to cultivate a workforce that reflects the rich tapestry of perspectives, backgrounds, and experiences present in our society. In our recruitment efforts, we are dedicated to promoting equal opportunities for all candidates, regardless of race, ethnicity, gender, sexual orientation, disability, age, or any other dimension of diversity.
Distillery accelerates innovation through an unyielding approach to nearshore software development. The world’s most innovative technology teams choose Distillery to help accelerate strategic innovation, fill a pressing technology gap, and hit mission-critical deadlines. We support essential applications, mobile apps, websites, and eCommerce platforms through the placement of senior, strategic technical leaders and by deploying fully managed technology teams that work intimately alongside our client’s in-house development teams. At Distillery, we’re not here to reinvent nearshore software development, we’re on a mission to perfect it.
About the Position
We are seeking a Data Engineer with expertise in Python, HDFS, and PostgreSQL to join our dynamic team. In this role, you will design, develop, test, and maintain large-scale data systems, including Data Lakes, Data Warehouses, and data ingestion/processing pipelines. You will work closely with cross-functional teams to ensure that data flows efficiently, while adhering to data governance standards and improving data quality.
Responsibilities
-
Design, develop, implement, and maintain:
-
Data Pipelines for data ingestion.
-
ETL Processes for data transformation and processing.
-
Data Warehouse Structures for efficient data storage and retrieval.
-
System integration with large-scale processing systems.
-
-
Ensure adherence to Data Governance policies, including the development of data dictionaries, data lineage, and data quality frameworks.
-
Analyze data sources and data warehouse structures to optimize and identify opportunities for data acquisition and integration.
-
Handle raw data containing errors, ensuring data reliability and accuracy.
-
Test and maintain deployed data services to ensure optimal performance.
-
Recommend and implement improvements for data reliability, efficiency, and quality.
Requirements
-
Strong proficiency in Python, HDFS, and PostgreSQL.
-
Proven experience in designing, implementing, and maintaining data pipelines and ETL processes.
-
Hands-on experience with data modeling (conceptual, logical, and physical models) and relational database schemas (e.g., star and snowflake schemas).
-
Knowledge of handling structured, semi-structured, and unstructured data.
-
Solid understanding of Computer Science principles: Algorithms, Data Structures, Time & Space complexity, SOLID Principles.
-
Experience in Object-Oriented Programming (inheritance, polymorphism, etc.).
-
Familiarity with Data Governance and best practices for data quality and documentation.
-
Nice-to-Have: Experience with additional programming languages (Java, R, Scala, etc.) and cloud or on-premise ETL tools (MS SSIS, Azure Data Factory, etc.).
Why You’ll Like Working Here
- The opportunity to work with and partner on multi-national teams that embody our core values: Unyielding Commitment, Relentless Pursuit, Courageous Ambition, and Authentic Connection.
- A generous, competitive compensation package for exceptional performers, alongside a competitive benefits plan and a generous vacation package.
- Remote working environment.
- Professional and personal development opportunities within a fast-growing, innovative company.