Senior SRE (FinOps & Performance oriented) – LATAM
About Distillery
Distillery accelerates innovation through an unyielding approach to nearshore software development. The world’s most innovative technology teams choose Distillery to help accelerate strategic innovation, fill a pressing technology gap, and hit mission-critical deadlines. We support essential applications, mobile apps, websites, and eCommerce platforms by placing senior, strategic technical leaders and deploying fully managed technology teams that work intimately alongside our client’s in-house development teams. At Distillery we’re not here to reinvent nearshore software development, we’re on a mission to perfect it. Distillery is committed to diversity and inclusion. We actively seek to cultivate a workforce that reflects the rich tapestry of perspectives, backgrounds, and experiences present in our society. Our recruitment efforts are dedicated to promoting equal opportunities for all candidates, regardless of race, ethnicity, gender, sexual orientation, disability, age, or any other dimension of diversity
About the Position
We are looking for a systems-minded Site Reliability Engineer (SRE) to join an e-commerce DevOps engineering team. In this role, you will bridge the gap between application development (Java, TypeScript, Next.js) and cloud infrastructure (AWS, EKS, EC2).
As a curious, high-ownership engineer who thrives in multicultural, remote-first environments, your mission will be to ensure the platform is resilient under peak load, observable by default, and cost-optimized by design.
This is a hands-on position for someone who enjoys solving complex infrastructure challenges, partnering closely with developers, and building reliable systems at scale in production-critical environments.
Responsibilities
- Solve complex performance bottlenecks across the entire stack, from Linux kernel tuning to cloud-native AWS architectures
- Debug networking issues including TCP/IP, DNS, and distributed system behaviors
- Identify true root causes beyond surface-level fixes and drive long-term reliability improvements
- Partner with software engineers to design and execute load and stress testing strategies
- Take ownership of performance outcomes, supporting infrastructure or application refactoring efforts
- Ensure platform stability during traffic spikes and peak e-commerce events
- Maintain and scale cloud infrastructure using Terraform
- Contribute to automation and platform consistency across environments
- Treat cloud cost as a primary engineering metric alongside performance and uptime
- Lead cost optimization initiatives to ensure sustainable infrastructure spending
- Ensure infrastructure investments directly support reliability and scalability
- Drive best-in-class observability practices across services and infrastructure
- Build monitoring and alerting systems that enable proactive incident prevention
- Build paved roads and self-service reliability tools to empower development teams
- Collaborate with Engineering, Product, Project Management, and CX teams during production incidents
- Mentor peers and promote operational excellence across the organization
Requirements
- 4+ years of experience in SRE or DevOps roles with a strong software engineering background
- Proven ability to build, operate, and scale production systems in business-critical environments
- Deep experience managing large-scale Linux environments on AWS
- Strong debugging skills when abstraction layers fail
- Solid understanding of cloud fundamentals including networking, IAM, storage, compute, databases, and caching
- Hands-on experience with Kubernetes fundamentals, particularly AWS EKS
- High proficiency in Python, Go, and Bash scripting
- Strong bias toward automation, tooling, and operational efficiency
- Familiarity with access boundaries, secrets management, encryption, and cloud security best practices
- Analytical mindset with strong communication skills
- Ownership mentality and ability to thrive in fast-paced, production-focused environments
Nice To Have
- Experience with Java or TypeScript / Next.js
- Familiarity with Pulumi
- Experience with e-commerce platforms, billing or fulfillment systems
- Exposure to Shopify or similar e-commerce ecosystems
Why You'll Like Working Here
Join a global team committed to Distillery's core values: Unyielding Commitment, Relentless Pursuit, Courageous Ambition, and Authentic Connection.
- 100% Remote Work: Enjoy the freedom to work from anywhere while collaborating with a diverse, multinational team.
- Competitive Compensation: Generous and competitive package in USD, along with a comprehensive benefits plan.
- Flexible Hours: Create a schedule that aligns with your life and priorities.
- Home Office Setup: Receive all the hardware and software needed to succeed from home.
- Innovative Workplace: Collaborate with the global Top 1% of talent in a multicultural and dynamic environment.
- Focus on Growth: Pursue professional and personal development while contributing your unique talents to a team where you can truly shine.
