AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
WHY JOIN US
If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!
ABOUT THE ROLE
As a Lead DevOps Engineer, you’ll own and evolve cloud infrastructure that powers AI-driven products, ensuring scalability, reliability, and security across distributed systems. You’ll lead modernization efforts in AWS, streamline CI/CD pipelines, and champion automation and observability best practices. This role combines hands-on technical leadership with strategic impact, offering the opportunity to shape infrastructure standards and drive innovation in a fast-paced, collaborative environment.
WHAT YOU WILL DO
WHY JOIN US
If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!
ABOUT THE ROLE
As a Lead DevOps Engineer, you’ll own and evolve cloud infrastructure that powers AI-driven products, ensuring scalability, reliability, and security across distributed systems. You’ll lead modernization efforts in AWS, streamline CI/CD pipelines, and champion automation and observability best practices. This role combines hands-on technical leadership with strategic impact, offering the opportunity to shape infrastructure standards and drive innovation in a fast-paced, collaborative environment.
WHAT YOU WILL DO
- Lead the design, maintenance, and evolution of AWS-based infrastructure (ECS, RDS, Lambda, S3, CloudWatch, CDK);
- Upgrade AWS Aurora Postgres clusters to the latest supported versions, ensuring high availability and data integrity;
- Enhance and maintain GitHub Actions CI/CD pipelines for production deployments, including support for Lambda functions and microservices;
- Manage infrastructure as code using AWS CDK (TypeScript) and implement automation best practices;
- Consolidate and optimize utility scripts and shared tooling across multiple repositories;
- Collaborate with engineering teams to drive DevOps culture, automation, and observability.
- 5+ years of experience in DevOps or Site Reliability Engineering (SRE), including project ownership or leadership roles;
- Strong expertise with AWS services (ECS, RDS/Aurora, S3, Lambda, CloudWatch, CDK);
- Proficient with Docker, GitHub Actions, and CI/CD best practices;
- Deep knowledge of Postgres administration, including upgrades, backups, and performance tuning;
- Strong scripting and automation skills with TypeScript, Python, or Bash;
- Proven ability to architect scalable, secure, and reliable cloud environments;
- Excellent communication, collaboration, and problem-solving skills;
- Self-driven, detail-oriented, and passionate about delivering high-quality results;
- Upper-intermediate English level.
- Familiarity with AI/ML workflows or cloud-based AI services;
- Experience with AWS Bedrock or similar generative AI platforms;
- Exposure to Cursor or other modern AI-enhanced developer tools;
- Understanding of security and scaling best practices for distributed environments;
- Experience with monitoring and observability tools (Datadog, Prometheus, CloudWatch, etc.).
- Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps.
- Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities.
- A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands.
- Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office – whatever makes you the happiest and most productive.