Senior Software Engineer
CAST AI
Why Cast AI?
Cast AI is the leading Application Performance Automation (APA) platform, enabling customers to cut cloud costs, improve performance, and boost productivity – automatically.
Built originally for Kubernetes, Cast AI goes beyond cost and observability by delivering real-time, autonomous optimization across any cloud environment. The platform continuously analyzes workloads, rightsizes resources, and rebalances clusters without manual intervention, ensuring applications run faster, more reliably, and more efficiently.
Headquartered in Miami, Florida, Cast AI has employees in more than 32 countries worldwide and supports some of the world’s most innovative teams running their applications on all major cloud, hybrid, and on-premises environments. Over 2,100 companies already rely on Cast - from BMW and Akamai to Hugging Face and NielsenIQ.
What’s next? Backed by our $108M Series C, we’re doubling down on making APA the new standard for DevOps and MLOps, and everything in between.
We are hiring across multiple teams!
As a Senior Software Engineer, you will have the opportunity to work on different key features of our product. We are currently hiring Senior Software Engineers for the following teams:
Reporting - Builds a scalable reporting system that ingests millions of rows per second into our time-series databases, providing insights into cost savings, workload efficiencies, and Cast AI automation impact.
Pricing - Drives the synchronization of public and customer cloud resources, availability, and dynamic pricing across all major cloud providers. Empowers autoscaling by leveraging discounts, commitments, and cross-cluster tracking to maximize savings. Provides a reliable source of truth for node pricing, resources, components, discounts, and commitments.
Autoscaler - Automates Kubernetes node autoscaling to optimize clusters, balance workloads, remove underutilized nodes, and dynamically allocate capacity in real-time, thereby reducing cluster costs by half.
Workload Optimization (WOOP) - Automates workload resource management by dynamically adjusting resource allocations, helping developers significantly reduce costs and improve application reliability.
AI Enabler - Helps customers deploying and managing LLMs in their Kubernetes cluster and optimizes their workloads by providing cost visibility and intelligent routing for LLM requests to the most cost-effective compute resources (e.g. Grok, self-hosted LLAMA models).
APA - An intelligent agentic system that not only detects application performance issues but proactively resolves them. By deeply integrating with observability stacks and leveraging Kubernetes, APA automates optimization, scaling, security, and recovery, enabling applications to run faster, cheaper, and more reliably.
Sec Posture - Builds a Kubernetes Security product that helps our customers secure their clusters, surfacing threats with the biggest potential security impact by ingesting millions of data points from vulnerability advisories, image scans and configurations from customer’s environments, application behavior at runtime, etc.
Wire - The De-facto Team is a core platform team that builds and maintains essential services - such as authorization, notifications, audit logs, and feature flags - that enable customers to securely and effectively use the Cast AI platform after purchase. Our work focuses on delivering enterprise-grade capabilities like SSO, granular permissions, and billing, empowering both customers and internal teams to operate at scale.
Here are some of the tools we use daily:
- Languages: GoLang (primary), Python (secondary for some cases)
- Cloud & Orchestration: Kubernetes, AWS, GCP, Azure
- Databases & Storage: PostgreSQL, Cloud Object Storage
- PostgreSQL and Cloud Object Storage for persistence
- Messaging & APIs: GCP Pub/Sub, gRPC for internal communication, REST for public APIs
- Observability: Prometheus, Grafana, Loki, Tempo
- CI/CD & GitOps: GitLab CI with ArgoCD.
Requirements:
- Strong software engineering skills with experience in distributed systems and backend development (ideally GoLang, but not a hard requirement as long as you’re willing to transition to it)
- Strong debugging, optimization, and performance-tuning skills
- Deep understanding of cloud platforms: hands-on experience with cloud platforms like AWS, Google Cloud Platform (GCP), Microsoft Azure, and tools such as Kubernetes for containerization and orchestration
- CI/CD and DevOps practices experience
- Strong English skills, both verbal and written
- Ability to work independently and collaboratively within a team
- Startup mindset: adaptable, proactive, and comfortable with ambiguity
- A proactive, problem-solving mindset with a "yes we can" attitude.
What’s in it for you?
- Competitive salary (€6,500 - €9,000 gross, depending on the level of experience)
- Enjoy a flexible, remote-first global environment.
- Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
- Enjoy a flexible, remote-first global environment.
- Equity options.
- Private health insurance.
- Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
- Spend 10% of your work time on personal projects or self-improvement.
- Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
- Annual hackathon to spark new ideas and strengthen team bonds.
- Team-building budget and company events to connect with your colleagues.
- Equipment budget to ensure you have everything you need.
- Extra days off to help maintain a healthy work-life balance.