Back to jobsJob overview

About the role

Software Engineer II at Microsoft

Required Skills

pythonc#goazurekubernetesterraformllm/agentssreci/cd

About the Role

Software Engineer II role at Microsoft's CoreAI organization, focusing on building and operating AI agents for Azure's app platforms. Responsibilities include designing SRE agents, applying AI for reliability, and implementing SRE fundamentals at scale. Requires experience with cloud services, automation, and LLM/agent systems.

Key Responsibilities

  • Design & implementation of SRE agents and platform services
  • Build LLM-powered detection, triage, and mitigation loops
  • Define SLIs/SLOs and error budgets for reliability
  • Implement feature flags, canaries, and staged rollouts
  • Participate in on-call, mitigate incidents, and drive post-incident reviews

Required Skills & Qualifications

Must Have:

  • Bachelor's Degree in Computer Science or related field AND 2+ years technical engineering experience with coding in C#, Go, or Python OR equivalent experience
  • 2+ years in Production/Platform Engineering for large-scale cloud services
  • Experience with Azure services (AKS, Container Apps, Functions, etc.), IaC (Terraform/Bicep), and CI/CD (GitHub Actions/Azure DevOps)
  • 6+ months experience building/operating LLM/agent systems (function calling, planning, retrieval, memory)

Nice to Have:

  • Experience with Azure OpenAI, Azure AI Search/vector stores, prompt/response optimization, cost & latency tuning
  • Experience with Linux, containers (Docker), orchestration (Kubernetes), and SRE fundamentals (SLI/SLO, incident management)
  • Keeping up with latest AI research and blogs
  • Experience with Observability tools (Azure Monitor, OpenTelemetry, Prometheus/Grafana)

Benefits & Perks

  • Industry leading healthcare