Back to jobsJob overview

About the role

Senior Software Development Engineer - Gen AI Infrastructure and Solutions at Microsoft

Required Skills

pythondistributed systemscloud infrastructuredevopsdockerkubernetesllmai/mlazure

About the Role

Senior Software Development Engineer role focused on building and optimizing Gen AI infrastructure and solutions. Responsibilities include developing large-scale training infrastructure, fine-tuning LLMs, and building scalable services for AI applications. The role requires expertise in distributed systems, cloud infrastructure, and DevOps practices.

Key Responsibilities

  • Collaborate with senior engineers and researchers to build and optimize training infrastructure and tools for LLMs, SLMs, multimodal, and code-specific models.
  • Design and implement new AI features for Azure Machine translation and Language services.
  • Design, build and improve the services with high scalability and reliability.
  • Contribute to the deployment and monitoring of services in production environments.
  • Participate in the efforts to deliver and improve engineering systems and practices to ensure service quality in complex cloud environments.

Required Skills & Qualifications

Must Have:

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 3+ years designing, developing, and shipping software.
  • 2+ years of experience with distributed systems and cloud-based infrastructure.
  • 1+ year of experience with DevOps practices (CI/CD, automated testing, deployment, etc.).

Nice to Have:

  • Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 1+ year of experience with containerization tools (e.g., Docker, Kubernetes).
  • Familiarity with production ML systems and concepts like model serving, caching, batching, and monitoring.

Benefits & Perks

  • Industry leading healthcare