Back to jobsJob overview

About the role

Software Engineer II at Microsoft

Required Skills

c++pythonrustdistributed systemshigh performance computingrdmanetworkingai infrastructure

About the Role

Develop next-generation network transport protocols and RDMA-based communication libraries for high-performance AI supercomputing infrastructure. Focus on backend network design, distributed systems, and optimizing throughput and latency for large-scale AI training.

Key Responsibilities

  • Design, develop, and optimize networking solutions for large-scale AI training infrastructure
  • Benchmark, analyze, and enhance scalability and reliability of networking systems
  • Debug and resolve complex networking issues in high-performance environments
  • Create, implement, optimize, debug, refactor, and reuse code for performance and maintainability
  • Proactively seek new knowledge and adapt to new AI trends and technical solutions

Required Skills & Qualifications

Must Have:

  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including C, C++, Rust, or Python OR equivalent experience
  • 2+ years of experience in software design and development
  • 1+ years experience on Distributed Systems
  • Ability to pass Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Nice to Have:

  • Bachelor's Degree in Computer Science OR related technical field AND 5+ years technical engineering experience with coding in languages including C, C++, C#, Java, JavaScript, OR Python OR Master's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 2+ years experience on High Performance Computing / Machine Learning middleware and Communication Runtime
  • 2+ years experience on Hardware-Software co-design
  • 2+ years experience on Profiling and Performance Analysis Tools
  • Problem-solving skills, analytical capabilities, and attention to detail
  • Familiarity with high performance networking hardware/architecture

Benefits & Perks

  • Industry leading healthcare