Software Engineer- AI/ML, AWS Neuron Distributed Training

Do you love decomposing problems to develop products that impact millions of people around the world? Would you enjoy identifying, defining, and building software solutions that revolutionize how businesses operate?The Annapurna Labs team at Amazon Web Services (AWS) is looking for a Software Development Engineer II to build, deliver, and maintain complex products that delight our customers and raise our performance bar. You’ll design fault-tolerant systems that run at massive scale as we continue to innovate best-in-class services and applications in the AWS Cloud. Annapurna Labs was a startup company acquired by AWS in 2015, and is now fully integrated. If AWS is an infrastructure company, then think Annapurna Labs as the infrastructure provider of AWS. Our org covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe, are some of the products we have delivered, over the last few years. AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1 and Inf1 servers that use them. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This role is responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive scale large language models like GPT2, GPT3 and beyond, as well as stable diffusion, Vision Transformers and many more. The ML Distributed Training team works side by side with chip architects, compiler engineers and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large models using Python is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.Key job responsibilitiesThis role will help lead the efforts building distributed training support into Pytorch, Tensorflow using XLA and the Neuron compiler and runtime stacks. This role will help tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and the TRn1 , Inf1 servers. Strong software development and ML knowledge are both critical to this role.About the teamAbout UsInclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded professional and enable them to take on more complex tasks in the future.BASIC QUALIFICATIONS- 3+ years of non-internship professional software development experience- 3+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience- Experience programming with at least one software programming language- Deep Learning industry experience ...

Sr. Commodity Manager, Strategic Silicon

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.Amazon Web Services (AWS) provides a highly reliable, scalable, and low-cost cloud platform that powers thousands of businesses in over 190 countries. AWS’ Infrastructure Supply Chain & Procurement (ISCaP) organization works to deliver cutting-edge solutions to source, build and maintain our socially responsible data center supply chains. We are a team of highly-motivated, engaged, and responsive professionals who enable the core sustainable infrastructure of AWS. Come join our team and be a part of history as we deliver results for the largest cloud services company on Earth!Amazon Web Services (AWS) is seeking a highly effective candidate to identify, create, develop and integrate innovative technology to deliver the best operating, lowest cost infrastructure in the world. This individual will lead the sourcing of strategic silicon used in our servers to better meet our customers' rapidly growing infrastructure needs. Successful candidates will bring strong knowledge of Semiconductor supply chains and proven ability to manage complex negotiations, program manage cross-functional teams, and effectively operate at any level of the organization (up to senior executives). AWS serves over a million active customers in more than 190 countries. We are steadily expanding global infrastructure to help our customers achieve lower latency and higher throughput. As our customers grow their businesses, AWS will continue to provide infrastructure that meets their global requirements.Professional traits that are necessary for Amazon leaders: - Exhibits excellent judgment - Has high standards (is never satisfied with the status quo) - Is able to dive deep and is never out of touch with the details of the business - Expects and requires innovation of her/his team - Has passion and convictions and the innate ability to inspire passion in others - Strong results orientation - Thinks bigPrimary Responsibilities: - Develop and implement Silicon sourcing strategies for AWS hardware infrastructure. - Understand industry trends for key technologies such as PCIe, Microcontroller, Retimers, power modules etc - Identify key technology trends in the industry and drive supplier product roadmaps and execute on strategic initiatives in those products. - Negotiate and implement complex supplier agreements and contracts by working with cross-functional stakeholders. - Develop negotiations strategies to deliver against business objectives and achieve sustainable relationship with suppliers. - Responsible to provide weekly core-team updates, deliver executive level updates on a monthly basis, and will be responsible for vendor escalations in order to resolve any issues related to his/her category. - Present to senior leadership via written narratives.About the teamDiverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Why AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAmazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship and Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- 5+ years of developing, negotiating and executing business agreements experience- 5+ years of professional or military experience- Bachelor's degree- Experience developing strategies that influence leadership decisions at the organizational level- Experience managing programs across cross functional teams, building processes and coordinating release schedules ...

Sr. Hardware Dev Engineer (AWS Generative AI & ML Servers), AWS Generative AI & ML Servers

Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build the future of the cloud for AI training and inference? Want to do industry leading work delivering continuous price performance improvements in the cloud for AI model training for multi billion variable LLMs? Come Join us in designing, delivering and operating AWS cloud offerings that enable high performance and scalability in AI/ML and HPC workloads.AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.Key job responsibilitiesAs a member of the Hardware Engineering Services team in this specific function, you will own and lead the design, development and root cause of a new segment of accelerated servers.You will work closely with our customers to understand their technical needs and business goals, leveraging your experience with server design and the knowledge of various teams to architect the solutions that we will deploy at scale.To deliver your products you will work with an interdisciplinary team of component, firmware, test, qualification, and integration engineers, and lead our design and manufacturing partners to bring these servers to the data center. After launch you will oversee the fleet of servers you develop, monitoring their quality and how they are meeting the customer requirements.A day in the lifeYour day to day responsibilities will include interfacing with our internal and external customers to understand project requirements and facilitate system development ontop of your server design. You will be responsible for learning operational challenges to our existing fleet with the goal of improving the current customer experience as well as developing improved systems for future designs. You will work directly with vendors and ODM/JDM design teams to develop and manufacture your product at scale.About the teamThe team is comprise of both Hardware Design Engineers, System Design Engineers, Software Development Engineers and Technical Program Managers, all with the common goal of delivering the best Accelerated Server fleet possible to our customers.*Why AWS*Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.*Diverse Experiences*Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.*Work/Life Balance*We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.*Inclusive Team Culture*Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.*Mentorship and Career Growth*We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- Experience in developing functional specifications, design verification plans and functional test procedures ...

Sr. Hardware Development Engineer, AWS Board Core Design and Services Team

Do you enjoy solving complex problems and driving influential changes? Are you curious about the systems used to run the largest cloud computing infrastructures in the world? Do you thrive in a fast paced and ever-changing environment? If you answered YES to these questions, then our team is looking for you! The AWS Board Core Design & Services team drives system innovation in the servers used by all of Amazon Web Services, including EC2, S3 and CloudFront. Our engineers solve the hardest problems that fuse software, hardware, and the cloud. We take big bets on new concepts, enabling AWS services to continue to revolutionize the industry.What you will do: As a member of the AWS Board Core Design & Services team you will own next-generation server components. You will have demonstrated results in the architecture and development of components used in servers, in areas such as motherboards, graphics, accelerators, and programmable hardware. You will interact with an interdisciplinary team of engineers to design, develop, validate, and launch at large scale. You’ll provide leadership in the application of new technologies to servers in a continuous effort to deliver and improve a world-class customer experience. This is a fast-paced, intellectually challenging position. You will work with thought leaders in multiple technology areas. You have high standards for yourself and everyone you work with. You will be constantly looking for ways to improve performance, quality, and cost. At AWS we are changing an industry and want individuals who are ready for a challenge to reach beyond what is possible today.Why it matters: Public cloud IT services represent the majority of growth in the overall IT services market and will continue to do so for several years to come. The scale of AWS creates a unique opportunity for differentiated hardware that will directly benefit customers.Why you will love it: You will build next-generation hardware that powers the cloud. You will deliver improvements for our customers and have a direct impact on our bottom line. You will be part of a growing, fast paced, and fun team. You will have ownership for the implementation of your work.A day in the lifeHardware engineers within AWS Board Core Design & Services team could be working in a variety of areas, few examples would be: hardware design and development; building the systems that validate hardware quality in manufacturing; monitoring and improving hardware reliability in data centers and platform. We cover everything from low level hardware to embedded software and systems that operate and monitor it. There is no blueprint for how to do what we do, which encourages our engineers to identify and develop simple solutions to complex problems. We encourage durable solutions that look around corners while taking into consideration our customer needs from a cost, performance, and reliability perspective. About the teamWithin AWS AWS Board Core Design & Services team, our organization is responsible for system innovation in the servers used by all of Amazon Web Services. Beyond product delivery we actively manage the fleet of Servers in the data center that keeps growing. This means tracking key business and operational metrics to ensure that we operate smoothly and minimize or eliminate customer impact due to device related issues for a transparent AWS customer experience.BASIC QUALIFICATIONS- BS Electrical/Computer Engineering (or equivalent experience)- 7+ years of experience in embedded ARM, FPGA, and CPLD applications- 7+ years of experience with flash, I2C, SPI, UART, and ethernet interfaces- 7+ years of experience in collaborating with the ECAD/physical board designer and able to review PCB designs with Allegro, CAM350, or equivalent.- 7+ years of experience in developing functional specifications, design verification plans and functional test procedures- 7+ years of experience in board root cause analysis and resolution ...

Sr. Machine Learning - Compiler Engineer III, AWS Neuron, Annapurna Labs

AWS Machine Learning accelerators are at the forefront of AWS innovation and one of several AWS tools used for building Generative AI on AWS. The Inferentia chip delivers best-in-class ML inference performance at the lowest cost in cloud. Trainium delivers the best-in-class ML training performance with the most teraflops (TFLOPS) of compute power for ML in the cloud. This is all enabled by cutting edge software stack, the AWS Neuron Software Development Kit (SDK), which includes an ML compiler, runtime and natively integrates into popular ML frameworks, such as PyTorch, TensorFlow and JAX. AWS Neuron is used at scale with customers like Snap, Autodesk, Amazon Alexa, Amazon Rekognition and more customers in various other segments.The Amazon Annapurna Labs team is responsible for silicon development at AWS. The team covers multiple disciplines including silicon engineering, hardware design and verification, software and operations.The Neuron Compiler team is developing a deep learning compiler stack that takes neural network descriptions created in frameworks such as TensorFlow, PyTorch, and JAX, and converts them into code suitable for execution. The team is comprised of some of the brightest minds in the engineering, research, and product communities, focused on the ambitious goal of creating a toolchain that will provide a quantum leap in performance.As a Machine Learning Compiler Engineer II in the AWS Neuron Compiler team, you will be supporting the ground-up development and scaling of a compiler to handle the world's largest ML workloads. Architecting and implementing business-critical features, publish cutting-edge research, and contributing to a brilliant team of experienced engineers excites and challenges you. You will leverage your technical communications skill as a hands-on partner to AWS ML services teams and you will be involved in pre-silicon design, bringing new products/features to market, and many other exciting projects. A background in compiler development is strongly preferred. A background in Machine Learning and AI accelerators is preferred, but not required.In order to be considered for this role, candidates must be currently located or willing to relocate to Cupertino (perferred), Seattle, Austin.BASIC QUALIFICATIONS- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- 2+ years of experience in developing compiler features and optimizations- Proficiency with 1 or more of the following programming languages: C++ (preferred), C, Python ...

Sr. Physical Design Methodology Engineer, Annapurna Labs

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.Amazon Web Services provides a highly reliable, scalable, low-cost infrastructure platform in the cloud that powers hundreds of thousands of businesses in 190 countries around the world. We have data center locations in the U.S., Europe, Singapore, and Japan, and customers across all industries.Custom SoCs (System on Chip) live at the heart of AWS Machine Learning servers. As a member of the Cloud-Scale Machine Learning Acceleration team you’ll be responsible for the design and optimization of hardware in our data centers including AWS Inferentia, Trainium Systems (our custom designed machine learning inference and training datacenter servers). Our success depends on our world-class server infrastructure; we’re handling massive scale and rapid integration of emergent technologies. We’re looking for an ASIC Physical Design Methodology Engineer to help us trail-blaze new technologies and architectures, while ensuring high design quality and making the right trade-offs.Key job responsibilitiesDefine, develop and deploy innovative physical design methodologies (RTL2GDS) and CAD flows for ML Accelerator chips in advanced nodesDrive improvement in RTL2GDS flows/methodology for PPA and TAT improvementsCreate Dashboard and Central reports for project tracking and visualizing QoR/statsFine tune cloud infrastructure to improve turnaround times for physical design work.Interface directly with RTL, Physical Design, Package Design, DFT and other teams to improve methodologies and efficiencies and drive efforts to resolution.Work with EDA tool vendors to evaluate new methods, solve bugs, improve usability, etc.Drive setting up RTL2GDS flows for new nodes, run regressions, quality assurance checksAbout the teamAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- BS + 10yrs or MS + 7yrs in EE/CS- 5+ years of experience in developing physical design methodology or CAD flows in synthesis, PNR, and sign-off areas for advanced technology nodes. - Proficient in programming/scripting languages (Perl, Python, C++)- Solid understanding of ASIC physical design, and methodologies including synthesis, place and route, STA, IR, formal and physical verification. - Demonstrated level of expertise in PD tools such as Innovus, ICC2, Fusion Compiler, STA, and Sign-Off. - Proven track record of delivering metric driven PPA flow development and support. ...

Sr. SDM, ML Acceleration, Neuron Inference Apps

Utility Computing (UC)AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.AWS Neuron is the complete software stack for the Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1/Inf2 servers that use them. As the Sr. SDM for the Neuron Inference Customer Enablement Team, you will be responsible for leading a strong team of Managers and engineers to help optimize customer or open-source models for Inference performance (latency, throughput, scale) on various frameworks such as Pytorch, JAX, Tensorflow. You will be responsible for the full development life cycle of inference performance improvement and reliability/scalability features in our internal Neuronx_Distributed and Transformers_Neuronx Inference Libraries, as well as contribute to other popular open Inference Libraries. You will strive towards enabling our customers adopt and make Trainium and Inferentia devices as the first-class citizens for ML Acceleration workloads including both Text and Multimodal models. Lead the way to ensure support for key ML functionality in a combined chip / software platform. Ensure the right thing is being built and delivered to customers.A successful candidate will have an established background in delivering on ML roadmaps for demanding, fast-changing customers balancing across with internal Product roadmap. Delivered high-performant models using distributed inference libraries and frameworks. The ideal candidate should have a strong technical ability to work/deliver on a vertically integrated system stack that consists of a combinatorial matrix of hardware, frameworks, and workflows. Deep expertise in Framework integrations and development using C++ is a must along-with direct customer-facing experience and a strong motivation to achieve results. A day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.About the teamAbout AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 10+ years of engineering experience- 5+ years of engineering team management experience- 10+ years of planning, designing, developing and delivering consumer software experience- Experience partnering with product or program management teams- Experience managing multiple concurrent programs, projects and development teams in an Agile environment ...

Sr. Software Development Engineer - BMC, AWS Hardware Engineering Services

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.The AWS Firmware team drives system innovation in the servers used by all of Amazon Web Services, including EC2, S3, CloudFront, etc. Our engineers solve the hardest problems that fuse software, hardware, and the cloud. We take big bets on new concepts, enabling AWS services to continue to revolutionize the industry.We are looking for a seasoned Senior Software Development engineer to build and own the server related firmware. As a senior engineer in this team, you will work with a team of world-class software developers who thrive on creating innovative, scalable solutions for real-world data center infrastructure problems. You will be part of development efforts to build, validate, and support firmware in diverse technology domains from embedded software to large-scale distributed software systems, using proprietary and open source technologies.Why it mattersPublic cloud IT services represent the majority of growth in the overall IT services market and will continue to do so for several years to come. The scale of AWS, combined with an understanding of how our software and hardware is used, creates a unique opportunity for component customizations that will directly benefit our customers.Why you will love itYou will work with engineers across the company to build software for the next-generation platform. You will have a direct impact on our bottom line and the ability to deliver improvements for our developers. You will be part of a growing, fast paced, and fun team. You will have ownership for the implementation of your work. You will see direct product improvements based on the results of your work.A day in the lifeAs an experience Senior Software Development engineer, you will build and own the server related firmware. As a Senior Engineer in this team, you will work with a team of world-class software developers who thrive on creating innovative, scalable solutions for real-world data center infrastructure problems. You will be part of development efforts to build, validate, and support firmware in diverse technology domains from embedded software to large-scale distributed software systems, using proprietary and open source technologies.About the team*Why AWS*Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.*Diverse Experiences*Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.*Work/Life Balance*We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.*Inclusive Team Culture*Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.*Mentorship and Career Growth*We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- 5+ years of non-internship professional software development experience- 5+ years of programming with at least one software programming language experience- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- Experience as a mentor, tech lead or leading an engineering team- Experience developing embedded systems- Experience with software development for server platforms- Working knowledge of scripting languages like Python, Shell, or other similar scripting languages- Understanding of Intel architecture- Understanding of server platform design and architecture- Experience with IPMI and BMC development ...

Sr. Software Development Engineer, Annapurna Labs

AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.The AWS Cloud Storage offers a complete range of hardware and software for customers to store, access, govern, and analyze their data, reducing costs, increasing agility, and accelerating innovation.AWS Cloud Storage team is hiring firmware engineers with a background in NVMe memory devices to solve our customers toughest problems.As a firmware engineer on the AWS Cloud Storage team, you will be a thought leader at the forefront of consumer storage and networking solutions. You should feel equally comfortable in server and embedded environments, possess a deep understanding of computer architecture, Linux OS, and programming sophisticated embedded devices.Every day you will be working alongside brilliant engineers and leaders who obsess about performance, availability, scalability and durability of customer data, with the ambitious goal of improving AWS' industry-leading product.Key job responsibilities- Research, design, implement Firmware to support NVMe subsystem, DMA and Crypto through specialized HW units in Nitro Cards.- Debug complex, system-level, multi-component issues across multiple layers from kernel to application- Profile system performance activity and drive optimizations across our software stack- Deliver production-quality code and support its operation in the production environmentAbout the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- Experience as a mentor, tech lead or leading an engineering team- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience- 5+ years of experience with programming language: C or C++- 5+ years of experience in embedded Linux systems or NVMe Subsystem ...

Sr. Software Development Engineer, Nitro SSD

The AWS Hardware Engineering team is driving rapid innovation in the server and storage infrastructure used by Amazon Web Services. Our designs are industry-leading in frugality and operational excellence, and are critical to the success of the AWS business and the more than one million customers who use AWS today. Our Firmware Engineers solve challenging technology problems, and build architecturally sound, high-quality components to enable AWS to realize critical business strategies. The ideal candidate for this role will be an innovative self-starter. You will be an SSD firmware expert with experience in making architectural tradeoffs to optimize SSD performance for a variety of use cases. You will work with engineers across the company as well as external companies and lead firmware development efforts on custom solid-state devices. You will collaborate with internal and external development engineers (architecture, hardware, validation, software services). AWS Engineers are shaping the way people use computers and designing the future of cloud computing technology – come help us make history!What you will do: You will be a member of a team designing AWS-specific hardware, firmware and software for non-volatile memory devices, including NAND-based SSDs. You will be a part of the firmware effort from conception, through validation and into production. You will contribute to FW development and support device characterization and benchmarking efforts. You will work closely with AWS software engineers to tailor devices for the AWS environment.Why it matters: Public cloud IT services represent the majority of growth in the overall IT services market and will continue to do so for several years to come. The scale of AWS, combined with an understanding of how our hardware is used, creates a unique opportunity for component customizations that will directly benefit our customers.Why you will love it: You will work with engineers across the company to build next-generation devices. You will have a direct impact on our bottom line and the ability to deliver improvements for our developers. You will be part of a growing, fast paced, and fun team. You will have ownership for the implementation of your work. You will see direct product improvements based on the results of your work.Key job responsibilitiesAWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.About the team*Why AWS*Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.*Diverse Experiences*Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.*Work/Life Balance*We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.*Inclusive Team Culture*Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.*Mentorship and Career Growth*We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- 5+ years of non-internship professional software development experience- 5+ years of programming with at least one software programming language experience- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- Experience as a mentor, tech lead or leading an engineering team ...

Sr. Software Development Manager, AWS Neuron ML Frameworks

Utility Computing (UC)AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services.Join the team that builds AWS Neuron, the software stack that runs all the leading AI models on the AWS Inferentia and Trainium cloud-scale machine learning accelerators.As the Sr. Software Development Manager of ML Frameworks & Ecosystems you will lead the team that develops and extends Neuron support for leading ML frameworks including PyTorch and JAX. You will develop and deliver the framework plugins and libraries that enable a great user experience for developing and optimizing models on Trainium and Inferentia accelerators, and work closely with the open source ecosystem to drive improvements to enable models port seamlessly across accelerators.You will work closely with the Neuron compiler, training and inference optimization teams, ML model developers and users to deliver best performance on top AI models.You should have an established background in AI Frameworks and Machine Learning infrastructure such as PyTorch, PyTorch/XLA, and JAX. Experience with OpenXLA is a significant plus. You should have demonstrated ability to work with open source communities to influence future community direction, a strong technical understanding and a motivation to achieve results. Key job responsibilitiesResponsible for the full life cycle of developing and releasing JAX and Pytorch framework support for AWS Neuron.Understand current and future directions of ML framework development, with a focus on enabling and optimizing the latest features of ML frameworks.Work closely with the PyTorch and JAX community to actively drive the future directions to improve the experience of developing and optimizing ML models across multiple platforms.Develop and grow your team to meet the ever-expanding needs of the AI software ecosystem.A day in the lifeYou will work with the executive leadership and other senior management and technical leaders to define strategic directions and deliver new capabilities to ML model developers and users. You will work closely with your team to enhance our current framework support for the latest ML models and for top customers and to grow and advance your team's capabilities. You will solve challenges facing current users to enable the best performance on the latest accelerators. You will collaborate with the PyTorch and JAX community across the AI industry to drive ML framework technology forward.About the teamAbout AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 10+ years of engineering experience- 5+ years of engineering team management experience- 10+ years of planning, designing, developing and delivering consumer software experience- Experience partnering with product or program management teams- Experience managing multiple concurrent programs, projects and development teams in an Agile environment ...

Sr. Software Engineer - AI/ML, AWS Neuron Distributed Training - Next Generation Training

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1 and Inf1 servers that use them. This role for a senior software engineering responsible for driving and enabling the AWS Neuron software stack to support next generation capabilities such as newer model architectures (like Mamba and Mixture of Experts) and lower precision training techniques.This is a cross functional role where you will be responsible for -- Influencing Neuron roadmap to support newer model architectures and training techniques based on your technical assessment of state-of-the-art literature.- Working side by side with chip architects, applied scientists, compiler and runtime engineers to build performant support for the next generation models and training techniques (e.g. low precision training).This role requires experience on two dimensions -- Experience training large models using PyTorch/JAX is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.- Experience with profiling and building an understanding of systems bottlenecks and developing solutions (e.g. custom kernels) to improve performance is a must.About the teamAbout UsInclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded professional and enable them to take on more complex tasks in the future.BASIC QUALIFICATIONS- 5+ years of non-internship professional software development experience- 5+ years of programming with at least one software programming language experience- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience- Experience as a mentor, tech lead or leading an engineering team ...

Sr. Software Engineer- AI/ML, AWS Neuron Apps

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machinelearning accelerators and the Trn1 and Inf1 servers that use them. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This role is responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive scale large language models like GPT2, GPT3 and beyond, as well as stable diffusion, Vision Transformers and many more. The ML Apps team works side by side with chip architects, compiler engineers and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large models using Python is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.*Utility Computing (UC)* AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.**Why AWS**Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.**Diverse Experiences**Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.**Work/Life Balance* *We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. **Inclusive Team Culture* *Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.**Mentorship and Career Growth**We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Key job responsibilitiesThis role will help lead the efforts building distributed training and inference support into Pytorch, Tensorflow using XLA and the Neuron compiler and runtime stacks. This role will help tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and the TRn1 , Inf1 servers. Strong software development and ML knowledge are both critical to this role.About the teamAbout UsInclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded professional and enable them to take on more complex tasks in the future.BASIC QUALIFICATIONS- 5+ years of programming using a modern programming language such as Java, C++, or C#, including object-oriented design experience- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience- Fundamentals of Machine learning and deep learning models, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model execution. ...

Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio.Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.AWS Neuron is the complete software stack for the AWS Inferentia (Inf1/Inf2) and Trainium (Trn1), our cloud-scale Machine Learning accelerators. This role is for a senior machine learning engineer in the Distribute Training team for AWS Neuron, responsible for development, enablement and performance tuning of a wide variety of ML model families, including massive-scale Large Language Models (LLM) such as GPT and Llama, as well as Stable Diffusion, Vision Transformers (ViT) and many more.The ML Distributed Training team works side by side with chip architects, compiler engineers and runtime engineers to create, build and tune distributed training solutions with Trainium instances. Experience with training these large models using Python is a must. FSDP (Fully-Sharded Data Parallel), Deepspeed and other distributed training libraries are central to this and extending all of this for the Neuron based system is key.Key job responsibilitiesYou will help lead the efforts building distributed training support into Pytorch, Tensorflow using XLA and the Neuron compiler and runtime stacks. You will help tune these models to ensure highest performance and maximize the efficiency of them running on the custom AWS Trainium and Inferentia silicon and the Trn1, Inf1/2 servers. Strong software development and Machine Learning knowledge are both critical to this role.About the teamAnnapurna Labs was a startup company acquired by AWS in 2015, and is now fully integrated. If AWS is an infrastructure company, then think Annapurna Labs as the infrastructure provider of AWS. Our org covers multiple disciplines including silicon engineering, hardware design and verification, software, and operations. AWS Nitro, ENA, EFA, Graviton and F1 EC2 Instances, AWS Neuron, Inferentia and Trainium ML Accelerators, and in storage with scalable NVMe, are some of the products we have delivered, over the last few years.About the teamOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.Diverse ExperiencesAWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.About AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Mentorship & Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- Bachelor's degree in computer science or equivalent- 5+ years of non-internship professional software development experience- 5+ years of programming with at least one software programming language experience- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience- Experience as a mentor, tech lead or leading an engineering team- Experience in machine learning, data mining, information retrieval, statistics or natural language processing ...

Sr. Software Engineer, Data Plane, NPD Forwarding Stack, Data Plane

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.Amazon Web Services is looking for Software Development Engineers to develop state of the art Linux based networking platforms. You will join a team of engineers developing embedded routing platforms that enable one of the world's largest and complex networks.We are seeking engineers with a demonstrated track record of designing and implementing Linux-based solutions on embedded devices, ideally for networking products. We want people who are passionate about changing the way data center networking is done.Plenty of complexity and scope for your next challenge! Why would you want to work on network devices for Amazon? Key job responsibilities You like to get stuff done and solve complex, impactful problems. AWS develops both the network and the devices, allowing us to innovate in a way that others cannot. Amazon’s network is global in scope and it continues to grow: most of the network runs on our switches and we continue to expand our footprint. Very large impact: these devices are central to Amazon.com (http://amazon.com/), AWS and more AWS customers.A day in the lifeThere are two main components to forwarding the Linux Kernel and it’s constructs for L2/L3 forwarding and management, and the underlying hardware.Our goal is to have the Kernel state and the hardware state mirror one another, and as such we need people that have expertise in Linux kernel and core networking expertise. Ideally you understand both how Linux manages forwarding as well as how that maps to the underlying forwarding hardware.About the teamWe are the Data Plane team and are split between Cupertino and Seattle and are looking to expand in either site. Our team owns packet forwarding in our networking devices; that is, the core functionality of a networking switch.Why AWSAmazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Diverse ExperiencesAmazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.Work/Life BalanceWe value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.Inclusive Team CultureHere at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.Mentorship and Career GrowthWe’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- 5+ years of non-internship professional software development experience- 5+ years of programming with at least one software programming language experience- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience- Experience as a mentor, tech lead or leading an engineering team ...

Sr. System Development Engineer, Hardware Engineering

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.The AWS Hardware Engineering team creates server designs for Amazon’s innovative web services. Our designs are industry-leading in frugality and operational excellence, and are critical to the success of the AWS business and the more than one million customers who use AWS today. Our engineers solve challenging technology problems, and build architecturally sound, high-quality components to enable AWS to realize critical business strategies. The ideal candidate for this role will have a proven track record of rapidly coming up to speed on new engineering disciplines, making impactful decisions within that space, and have experience gluing together components written by more specialized engineers to create a cohesive, well-running engineering product. AWS Engineers are shaping the way people use computers and designing the future of cloud computing technology – come help us make history! What you will do: You will be a member of the HWEng Storage Drives team and will interact with engineers across the company to lead and develop automation and infrastructure capabilities to improve OEM drive operational health, qualification and integration across all of AWS storage server platforms. Key Responsibilities for this position include authoring requirements specifications, system design, and development, all while testing at each phase. There are two main areas of focus for this position: software test development and integration to enable storage hardware, and system debugging for production hosts. Why it matters: Public cloud IT services represent the majority of growth in the overall IT services market and will continue to do so for several years to come. The scale of AWS, combined with an understanding of how our hardware is used, creates a unique opportunity for component customization that will directly benefit customers. Why you will love it: You will work with engineers across the company to build next-generation storage servers. You will have a direct impact on our bottom line and the ability to deliver improvements for our developers. You will be part of a growing, fast paced, and fun team. You will have ownership for the implementation of your work. You will see direct product improvements based on the results of your work.Key job responsibilities- Must have leadership experience developing automation software, spanning across multiple teams and multiple organizations. - Must have experience designing and building system-level software at scale, with an emphasis on durability, availability, security, and diagnostics.- Must have a strong understanding of OS internals, most notably storage subsystems in a Linux-based environment.- Must have demonstrated experience with building and troubleshooting device drivers for Linux on ARM and x86.- Must have demonstrated experience debugging Linux boot and runtime problems on ARM and x86.- Must have demonstrated experience developing tools to perform testing running on Linux-based systems- Must have experience hands on development of automation software in at least one modern language, such as Python, Ruby, Java, or others.- Desirable to have an understanding of AWS services, specifically, distributed storage systems and use of storage hardware- Desirable to have an understanding of drive (HDD, SSD) technologyA day in the lifeLead the Hardware Engineering (HWEng) System Development (SysDE) effort to define and build software and enabling tools, according to defined HWEng Software development best practices; Track and report progress. Work across internal HWEng teams, to ensure drive chosen for new storage hardware delivers performance, reliability and operational health needed by the EC2, EBS, and S3 platforms. Work closely with internal customers to identify early any potential problems with on-boarding new storage servers into their ecosystem. Build, manage, and deploy pipelines for rapid deployment of new code changes to a variety of org-owned and customer-owned systems. Build monitoring tools and metrics to ensure hardware is running properly in both test and production environments.About the teamSystem Development Engineers in AWS Hardware Engineering wear many hats. From orchestration tooling development, to hardware integration, to kernel driver debugging, we dive deep into problems across the breadth of AWS. The ideal candidate will have a proven track record of rapidly coming up to speed on new engineering disciplines, making impactful decisions within that space, and have experience gluing together components written by more specialized engineers to create a cohesive, well-running engineering product.Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 6+ years of deploying and operating in a Linux/Unix environment experience- 5+ years of programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby experience- 5+ years of non-internship professional software development experience- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- 3+ years of systems design, software development, operations, automation, and process improvement experience- Experience leading the design, build and deployment of complex and performant (reliable and scalable) software solutions in production ...

Sr. System Development Engineer, Storage Hardware Engineering

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion. The AWS Hardware Engineering team creates server designs for Amazon’s innovative web services. Our designs are industry-leading in frugality and operational excellence, and are critical to the success of the AWS business and the more than one million customers who use AWS today. Our engineers solve challenging technology problems, and build architecturally sound, high-quality components to enable AWS to realize critical business strategies. The ideal candidate for this role will have a proven track record of rapidly coming up to speed on new engineering disciplines, making impactful decisions within that space, and have experience gluing together components written by more specialized engineers to create a cohesive, well-running engineering product. AWS Engineers are shaping the way people use computers and designing the future of cloud computing technology – come help us make history! What you will do: You will be a member of the HWEng Storage Drives team and will interact with engineers across the company to lead and develop automation and infrastructure capabilities to improve OEM drive operational health, qualification and integration across all of AWS storage server platforms. Key Responsibilities for this position include authoring requirements specifications, system design, and development, all while testing at each phase. There are two main areas of focus for this position: software test development and integration to enable storage hardware, and system debugging for production hosts. Why it matters: Public cloud IT services represent the majority of growth in the overall IT services market and will continue to do so for several years to come. The scale of AWS, combined with an understanding of how our hardware is used, creates a unique opportunity for component customization that will directly benefit customers. Why you will love it: You will work with engineers across the company to build next-generation storage servers. You will have a direct impact on our bottom line and the ability to deliver improvements for our developers. You will be part of a growing, fast paced, and fun team. You will have ownership for the implementation of your work. You will see direct product improvements based on the results of your work.Key job responsibilities- Must have leadership experience developing automation software, spanning across multiple teams and multiple organizations. - Must have experience designing and building system-level software at scale, with an emphasis on durability, availability, security, and diagnostics.- Must have a strong understanding of OS internals, most notably storage subsystems in a Linux-based environment.- Must have demonstrated experience with building and troubleshooting device drivers for Linux on ARM and x86.- Must have demonstrated experience debugging Linux boot and runtime problems on ARM and x86.- Must have demonstrated experience developing tools to perform testing running on Linux-based systems- Must have experience hands on development of automation software in at least one modern language, such as Python, Ruby, Java, or others.- Desirable to have an understanding of AWS services, specifically, distributed storage systems and use of storage hardware- Desirable to have an understanding of drive (HDD, SSD) technologyA day in the lifeLead the Hardware Engineering (HWEng) System Development (SysDE) effort to define and build software and enabling tools, according to defined HWEng Software development best practices; Track and report progress. Work across internal HWEng teams, to ensure drive chosen for new storage hardware delivers performance, reliability and operational health needed by the EC2, EBS, and S3 platforms. Work closely with internal customers to identify early any potential problems with on-boarding new storage servers into their ecosystem. Build, manage, and deploy pipelines for rapid deployment of new code changes to a variety of org-owned and customer-owned systems. Build monitoring tools and metrics to ensure hardware is running properly in both test and production environments.About the teamSystem Development Engineers in AWS Hardware Engineering wear many hats. From orchestration tooling development, to hardware integration, to kernel driver debugging, we dive deep into problems across the breadth of AWS. The ideal candidate will have a proven track record of rapidly coming up to speed on new engineering disciplines, making impactful decisions within that space, and have experience gluing together components written by more specialized engineers to create a cohesive, well-running engineering product.Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. BASIC QUALIFICATIONS- 6+ years of deploying and operating in a Linux/Unix environment experience- 5+ years of programming with at least one modern language such as C++, C#, Java, Python, Golang, PowerShell, Ruby experience- 5+ years of non-internship professional software development experience- 3+ years of designing or architecting (design patterns, reliability and scaling) of new and existing systems experience- 3+ years of systems design, software development, operations, automation, and process improvement experience- Experience leading the design, build and deployment of complex and performant (reliable and scalable) software solutions in production ...

Sr. Technical Product Manager - AWS Neuron, Annapurna Labs

AWS Utility Computing (UC) provides product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Additionally, this role may involve exposure to and experience with Amazon's growing suite of generative AI services and other cutting-edge cloud computing offerings across the AWS portfolio. Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world.The Product: AWS Neuron is the software of Trainium and Inferentia, the AWS Machine Learning chips. Inferentia delivers best-in-class ML inference performance at the lowest cost in the cloud to our AWS customers. Trainium is designed to deliver the best-in-class ML training performance at the lowest training cost in the cloud, and it’s all being enabled by AWS Neuron. Neuron is cutting edge software including an ML compiler and native integration into popular ML frameworks. Our products are being used at scale with external customers like Anthropic and Databricks as well as internal customers like Alexa, Amazon Bedrocks, Amazon Robotics, Amazon Ads, Amazon Rekognition and many more.The Team: the Amazon Annapurna Labs team is responsible for building innovation in silicon and software for our AWS customers. We are at the forefront of innovation by combining cloud scale with the world’s most talented engineers. Our team covers multiple disciplines including silicon engineering, hardware design, software and operations. Because of our teams breadth of talent, we have been able to improve AWS cloud infrastructure in high-performance machine learning with AWS Neuron, Inferentia and Trainium ML chips, in networking and security with products such as AWS Nitro, Enhanced Network Adapter (ENA), and Elastic Fabric Adapter (EFA), and in computing with AWS Graviton and F1 EC2 instances.You: We’re seeking a hands-on product manager who have a passion for machine learning and developer-focused cloud software and hardware products, and are willing to work hard for their customers. Product Management in Annapurna is an opportunity to collaborate with engineering, design, and sales/business development teams to create state of the art machine learning cloud services.In your role as Neuron product manager, you will be in charge of the customer voice within our team, tirelessly working closely with multiple internal teams and customers to develop the new Neuron features for training and inference, and support our growing eco-system. Your mission will be to ensure our customers find new cutting edge offerings pleasing and useful to achieve their aggressive business goals.As a member of the Annapurna team you’ll dive on our technology and work closely with our internal teams, engage with leading developers and customers, and help support Annapurna's products scale to large deployments. We are looking for self-driven individuals who can collaborate with others, and that will continuously work to deliver a world-class customer experience. This is a fast-paced, hands-on, intellectually challenging position, and you’ll work with thought leaders in multiple business and technology areas.You’re a good fit if (a) you can think big and are able to break down the big picture into measurable goals, (b) you have an instinctive understanding of what makes products successful and easy to deploy, and can raise the bar on delivering features beneficial to our customer, (c) you can dive into technical details and ask engineers insightful questions about the services that you own, and finally (d) you can think long-term, can balance conflicting interests and priorities, and converge on outcomes that earn trust and customer loyalty.In this role you will: - Work directly with software engineering teams to define and execute on new features. - Produce clear, concise documents such as functional or technical specifications. - Write user stories and perform user acceptance testing. - Anticipate bottlenecks, manage risk and escalations, and balance business needs against technical constraints. - Find opportunities to innovate on behalf of our customers, design features related to these opportunities, and always push to improve our product user experience. - Drive feature discussions with customers, engineering, and other stakeholders. - Stay connected with industry counterparts and gain insights on technology trends.About the teamAbout the Team Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. About AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.BASIC QUALIFICATIONS- Bachelor's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent- Experience owning/driving roadmap strategy and definition- Experience with feature delivery and tradeoffs of a product- Experience contributing to engineering discussions around technology decisions and strategy related to a product- Experience in representing and advocating for a variety of critical customers and stakeholders during executive-level prioritization and planning- Experience in technical product management, program management or engineering- 10+ years of industry experience, with 5+ years in a technical product management or customer facing roles. Knowledge in full product life cycles, including technical specifications, development, go-to-market, pricing, customer facing presentations and collaboration with engineering and sales teams.- Solid knowledge in computer architecture fundamentals, operating systems and cloud infrastructure engineering concepts- Ability to work in a fast paced and agile work environment with demonstrated collaboration skills and demonstrated strengths in driving through complexity, ambiguity, and unknowns in early-stage programs- Proven experience in delivering modern software products, preferably collaborative open-source projects ...

Sr. Technical Program Manager , Sr Technical Program manager

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.In this org: Amazon Web Services (AWS) Hardware Engineering Services is a fast growing and leading-edge research and development team that creates enterprise compute and storage server designs for our innovative web service and e-commerce technology platforms. Our designs are industry-leading in terms of performance, frugality and operational excellence, and are critical to the success of the AWS business and the hundreds of thousands of customers who use utility computing today. We are seeking an experienced Sr. Technical Program Manager (TPM) to define and build the next generation of our cloud platforms. Our success depends on our world-class server infrastructure; we’re handling massive scale and rapid integration of emergent technologies.The ideal TPM candidate will be an experienced program manager who is skilled in managing server hardware product development teams across several geographic regions using a variety of suppliers (CM, ODM/IHV). The successful candidate will be an innovative self-starter and leader. The candidate will be familiar with the cloud computing industry and AWS offerings and will be passionate about providing the best possible customer experience at the best cost. The candidate will have an obsession with data and precise analysis and will use these as inputs to make decisions.Key job responsibilitiesAs a TPM on the Server Hardware Engineering team you will be responsible for leading multiple simultaneous hardware product development programs in a highly cross-functional environment which includes internal customers, external vendors and technology partners. You’ll provide leadership in the application of new technologies to large scale server deployments in a continuous effort to deliver a world-class customer experience at a world-class cost point. This is a fast-paced, intellectually challenging position, and you’ll work with thought leaders in multiple technology areas. You’ll have high standards for yourself and everyone you work with, and you’ll be constantly looking for ways to improve your product’s performance, quality and cost. Using data and key metrics, you will also drive and measure process improvements that enhance our operational effectiveness. You will work independently in a dynamic, challenging, and fast-changing organization. We’re changing an industry, and we need individuals who are ready for this challenge and who want to reach beyond what is possible today.A day in the lifeYou will have exposure to teams and leaders across the entire company. You will be a part of a leading edge research and development team. You will see many aspects of the Amazon business. You will have a direct impact on our bottom line and the ability to improve things for our developers. You will be part of a growing, fast paced, and fun team. You will have ownership and responsibility for defining and executing processes that deliver both savings and productivity for Amazon.About the teamThe team you will report to consists of a diverse set of skills across many technical disciplines including TPM, Hardware engineering, software development, and hardware systems design. We are committed to delivering innovated solutions to our customers through the full life cycle of delivering and operating AWS hardware. BASIC QUALIFICATIONS- BS or MS degree in a technical discipline or equivalent experience in an IT-related field.- 8+ years of server/IT industry experience in a hardware development team- 8+ years of project/program management experience- Demonstrated ability to managed project/task prioritization, procurement, project planning and schedule development. ...

Sr. TPM AWS, Annapurna Labs AI Chips GTM, Annapurna Labs

Do you want to help define the future of AWS AI Chips (AWS Inferentia/Trainium) Go to Market (GTM)? You will be part of the core worldwide AWS AI Chips Business and GTM team, driving our most strategic customer and industry partnership engagement programs. Our customers build and deploy GenAI applications on our Chips across many industry segments. You will collaborate with Neuron engineering, Business and product leaders to ensure we meet and exceed our customer expectations. This work includes managing relationship with leading ML frameworks and library providers, and working with cross teams in AWS to accelerate customer adoption of AWS Inferentia and Trainium based instances.At Annapurna, our TPM role is focused on working amongst multiple teams to deliver functionality that those teams are responsible for. The TPM is the “glue” that holds teams together and maintains a bird’s eye view over what those teams are delivering and how those deliverables fit together. This involves two aspects, both equally important: The “Technical” part of “TPM” requires identifying dependencies and technical risks that affect the various teams represented. The “Program Manager” part of “TPM” involves scheduling, creating milestones, and reporting status. As part of your project and program ownership, you focus on the larger business and technology picture (i.e., customer experience, processes, opportunities, and/or problems to be solved). You deeply understand the business and technical requirements of the solutions being built and drive the right outcomes. You take the time to understand the needs of engineers (who have to build what, maintain, and extend features for the life of the project). You help your customers and the engineering teams make appropriate trade-offs by considering the larger picture (e.g., business goals, user experience, dependency impacts, efficiency, availability). You partner with technical managers to secure resources, scope technical efforts, set project priorities, milestones, and drive delivery. You determine if success metrics are in place, and if not work to define them. As part of this role, you will work directly with external technology providers, customers, and partners. To be successful, you have a solid understanding of the design approaches and industry technologies utilized. You make connections (to people and/or technologies) and make sure the right people are part of the conversation. For example, you are able to recognize when a proposed design is too complex or risky (and arrange additional reviews by senior engineers). You will need to be adept at interacting, communicating, and partnering with teams within AWS (product, solutions architecture, sales, marketing, and professional services) and externally with customers and 3rd party model providers.Key job responsibilitiesLead internal/external cross-team technical projects to accelerate adoption of AWS AI ChipsManage complex customer and partner deliveriesDrive scale with external partnersProvide technical direction with limited assistance Communicate project status to the executive team About the teamThe Amazon Annapurna Labs team is responsible for building innovation in silicon and software for our AWS customers. We are at the forefront of innovation by combining cloud scale with the world’s most talented engineers. Our team covers multiple disciplines including silicon engineering, hardware design, software and business development. Along with the AI Chips Inferentia and Trainium, Annapurna Labs has delivered advancements in Networking with AWS Nitro, Amazon’s first ARM based instances with AWS Graviton and first FGPA instances in the cloud.BASIC QUALIFICATIONS- * 5+ years of TPM within a large engineering org, or relevant technical partnership management- * 3+ proven experience in driving the delivery of large technology programs or products.- * 3+ years of software engineering experience- * Design knowledge and expertise to drive technical decisions and anticipate technical risks- * Experience managing programs across cross functional teams, building mechanisms of scale- * Working knowledge of one or more ML Frameworks (e.g., PyTorch, JAX) and ML methods including GenAI foundation models, computer vision models, multimodal techniques- * Bachelor's degree or equivalent ...