Amazon is a fast paced innovative company that is developing software that no one has attempted before. If you are a data engineer who is passionate about writing code and loves to build large scale data pipelines which are , scalable, high throughput, fault tolerant and always available, then get in touch with us.

The Item and Offers team is responsible for a variety of services that form a core part of the Amazon eCommerce platform. We are primarily responsible for developing the services that process all of the Item information from millions of merchants who want to sell through the Amazon family of websites. Our expertise lies in managing billions of products in the catalog and developing large scale distributed systems that process hundreds of millions of changes to the catalog every day in real time. The team offers a unique blend of hard computer science problems and an opportunity to help the businesses model their new ideas.

Are you passionate about working with large datasets and code? Do you want to build and manage data engineering solutions that process a broad range of data schemas? Do you want to continuously improve the data pipelines that operate at Amazon’s catalog scale while ensuring our customer’s trust? If yes, then come join the Catalog Data Works (CDW) team with the charter to provide useful, fresh and historical catalog data that teams at Amazon can analyze and leverage for their business use cases. The published by this team are a critical component of building a catalog that earns our customers’ trust.

As a Sr. Data Engineer in the CDW team, you will own complex big data pipelines and data solutions to provide highly availability datasets. You will work with large data sets (in petabytes) and transformations involving multiple data sources to enable downstream analytics for our stakeholders. You will build and manage large datasets to help teams drive data-driven decisions through analytical and business metrics dashboards.

The Data Engineer will play a crucial role in designing, developing, and maintaining efficient and scalable data pipelines, data models, and data warehousing solutions. This position will be responsible for ensuring data integrity, quality, and availability across the organization, enabling data-driven decision-making and supporting business analytics and insight initiatives.

Key job responsibilities
• Define and optimize data models for rapid analytics on catalog product data, improving freshness and LLM consumption while reducing costs and undifferentiated work.

• Automate metrics generation to support S-team goals, including pack hierarchy scaling and standard KPIs, while leading strategy for scaling self-serve analysis and dashboards.

• Mentor engineers, establish best practices in data engineering and operational excellence, and stay current with latest technologies to recommend innovations.

• Conduct comprehensive data discovery, profiling, and performance analysis for various sources, designing effective models for Page0, entitlement, propensity, and other relevant data.

• Collaborate with stakeholders to translate requirements into optimized data structures, while establishing and enforcing data governance policies to maintain quality, consistency, and security.

• Take a long-term view of data solutions, proactively addressing architecture deficiencies and making appropriate trade-offs for usability, security, maintainability, scalability, and extensibility.

• Resolve root causes of endemic problems, unblocking innovation for related teams, and build consensus with stakeholders to influence and determine the best path forward.

About the team

BASIC QUALIFICATIONS

- Bachelor's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent
- 5+ years of data engineering experience
- Experience with data modeling, warehousing and building ETL pipelines
- Experience mentoring team members on best practices
- Experience in at least one modern scripting or programming language, such as Python, Java or Scala
- Experience in dimensional data modeling and schema design
- Experience with diverse data formats: Parquet, JSON, big data formats, and table formats like Apache Iceberg

PREFERRED QUALIFICATIONS

- Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
- Experience with BDT toolsets like Cradle, DataCraft, Andes and other products
- Experience in managing data at scale (hundreds of terabyte size datasets)

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Los Angeles County applicants: Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company’s reputation. Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $139,100/year in our lowest geographic market up to $240,500/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.