Job Description
<h3>π Description</h3> β’ Worth AI, a leader in the computer software industry, is looking for a talented and experienced Principal Data Engineer to join their innovative team. At Worth AI, we are on a mission to revolutionize decision-making with the power of artificial intelligence while fostering an environment of collaboration, and adaptability, aiming to make a meaningful impact in the tech landscape.. Our team values include extreme ownership, one team and creating reaving fans both for our employees and customers.
β’ Worth is looking for a Principal Data Engineer to own the company-wide data architecture and platform. Design and scale reliable batch/streaming pipelines, institute data quality and governance, and enable analytics/ML with secure, cost-efficient systems. Partner with engineering, product, analytics, and security to turn business needs into durable data products.
β’ Responsibilities
β’ What you will do:
β’ Architecture & Strategy
β’ Define end-to-end data architecture (lake/lakehouse/warehouse, batch/streaming, CDC, metadata).
β’ Set standards for schemas, contracts, orchestration, storage layers, and semantic/metrics models.
β’ Publish roadmaps, ADRs/RFCs, and βnorth starβ target states; guide build vs. buy decisions.
β’ Platform & Pipelines
β’ Design and build scalable, observable ELT/ETL and event pipelines.
β’ Establish ingestion patterns (CDC, file, API, message bus) and schema-evolution policies.
β’ Provide self-service tooling for analysts/scientists (dbt, notebooks, catalogs, feature stores).
β’ Ensure workflow reliability (idempotency, retries, backfills, SLAs).
β’ Data Quality & Governance
β’ Define dataset SLAs/SLOs, freshness, lineage, and data certification tiers.
β’ Enforce contracts and validation tests; deploy anomaly detection and incident runbooks.
β’ Partner with governance on cataloging, PII handling, retention, and access policies.
β’ Reliability, Performance & Cost
β’ Lead capacity planning, partitioning/clustering, and query optimization.
β’ Introduce SRE-style practices for data (error budgets, postmortems).
β’ Drive FinOps for storage/compute; monitor and reduce cost per TB/query/job.
β’ Security & Compliance
β’ Implement encryption, tokenization, and row/column-level security; manage secrets and audits.
β’ Align with SOC 2 and privacy regulations (e.g., GDPR/CCPA; HIPAA if applicable).
β’ ML & Analytics Enablement
β’ Deliver versioned, documented datasets/features for BI and ML.
β’ Operationalize training/serving data flows, drift signals, and feature-store governance.
β’ Build and maintain the semantic layer and metrics consistency for experimentation/BI.
β’ Leadership & Collaboration
β’ Provide technical leadership across squads; mentor senior/staff engineers.
β’ Run design reviews and drive consensus on complex trade-offs.
β’ Translate business goals into data products with product/analytics leaders. <h3>π― Requirements</h3> β’ 10+ years in data engineering (including 3+ years as staff/principal or equivalent scope).
β’ Proven leadership of company-wide data architecture and platform initiatives.
β’ Deep experience with at least one cloud (AWS) and a modern warehouse or lakehouse (e.g., Snowflake, Redshift, Databricks).
β’ Strong SQL and one programming language (Python or Scala/Java).
β’ Orchestration (Airflow/Dagster/Prefect), transformations (dbt or equivalent), and streaming (Kafka/Kinesis/PubSub).
β’ Data modeling (3NF, star, data vault) and semantic/metrics layers.
β’ Data quality testing, lineage, and observability in production environments.
β’ Security best practices: RBAC/ABAC, encryption, key management, auditability. <h3>ποΈ Benefits</h3> β’ Health Care Plan (Medical, Dental & Vision)
β’ Retirement Plan (401k, IRA)
β’ Life Insurance
β’ Unlimited Paid Time Off
β’ 9 paid Holidays
β’ Family Leave
β’ Work From Home
β’ Free Food & Snacks (Access to Industrious Co-working Membership!)
β’ Wellness Resources