Data Management Engineer
Ouster

At Ouster, we are pioneering the future of Physical AI by redefining the capabilities of deep learning. Through our groundbreaking digital lidar solutions, including the Ouster Gemini perception platform and BlueCity smart city applications, we deliver the critical intelligence needed to build a safer and more efficient world.
Job SummaryWe are seeking a highly organized and detail-oriented Data Management Engineer to own, develop and operate our entire data pipeline, from collection to validation. In this critical role, you will be the bridge between our engineering teams, stakeholders, and external labeling partners. You will manage the full lifecycle of our datasets, ensuring the quality and integrity required to train and validate our advanced perception systems. The ideal candidate is a proactive problem-solver with a strong technical background in coding, data handling and a knack for process improvement.
Key ResponsibilitiesData Acquisition & Management
- Stakeholder Coordination: Liaise with internal teams to define data collection requirements and priorities, maintaining clear documentation on collection needs.
- Data Collection: Perform on-site data recording using our systems and directly obtain datasets from key stakeholders or customers.
- Data Storage & Archiving: Manage and organize large datasets across various storage solutions, including Google Cloud (GCloud) and local Network Attached Storage (NAS), ensuring data is secure, accessible, and up-to-date.
Data Processing & Validation
- Lidar Pre-processing: Conduct initial alignment and pre-validation of raw Lidar point cloud data to ensure its quality and usability.
- Data Validation & Quality Assurance: Design, Develop and Perform rigorous validation processes on labeled data received from vendors to confirm it meets our standards before being integrated into training or validation sets.
- Dataset Management: Handle the strategic splitting of datasets for training and validation purposes. Identify and document errors or inconsistencies in current datasets and coordinate with labeling teams for corrections.
- Validation & Regression Framework: Add new datasets into our validation framework, which includes creating precise Areas of Interest (AOIs) and generating new baseline performance metrics.
Labeling & Process Improvement
- Labeling Coordination: Manage the end-to-end labeling process by sending data to our labeling partners, tracking their progress, and serving as the primary technical point of contact.
- Documentation: Own and update all labeling documentation, including defining new classes and clarifying labeling instructions to ensure data consistency.
- Metrics & Reporting: Continuously improve the metrics and reports used for our validation, performance and regression testing, adding new parameters as needed to enhance our evaluation framework.
- Automation & Efficiency: Run, maintain, and improve our pre-labeling pipeline to increase the efficiency of our data operations.
- Tool & Industry Research: Conduct industry research to identify and evaluate new tools and technologies that can make our labeling and data management processes more efficient.
Required:
- Bachelor’s degree in Computer Science, Engineering, or a related technical field.
- Proven experience in managing large-scale datasets and complex validation pipelines for machine learning and computer vision applications.
- Proficiency in scripting languages such as Python for automation and data manipulation.
- Familiarity with data labeling processes and managing multiple labeling vendors.
- Strong organizational skills with the ability to manage multiple projects and priorities simultaneously.
- Excellent communication skills, with experience coordinating between technical teams and external vendors.
- Meticulous attention to detail, especially in data validation and quality control.
- Ability to identify a gap in processes and define a proper process to bridge over the gap.
Nice to have:
- Strong knowledge of C++ / RUST.
- DevOps or MLOps experience.
- Hands-on experience with Lidar data and point cloud processing.
See more jobs in Ottawa, ON