Data Scientist
Posted 2026-05-06
This is a remote position.
Role Overview
We are looking for a highly skilled Data Scientist to design and deploy end-to-end production pipelines for multimodal data synthesis. You will focus on building sophisticated VAE architectures, advanced video processing modules, and scalable synthetic data generation systems using a phased, Agile delivery approach.
Key Responsibilities
Build standalone, testable modules for video metadata extraction, critical frame selection, and automated scene analysis.
Lead the design of VAE-based systems for complex data imputation and synthesis.
Execute a three-phase delivery model: 1) Extraction & Architecture, 2) Synthesis & Imputation, and 3) Cross-component optimization.
Implement rigorous integration testing and quality metrics to ensure the fidelity of synthetic outputs.
Requirements
Requirements
Deep expertise in VAE architecture design, training, and latent space manipulation for high-dimensional data synthesis and imputation.
Proven experience in 3D CNNs, scene change detection (inter-frame histograms), and motion analysis (optical flow/peak detection).
Ability to generate and fuse synthetic data across tabular, text, audio, and video formats using statistical modeling and Gaussian copulas.
Advanced Python skills with a focus on modular design, production-level pipelines, and
ffmpegintegration for audio/video handling.Familiarity with specialized techniques such as MDH/CDS/DSM exclusion and multimodal merger integration.