Software Engineer, Sensor Data Integration
The Role
At Mach9, Sensor Data Integration Engineers build the algorithms and pipelines that transform large-scale geospatial datasets into structured, accessible formats to power our survey product, Digital Surveyor. You’ll work with high-volume data sources — LiDAR-collected point clouds, on-road imagery, overhead aerial ortho photos — and own the systems that ingest, standardize and store them for our training and product use. Every single piece of data that our customers upload will pass through your systems first.
This role is ideal for an engineer who loves puzzle-hunting — reverse-engineering sparsely-documented formats, wrangling coordinate systems and transforms, hunting down strange camera projection issues.
You’ll sit at the divide between our customers and our product, making messy real-world sensor data trustworthy at scale. This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale.
Where you'll make an impact
Own the ingestion pipelines that convert point clouds and imagery from hardware vendors into Mach9's standard internal format
Reverse-engineer new vendor formats and updates - often working only with sparse or missing documentation - to expand what data Mach9 can take in
Build agentic systems to automatically triage failures and reformat data
Build automated checks and regression testing to guarantee the consistency of our data
Optimize the performance of our processing and storage across massive geospatial datasets in the cloud
Work directly with customers and partners to unblock critical customer projects
What you bring
Strong software development and debugging skills
Experience building production software in Python
Comfort operating with ambiguity. You'll need to be able to dig into undocumented or messy data formats and reverse-engineer them.
Strong communication skills, with the ability to work across our ML, product, and customer success teams
A foundation in parallel computing or distributed systems
A bachelor's degree in Computer Science, Engineering, or equivalent experience.
Bonus experience
Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching.
Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf).
Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3).
Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch).
Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms).
Experience building data versioning, lineage, or artifact-tracking systems.
Experience operating data pipelines that feed ML training and inference.
Familiar with C++.
About Mach9
Mach9 is transforming civil infrastructure design with AI-powered geospatial tools. Our platform accelerates the creation of engineering deliverables from raw data, cutting manual drafting time by 96×. Trusted by global leaders in engineering and construction, we're backed by Y Combinator, Quiet Capital, and top founders and executives from Cruise, Autodesk, Adobe, and DoorDash.
We believe the needs of a startup benefit from an in-person culture. The team works out of our office in SoMa, with the flexibility to work from home when needed.