Data Engineer
Location
Nashville - TN
Job Level
Mid-Level
Schedule
Full-Time
Job Type
Remote
Data Engineer
Company Background
At Edgevanta, our mission is to empower civil contractors to estimate and bid faster and more accurately. Our customers are the unsung heroes building the infrastructure we all rely on - from the water systems that sustain us to the airports that connect us to the roads we drive on every day. Without infrastructure contractors, society wouldn't function.
Our Story
Edgevanta was born from the dirt. Our Founder and CEO, Tristan Wilson, is a fourth-generation contractor who grew up on job sites, estimating and managing heavy civil work.
We believe Estimators Are Awesome! They sift through thousands of pages - specs, plans, bid tabs - making sense of chaos under tight deadlines and tighter margins. But outdated tools slow them down, wasting time on what should be automatic.
Edgevanta exists to give estimators superpowers. We give them their time back with AI - automating the repetitive, surfacing the critical, and unlocking faster, more accurate bids. Because better bids don't just win jobs - they build better infrastructure for the world.
We work with many of the top infrastructure contractors in North America and are growing rapidly.
Our Culture
Edgevanta's success is built on four core values that define everything we do: Make It Happen, Fix Customer Problems, Curiosity, and Candor. If you're passionate about creating meaningful impact and committed to going above and beyond, we want to hear from you.
Summary
We're looking for a skilled Data Engineer to join our team on a project focused on converting unstructured data from PDFs into structured formats like databases or spreadsheets. You will build and optimize ETL pipelines, ensuring the accurate and efficient transformation of data from a variety of document types into usable formats for downstream teams.
Description
Responsibilities
- Analyze various PDF document types (text-based, scanned, tables, forms) to determine optimal extraction strategies.
- Build robust, reusable pipelines to:
- Extract data from PDFs using code or AI-based tools.
- Clean, validate, and structure the data.
- Load data into a relational database (PostgreSQL preferred) or spreadsheet formats (CSV, Excel).
- Recommend and implement tools (Python, JavaScript, or others) and services (e.g., AWS, serverless, OCR, etc...).
- Ensure high data accuracy, quality, and reliability.
- Document the pipeline and processes clearly for internal use and possible handoff.
- Collaborate with internal engineers to align on data models, API interactions, or downstream consumption.
Qualifications
Requirements
- 3+ years of experience as a Data Engineer or similar role.
- Proven experience designing data pipelines that involve unstructured document parsing (especially PDFs).
- Comfortable working with PDF parsing libraries (PDFPlumber, Tabula, Camelot, etc.) and/or AI/LLM-based extraction techniques.
- Strong programming background (Python preferred; open to others like JavaScript/TypeScript, Go, or Java).
- Experience with databases (PostgreSQL preferred) and data modeling.
- Familiar with cloud storage (e.g., AWS S3, GCS, Azure Blob).
- Strong communication skills and ability to collaborate in cross-functional teams.
Nice to Have
- Familiarity with our general tech stack: TypeScript, Vercel, Next.js, NeonDB, LangChain, Drizzle ORM.
- Experience working in monorepo environments and modern CI/CD workflows.
- Experience with AI-enhanced document parsing (e.g., OpenAI, AWS Textract, Azure Form Recognizer).
- Terraform or infrastructure-as-code experience for setting up cloud resources.
Success in This Role Looks Like
- Delivered a reliable pipeline that accurately converts various types of PDFs into structured data.
- Chosen tools and frameworks that balance performance, maintainability, and cost.
- Collaborated with internal team members to align data models and future data needs.
- Created clear documentation for ongoing maintenance and scaling.