Fire Protection Lead Intelligence
Built an LLM-powered system to automatically categorize over one million annual construction bids, identifying high-value fire protection opportunities that sales teams would otherwise miss.
The Business Problem
Core & Main sells fire protection equipment (sprinkler systems, fire pumps, alarms) to contractors working on commercial construction projects. The challenge: finding the right projects in a sea of data.
The Numbers:
- 1M+ annual bid postings from construction databases
- 2-3% are relevant fire protection opportunities
- Manual review: impossible at scale
- Missing opportunities: real revenue loss
Traditional keyword filtering caught obvious matches but missed nuanced cases. "Hospital renovation" might need fire protection. "Hospital equipment procurement" definitely doesn't.
Two-Pass LLM Architecture
The key insight: Use cheap models for filtering, expensive models for precision.
Pass 1: Bulk Filtering (GPT-4o-mini)
- Process all 1M+ bids through fast, inexpensive model
- Simple prompt: "Could this project involve fire protection systems?"
- Binary classification: Maybe (keep) or No (discard)
- Reduces dataset by ~90% at minimal cost
Pass 2: Deep Analysis (GPT-4o)
- Remaining ~100K bids get detailed analysis
- Structured extraction: project type, scope, timeline, location
- Confidence scoring with reasoning explanation
- Fire protection relevance rating (1-5 scale)
Why Two Passes?
| Approach | Cost per Million Bids | Accuracy | |----------|----------------------|----------| | All GPT-4o | $15,000+ | High | | All GPT-4o-mini | $150 | Medium | | Two-Pass Hybrid | $2,250 | High |
85% cost reduction with equivalent accuracy to full GPT-4o processing.
Technical Implementation
Data Pipeline (Microsoft Fabric)
Bid Sources → Bronze (raw) → Silver (cleaned) → LLM Processing → Gold (enriched)
- Daily incremental ingestion from bid aggregator APIs
- Deduplication and normalization in Silver layer
- PySpark UDFs calling Azure AI Foundry endpoints
- Results land in Gold layer with full lineage tracking
Prompt Engineering
The prompts evolved through dozens of iterations. Key learnings:
- Be specific about edge cases: "Hospital" without construction scope = no
- Provide examples: Few-shot prompting improved edge case handling
- Request structured output: JSON responses enable reliable parsing
- Ask for reasoning: Chain-of-thought improves accuracy and enables auditing
Quality Monitoring
- Weekly sample audits by sales team
- Precision/recall tracking over time
- A/B testing prompt variations
- Cost-per-lead metrics
Business Impact
- Coverage: 98% of relevant bids now identified (up from ~60% with keywords)
- Efficiency: Sales team focuses on pre-qualified leads, not research
- ROI: First quarter showed revenue attribution exceeding annual system cost
- Speed: Daily lead delivery vs. weekly manual reports
Lessons Learned
- Start with the economics: Model architecture follows from cost constraints, not technical elegance
- Ground truth is expensive: Getting sales team to validate samples took more effort than building the system
- Prompts are code: Version control, testing, and iteration are essential
- LLMs aren't magic: They're pattern matchers. Garbage in, garbage out.
Finding needles in haystacks at scale.