In Property & Casualty insurance, underwriting decisions depend heavily on loss runs — structured claims history documents submitted by brokers in PDF format. The challenge is that every carrier formats these documents differently, making automated parsing difficult. For Shepherd Insurance, a startup scaling its underwriting operations, this meant that processing a single 350-page loss run required an underwriter to spend 3+ hours on manual data entry. With no standardized format across carriers, each document demanded custom interpretation. This bottleneck constrained underwriting throughput and introduced the risk of human error in a domain where data accuracy directly affects pricing and risk decisions.
Shepherd's engineering team built an automated document processing pipeline centered on SenseML from Sensible, a JSON-based domain-specific language that sits on top of AWS and Azure OCR services. Rather than attempting a general solution upfront, the team took a deliberate approach: they identified the top 8 most common carrier formats in their submission volume and wrote dedicated extraction configurations for each. This gave the platform deterministic, auditable parsing logic for the majority of incoming documents. Underwriters could upload PDFs directly into the platform, with claims data automatically extracted and persisted — no manual entry required. The team evaluated LLM-based approaches, including GPT-4 with large context windows, RAG pipelines, and vector database solutions, before concluding that NLP-driven OCR with a structured DSL was the right fit for this use case.
The automated pipeline delivers a 270x productivity improvement over manual extraction. A 350-page loss run that previously took an underwriter over 3 hours to process manually is now completed in under 40 seconds. For known carrier formats — the top 8 configurations — the system achieves 100% extraction accuracy, eliminating a class of data entry errors that could affect underwriting decisions.
Beyond the raw numbers, underwriters shifted from data entry work to higher-value analysis, and the pipeline removed a key scaling constraint as submission volume grew.
Have a similar implementation?
Share your customer's AI results and link it to your vendor profile.
Submit a case study →