Intelligent Document Processing System
Intelligent document processing system using AWS AI services for automated document analysis and information extraction
Tech Stack

AWS Intelligent Document Processing System
📋 Project Overview
State medical insurance policy documents contain complex business rules, and traditional manual processing methods are not only time-consuming but also error-prone. Particularly for medical insurance treatment policy documents, accurate understanding of eligibility conditions and service scope is required, and these rules need to be transformed into executable database queries. We developed a multi-Agent collaborative intelligent document processing system based on the AWS Nova model series, achieving end-to-end automated processing from PDF documents to SQL scripts. This project also participated in the 2025 AWS Nova Model Application Hackathon, demonstrating the practical application value of the latest GenAI technology in the medical insurance field.
🚀 Key Features
Core Implementation
- Multi-Agent Collaborative Architecture: Used LangGraph to orchestrate specialized agents including Business Rule Agent, Revisor Agent, Policy Mapping Agent, SQL Generation Agent
- AWS Nova Model Series Application: Intelligently selected Nova Pro/Lite/Micro models based on task complexity to achieve optimal balance between cost and performance
- Intelligent Business Rule Extraction: Deep analysis of medical insurance policy documents, automatically identifying core business rules and eligibility conditions
- Automated SQL Generation: Converted policy rules into precise SQL query scripts, supporting complex condition combinations and data associations
- Quality Assurance Mechanism: Multi-round review verification ensures accuracy and compliance of generated results
Technical Highlights
- Layered Model Strategy: Nova Pro handles complex rule extraction, Nova Lite manages review mapping, Nova Micro handles document summarization, achieving intelligent cost optimization
- LangGraph Workflow Engine: Built state graph-driven Agent collaboration process, supporting intelligent retry and error handling
- Medical Domain Semantic Understanding: Specialized prompt engineering optimization for medical insurance terminology and policy language
- RESTful API Architecture: Based on FastAPI providing modular interfaces, supporting seamless integration with existing medical information systems
💻 Project Detail
Our multi-Agent intelligent document processing system addresses the core challenges of medical insurance policy understanding. The specific implementation process is as follows:
- Intelligent Document Parsing:
- Used PyPDF2 to extract content from medical insurance treatment policy PDF documents
- Identified policy clauses and business rule paragraphs through document structure analysis
- Provided structured text input for subsequent Agent processing
-
Multi-Agent Business Rule Extraction:
-
Business Rule Agent: Deployed AWS Nova Pro model for deep document analysis, extracting core business rules and eligibility conditions
- Revisor Agent: Used AWS Nova Lite model for rule review and quality control, ensuring extraction completeness
- Policy Mapping Agent: Utilized AWS Nova Lite to map abstract policy rules to specific query requirements
-
Orchestrated Agent workflows through LangGraph, achieving state management and task transfer
-
Intelligent SQL Script Generation:
-
SQL Generation Agent: Used AWS Nova Pro model to generate corresponding SQL query scripts based on business rules
- Supported complex WHERE conditions, JOIN operations, and aggregate queries
-
Generated SQL can be directly used for patient database screening and compliance checking
-
Multi-Model Collaboration Optimization:
-
Intelligently selected Nova models based on task complexity: Pro for complex logic, Lite for medium tasks, Micro for simple summaries
- Managed different Agent prompt strategies through Jinja2 template engine
-
Implemented automatic retry mechanism based on review results
-
System Integration Deployment:
- Used FastAPI framework to provide RESTful interfaces, supporting modular calls
- Integrated AWS Bedrock services, ensuring stable supply of enterprise-grade AI capabilities
- Supported API integration with existing medical information systems
📊 Project Impact
Medical Institution Efficiency Enhancement:
- Reduced traditional policy analysis work requiring several hours to completion within minutes
- Automatically generated SQL scripts with high accuracy, significantly reducing manual errors and compliance risks
- Provided reliable technical support for medical institutions to quickly screen compliant patients
AI Technology Innovation Application:
- Successfully validated the practicality and cost-effectiveness of AWS Nova model series in complex business scenarios
- Demonstrated advantages of multi-Agent collaborative architecture in professional domain document processing
- Provided best practice cases in the medical insurance field for the 2025 AWS Nova Hackathon
Architecture Design Value:
- Layered model selection strategy provided cost optimization reference solutions for similar projects
- Workflow design based on LangGraph offers good scalability and reusability
- Modular API design supports seamless integration with existing enterprise systems
🛠️ Technology Stack
AI & Machine Learning:
- AWS Bedrock Nova Pro/Lite/Micro (Layered Large Language Models)
- LangGraph (Multi-Agent Workflow Orchestration)
- LangChain (Large Model Application Development Framework)
- AWS Bedrock (Enterprise-grade AI Service Platform)
Multi-Agent Architecture:
- Business Rule Agent (Business Rule Extraction)
- Revisor Agent (Quality Review Verification)
- Policy Mapping Agent (Policy Rule Mapping)
- SQL Generation Agent (SQL Script Generation)
- Summary Agent (Document Summarization)
Backend Development:
- FastAPI (High-performance API Framework)
- Django (Web Application Framework)
- Python (Core Development Language)
- Jinja2 (Template Engine)
Document Processing:
- PyPDF2 (PDF Document Parsing)
- Document Analysis (Document Structure Recognition)
- Text Extraction (Text Content Extraction)
Cloud Infrastructure:
- AWS Bedrock (Managed AI Service)
- AWS IAM (Identity and Access Management)
- SQLite (Development Environment Database)
Data Processing:
- SQL Query Generation (Dynamic Query Generation)
- Medical Policy Analysis (Medical Policy Semantic Understanding)
- Rule-to-Query Mapping (Rule to Query Conversion)
This project demonstrates the practical application of multi-Agent collaborative architecture and intelligent model selection in medical document processing, providing advanced technical reference for digital transformation in the medical insurance industry.