Phase 2 focuses on the document ingestion pipeline, improving capabilities such as multi-format parsing, OCR, and table extraction.
Phase 3 optimizes retrieval and embedding, implementing advanced features like hybrid retrieval (semantic + keyword), re-ranking, and query rewriting.
Phase 4 enhances answer generation, introducing interactive capabilities such as reference verification, multi-turn dialogue, and follow-up clarification.
Phase 5 completes AWS cloud-native deployment, with the target architecture including:
- Network Layer: VPC divided into public and private subnets; FastAPI services deployed in private subnets
- Storage Layer: RDS PostgreSQL managed database; S3 private buckets for storing original documents
- Computing Layer: EC2 GPU instances running Ollama or vLLM, or connecting to Amazon Bedrock private endpoints
- Security Layer: Secrets Manager for credential management; Security Groups and IAM roles for access control
- Monitoring Layer: CloudWatch for log and metric collection