From single-stage vision analysis to hybrid RAG + two-stage OCR.
Built for Ghana's 3G networks. Powered by Claude Sonnet 4.6.
Upload an exercise book photo and get AI-powered gap analysis + remediation exercises
Four phases of continuous improvement: from infrastructure hardening to multi-country scaling.
Each phase measured, documented, and battle-tested in production.
December 2025
Eliminated session corruption, duplicate processing, and message loss in the SQS-backed worker pipeline. Introduced exception hierarchy, idempotency ledger, and safe requeue ordering.
# Idempotency guard using PostgreSQL INSERT ... ON CONFLICT from sqlalchemy.dialects.postgresql import insert as pg_insert stmt = pg_insert(ProcessingLedger).values( sqs_message_id=task.message_id, task_type=task.task_type, student_id=payload.get("student_id"), ).on_conflict_do_nothing(constraint="uq_ledger_msg_task") result = await db.execute(stmt) await db.commit() if result.rowcount == 0: # Duplicate message - skip processing logger.warning("duplicate_message_skipped") return
January 2026
Replaced brute-force curriculum dumping with semantic search + prerequisite graph traversal. pgvector cosine similarity (top-k=15) + recursive CTE (depth=2) for surgical node injection.
February 2026
Separated OCR from diagnosis for clarity. Stage 1 (TRANSCRIPTION-001) extracts structured JSON from handwriting. Stage 2 (ANALYSIS-001) diagnoses gaps from clean text + image fallback. Temperature 0.1 for deterministic OCR.
# TRANSCRIPTION-001: Extract structured JSON from handwriting response = await ai_client.generate( prompt_id="TRANSCRIPTION-001", model="claude-sonnet-4-6", temperature=0.1, # Deterministic OCR max_tokens=2048, images=[exercise_book_image], json_mode=True, ) transcript = response["transcription_result"] # Output: { # "questions": [ # {"question_number": "1", "question_text": "Add 1/3 + 1/4", # "student_work": "1/3 + 1/4 = 2/7", "teacher_mark": "β"} # ], # "overall_legibility": "mostly_legible" # }
March 2026
Unified grade representations across Ghana, Uganda, Kenya, Nigeria. Canonical B1-B9 format with adjacent-grade filtering (radius=1) for vector search. SQS heartbeat prevents timeout redelivery. Partner config from YAML.
# Multi-country grade normalization GRADE_MAPS = { "ghana": { "jhs1": "B7", "jhs2": "B8", "jhs3": "B9", "primary 4": "B4", "basic 5": "B5", # ... }, "uganda": { "s1": "B7", "s2": "B8", "s3": "B9", "p4": "B4", "p5": "B5", # ... }, # Kenya, Nigeria... } def adjacent_grades(grade: str, country: str, radius: int = 1) -> list[str]: """Return canonical grades within Β±radius. Example: adjacent_grades("B5", "ghana", radius=1) -> ["B4", "B5", "B6"] """ sequence = GRADE_SEQUENCES[country] idx = sequence.index(grade) start = max(0, idx - radius) end = min(len(sequence), idx + radius + 1) return sequence[start:end]
7-step pipeline orchestrating AI analysis, hybrid RAG retrieval, and multi-country normalization.
Every step measured, logged, and optimized for Ghana's 3G networks.
Real performance data from production deployment in Ghana, Uganda, Kenya, and Nigeria.
Comprehensive technical specs, architecture docs, and deployment guides.
Complete system architecture with diagrams, data models, and design decisions.
Read Architecture βDetailed design docs for all 4 phases: requirements, implementation, and testing.
Browse Specs βAWS ECS Fargate setup, environment variables, and production deployment procedures.
Deploy to AWS βFull source code on GitHub: Python backend, FastAPI, PostgreSQL, pgvector, SQS workers.
View on GitHub βTry GapSense live: upload an exercise book photo and get real-time gap analysis.
Try Live Demo βFastAPI endpoints, request/response schemas, authentication, and rate limits.
API Docs β