System Architecture

Overview

TAXIA is a modular Graph-RAG system designed for Korean tax law question-answering with mandatory legal citations.

┌─────────────────────────────────────────────────────────────┐
│                      TAXIA Architecture                      │
└─────────────────────────────────────────────────────────────┘

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   CLI Tool   │     │  REST API    │     │  Python SDK  │
│              │     │  (FastAPI)   │     │              │
└──────┬───────┘     └──────┬───────┘     └──────┬───────┘
       │                    │                    │
       └────────────────────┴────────────────────┘
                            │
                    ┌───────▼──────┐
                    │ TaxiaEngine  │
                    │   (Core)     │
                    └───────┬──────┘
                            │
       ┌────────────────────┼────────────────────┐
       │                    │                    │
┌──────▼──────┐    ┌────────▼────────┐   ┌──────▼──────┐
│   Qdrant    │    │     Neo4j       │   │  LLM (API)  │
│   Vector    │    │   Graph-RAG     │   │   Claude/   │
│   Search    │    │  Relationships  │   │   GPT-4     │
└─────────────┘    └─────────────────┘   └─────────────┘

Core Components

1. TaxiaEngine

The main orchestration layer that coordinates: - Query processing - Document retrieval (vector + graph) - LLM generation - Citation validation - Audit trail management

2. Vector Search (Qdrant)

High-performance vector similarity search: - Embedding generation (OpenAI ada-002) - Fast nearest-neighbor search - Collection management - Metadata filtering

3. Graph-RAG (Neo4j)

Graph database for legal document relationships: - Law → Enforcement Decree relationships - Enforcement Decree → Enforcement Rules - Cross-references between articles - Temporal version tracking

4. LLM Integration

Support for multiple LLM providers: - Anthropic Claude (Recommended) - claude-3-5-sonnet-20241022 - Best for Korean legal text - OpenAI GPT - gpt-4-turbo-preview - Alternative option

Data Flow

Query Processing

1. User Query
   └─> TaxiaEngine.answer()

2. Vector Search
   └─> Qdrant retrieval (top_k documents)

3. Graph Enhancement (optional)
   └─> Neo4j related documents

4. Context Assembly
   └─> Combine retrieved documents

5. LLM Generation
   └─> Claude/GPT with context
   └─> Mandatory citation requirement

6. Citation Validation
   └─> Verify ≥2 legal sources

7. Response
   └─> Answer + Citations + Trace ID

Document Indexing

1. Load Tax Law JSON
   └─> Parse law/decree/rules

2. Generate Embeddings
   └─> OpenAI ada-002

3. Store in Qdrant
   └─> Vector + Metadata

4. Build Graph (optional)
   └─> Create Neo4j relationships
   └─> Law → Decree → Rules
   └─> Cross-references

Module Structure

src/taxia/
├── __init__.py          # Public API
├── engine.py            # TaxiaEngine core
├── types.py             # Type definitions
├── config.py            # Configuration
│
├── retrieval/
│   ├── vector.py        # Qdrant integration
│   ├── graph.py         # Neo4j integration
│   └── embeddings.py    # Embedding generation
│
├── llm/
│   ├── anthropic.py     # Claude integration
│   ├── openai.py        # OpenAI integration
│   └── prompts.py       # Prompt templates
│
├── indexing/
│   ├── loader.py        # Document loading
│   ├── parser.py        # JSON parsing
│   └── indexer.py       # Indexing orchestration
│
├── api/
│   ├── server.py        # FastAPI server
│   └── routes.py        # API endpoints
│
└── cli/
    ├── main.py          # CLI entry point
    └── commands.py      # CLI commands

Design Principles

1. Citations Required

Every answer MUST include ≥2 legal citations: - Law articles - Enforcement decrees - Enforcement rules - Interpretations

2. Audit Trail

Complete traceability: - Unique trace ID per query - Full context logging - Citation provenance - Timestamp tracking

3. Modular Architecture

Pluggable components: - LLM provider swap (Claude ↔ GPT) - Vector DB swap (Qdrant → alternatives) - Graph DB optional (Neo4j)

4. Performance

Optimized for production: - Async IO for API calls - Batch indexing support - Connection pooling - Caching strategies

Deployment Options

Development

# Local with demo data
pip install taxia-core
python -c "from taxia import TaxiaEngine; engine = TaxiaEngine()"

Production

# With full infrastructure
docker-compose up -d  # Qdrant + Neo4j
taxia index ./koreantaxlaw
taxia server --host 0.0.0.0 --port 8000

Cloud Deployment

Compatible with: - AWS (ECS, Lambda) - GCP (Cloud Run, GKE) - Azure (Container Instances, AKS)

Configuration

See Configuration Guide for details.

API Reference

Performance Considerations

Vector Search

Collection size: up to 1M documents
Query latency: < 50ms (p95)
Embedding batch size: 100 docs

Graph-RAG

Node count: ~10K (tax laws)
Relationship count: ~50K
Cypher query latency: < 100ms

LLM Generation

Claude Sonnet: ~2-3s per query
GPT-4 Turbo: ~3-4s per query
Token usage: ~2-3K tokens/query

Security

API key management via environment variables
Rate limiting on API endpoints
Input validation and sanitization
Audit logging for compliance