Engine API Reference

TaxiaEngine

Main interface for the TAXIA system.

Constructor

from taxia import TaxiaEngine, TaxiaConfig

engine = TaxiaEngine(config: TaxiaConfig = None)

Parameters: - config (TaxiaConfig, optional): Configuration object. Uses defaults if not provided.

Example:

# Default configuration
engine = TaxiaEngine()

# Custom configuration
config = TaxiaConfig(
    llm_provider="anthropic",
    enable_graph_rag=True
)
engine = TaxiaEngine(config=config)

Methods

answer()

Ask a tax law question and get an answer with citations.

result = engine.answer(
    query: str,
    top_k: int = 5,
    include_trace: bool = True
) -> QueryResult

Parameters: - query (str): The question to ask - top_k (int, optional): Number of documents to retrieve. Default: 5 - include_trace (bool, optional): Include audit trail. Default: True

Returns: - QueryResult: Object containing answer, citations, and trace ID

Example:

result = engine.answer("What is the corporate tax filing deadline?")

print(result.answer)
# "The corporate tax filing deadline is within 3 months from the end of the fiscal year."

print(result.citations)
# ["Corporate Tax Act Article 60", "Enforcement Decree Article 132"]

print(result.trace_id)
# "trace-2025-01-23-abc123"

index_documents()

Index tax law documents for retrieval.

engine.index_documents(
    data_dir: str,
    collection_name: str = "taxia_documents",
    batch_size: int = 100
) -> IndexResult

Parameters: - data_dir (str): Path to directory containing JSON files - collection_name (str, optional): Qdrant collection name. Default: "taxia_documents" - batch_size (int, optional): Batch size for indexing. Default: 100

Returns: - IndexResult: Object with indexing statistics

Example:

result = engine.index_documents("./koreantaxlaw/2025")

print(f"Indexed {result.document_count} documents")
print(f"Time: {result.elapsed_time:.2f}s")

Search for relevant documents without LLM generation.

docs = engine.search(
    query: str,
    top_k: int = 5,
    filters: dict = None
) -> List[Document]

Parameters: - query (str): Search query - top_k (int, optional): Number of results. Default: 5 - filters (dict, optional): Metadata filters

Returns: - List[Document]: List of relevant documents

Example:

docs = engine.search(
    query="corporate tax rate",
    top_k=3,
    filters={"year": 2025}
)

for doc in docs:
    print(f"{doc.title}: {doc.score:.3f}")

health_check()

Check system health and dependencies.

status = engine.health_check() -> HealthStatus

Returns: - HealthStatus: System health information

Example:

status = engine.health_check()

print(f"Qdrant: {status.qdrant}")      # "healthy"
print(f"Neo4j: {status.neo4j}")        # "healthy" or "disabled"
print(f"LLM: {status.llm}")            # "configured"

Data Classes

QueryResult

Result from answer() method.

Attributes: - answer (str): Generated answer - citations (List[str]): Legal citations (≥2) - trace_id (str): Unique trace identifier - confidence (float): Confidence score (0-1) - retrieved_docs (List[Document]): Source documents - timestamp (datetime): Query timestamp

Example:

result = engine.answer("What is the dividend income tax rate?")

print(f"Answer: {result.answer}")
print(f"Confidence: {result.confidence:.2%}")
print(f"Trace: {result.trace_id}")
print(f"Time: {result.timestamp}")

IndexResult

Result from index_documents() method.

Attributes: - document_count (int): Number of documents indexed - elapsed_time (float): Time in seconds - collection_name (str): Qdrant collection name - status (str): "success" or "partial"

Document

Retrieved document information.

Attributes: - id (str): Document ID - title (str): Document title - content (str): Document content - metadata (dict): Additional metadata - score (float): Relevance score

HealthStatus

System health information.

Attributes: - qdrant (str): "healthy", "unhealthy", or "disabled" - neo4j (str): "healthy", "unhealthy", or "disabled" - llm (str): "configured" or "not_configured" - version (str): TAXIA version

Configuration

TaxiaConfig

Configuration object for TaxiaEngine.

from taxia import TaxiaConfig

config = TaxiaConfig(
    # LLM settings
    llm_provider: str = "anthropic",
    llm_model: str = "claude-3-5-sonnet-20241022",
    llm_temperature: float = 0.1,
    llm_max_tokens: int = 4000,

    # Vector search
    qdrant_host: str = "localhost",
    qdrant_port: int = 6333,
    qdrant_collection: str = "taxia_documents",
    top_k: int = 5,

    # Graph-RAG
    enable_graph_rag: bool = False,
    neo4j_uri: str = "bolt://localhost:7687",
    neo4j_user: str = "neo4j",
    neo4j_password: str = None,

    # Logging
    log_level: str = "INFO",
    enable_audit_trail: bool = True,
)

See Configuration Guide for details.

Error Handling

Exceptions

from taxia.exceptions import (
    TaxiaError,           # Base exception
    ConfigError,          # Configuration error
    VectorSearchError,    # Qdrant error
    GraphError,           # Neo4j error
    LLMError,            # LLM API error
    CitationError,       # Insufficient citations
)

Example:

from taxia import TaxiaEngine
from taxia.exceptions import TaxiaError

try:
    engine = TaxiaEngine()
    result = engine.answer("What is the corporate tax filing deadline?")
except ConfigError as e:
    print(f"Configuration error: {e}")
except LLMError as e:
    print(f"LLM API error: {e}")
except TaxiaError as e:
    print(f"General error: {e}")

Advanced Usage

Context Manager

from taxia import TaxiaEngine

with TaxiaEngine() as engine:
    result = engine.answer("What is the corporate tax filing deadline?")
    print(result.answer)
# Automatically closes connections

Async Support

from taxia import TaxiaEngine
import asyncio

async def main():
    engine = TaxiaEngine()
    result = await engine.answer_async("What is the corporate tax filing deadline?")
    print(result.answer)

asyncio.run(main())

Batch Queries

queries = [
    "What is the corporate tax filing deadline?",
    "What is the VAT rate?",
    "What is the income tax standard?",
]

results = engine.answer_batch(queries)

for query, result in zip(queries, results):
    print(f"Q: {query}")
    print(f"A: {result.answer}\n")