Engine API Reference
TaxiaEngine
Main interface for the TAXIA system.
Constructor
from taxia import TaxiaEngine, TaxiaConfig
engine = TaxiaEngine(config: TaxiaConfig = None)
Parameters:
- config (TaxiaConfig, optional): Configuration object. Uses defaults if not provided.
Example:
# Default configuration
engine = TaxiaEngine()
# Custom configuration
config = TaxiaConfig(
llm_provider="anthropic",
enable_graph_rag=True
)
engine = TaxiaEngine(config=config)
Methods
answer()
Ask a tax law question and get an answer with citations.
result = engine.answer(
query: str,
top_k: int = 5,
include_trace: bool = True
) -> QueryResult
Parameters:
- query (str): The question to ask
- top_k (int, optional): Number of documents to retrieve. Default: 5
- include_trace (bool, optional): Include audit trail. Default: True
Returns:
- QueryResult: Object containing answer, citations, and trace ID
Example:
result = engine.answer("What is the corporate tax filing deadline?")
print(result.answer)
# "The corporate tax filing deadline is within 3 months from the end of the fiscal year."
print(result.citations)
# ["Corporate Tax Act Article 60", "Enforcement Decree Article 132"]
print(result.trace_id)
# "trace-2025-01-23-abc123"
index_documents()
Index tax law documents for retrieval.
engine.index_documents(
data_dir: str,
collection_name: str = "taxia_documents",
batch_size: int = 100
) -> IndexResult
Parameters:
- data_dir (str): Path to directory containing JSON files
- collection_name (str, optional): Qdrant collection name. Default: "taxia_documents"
- batch_size (int, optional): Batch size for indexing. Default: 100
Returns:
- IndexResult: Object with indexing statistics
Example:
result = engine.index_documents("./koreantaxlaw/2025")
print(f"Indexed {result.document_count} documents")
print(f"Time: {result.elapsed_time:.2f}s")
search()
Search for relevant documents without LLM generation.
docs = engine.search(
query: str,
top_k: int = 5,
filters: dict = None
) -> List[Document]
Parameters:
- query (str): Search query
- top_k (int, optional): Number of results. Default: 5
- filters (dict, optional): Metadata filters
Returns:
- List[Document]: List of relevant documents
Example:
docs = engine.search(
query="corporate tax rate",
top_k=3,
filters={"year": 2025}
)
for doc in docs:
print(f"{doc.title}: {doc.score:.3f}")
health_check()
Check system health and dependencies.
status = engine.health_check() -> HealthStatus
Returns:
- HealthStatus: System health information
Example:
status = engine.health_check()
print(f"Qdrant: {status.qdrant}") # "healthy"
print(f"Neo4j: {status.neo4j}") # "healthy" or "disabled"
print(f"LLM: {status.llm}") # "configured"
Data Classes
QueryResult
Result from answer() method.
Attributes:
- answer (str): Generated answer
- citations (List[str]): Legal citations (≥2)
- trace_id (str): Unique trace identifier
- confidence (float): Confidence score (0-1)
- retrieved_docs (List[Document]): Source documents
- timestamp (datetime): Query timestamp
Example:
result = engine.answer("What is the dividend income tax rate?")
print(f"Answer: {result.answer}")
print(f"Confidence: {result.confidence:.2%}")
print(f"Trace: {result.trace_id}")
print(f"Time: {result.timestamp}")
IndexResult
Result from index_documents() method.
Attributes:
- document_count (int): Number of documents indexed
- elapsed_time (float): Time in seconds
- collection_name (str): Qdrant collection name
- status (str): "success" or "partial"
Document
Retrieved document information.
Attributes:
- id (str): Document ID
- title (str): Document title
- content (str): Document content
- metadata (dict): Additional metadata
- score (float): Relevance score
HealthStatus
System health information.
Attributes:
- qdrant (str): "healthy", "unhealthy", or "disabled"
- neo4j (str): "healthy", "unhealthy", or "disabled"
- llm (str): "configured" or "not_configured"
- version (str): TAXIA version
Configuration
TaxiaConfig
Configuration object for TaxiaEngine.
from taxia import TaxiaConfig
config = TaxiaConfig(
# LLM settings
llm_provider: str = "anthropic",
llm_model: str = "claude-3-5-sonnet-20241022",
llm_temperature: float = 0.1,
llm_max_tokens: int = 4000,
# Vector search
qdrant_host: str = "localhost",
qdrant_port: int = 6333,
qdrant_collection: str = "taxia_documents",
top_k: int = 5,
# Graph-RAG
enable_graph_rag: bool = False,
neo4j_uri: str = "bolt://localhost:7687",
neo4j_user: str = "neo4j",
neo4j_password: str = None,
# Logging
log_level: str = "INFO",
enable_audit_trail: bool = True,
)
See Configuration Guide for details.
Error Handling
Exceptions
from taxia.exceptions import (
TaxiaError, # Base exception
ConfigError, # Configuration error
VectorSearchError, # Qdrant error
GraphError, # Neo4j error
LLMError, # LLM API error
CitationError, # Insufficient citations
)
Example:
from taxia import TaxiaEngine
from taxia.exceptions import TaxiaError
try:
engine = TaxiaEngine()
result = engine.answer("What is the corporate tax filing deadline?")
except ConfigError as e:
print(f"Configuration error: {e}")
except LLMError as e:
print(f"LLM API error: {e}")
except TaxiaError as e:
print(f"General error: {e}")
Advanced Usage
Context Manager
from taxia import TaxiaEngine
with TaxiaEngine() as engine:
result = engine.answer("What is the corporate tax filing deadline?")
print(result.answer)
# Automatically closes connections
Async Support
from taxia import TaxiaEngine
import asyncio
async def main():
engine = TaxiaEngine()
result = await engine.answer_async("What is the corporate tax filing deadline?")
print(result.answer)
asyncio.run(main())
Batch Queries
queries = [
"What is the corporate tax filing deadline?",
"What is the VAT rate?",
"What is the income tax standard?",
]
results = engine.answer_batch(queries)
for query, result in zip(queries, results):
print(f"Q: {query}")
print(f"A: {result.answer}\n")