- Release documentation update

- Try to move Add... functionality to overviews
- finished add_document on Zapier interface
2024-12-13 11:18:07 +01:00 · 2024-12-13 10:41:29 +01:00 · 2024-12-13 10:40:57 +01:00 · 2024-12-12 16:36:41 +01:00 · 2024-12-05 15:19:37 +01:00 · 2024-11-29 14:11:49 +01:00
186 changed files with 14554 additions and 3235 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -43,3 +43,6 @@ scripts/.DS_Store
 scripts/__pycache__/run_eveai_app.cpython-312.pyc
 /eveai_repo.txt
 *repo.txt
 /docker/eveai_logs/
 /integrations/Wordpress/eveai_sync.zip
 /integrations/Wordpress/eveai-chat.zip
--- a/.repopackignore_eveai_api
+++ b/.repopackignore_eveai_api
@@ -4,8 +4,7 @@ eveai_beat/
 eveai_chat/
 eveai_chat_workers/
 eveai_entitlements/
 eveai_workers/
 instance/
-integrations/
+integrations/Wordpress/eveai-chat
 nginx/
 scripts/
--- a/.repopackignore_eveai_chat
+++ b/.repopackignore_eveai_chat
@@ -1,11 +1,10 @@
 docker/
 eveai_api/
 eveai_app/
 eveai_beat/
 eveai_chat_workers/
 eveai_entitlements/
 eveai_workers/
 instance/
-integrations/
+integrations/Wordpress/eveai_sync
 nginx/
 scripts/
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -25,6 +25,75 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Security
 - In case of vulnerabilities.
 ## [2.0.1-alfa]
 ### Added
 - Zapîer Integration (partial - only adding files).
 - Addition of general chunking parameters (chunking_heading_level and chunking_patterns)
 - Addition of DocX and markdown Processor Types
 ### Changed
 - For changes in existing functionality.
 ### Deprecated
 - For soon-to-be removed features.
 ### Removed
 - For now removed features.
 ### Fixed
 - Ensure the RAG Specialist is using the detailed_question
 - Wordpress Chat Plugin: languages dropdown filled again
 - OpenAI update - proxies no longer supported
 - Build & Release script for Wordpress Plugins (including end user download folder)
 ### Security
 - In case of vulnerabilities.
 ## [2.0.0-alfa]
 ### Added
 - Introduction of dynamic Retrievers & Specialists
 - Introduction of dynamic Processors
 - Introduction of caching system
 - Introduction of a better template manager
 - Modernisation of external API/Socket authentication using projects
 - Creation of new eveai_chat WordPress plugin to support specialists
 ### Changed
 - Update of eveai_sync WordPress plugin
 ### Fixed
 - Set default language when registering Documents or URLs.
 ### Security
 - Security improvements to Docker images
 ## [1.0.14-alfa]
 ### Added
 - New release script added to tag images with release number
 - Allow the addition of multiple types of Catalogs
 - Generic functionality to enable dynamic fields
 - Addition of Retrievers to allow for smart collection of information in Catalogs
 - Add dynamic fields to Catalog / Retriever / DocumentVersion
 ### Changed
 - Processing parameters defined at Catalog level iso Tenant level
 - Reroute 'blank' paths to 'admin'
 ### Deprecated
 - For soon-to-be removed features.
 ### Removed
 - For now removed features.
 ### Fixed
 - Set default language when registering Documents or URLs.
 ### Security
 - In case of vulnerabilities.
 ## [1.0.13-alfa]
 ### Added
--- a/common/extensions.py
+++ b/common/extensions.py
@@ -12,6 +12,8 @@ from flask_wtf import CSRFProtect
 from flask_restx import Api
 from prometheus_flask_exporter import PrometheusMetrics
 from .langchain.templates.template_manager import TemplateManager
 from .utils.cache.eveai_cache_manager import EveAICacheManager
 from .utils.simple_encryption import SimpleEncryption
 from .utils.minio_utils import MinioClient
@@ -32,3 +34,5 @@ api_rest = Api()
 simple_encryption = SimpleEncryption()
 minio_client = MinioClient()
 metrics = PrometheusMetrics.for_app_factory()
 template_manager = TemplateManager()
 cache_manager = EveAICacheManager()
--- a/common/langchain/outputs/base.py
+++ b/common/langchain/outputs/base.py
@@ -0,0 +1,23 @@
 # Output Schema Management - common/langchain/outputs/base.py
 from typing import Dict, Type, Any
 from pydantic import BaseModel
 class BaseSpecialistOutput(BaseModel):
    """Base class for all specialist outputs"""
    pass
 class OutputRegistry:
    """Registry for specialist output schemas"""
    _schemas: Dict[str, Type[BaseSpecialistOutput]] = {}
    @classmethod
    def register(cls, specialist_type: str, schema_class: Type[BaseSpecialistOutput]):
        cls._schemas[specialist_type] = schema_class
    @classmethod
    def get_schema(cls, specialist_type: str) -> Type[BaseSpecialistOutput]:
        if specialist_type not in cls._schemas:
            raise ValueError(f"No output schema registered for {specialist_type}")
        return cls._schemas[specialist_type]
--- a/common/langchain/outputs/rag.py
+++ b/common/langchain/outputs/rag.py
@@ -0,0 +1,22 @@
 # RAG Specialist Output - common/langchain/outputs/rag.py
 from typing import List
 from pydantic import Field
 from .base import BaseSpecialistOutput
 class RAGOutput(BaseSpecialistOutput):
    """Output schema for RAG specialist"""
    """Default docstring - to be replaced with actual prompt"""
    answer: str = Field(
        ...,
        description="The answer to the user question, based on the given sources",
    )
    citations: List[int] = Field(
        ...,
        description="The integer IDs of the SPECIFIC sources that were used to generate the answer"
    )
    insufficient_info: bool = Field(
        False,  # Default value is set to False
        description="A boolean indicating whether given sources were sufficient or not to generate the answer"
    )
--- a/common/langchain/retrievers/eveai_default_rag_retriever.py
+++ b/common/langchain/retrievers/eveai_default_rag_retriever.py
@@ -1,145 +0,0 @@
 from langchain_core.retrievers import BaseRetriever
 from sqlalchemy import func, and_, or_, desc
 from sqlalchemy.exc import SQLAlchemyError
 from pydantic import BaseModel, Field, PrivateAttr
 from typing import Any, Dict
 from flask import current_app
 from common.extensions import db
 from common.models.document import Document, DocumentVersion
 from common.utils.datetime_utils import get_date_in_timezone
 from common.utils.model_utils import ModelVariables
 class EveAIDefaultRagRetriever(BaseRetriever, BaseModel):
    _catalog_id: int = PrivateAttr()
    _model_variables: ModelVariables = PrivateAttr()
    _tenant_info: Dict[str, Any] = PrivateAttr()
    def __init__(self, catalog_id: int, model_variables: ModelVariables, tenant_info: Dict[str, Any]):
        super().__init__()
        current_app.logger.debug(f'Model variables type: {type(model_variables)}')
        self._catalog_id = catalog_id
        self._model_variables = model_variables
        self._tenant_info = tenant_info
    @property
    def catalog_id(self) -> int:
        return self._catalog_id
    @property
    def model_variables(self) -> ModelVariables:
        return self._model_variables
    @property
    def tenant_info(self) -> Dict[str, Any]:
        return self._tenant_info
    def _get_relevant_documents(self, query: str):
        current_app.logger.debug(f'Retrieving relevant documents for query: {query}')
        query_embedding = self._get_query_embedding(query)
        current_app.logger.debug(f'Model Variables Private: {type(self._model_variables)}')
        current_app.logger.debug(f'Model Variables Property: {type(self.model_variables)}')
        db_class = self.model_variables['embedding_db_model']
        similarity_threshold = self.model_variables['similarity_threshold']
        k = self.model_variables['k']
        if self.model_variables['rag_tuning']:
            try:
                current_date = get_date_in_timezone(self.tenant_info['timezone'])
                current_app.rag_tuning_logger.debug(f'Current date: {current_date}\n')
                # Debug query to show similarity for all valid documents (without chunk text)
                debug_query = (
                    db.session.query(
                        Document.id.label('document_id'),
                        DocumentVersion.id.label('version_id'),
                        db_class.id.label('embedding_id'),
                        (1 - db_class.embedding.cosine_distance(query_embedding)).label('similarity')
                    )
                    .join(DocumentVersion, db_class.doc_vers_id == DocumentVersion.id)
                    .join(Document, DocumentVersion.doc_id == Document.id)
                    .filter(
                        or_(Document.valid_from.is_(None), func.date(Document.valid_from) <= current_date),
                        or_(Document.valid_to.is_(None), func.date(Document.valid_to) >= current_date)
                    )
                    .order_by(desc('similarity'))
                )
                debug_results = debug_query.all()
                current_app.logger.debug("Debug: Similarity for all valid documents:")
                for row in debug_results:
                    current_app.rag_tuning_logger.debug(f"Doc ID: {row.document_id}, "
                                                        f"Version ID: {row.version_id}, "
                                                        f"Embedding ID: {row.embedding_id}, "
                                                        f"Similarity: {row.similarity}")
                current_app.rag_tuning_logger.debug(f'---------------------------------------\n')
            except SQLAlchemyError as e:
                current_app.logger.error(f'Error generating overview: {e}')
                db.session.rollback()
        if self.model_variables['rag_tuning']:
            current_app.rag_tuning_logger.debug(f'Parameters for Retrieval of documents: \n')
            current_app.rag_tuning_logger.debug(f'Similarity Threshold: {similarity_threshold}\n')
            current_app.rag_tuning_logger.debug(f'K: {k}\n')
            current_app.rag_tuning_logger.debug(f'---------------------------------------\n')
        try:
            current_date = get_date_in_timezone(self.tenant_info['timezone'])
            # Subquery to find the latest version of each document
            subquery = (
                db.session.query(
                    DocumentVersion.doc_id,
                    func.max(DocumentVersion.id).label('latest_version_id')
                )
                .group_by(DocumentVersion.doc_id)
                .subquery()
            )
            # Main query to filter embeddings
            query_obj = (
                db.session.query(db_class,
                                 (1 - db_class.embedding.cosine_distance(query_embedding)).label('similarity'))
                .join(DocumentVersion, db_class.doc_vers_id == DocumentVersion.id)
                .join(Document, DocumentVersion.doc_id == Document.id)
                .join(subquery, DocumentVersion.id == subquery.c.latest_version_id)
                .filter(
                    or_(Document.valid_from.is_(None), func.date(Document.valid_from) <= current_date),
                    or_(Document.valid_to.is_(None), func.date(Document.valid_to) >= current_date),
                    (1 - db_class.embedding.cosine_distance(query_embedding)) > similarity_threshold,
                    Document.catalog_id == self._catalog_id
                )
                .order_by(desc('similarity'))
                .limit(k)
            )
            if self.model_variables['rag_tuning']:
                current_app.rag_tuning_logger.debug(f'Query executed for Retrieval of documents: \n')
                current_app.rag_tuning_logger.debug(f'{query_obj.statement}\n')
                current_app.rag_tuning_logger.debug(f'---------------------------------------\n')
            res = query_obj.all()
            if self.model_variables['rag_tuning']:
                current_app.rag_tuning_logger.debug(f'Retrieved {len(res)} relevant documents \n')
                current_app.rag_tuning_logger.debug(f'Data retrieved: \n')
                current_app.rag_tuning_logger.debug(f'{res}\n')
                current_app.rag_tuning_logger.debug(f'---------------------------------------\n')
            result = []
            for doc in res:
                if self.model_variables['rag_tuning']:
                    current_app.rag_tuning_logger.debug(f'Document ID: {doc[0].id} - Distance: {doc[1]}\n')
                    current_app.rag_tuning_logger.debug(f'Chunk: \n {doc[0].chunk}\n\n')
                result.append(f'SOURCE: {doc[0].id}\n\n{doc[0].chunk}\n\n')
        except SQLAlchemyError as e:
            current_app.logger.error(f'Error retrieving relevant documents: {e}')
            db.session.rollback()
            return []
        return result
    def _get_query_embedding(self, query: str):
        embedding_model = self.model_variables['embedding_model']
        query_embedding = embedding_model.embed_query(query)
        return query_embedding
--- a/common/langchain/retrievers/eveai_dossier_retriever.py
+++ b/common/langchain/retrievers/eveai_dossier_retriever.py
@@ -1,154 +0,0 @@
 from langchain_core.retrievers import BaseRetriever
 from sqlalchemy import func, and_, or_, desc, cast, JSON
 from sqlalchemy.exc import SQLAlchemyError
 from pydantic import BaseModel, Field, PrivateAttr
 from typing import Any, Dict, List, Optional
 from flask import current_app
 from contextlib import contextmanager
 from common.extensions import db
 from common.models.document import Document, DocumentVersion, Catalog
 from common.utils.datetime_utils import get_date_in_timezone
 from common.utils.model_utils import ModelVariables
 class EveAIDossierRetriever(BaseRetriever, BaseModel):
    _catalog_id: int = PrivateAttr()
    _model_variables: ModelVariables = PrivateAttr()
    _tenant_info: Dict[str, Any] = PrivateAttr()
    _active_filters: Optional[Dict[str, Any]] = PrivateAttr()
    def __init__(self, catalog_id: int, model_variables: ModelVariables, tenant_info: Dict[str, Any]):
        super().__init__()
        self._catalog_id = catalog_id
        self._model_variables = model_variables
        self._tenant_info = tenant_info
        self._active_filters = None
    @contextmanager
    def filtering(self, metadata_filters: Dict[str, Any]):
        """Context manager for temporarily setting metadata filters"""
        previous_filters = self._active_filters
        self._active_filters = metadata_filters
        try:
            yield self
        finally:
            self._active_filters = previous_filters
    def _build_metadata_filter_conditions(self, query):
        """Build SQL conditions for metadata filtering"""
        if not self._active_filters:
            return query
        conditions = []
        for field, value in self._active_filters.items():
            if value is None:
                continue
            # Handle both single values and lists of values
            if isinstance(value, (list, tuple)):
                # Multiple values - create OR condition
                or_conditions = []
                for val in value:
                    or_conditions.append(
                        cast(DocumentVersion.user_metadata[field].astext, JSON) == str(val)
                    )
                if or_conditions:
                    conditions.append(or_(*or_conditions))
            else:
                # Single value - direct comparison
                conditions.append(
                    cast(DocumentVersion.user_metadata[field].astext, JSON) == str(value)
                )
        if conditions:
            query = query.filter(and_(*conditions))
        return query
    def _get_relevant_documents(self, query: str):
        current_app.logger.debug(f'Retrieving relevant documents for dossier query: {query}')
        if self._active_filters:
            current_app.logger.debug(f'Using metadata filters: {self._active_filters}')
        query_embedding = self._get_query_embedding(query)
        db_class = self.model_variables['embedding_db_model']
        similarity_threshold = self.model_variables['similarity_threshold']
        k = self.model_variables['k']
        try:
            current_date = get_date_in_timezone(self.tenant_info['timezone'])
            # Subquery to find the latest version of each document
            subquery = (
                db.session.query(
                    DocumentVersion.doc_id,
                    func.max(DocumentVersion.id).label('latest_version_id')
                )
                .group_by(DocumentVersion.doc_id)
                .subquery()
            )
            # Build base query
            # Build base query
            query_obj = (
                db.session.query(db_class,
                                 (1 - db_class.embedding.cosine_distance(query_embedding)).label('similarity'))
                .join(DocumentVersion, db_class.doc_vers_id == DocumentVersion.id)
                .join(Document, DocumentVersion.doc_id == Document.id)
                .join(subquery, DocumentVersion.id == subquery.c.latest_version_id)
                .filter(
                    or_(Document.valid_from.is_(None), func.date(Document.valid_from) <= current_date),
                    or_(Document.valid_to.is_(None), func.date(Document.valid_to) >= current_date),
                    (1 - db_class.embedding.cosine_distance(query_embedding)) > similarity_threshold,
                    Document.catalog_id == self._catalog_id
                )
            )
            # Apply metadata filters
            query_obj = self._build_metadata_filter_conditions(query_obj)
            # Order and limit results
            query_obj = query_obj.order_by(desc('similarity')).limit(k)
            # Debug logging for RAG tuning if enabled
            if self.model_variables['rag_tuning']:
                self._log_rag_tuning(query_obj, query_embedding)
            res = query_obj.all()
            result = []
            for doc in res:
                if self.model_variables['rag_tuning']:
                    current_app.logger.debug(f'Document ID: {doc[0].id} - Distance: {doc[1]}\n')
                    current_app.logger.debug(f'Chunk: \n {doc[0].chunk}\n\n')
                result.append(f'SOURCE: {doc[0].id}\n\n{doc[0].chunk}\n\n')
        except SQLAlchemyError as e:
            current_app.logger.error(f'Error retrieving relevant documents: {e}')
            db.session.rollback()
            return []
        return result
    def _log_rag_tuning(self, query_obj, query_embedding):
        """Log debug information for RAG tuning"""
        current_app.rag_tuning_logger.debug("Debug: Query execution plan:")
        current_app.rag_tuning_logger.debug(f"{query_obj.statement}")
        if self._active_filters:
            current_app.rag_tuning_logger.debug("Debug: Active metadata filters:")
            current_app.rag_tuning_logger.debug(f"{self._active_filters}")
    def _get_query_embedding(self, query: str):
        """Get embedding for the query text"""
        embedding_model = self.model_variables['embedding_model']
        query_embedding = embedding_model.embed_query(query)
        return query_embedding
    @property
    def model_variables(self) -> ModelVariables:
        return self._model_variables
    @property
    def tenant_info(self) -> Dict[str, Any]:
        return self._tenant_info
--- a/common/langchain/retrievers/eveai_history_retriever.py
+++ b/common/langchain/retrievers/eveai_history_retriever.py
@@ -1,52 +0,0 @@
 from langchain_core.retrievers import BaseRetriever
 from sqlalchemy import asc
 from sqlalchemy.exc import SQLAlchemyError
 from pydantic import Field, BaseModel, PrivateAttr
 from typing import Any, Dict
 from flask import current_app
 from common.extensions import db
 from common.models.interaction import ChatSession, Interaction
 from common.utils.model_utils import ModelVariables
 class EveAIHistoryRetriever(BaseRetriever, BaseModel):
    _model_variables: ModelVariables = PrivateAttr()
    _session_id: str = PrivateAttr()
    def __init__(self, model_variables: ModelVariables, session_id: str):
        super().__init__()
        self._model_variables = model_variables
        self._session_id = session_id
    @property
    def model_variables(self) -> ModelVariables:
        return self._model_variables
    @property
    def session_id(self) -> str:
        return self._session_id
    def _get_relevant_documents(self, query: str):
        current_app.logger.debug(f'Retrieving history of interactions for query: {query}')
        try:
            query_obj = (
                db.session.query(Interaction)
                .join(ChatSession, Interaction.chat_session_id == ChatSession.id)
                .filter(ChatSession.session_id == self.session_id)
                .order_by(asc(Interaction.id))
            )
            interactions = query_obj.all()
            result = []
            for interaction in interactions:
                result.append(f'HUMAN:\n{interaction.detailed_question}\n\nAI: \n{interaction.answer}\n\n')
        except SQLAlchemyError as e:
            current_app.logger.error(f'Error retrieving history of interactions: {e}')
            db.session.rollback()
            return []
        return result
--- a/common/langchain/retrievers/eveai_retriever.py
+++ b/common/langchain/retrievers/eveai_retriever.py
@@ -1,40 +0,0 @@
 from pydantic import BaseModel, PrivateAttr
 from typing import Dict, Any
 from common.utils.model_utils import ModelVariables
 class EveAIRetriever(BaseModel):
    _catalog_id: int = PrivateAttr()
    _user_metadata: Dict[str, Any] = PrivateAttr()
    _system_metadata: Dict[str, Any] = PrivateAttr()
    _configuration: Dict[str, Any] = PrivateAttr()
    _tenant_info: Dict[str, Any] = PrivateAttr()
    _model_variables: ModelVariables = PrivateAttr()
    _tuning: bool = PrivateAttr()
    def __init__(self, catalog_id: int, user_metadata: Dict[str, Any], system_metadata: Dict[str, Any],
                 configuration: Dict[str, Any]):
        super().__init__()
        self._catalog_id = catalog_id
        self._user_metadata = user_metadata
        self._system_metadata = system_metadata
        self._configuration = configuration
    @property
    def catalog_id(self):
        return self._catalog_id
    @property
    def user_metadata(self):
        return self._user_metadata
    @property
    def system_metadata(self):
        return self._system_metadata
    @property
    def configuration(self):
        return self._configuration
    # Any common methods that should be shared among retrievers can go here.
--- a/common/langchain/templates/template_manager.py
+++ b/common/langchain/templates/template_manager.py
@@ -0,0 +1,154 @@
 import os
 import yaml
 from typing import Dict, Optional, Any
 from packaging import version
 from dataclasses import dataclass
 from flask import current_app, Flask
 from common.utils.os_utils import get_project_root
@dataclass
 class PromptTemplate:
    """Represents a versioned prompt template"""
    content: str
    version: str
    metadata: Dict[str, Any]
 class TemplateManager:
    """Manages versioned prompt templates"""
    def __init__(self):
        self.templates_dir = None
        self._templates = None
        self.app = None
    def init_app(self, app: Flask) -> None:
        # Initialize template manager
        base_dir = "/app"
        self.templates_dir = os.path.join(base_dir, 'config', 'prompts')
        app.logger.debug(f'Loading templates from {self.templates_dir}')
        self.app = app
        self._templates = self._load_templates()
        # Log available templates for each supported model
        for llm in app.config['SUPPORTED_LLMS']:
            try:
                available_templates = self.list_templates(llm)
                app.logger.info(f"Loaded templates for {llm}: {available_templates}")
            except ValueError:
                app.logger.warning(f"No templates found for {llm}")
    def _load_templates(self) -> Dict[str, Dict[str, Dict[str, PromptTemplate]]]:
        """
        Load all template versions from the templates directory.
        Structure: {provider.model -> {template_name -> {version -> template}}}
        Directory structure:
        prompts/
          ├── provider/
          │   └── model/
          │       └── template_name/
          │           └── version.yaml
        """
        templates = {}
        # Iterate through providers (anthropic, openai)
        for provider in os.listdir(self.templates_dir):
            provider_path = os.path.join(self.templates_dir, provider)
            if not os.path.isdir(provider_path):
                continue
            # Iterate through models (claude-3, gpt-4o)
            for model in os.listdir(provider_path):
                model_path = os.path.join(provider_path, model)
                if not os.path.isdir(model_path):
                    continue
                provider_model = f"{provider}.{model}"
                templates[provider_model] = {}
                # Iterate through template types (rag, summary, etc.)
                for template_name in os.listdir(model_path):
                    template_path = os.path.join(model_path, template_name)
                    if not os.path.isdir(template_path):
                        continue
                    template_versions = {}
                    # Load all version files for this template
                    for version_file in os.listdir(template_path):
                        if not version_file.endswith('.yaml'):
                            continue
                        version_str = version_file[:-5]  # Remove .yaml
                        if not self._is_valid_version(version_str):
                            current_app.logger.warning(
                                f"Invalid version format for {template_name}: {version_str}")
                            continue
                        try:
                            with open(os.path.join(template_path, version_file)) as f:
                                template_data = yaml.safe_load(f)
                                # Verify required fields
                                if not template_data.get('content'):
                                    raise ValueError("Template content is required")
                                template_versions[version_str] = PromptTemplate(
                                    content=template_data['content'],
                                    version=version_str,
                                    metadata=template_data.get('metadata', {})
                                )
                        except Exception as e:
                            current_app.logger.error(
                                f"Error loading template {template_name} version {version_str}: {e}")
                            continue
                    if template_versions:
                        templates[provider_model][template_name] = template_versions
        return templates
    def _is_valid_version(self, version_str: str) -> bool:
        """Validate semantic versioning string"""
        try:
            version.parse(version_str)
            return True
        except version.InvalidVersion:
            return False
    def get_template(self,
                     provider_model: str,
                     template_name: str,
                     template_version: Optional[str] = None) -> PromptTemplate:
        """
        Get a specific template version. If version not specified,
        returns the latest version.
        """
        if provider_model not in self._templates:
            raise ValueError(f"Unknown provider.model: {provider_model}")
        if template_name not in self._templates[provider_model]:
            raise ValueError(f"Unknown template: {template_name}")
        versions = self._templates[provider_model][template_name]
        if template_version:
            if template_version not in versions:
                raise ValueError(f"Template version {template_version} not found")
            return versions[template_version]
        # Return latest version
        latest = max(versions.keys(), key=version.parse)
        return versions[latest]
    def list_templates(self, provider_model: str) -> Dict[str, list]:
        """
        List all available templates and their versions for a provider.model
        Returns: {template_name: [version1, version2, ...]}
        """
        if provider_model not in self._templates:
            raise ValueError(f"Unknown provider.model: {provider_model}")
        return {
            template_name: sorted(versions.keys(), key=version.parse)
            for template_name, versions in self._templates[provider_model].items()
        }
--- a/common/langchain/tracked_transcribe.py
+++ b/common/langchain/tracked_transcribe.py
@@ -1,27 +0,0 @@
 import time
 from common.utils.business_event_context import current_event
 def tracked_transcribe(client, *args, **kwargs):
    start_time = time.time()
    # Extract the file and model from kwargs if present, otherwise use defaults
    file = kwargs.get('file')
    model = kwargs.get('model', 'whisper-1')
    duration = kwargs.pop('duration', 600)
    result = client.audio.transcriptions.create(*args, **kwargs)
    end_time = time.time()
    # Token usage for transcriptions is actually the duration in seconds we pass, as the whisper model is priced per second transcribed
    metrics = {
        'total_tokens': duration,
        'prompt_tokens': 0,  # For transcriptions, all tokens are considered "completion"
        'completion_tokens': duration,
        'time_elapsed': end_time - start_time,
        'interaction_type': 'ASR',
    }
    current_event.log_llm_metrics(metrics)
    return result
--- a/common/langchain/tracked_transcription.py
+++ b/common/langchain/tracked_transcription.py
@@ -0,0 +1,77 @@
 # common/langchain/tracked_transcription.py
 from typing import Any, Optional, Dict
 import time
 from openai import OpenAI
 from common.utils.business_event_context import current_event
 class TrackedOpenAITranscription:
    """Wrapper for OpenAI transcription with metric tracking"""
    def __init__(self, api_key: str, **kwargs: Any):
        """Initialize with OpenAI client settings"""
        self.client = OpenAI(api_key=api_key)
        self.model = kwargs.get('model', 'whisper-1')
    def transcribe(self,
                   file: Any,
                   model: Optional[str] = None,
                   language: Optional[str] = None,
                   prompt: Optional[str] = None,
                   response_format: Optional[str] = None,
                   temperature: Optional[float] = None,
                   duration: Optional[int] = None) -> str:
        """
        Transcribe audio with metrics tracking
        Args:
            file: Audio file to transcribe
            model: Model to use (defaults to whisper-1)
            language: Optional language of the audio
            prompt: Optional prompt to guide transcription
            response_format: Response format (json, text, etc)
            temperature: Sampling temperature
            duration: Duration of audio in seconds for metrics
        Returns:
            Transcription text
        """
        start_time = time.time()
        try:
            # Create transcription options
            options = {
                "file": file,
                "model": model or self.model,
            }
            if language:
                options["language"] = language
            if prompt:
                options["prompt"] = prompt
            if response_format:
                options["response_format"] = response_format
            if temperature:
                options["temperature"] = temperature
            response = self.client.audio.transcriptions.create(**options)
            # Calculate metrics
            end_time = time.time()
            # Token usage for transcriptions is based on audio duration
            metrics = {
                'total_tokens': duration or 600,  # Default to 10 minutes if duration not provided
                'prompt_tokens': 0,  # For transcriptions, all tokens are completion
                'completion_tokens': duration or 600,
                'time_elapsed': end_time - start_time,
                'interaction_type': 'ASR',
            }
            current_event.log_llm_metrics(metrics)
            # Return text from response
            if isinstance(response, str):
                return response
            return response.text
        except Exception as e:
            raise Exception(f"Transcription failed: {str(e)}")
--- a/common/models/document.py
+++ b/common/models/document.py
@@ -10,24 +10,33 @@ class Catalog(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(50), nullable=False)
    description = db.Column(db.Text, nullable=True)
-    type = db.Column(db.String(50), nullable=False, default="DEFAULT_CATALOG")
+    type = db.Column(db.String(50), nullable=False, default="STANDARD_CATALOG")
    # Embedding variables
    html_tags = db.Column(ARRAY(sa.String(10)), nullable=True, default=['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'li'])
    html_end_tags = db.Column(ARRAY(sa.String(10)), nullable=True, default=['p', 'li'])
    html_included_elements = db.Column(ARRAY(sa.String(50)), nullable=True)
    html_excluded_elements = db.Column(ARRAY(sa.String(50)), nullable=True)
    html_excluded_classes = db.Column(ARRAY(sa.String(200)), nullable=True)
    min_chunk_size = db.Column(db.Integer, nullable=True, default=2000)
    max_chunk_size = db.Column(db.Integer, nullable=True, default=3000)
-    # Chat variables ==> Move to Specialist?
+    # Meta Data
-    chat_RAG_temperature = db.Column(db.Float, nullable=True, default=0.3)
+    user_metadata = db.Column(JSONB, nullable=True)
-    chat_no_RAG_temperature = db.Column(db.Float, nullable=True, default=0.5)
+    system_metadata = db.Column(JSONB, nullable=True)
    configuration = db.Column(JSONB, nullable=True)
    # Versioning Information
    created_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now())
    created_by = db.Column(db.Integer, db.ForeignKey(User.id), nullable=True)
    updated_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now(), onupdate=db.func.now())
    updated_by = db.Column(db.Integer, db.ForeignKey(User.id))
 class Processor(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(50), nullable=False)
    description = db.Column(db.Text, nullable=True)
    catalog_id = db.Column(db.Integer, db.ForeignKey('catalog.id'), nullable=True)
    type = db.Column(db.String(50), nullable=False)
    sub_file_type = db.Column(db.String(50), nullable=True)
    # Tuning enablers
-    embed_tuning = db.Column(db.Boolean, nullable=True, default=False)
+    tuning = db.Column(db.Boolean, nullable=True, default=False)
    # Meta Data
    user_metadata = db.Column(JSONB, nullable=True)
@@ -46,13 +55,14 @@ class Retriever(db.Model):
    name = db.Column(db.String(50), nullable=False)
    description = db.Column(db.Text, nullable=True)
    catalog_id = db.Column(db.Integer, db.ForeignKey('catalog.id'), nullable=True)
-    type = db.Column(db.String(50), nullable=False, default="DEFAULT_RAG")
+    type = db.Column(db.String(50), nullable=False, default="STANDARD_RAG")
    tuning = db.Column(db.Boolean, nullable=True, default=False)
    # Meta Data
    user_metadata = db.Column(JSONB, nullable=True)
    system_metadata = db.Column(JSONB, nullable=True)
    configuration = db.Column(JSONB, nullable=True)
    arguments = db.Column(JSONB, nullable=True)
    # Versioning Information
    created_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now())
@@ -89,6 +99,7 @@ class DocumentVersion(db.Model):
    bucket_name = db.Column(db.String(255), nullable=True)
    object_name = db.Column(db.String(200), nullable=True)
    file_type = db.Column(db.String(20), nullable=True)
    sub_file_type = db.Column(db.String(50), nullable=True)
    file_size = db.Column(db.Float, nullable=True)
    language = db.Column(db.String(2), nullable=False)
    user_context = db.Column(db.Text, nullable=True)
--- a/common/models/interaction.py
+++ b/common/models/interaction.py
@@ -1,6 +1,8 @@
 from sqlalchemy.dialects.postgresql import JSONB
 from ..extensions import db
 from .user import User, Tenant
-from .document import Embedding
+from .document import Embedding, Retriever
 class ChatSession(db.Model):
@@ -18,14 +20,32 @@ class ChatSession(db.Model):
        return f"<ChatSession {self.id} by {self.user_id}>"
 class Specialist(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(50), nullable=False)
    description = db.Column(db.Text, nullable=True)
    type = db.Column(db.String(50), nullable=False, default="STANDARD_RAG")
    tuning = db.Column(db.Boolean, nullable=True, default=False)
    configuration = db.Column(JSONB, nullable=True)
    arguments = db.Column(JSONB, nullable=True)
    # Relationship to retrievers through the association table
    retrievers = db.relationship('SpecialistRetriever', backref='specialist', lazy=True,
                                 cascade="all, delete-orphan")
    # Versioning Information
    created_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now())
    created_by = db.Column(db.Integer, db.ForeignKey(User.id), nullable=True)
    updated_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now(), onupdate=db.func.now())
    updated_by = db.Column(db.Integer, db.ForeignKey(User.id))
 class Interaction(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    chat_session_id = db.Column(db.Integer, db.ForeignKey(ChatSession.id), nullable=False)
-    question = db.Column(db.Text, nullable=False)
+    specialist_id = db.Column(db.Integer, db.ForeignKey(Specialist.id), nullable=True)
-    detailed_question = db.Column(db.Text, nullable=True)
+    specialist_arguments = db.Column(JSONB, nullable=True)
-    answer = db.Column(db.Text, nullable=True)
+    specialist_results = db.Column(JSONB, nullable=True)
    algorithm_used = db.Column(db.String(20), nullable=True)
    language = db.Column(db.String(2), nullable=False)
    timezone = db.Column(db.String(30), nullable=True)
    appreciation = db.Column(db.Integer, nullable=True)
@@ -44,3 +64,10 @@ class Interaction(db.Model):
 class InteractionEmbedding(db.Model):
    interaction_id = db.Column(db.Integer, db.ForeignKey(Interaction.id, ondelete='CASCADE'), primary_key=True)
    embedding_id = db.Column(db.Integer, db.ForeignKey(Embedding.id, ondelete='CASCADE'), primary_key=True)
 class SpecialistRetriever(db.Model):
    specialist_id = db.Column(db.Integer, db.ForeignKey(Specialist.id, ondelete='CASCADE'), primary_key=True)
    retriever_id = db.Column(db.Integer, db.ForeignKey(Retriever.id, ondelete='CASCADE'), primary_key=True)
    retriever = db.relationship("Retriever", backref="specialist_retrievers")
--- a/common/models/user.py
+++ b/common/models/user.py
@@ -34,36 +34,8 @@ class Tenant(db.Model):
    embedding_model = db.Column(db.String(50), nullable=True)
    llm_model = db.Column(db.String(50), nullable=True)
    # # Embedding variables ==> To be removed once all migrations (dev + prod) have been done
    # html_tags = db.Column(ARRAY(sa.String(10)), nullable=True, default=['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'li'])
    # html_end_tags = db.Column(ARRAY(sa.String(10)), nullable=True, default=['p', 'li'])
    # html_included_elements = db.Column(ARRAY(sa.String(50)), nullable=True)
    # html_excluded_elements = db.Column(ARRAY(sa.String(50)), nullable=True)
    # html_excluded_classes = db.Column(ARRAY(sa.String(200)), nullable=True)
    #
    # min_chunk_size = db.Column(db.Integer, nullable=True, default=2000)
    # max_chunk_size = db.Column(db.Integer, nullable=True, default=3000)
    #
    # # Embedding search variables
    # es_k = db.Column(db.Integer, nullable=True, default=5)
    # es_similarity_threshold = db.Column(db.Float, nullable=True, default=0.7)
    #
    # # Chat variables
    # chat_RAG_temperature = db.Column(db.Float, nullable=True, default=0.3)
    # chat_no_RAG_temperature = db.Column(db.Float, nullable=True, default=0.5)
    fallback_algorithms = db.Column(ARRAY(sa.String(50)), nullable=True)
    # Licensing Information
    encrypted_chat_api_key = db.Column(db.String(500), nullable=True)
    encrypted_api_key = db.Column(db.String(500), nullable=True)
    # # Tuning enablers
    # embed_tuning = db.Column(db.Boolean, nullable=True, default=False)
    # rag_tuning = db.Column(db.Boolean, nullable=True, default=False)
    # Entitlements
    currency = db.Column(db.String(20), nullable=True)
    usage_email = db.Column(db.String(255), nullable=True)
    storage_dirty = db.Column(db.Boolean, nullable=True, default=False)
    # Relations
@@ -96,9 +68,7 @@ class Tenant(db.Model):
            'allowed_languages': self.allowed_languages,
            'embedding_model': self.embedding_model,
            'llm_model': self.llm_model,
            'fallback_algorithms': self.fallback_algorithms,
            'currency': self.currency,
            'usage_email': self.usage_email,
        }
@@ -140,6 +110,8 @@ class User(db.Model, UserMixin):
    fs_uniquifier = db.Column(db.String(255), unique=True, nullable=False)
    confirmed_at = db.Column(db.DateTime, nullable=True)
    valid_to = db.Column(db.Date, nullable=True)
    is_primary_contact = db.Column(db.Boolean, nullable=True, default=False)
    is_financial_contact = db.Column(db.Boolean, nullable=True, default=False)
    # Security Trackable Information
    last_login_at = db.Column(db.DateTime, nullable=True)
@@ -180,3 +152,29 @@ class TenantDomain(db.Model):
    def __repr__(self):
        return f"<TenantDomain {self.id}: {self.domain}>"
 class TenantProject(db.Model):
    __bind_key__ = 'public'
    __table_args__ = {'schema': 'public'}
    id = db.Column(db.Integer, primary_key=True)
    tenant_id = db.Column(db.Integer, db.ForeignKey('public.tenant.id'), nullable=False)
    name = db.Column(db.String(50), nullable=False)
    description = db.Column(db.Text, nullable=True)
    services = db.Column(ARRAY(sa.String(50)), nullable=False)
    encrypted_api_key = db.Column(db.String(500), nullable=True)
    visual_api_key = db.Column(db.String(20), nullable=True)
    active = db.Column(db.Boolean, nullable=False, default=True)
    responsible_email = db.Column(db.String(255), nullable=True)
    # Versioning Information
    created_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now())
    created_by = db.Column(db.Integer, db.ForeignKey('public.user.id'), nullable=True)
    updated_at = db.Column(db.DateTime, nullable=False, server_default=db.func.now(), onupdate=db.func.now())
    updated_by = db.Column(db.Integer, db.ForeignKey('public.user.id'))
    # Relations
    tenant = db.relationship('Tenant', backref='projects')
    def __repr__(self):
        return f"<TenantProject {self.id}: {self.name}>"
--- a/common/utils/business_event.py
+++ b/common/utils/business_event.py
@@ -4,7 +4,6 @@ from contextlib import contextmanager
 from datetime import datetime
 from typing import Dict, Any, Optional
 from datetime import datetime as dt, timezone as tz
 from portkey_ai import Portkey, Config
 import logging
 from .business_event_context import BusinessEventContext
--- a/common/langchain/retrievers/init.py
+++ b/common/langchain/retrievers/init.py
--- a/common/utils/cache/base.py
+++ b/common/utils/cache/base.py
@@ -0,0 +1,89 @@
 # common/utils/cache/base.py
 from typing import Any, Dict, List, Optional, TypeVar, Generic, Type
 from dataclasses import dataclass
 from flask import Flask
 from dogpile.cache import CacheRegion
 T = TypeVar('T')
@dataclass
 class CacheKey:
    """Represents a cache key with multiple components"""
    components: Dict[str, Any]
    def __str__(self) -> str:
        return ":".join(f"{k}={v}" for k, v in sorted(self.components.items()))
 class CacheInvalidationManager:
    """Manages cache invalidation subscriptions"""
    def __init__(self):
        self._subscribers = {}
    def subscribe(self, model: str, handler: 'CacheHandler', key_fields: List[str]):
        if model not in self._subscribers:
            self._subscribers[model] = []
        self._subscribers[model].append((handler, key_fields))
    def notify_change(self, model: str, **identifiers):
        if model in self._subscribers:
            for handler, key_fields in self._subscribers[model]:
                if all(field in identifiers for field in key_fields):
                    handler.invalidate_by_model(model, **identifiers)
 class CacheHandler(Generic[T]):
    """Base cache handler implementation"""
    def __init__(self, region: CacheRegion, prefix: str):
        self.region = region
        self.prefix = prefix
        self._key_components = []
    def configure_keys(self, *components: str):
        self._key_components = components
        return self
    def subscribe_to_model(self, model: str, key_fields: List[str]):
        invalidation_manager.subscribe(model, self, key_fields)
        return self
    def generate_key(self, **identifiers) -> str:
        missing = set(self._key_components) - set(identifiers.keys())
        if missing:
            raise ValueError(f"Missing key components: {missing}")
        key = CacheKey({k: identifiers[k] for k in self._key_components})
        return f"{self.prefix}:{str(key)}"
    def get(self, creator_func, **identifiers) -> T:
        cache_key = self.generate_key(**identifiers)
        def creator():
            instance = creator_func(**identifiers)
            return self.to_cache_data(instance)
        cached_data = self.region.get_or_create(
            cache_key,
            creator,
            should_cache_fn=self.should_cache
        )
        return self.from_cache_data(cached_data, **identifiers)
    def invalidate(self, **identifiers):
        cache_key = self.generate_key(**identifiers)
        self.region.delete(cache_key)
    def invalidate_by_model(self, model: str, **identifiers):
        try:
            self.invalidate(**identifiers)
        except ValueError:
            pass
 # Create global invalidation manager
 invalidation_manager = CacheInvalidationManager()
--- a/common/utils/cache/eveai_cache_manager.py
+++ b/common/utils/cache/eveai_cache_manager.py
@@ -0,0 +1,39 @@
 from typing import Type
 from flask import Flask
 from common.utils.cache.base import CacheHandler
 class EveAICacheManager:
    """Cache manager with registration capabilities"""
    def __init__(self):
        self._regions = {}
        self._handlers = {}
    def init_app(self, app: Flask):
        """Initialize cache regions"""
        from common.utils.cache.regions import create_cache_regions
        self._regions = create_cache_regions(app)
        # Store regions in instance
        for region_name, region in self._regions.items():
            setattr(self, f"{region_name}_region", region)
        # Initialize all registered handlers with their regions
        for handler_class, region_name in self._handlers.items():
            region = self._regions[region_name]
            handler_instance = handler_class(region)
            handler_name = getattr(handler_class, 'handler_name', None)
            if handler_name:
                app.logger.debug(f"{handler_name} is registered")
                setattr(self, handler_name, handler_instance)
        app.logger.info('Cache regions initialized: ' + ', '.join(self._regions.keys()))
    def register_handler(self, handler_class: Type[CacheHandler], region: str):
        """Register a cache handler class with its region"""
        if not hasattr(handler_class, 'handler_name'):
            raise ValueError("Cache handler must define handler_name class attribute")
        self._handlers[handler_class] = region
--- a/common/utils/cache/regions.py
+++ b/common/utils/cache/regions.py
@@ -0,0 +1,65 @@
 # common/utils/cache/regions.py
 from dogpile.cache import make_region
 from urllib.parse import urlparse
 import os
 def get_redis_config(app):
    """
    Create Redis configuration dict based on app config
    Handles both authenticated and non-authenticated setups
    """
    # Parse the REDIS_BASE_URI to get all components
    redis_uri = urlparse(app.config['REDIS_BASE_URI'])
    config = {
        'host': redis_uri.hostname,
        'port': int(redis_uri.port or 6379),
        'db': 4,  # Keep this for later use
        'redis_expiration_time': 3600,
        'distributed_lock': True,
        'thread_local_lock': False,
    }
    # Add authentication if provided
    if redis_uri.username and redis_uri.password:
        config.update({
            'username': redis_uri.username,
            'password': redis_uri.password
        })
    return config
 def create_cache_regions(app):
    """Initialize all cache regions with app config"""
    redis_config = get_redis_config(app)
    regions = {}
    # Region for model-related caching (ModelVariables etc)
    model_region = make_region(name='eveai_model').configure(
        'dogpile.cache.redis',
        arguments=redis_config,
        replace_existing_backend=True
    )
    regions['eveai_model'] = model_region
    # Region for eveai_chat_workers components (Specialists, Retrievers, ...)
    eveai_chat_workers_region = make_region(name='eveai_chat_workers').configure(
        'dogpile.cache.redis',
        arguments=redis_config,  # arguments={**redis_config, 'db': 4},  # Different DB
        replace_existing_backend=True
    )
    regions['eveai_chat_workers'] = eveai_chat_workers_region
    # Region for eveai_workers components (Processors, ...)
    eveai_workers_region = make_region(name='eveai_workers').configure(
        'dogpile.cache.redis',
        arguments=redis_config,  # Same config for now
        replace_existing_backend=True
    )
    regions['eveai_workers'] = eveai_workers_region
    return regions
--- a/common/utils/celery_utils.py
+++ b/common/utils/celery_utils.py
@@ -8,8 +8,6 @@ celery_app = Celery()
 def init_celery(celery, app, is_beat=False):
    celery_app.main = app.name
    app.logger.debug(f'CELERY_BROKER_URL: {app.config["CELERY_BROKER_URL"]}')
    app.logger.debug(f'CELERY_RESULT_BACKEND: {app.config["CELERY_RESULT_BACKEND"]}')
    celery_config = {
        'broker_url': app.config.get('CELERY_BROKER_URL', 'redis://localhost:6379/0'),
--- a/common/utils/config_field_types.py
+++ b/common/utils/config_field_types.py
@@ -0,0 +1,662 @@
 from typing import Optional, List, Union, Dict, Any, Pattern
 from pydantic import BaseModel, field_validator, model_validator
 from typing_extensions import Annotated
 import re
 from datetime import datetime
 import json
 from textwrap import dedent
 import yaml
 from dataclasses import dataclass
 class TaggingField(BaseModel):
    """Represents a single tagging field configuration"""
    type: str
    required: bool = False
    description: Optional[str] = None
    allowed_values: Optional[List[Any]] = None  # for enum type
    min_value: Optional[Union[int, float]] = None  # for numeric types
    max_value: Optional[Union[int, float]] = None  # for numeric types
    @field_validator('type', mode='before')
    @classmethod
    def validate_type(cls, v: str) -> str:
        valid_types = ['string', 'integer', 'float', 'date', 'enum']
        if v not in valid_types:
            raise ValueError(f'type must be one of {valid_types}')
        return v
    @model_validator(mode='after')
    def validate_field_constraints(self) -> 'TaggingField':
        # Validate enum constraints
        if self.type == 'enum':
            if not self.allowed_values:
                raise ValueError('allowed_values must be provided for enum type')
        elif self.allowed_values is not None:
            raise ValueError('allowed_values only valid for enum type')
        # Validate numeric constraints
        if self.type not in ('integer', 'float'):
            if self.min_value is not None or self.max_value is not None:
                raise ValueError('min_value/max_value only valid for numeric types')
        else:
            if self.min_value is not None and self.max_value is not None and self.min_value >= self.max_value:
                raise ValueError('min_value must be less than max_value')
        return self
 class TaggingFields(BaseModel):
    """Represents a collection of tagging fields, mapped by their names"""
    fields: Dict[str, TaggingField]
    @classmethod
    def from_dict(cls, data: Dict[str, Dict[str, Any]]) -> 'TaggingFields':
        return cls(fields={
            field_name: TaggingField(**field_config)
            for field_name, field_config in data.items()
        })
    def to_dict(self) -> Dict[str, Dict[str, Any]]:
        return {
            field_name: field.model_dump(exclude_none=True)
            for field_name, field in self.fields.items()
        }
 class ChunkingPatternsField(BaseModel):
    """Represents a set of chunking patterns"""
    patterns: List[str]
    @field_validator('patterns')
    def validate_patterns(cls, patterns):
        for pattern in patterns:
            try:
                re.compile(pattern)
            except re.error as e:
                raise ValueError(f"Invalid regex pattern '{pattern}': {str(e)}")
        return patterns
 class ArgumentConstraint(BaseModel):
    """Base class for all argument constraints"""
    description: Optional[str] = None
    error_message: Optional[str] = None
 class NumericConstraint(ArgumentConstraint):
    """Constraints for numeric values (int/float)"""
    min_value: Optional[float] = None
    max_value: Optional[float] = None
    include_min: bool = True  # True for >= min_value, False for > min_value
    include_max: bool = True  # True for <= max_value, False for < max_value
    @model_validator(mode='after')
    def validate_ranges(self) -> 'NumericConstraint':
        if self.min_value is not None and self.max_value is not None:
            if self.min_value > self.max_value:
                raise ValueError("min_value must be less than or equal to max_value")
        return self
    def validate(self, value: Union[int, float]) -> bool:
        if self.min_value is not None:
            if self.include_min and value < self.min_value:
                return False
            if not self.include_min and value <= self.min_value:
                return False
        if self.max_value is not None:
            if self.include_max and value > self.max_value:
                return False
            if not self.include_max and value >= self.max_value:
                return False
        return True
 class StringConstraint(ArgumentConstraint):
    """Constraints for string values"""
    min_length: Optional[int] = None
    max_length: Optional[int] = None
    patterns: Optional[List[str]] = None  # List of regex patterns to match
    pattern_match_all: bool = False  # If True, string must match all patterns
    forbidden_patterns: Optional[List[str]] = None  # List of regex patterns that must not match
    allow_empty: bool = False
    @field_validator('patterns', 'forbidden_patterns')
    @classmethod
    def validate_patterns(cls, v: Optional[List[str]]) -> Optional[List[str]]:
        if v is not None:
            # Validate each pattern compiles
            for pattern in v:
                try:
                    re.compile(pattern)
                except re.error as e:
                    raise ValueError(f"Invalid regex pattern '{pattern}': {str(e)}")
        return v
    def validate(self, value: str) -> bool:
        if not self.allow_empty and not value:
            return False
        if self.min_length is not None and len(value) < self.min_length:
            return False
        if self.max_length is not None and len(value) > self.max_length:
            return False
        if self.patterns:
            matches = [bool(re.search(pattern, value)) for pattern in self.patterns]
            if self.pattern_match_all and not all(matches):
                return False
            if not self.pattern_match_all and not any(matches):
                return False
        if self.forbidden_patterns:
            for pattern in self.forbidden_patterns:
                if re.search(pattern, value):
                    return False
        return True
 class DateConstraint(ArgumentConstraint):
    """Constraints for date values"""
    min_date: Optional[datetime] = None
    max_date: Optional[datetime] = None
    include_min: bool = True
    include_max: bool = True
    allowed_formats: Optional[List[str]] = None  # List of allowed date formats
    @model_validator(mode='after')
    def validate_ranges(self) -> 'DateConstraint':
        if self.min_date and self.max_date and self.min_date > self.max_date:
            raise ValueError("min_date must be less than or equal to max_date")
        return self
    def validate(self, value: datetime) -> bool:
        if self.min_date is not None:
            if self.include_min and value < self.min_date:
                return False
            if not self.include_min and value <= self.min_date:
                return False
        if self.max_date is not None:
            if self.include_max and value > self.max_date:
                return False
            if not self.include_max and value >= self.max_date:
                return False
        return True
 class EnumConstraint(ArgumentConstraint):
    """Constraints for enum values"""
    allowed_values: List[Any]
    case_sensitive: bool = True  # For string enums
    allow_multiple: bool = False  # If True, value can be a list of allowed values
    min_selections: Optional[int] = None  # When allow_multiple is True
    max_selections: Optional[int] = None  # When allow_multiple is True
    @model_validator(mode='after')
    def validate_selections(self) -> 'EnumConstraint':
        if self.allow_multiple:
            if self.min_selections is not None and self.max_selections is not None:
                if self.min_selections > self.max_selections:
                    raise ValueError("min_selections must be less than or equal to max_selections")
                if self.max_selections > len(self.allowed_values):
                    raise ValueError("max_selections cannot be greater than number of allowed values")
        return self
    def validate(self, value: Union[Any, List[Any]]) -> bool:
        if self.allow_multiple:
            if not isinstance(value, list):
                return False
            if self.min_selections is not None and len(value) < self.min_selections:
                return False
            if self.max_selections is not None and len(value) > self.max_selections:
                return False
            for v in value:
                if not self._validate_single_value(v):
                    return False
        else:
            return self._validate_single_value(value)
        return True
    def _validate_single_value(self, value: Any) -> bool:
        if isinstance(value, str) and not self.case_sensitive:
            return any(str(value).lower() == str(v).lower() for v in self.allowed_values)
        return value in self.allowed_values
 class ArgumentDefinition(BaseModel):
    """Defines an argument with its type and constraints"""
    name: str
    type: str
    description: Optional[str] = None
    required: bool = False
    default: Optional[Any] = None
    constraints: Optional[Union[NumericConstraint, StringConstraint, DateConstraint, EnumConstraint]] = None
    @field_validator('type')
    @classmethod
    def validate_type(cls, v: str) -> str:
        valid_types = ['string', 'integer', 'float', 'date', 'enum']
        if v not in valid_types:
            raise ValueError(f'type must be one of {valid_types}')
        return v
    @model_validator(mode='after')
    def validate_constraints(self) -> 'ArgumentDefinition':
        if self.constraints:
            expected_constraint_types = {
                'string': StringConstraint,
                'integer': NumericConstraint,
                'float': NumericConstraint,
                'date': DateConstraint,
                'enum': EnumConstraint
            }
            expected_type = expected_constraint_types.get(self.type)
            if not isinstance(self.constraints, expected_type):
                raise ValueError(f'Constraints for type {self.type} must be of type {expected_type.__name__}')
            if self.default is not None:
                if not self.constraints.validate(self.default):
                    raise ValueError(f'Default value does not satisfy constraints for {self.name}')
        return self
 class ArgumentDefinitions(BaseModel):
    """Collection of argument definitions"""
    arguments: Dict[str, ArgumentDefinition]
    @classmethod
    def from_dict(cls, data: Dict[str, Dict[str, Any]]) -> 'ArgumentDefinitions':
        return cls(arguments={
            arg_name: ArgumentDefinition(**arg_config)
            for arg_name, arg_config in data.items()
        })
    def to_dict(self) -> Dict[str, Dict[str, Any]]:
        return {
            arg_name: arg.model_dump(exclude_none=True)
            for arg_name, arg in self.arguments.items()
        }
    def validate_argument_values(self, values: Dict[str, Any]) -> Dict[str, str]:
        """
        Validate a set of argument values against their definitions
        Returns a dictionary of error messages for invalid arguments
        """
        errors = {}
        # Check for required arguments
        for name, arg_def in self.arguments.items():
            if arg_def.required and name not in values:
                errors[name] = "Required argument missing"
                continue
            if name in values:
                value = values[name]
                # Validate type
                try:
                    if arg_def.type == 'integer':
                        value = int(value)
                    elif arg_def.type == 'float':
                        value = float(value)
                    elif arg_def.type == 'date' and isinstance(value, str):
                        if arg_def.constraints and arg_def.constraints.allowed_formats:
                            for fmt in arg_def.constraints.allowed_formats:
                                try:
                                    value = datetime.strptime(value, fmt)
                                    break
                                except ValueError:
                                    continue
                            else:
                                errors[
                                    name] = f"Invalid date format. Allowed formats: {arg_def.constraints.allowed_formats}"
                                continue
                except (ValueError, TypeError):
                    errors[name] = f"Invalid type. Expected {arg_def.type}"
                    continue
                # Validate constraints
                if arg_def.constraints and not arg_def.constraints.validate(value):
                    errors[name] = arg_def.constraints.error_message or "Value does not satisfy constraints"
        return errors
@dataclass
 class DocumentationFormat:
    """Constants for documentation formats"""
    MARKDOWN = "markdown"
    JSON = "json"
    YAML = "yaml"
@dataclass
 class DocumentationVersion:
    """Constants for documentation versions"""
    BASIC = "basic"  # Original documentation without retriever info
    EXTENDED = "extended"  # Including retriever documentation
 def _generate_argument_constraints(field_config: Dict[str, Any]) -> List[Dict[str, Any]]:
    """Generate possible argument constraints based on field type"""
    constraints = []
    base_constraint = {
        "description": f"Constraint for {field_config.get('description', 'field')}",
        "error_message": "Optional custom error message"
    }
    if field_config["type"] == "integer" or field_config["type"] == "float":
        constraints.append({
            **base_constraint,
            "type": "NumericConstraint",
            "possible_constraints": {
                "min_value": "number",
                "max_value": "number",
                "include_min": "boolean",
                "include_max": "boolean"
            },
            "example": {
                "min_value": field_config.get("min_value", 0),
                "max_value": field_config.get("max_value", 100),
                "include_min": True,
                "include_max": True
            }
        })
    elif field_config["type"] == "string":
        constraints.append({
            **base_constraint,
            "type": "StringConstraint",
            "possible_constraints": {
                "min_length": "integer",
                "max_length": "integer",
                "patterns": "list[str]",
                "pattern_match_all": "boolean",
                "forbidden_patterns": "list[str]",
                "allow_empty": "boolean"
            },
            "example": {
                "min_length": 1,
                "max_length": 100,
                "patterns": ["^[A-Za-z0-9]+$"],
                "pattern_match_all": False,
                "forbidden_patterns": ["^test_", "_temp$"],
                "allow_empty": False
            }
        })
    elif field_config["type"] == "enum":
        constraints.append({
            **base_constraint,
            "type": "EnumConstraint",
            "possible_constraints": {
                "allowed_values": f"list[{field_config.get('allowed_values', ['value1', 'value2'])}]",
                "case_sensitive": "boolean",
                "allow_multiple": "boolean",
                "min_selections": "integer",
                "max_selections": "integer"
            },
            "example": {
                "allowed_values": field_config.get("allowed_values", ["value1", "value2"]),
                "case_sensitive": True,
                "allow_multiple": True,
                "min_selections": 1,
                "max_selections": 2
            }
        })
    elif field_config["type"] == "date":
        constraints.append({
            **base_constraint,
            "type": "DateConstraint",
            "possible_constraints": {
                "min_date": "datetime",
                "max_date": "datetime",
                "include_min": "boolean",
                "include_max": "boolean",
                "allowed_formats": "list[str]"
            },
            "example": {
                "min_date": "2024-01-01T00:00:00",
                "max_date": "2024-12-31T23:59:59",
                "include_min": True,
                "include_max": True,
                "allowed_formats": ["%Y-%m-%d", "%Y/%m/%d"]
            }
        })
    return constraints
 def generate_field_documentation(
        tagging_fields: Dict[str, Any],
        format: str = "markdown",
        version: str = "basic"
 ) -> str:
    """
    Generate documentation for tagging fields configuration.
    Args:
        tagging_fields: Dictionary containing tagging fields configuration
        format: Output format ("markdown", "json", or "yaml")
        version: Documentation version ("basic" or "extended")
    Returns:
        str: Formatted documentation
    """
    if version not in [DocumentationVersion.BASIC, DocumentationVersion.EXTENDED]:
        raise ValueError(f"Unsupported documentation version: {version}")
    # Normalize fields configuration
    normalized_fields = {}
    for field_name, field_config in tagging_fields.items():
        field_doc = {
            "name": field_name,
            "type": field_config["type"],
            "required": field_config.get("required", False),
            "description": field_config.get("description", "No description provided"),
            "constraints": []
        }
        # Only include possible arguments in extended version
        if version == DocumentationVersion.EXTENDED:
            field_doc["possible_arguments"] = _generate_argument_constraints(field_config)
        # Add type-specific constraints
        if field_config["type"] == "integer" or field_config["type"] == "float":
            if "min_value" in field_config:
                field_doc["constraints"].append(
                    f"Minimum value: {field_config['min_value']}")
            if "max_value" in field_config:
                field_doc["constraints"].append(
                    f"Maximum value: {field_config['max_value']}")
        elif field_config["type"] == "string":
            if "min_length" in field_config:
                field_doc["constraints"].append(
                    f"Minimum length: {field_config['min_length']}")
            if "max_length" in field_config:
                field_doc["constraints"].append(
                    f"Maximum length: {field_config['max_length']}")
            if "patterns" in field_config:
                field_doc["constraints"].append(
                    f"Must match patterns: {', '.join(field_config['patterns'])}")
        elif field_config["type"] == "enum":
            if "allowed_values" in field_config:
                field_doc["constraints"].append(
                    f"Allowed values: {', '.join(str(v) for v in field_config['allowed_values'])}")
        elif field_config["type"] == "date":
            if "min_date" in field_config:
                field_doc["constraints"].append(
                    f"Minimum date: {field_config['min_date']}")
            if "max_date" in field_config:
                field_doc["constraints"].append(
                    f"Maximum date: {field_config['max_date']}")
            if "allowed_formats" in field_config:
                field_doc["constraints"].append(
                    f"Allowed formats: {', '.join(field_config['allowed_formats'])}")
        normalized_fields[field_name] = field_doc
    # Generate documentation in requested format
    if format == DocumentationFormat.MARKDOWN:
        return _generate_markdown_docs(normalized_fields, version)
    elif format == DocumentationFormat.JSON:
        return _generate_json_docs(normalized_fields, version)
    elif format == DocumentationFormat.YAML:
        return _generate_yaml_docs(normalized_fields, version)
    else:
        raise ValueError(f"Unsupported documentation format: {format}")
 def _generate_markdown_docs(fields: Dict[str, Any], version: str) -> str:
    """Generate markdown documentation"""
    docs = ["# Tagging Fields Documentation\n"]
    # Add overview table
    docs.append("## Fields Overview\n")
    docs.append("| Field Name | Type | Required | Description |")
    docs.append("|------------|------|----------|-------------|")
    for field_name, field in fields.items():
        docs.append(
            f"| {field_name} | {field['type']} | "
            f"{'Yes' if field['required'] else 'No'} | {field['description']} |"
        )
    # Add detailed field specifications
    docs.append("\n## Detailed Field Specifications\n")
    for field_name, field in fields.items():
        docs.append(f"### {field_name}\n")
        docs.append(f"**Type:** {field['type']}")
        docs.append(f"**Required:** {'Yes' if field['required'] else 'No'}")
        docs.append(f"**Description:** {field['description']}\n")
        if field["constraints"]:
            docs.append("**Field Constraints:**")
            for constraint in field["constraints"]:
                docs.append(f"- {constraint}")
            docs.append("")
        # Add retriever argument documentation only in extended version
        if version == DocumentationVersion.EXTENDED and "possible_arguments" in field:
            docs.append("**Possible Retriever Arguments:**")
            for arg_constraint in field["possible_arguments"]:
                docs.append(f"\n*{arg_constraint['type']}*")
                docs.append(f"Description: {arg_constraint['description']}")
                docs.append("\nPossible constraints:")
                for const_name, const_type in arg_constraint["possible_constraints"].items():
                    docs.append(f"- `{const_name}`: {const_type}")
                docs.append("\nExample:")
                docs.append("```python")
                docs.append(json.dumps(arg_constraint["example"], indent=2))
                docs.append("```\n")
    # Add example retriever configuration only in extended version
    if version == DocumentationVersion.EXTENDED:
        docs.append("\n## Example Retriever Configuration\n")
        docs.append("```python")
        example_config = {
            "metadata_filters": {
                field_name: field["possible_arguments"][0]["example"]
                for field_name, field in fields.items()
                if "possible_arguments" in field
            }
        }
        docs.append(json.dumps(example_config, indent=2))
        docs.append("```")
    return "\n".join(docs)
 def _generate_json_docs(fields: Dict[str, Any], version: str) -> str:
    """Generate JSON documentation"""
    doc = {
        "tagging_fields_documentation": {
            "version": version,
            "fields": fields
        }
    }
    if version == DocumentationVersion.EXTENDED:
        doc["tagging_fields_documentation"]["example_retriever_config"] = {
            "metadata_filters": {
                field_name: field["possible_arguments"][0]["example"]
                for field_name, field in fields.items()
                if "possible_arguments" in field
            }
        }
    return json.dumps(doc, indent=2)
 def _generate_yaml_docs(fields: Dict[str, Any], version: str) -> str:
    """Generate YAML documentation"""
    doc = {
        "tagging_fields_documentation": {
            "version": version,
            "fields": fields
        }
    }
    if version == DocumentationVersion.EXTENDED:
        doc["tagging_fields_documentation"]["example_retriever_config"] = {
            "metadata_filters": {
                field_name: field["possible_arguments"][0]["example"]
                for field_name, field in fields.items()
                if "possible_arguments" in field
            }
        }
    return yaml.dump(doc, sort_keys=False, default_flow_style=False)
 def patterns_to_json(text_area_content: str) -> str:
    """Convert line-based patterns to JSON"""
    text_area_content = text_area_content.strip()
    if len(text_area_content) == 0:
        return json.dumps([])
    # Split on newlines and remove empty lines
    patterns = [line.strip() for line in text_area_content.split('\n') if line.strip()]
    return json.dumps(patterns)
 def json_to_patterns(json_content: str) -> str:
    """Convert JSON patterns list to text area content"""
    try:
        patterns = json.loads(json_content)
        if not isinstance(patterns, list):
            raise ValueError("JSON must contain a list of patterns")
        # Join with newlines
        return '\n'.join(patterns)
    except json.JSONDecodeError as e:
        raise ValueError(f"Invalid JSON format: {e}")
 def json_to_pattern_list(json_content: str) -> list:
    """Convert JSON patterns list to text area content"""
    try:
        patterns = json.loads(json_content)
        if not isinstance(patterns, list):
            raise ValueError("JSON must contain a list of patterns")
        # Unescape if needed
        patterns = [pattern.replace('\\\\', '\\') for pattern in patterns]
        return patterns
    except json.JSONDecodeError as e:
        raise ValueError(f"Invalid JSON format: {e}")
--- a/common/utils/cors_utils.py
+++ b/common/utils/cors_utils.py
@@ -1,14 +1,14 @@
 from flask import request, current_app, session
 from flask_jwt_extended import decode_token, verify_jwt_in_request, get_jwt_identity
 from common.models.user import Tenant, TenantDomain
 def get_allowed_origins(tenant_id):
    session_key = f"allowed_origins_{tenant_id}"
    if session_key in session:
        current_app.logger.debug(f"Fetching allowed origins for tenant {tenant_id} from session")
        return session[session_key]
    current_app.logger.debug(f"Fetching allowed origins for tenant {tenant_id} from database")
    tenant_domains = TenantDomain.query.filter_by(tenant_id=int(tenant_id)).all()
    allowed_origins = [domain.domain for domain in tenant_domains]
@@ -18,51 +18,52 @@ def get_allowed_origins(tenant_id):
 def cors_after_request(response, prefix):
    current_app.logger.debug(f'CORS after request: {request.path}, prefix: {prefix}')
    current_app.logger.debug(f'request.headers: {request.headers}')
    current_app.logger.debug(f'request.args: {request.args}')
    current_app.logger.debug(f'request is json?: {request.is_json}')
    # Exclude health checks from checks
    if request.path.startswith('/healthz') or request.path.startswith('/_healthz'):
        current_app.logger.debug('Skipping CORS headers for health checks')
        response.headers.add('Access-Control-Allow-Origin', '*')
        response.headers.add('Access-Control-Allow-Headers', '*')
        response.headers.add('Access-Control-Allow-Methods', '*')
        return response
    # Handle OPTIONS preflight requests
    if request.method == 'OPTIONS':
        response.headers.add('Access-Control-Allow-Origin', '*')
        response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization,X-Tenant-ID')
        response.headers.add('Access-Control-Allow-Methods', 'GET,POST,PUT,DELETE,OPTIONS')
        response.headers.add('Access-Control-Allow-Credentials', 'true')
        return response
    tenant_id = None
    allowed_origins = []
-    # Try to get tenant_id from JSON payload
+    # Check Socket.IO connection
-    json_data = request.get_json(silent=True)
+    if 'socket.io' in request.path:
-    current_app.logger.debug(f'request.get_json(silent=True): {json_data}')
+        token = request.args.get('token')
-
+        if token:
-    if json_data and 'tenant_id' in json_data:
+            try:
-        tenant_id = json_data['tenant_id']
+                decoded = decode_token(token)
                tenant_id = decoded['sub']
            except Exception as e:
                current_app.logger.error(f'Error decoding token: {e}')
                return response
    else:
-        # Fallback to get tenant_id from query parameters or headers if JSON is not available
+        # Regular API requests
-        tenant_id = request.args.get('tenant_id') or request.args.get('tenantId') or request.headers.get('X-Tenant-ID')
+        try:
-
+            if verify_jwt_in_request(optional=True):
-    current_app.logger.debug(f'Identified tenant_id: {tenant_id}')
+                tenant_id = get_jwt_identity()
        except Exception as e:
            current_app.logger.error(f'Error verifying JWT: {e}')
            return response
    if tenant_id:
        allowed_origins = get_allowed_origins(tenant_id)
        current_app.logger.debug(f'Allowed origins for tenant {tenant_id}: {allowed_origins}')
    else:
        current_app.logger.warning('tenant_id not found in request')
        origin = request.headers.get('Origin')
-    current_app.logger.debug(f'Origin: {origin}')
+        allowed_origins = get_allowed_origins(tenant_id)
        if origin in allowed_origins:
            response.headers.add('Access-Control-Allow-Origin', origin)
            response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization')
            response.headers.add('Access-Control-Allow-Methods', 'GET,POST,PUT,DELETE,OPTIONS')
            response.headers.add('Access-Control-Allow-Credentials', 'true')
        current_app.logger.debug(f'CORS headers set for origin: {origin}')
    else:
        current_app.logger.warning(f'Origin {origin} not allowed')
    return response
--- a/common/utils/debug_utils.py
+++ b/common/utils/debug_utils.py
@@ -1,6 +1,7 @@
 from flask import request, session
 import time
 from flask_security import current_user
 import json
 def log_request_middleware(app):
@@ -36,7 +37,7 @@ def log_request_middleware(app):
    @app.before_request
    def log_session_state_before():
-        app.logger.debug(f'Session state before request: {session.items()}')
+        pass
    # @app.after_request
    # def log_response_info(response):
@@ -58,5 +59,98 @@ def log_request_middleware(app):
    @app.after_request
    def log_session_state_after(response):
        app.logger.debug(f'Session state after request: {session.items()}')
        return response
 def register_request_debugger(app):
    @app.before_request
    def debug_request_info():
        """Log consolidated request information for debugging"""
        # Skip health check endpoints
        if request.path.startswith('/_healthz') or request.path.startswith('/healthz'):
            return
        # Gather all request information in a structured way
        debug_info = {
            "basic_info": {
                "method": request.method,
                "path": request.path,
                "content_type": request.content_type,
                "content_length": request.content_length
            },
            "environment": {
                "remote_addr": request.remote_addr,
                "user_agent": str(request.user_agent)
            }
        }
        # Add headers (excluding sensitive ones)
        safe_headers = {k: v for k, v in request.headers.items()
                       if k.lower() not in ('authorization', 'cookie', 'x-api-key')}
        debug_info["headers"] = safe_headers
        # Add authentication info (presence only)
        auth_header = request.headers.get('Authorization', '')
        debug_info["auth_info"] = {
            "has_auth_header": bool(auth_header),
            "auth_type": auth_header.split(' ')[0] if auth_header else None,
            "token_length": len(auth_header.split(' ')[1]) if auth_header and len(auth_header.split(' ')) > 1 else 0,
            "header_format": 'Valid format' if auth_header.startswith('Bearer ') else 'Invalid format',
            "raw_header": auth_header[:10] + '...' if auth_header else None  # Show first 10 chars only
        }
        # Add request data based on type
        if request.is_json:
            try:
                json_data = request.get_json()
                if isinstance(json_data, dict):
                    # Remove sensitive fields from logging
                    safe_json = {k: v for k, v in json_data.items()
                               if not any(sensitive in k.lower()
                                        for sensitive in ['password', 'token', 'secret', 'key'])}
                    debug_info["request_data"] = {
                        "type": "json",
                        "content": safe_json
                    }
            except Exception as e:
                debug_info["request_data"] = {
                    "type": "json",
                    "error": str(e)
                }
        elif request.form:
            safe_form = {k: v for k, v in request.form.items()
                        if not any(sensitive in k.lower()
                                 for sensitive in ['password', 'token', 'secret', 'key'])}
            debug_info["request_data"] = {
                "type": "form",
                "content": safe_form
            }
        # Add file information if present
        if request.files:
            debug_info["files"] = {
                name: {
                    "filename": f.filename,
                    "content_type": f.content_type,
                    "content_length": f.content_length if hasattr(f, 'content_length') else None
                }
                for name, f in request.files.items()
            }
        # Add CORS information if present
        cors_headers = {
            "origin": request.headers.get('Origin'),
            "request_method": request.headers.get('Access-Control-Request-Method'),
            "request_headers": request.headers.get('Access-Control-Request-Headers')
        }
        if any(cors_headers.values()):
            debug_info["cors"] = {k: v for k, v in cors_headers.items() if v is not None}
        # Format the debug info as a pretty-printed JSON string with indentation
        formatted_debug_info = json.dumps(debug_info, indent=2, sort_keys=True)
        # Log everything in a single statement
        app.logger.debug(
            "Request Debug Information\n",
            extra={"request_debug\n": formatted_debug_info}
        )
--- a/common/utils/document_utils.py
+++ b/common/utils/document_utils.py
@@ -3,27 +3,36 @@ from datetime import datetime as dt, timezone as tz
 from sqlalchemy import desc
 from sqlalchemy.exc import SQLAlchemyError
 from werkzeug.utils import secure_filename
-from common.models.document import Document, DocumentVersion
+from common.models.document import Document, DocumentVersion, Catalog
 from common.extensions import db, minio_client
 from common.utils.celery_utils import current_celery
 from flask import current_app
 from flask_security import current_user
 import requests
-from urllib.parse import urlparse, unquote
+from urllib.parse import urlparse, unquote, urlunparse
 import os
-from .eveai_exceptions import EveAIInvalidLanguageException, EveAIDoubleURLException, EveAIUnsupportedFileType
+from .eveai_exceptions import (EveAIInvalidLanguageException, EveAIDoubleURLException, EveAIUnsupportedFileType,
                               EveAIInvalidCatalog, EveAIInvalidDocument, EveAIInvalidDocumentVersion, EveAIException)
 from ..models.user import Tenant
 def create_document_stack(api_input, file, filename, extension, tenant_id):
    # Create the Document
    catalog_id = int(api_input.get('catalog_id'))
    catalog = Catalog.query.get(catalog_id)
    if not catalog:
        raise EveAIInvalidCatalog(tenant_id, catalog_id)
    new_doc = create_document(api_input, filename, catalog_id)
    db.session.add(new_doc)
    url = api_input.get('url', '')
    if url != '':
        url = cope_with_local_url(api_input.get('url', ''))
    # Create the DocumentVersion
    new_doc_vers = create_version_for_document(new_doc, tenant_id,
-                                               api_input.get('url', ''),
+                                               url,
                                               api_input.get('sub_file_type', ''),
                                               api_input.get('language', 'en'),
                                               api_input.get('user_context', ''),
                                               api_input.get('user_metadata'),
@@ -64,7 +73,8 @@ def create_document(form, filename, catalog_id):
    return new_doc
-def create_version_for_document(document, tenant_id, url, language, user_context, user_metadata, catalog_properties):
+def create_version_for_document(document, tenant_id, url, sub_file_type,  language, user_context, user_metadata,
                                catalog_properties):
    new_doc_vers = DocumentVersion()
    if url != '':
        new_doc_vers.url = url
@@ -83,6 +93,9 @@ def create_version_for_document(document, tenant_id, url, language, user_context
    if catalog_properties != '' and catalog_properties is not None:
        new_doc_vers.catalog_properties = catalog_properties
    if sub_file_type != '':
        new_doc_vers.sub_file_type = sub_file_type
    new_doc_vers.document = document
    set_logging_information(new_doc_vers, dt.now(tz.utc))
@@ -163,6 +176,8 @@ def get_extension_from_content_type(content_type):
 def process_url(url, tenant_id):
    url = cope_with_local_url(url)
    response = requests.head(url, allow_redirects=True)
    content_type = response.headers.get('Content-Type', '').split(';')[0]
@@ -194,38 +209,6 @@ def process_url(url, tenant_id):
    return file_content, filename, extension
 def process_multiple_urls(urls, tenant_id, api_input):
    results = []
    for url in urls:
        try:
            file_content, filename, extension = process_url(url, tenant_id)
            url_input = api_input.copy()
            url_input.update({
                'url': url,
                'name': f"{api_input['name']}-{filename}" if api_input['name'] else filename
            })
            new_doc, new_doc_vers = create_document_stack(url_input, file_content, filename, extension, tenant_id)
            task_id = start_embedding_task(tenant_id, new_doc_vers.id)
            results.append({
                'url': url,
                'document_id': new_doc.id,
                'document_version_id': new_doc_vers.id,
                'task_id': task_id,
                'status': 'success'
            })
        except Exception as e:
            current_app.logger.error(f"Error processing URL {url}: {str(e)}")
            results.append({
                'url': url,
                'status': 'error',
                'message': str(e)
            })
    return results
 def start_embedding_task(tenant_id, doc_vers_id):
    task = current_celery.send_task('create_embeddings',
                                    args=[tenant_id, doc_vers_id,],
@@ -236,14 +219,6 @@ def start_embedding_task(tenant_id, doc_vers_id):
    return task.id
 def validate_file_type(extension):
    current_app.logger.debug(f'Validating file type {extension}')
    current_app.logger.debug(f'Supported file types: {current_app.config["SUPPORTED_FILE_TYPES"]}')
    if extension not in current_app.config['SUPPORTED_FILE_TYPES']:
        raise EveAIUnsupportedFileType(f"Filetype {extension} is currently not supported. "
                                       f"Supported filetypes: {', '.join(current_app.config['SUPPORTED_FILE_TYPES'])}")
 def get_filename_from_url(url):
    parsed_url = urlparse(url)
    path_parts = parsed_url.path.split('/')
@@ -261,10 +236,15 @@ def get_documents_list(page, per_page):
    return pagination
-def edit_document(document_id, name, valid_from, valid_to):
+def edit_document(tenant_id, document_id, name, valid_from, valid_to):
-    doc = Document.query.get_or_404(document_id)
+    doc = Document.query.get(document_id)
    if not doc:
        raise EveAIInvalidDocument(tenant_id, document_id)
    if name:
        doc.name = name
    if valid_from:
        doc.valid_from = valid_from
    if valid_to:
        doc.valid_to = valid_to
    update_logging_information(doc, dt.now(tz.utc))
@@ -277,8 +257,10 @@ def edit_document(document_id, name, valid_from, valid_to):
        return None, str(e)
-def edit_document_version(version_id, user_context, catalog_properties):
+def edit_document_version(tenant_id, version_id, user_context, catalog_properties):
-    doc_vers = DocumentVersion.query.get_or_404(version_id)
+    doc_vers = DocumentVersion.query.get(version_id)
    if not doc_vers:
        raise EveAIInvalidDocumentVersion(tenant_id, version_id)
    doc_vers.user_context = user_context
    doc_vers.catalog_properties = catalog_properties
    update_logging_information(doc_vers, dt.now(tz.utc))
@@ -293,15 +275,17 @@ def edit_document_version(version_id, user_context, catalog_properties):
 def refresh_document_with_info(doc_id, tenant_id, api_input):
-    doc = Document.query.get_or_404(doc_id)
+    doc = Document.query.get(doc_id)
    if not doc:
        raise EveAIInvalidDocument(tenant_id, doc_id)
    old_doc_vers = DocumentVersion.query.filter_by(doc_id=doc_id).order_by(desc(DocumentVersion.id)).first()
    if not old_doc_vers.url:
        return None, "This document has no URL. Only documents with a URL can be refreshed."
    new_doc_vers = create_version_for_document(
        doc, tenant_id,
        old_doc_vers.url,
        old_doc_vers.sub_file_type,
        api_input.get('language', old_doc_vers.language),
        api_input.get('user_context', old_doc_vers.user_context),
        api_input.get('user_metadata', old_doc_vers.user_metadata),
@@ -317,11 +301,12 @@ def refresh_document_with_info(doc_id, tenant_id, api_input):
        db.session.rollback()
        return None, str(e)
-    response = requests.head(old_doc_vers.url, allow_redirects=True)
+    url = cope_with_local_url(old_doc_vers.url)
    response = requests.head(url, allow_redirects=True)
    content_type = response.headers.get('Content-Type', '').split(';')[0]
    extension = get_extension_from_content_type(content_type)
-    response = requests.get(old_doc_vers.url)
+    response = requests.get(url)
    response.raise_for_status()
    file_content = response.content
@@ -357,3 +342,124 @@ def mark_tenant_storage_dirty(tenant_id):
    db.session.commit()
 def cope_with_local_url(url):
    current_app.logger.debug(f'Incomming URL: {url}')
    parsed_url = urlparse(url)
    # Check if this is an internal WordPress URL (TESTING) and rewrite it
    if parsed_url.netloc in [current_app.config['EXTERNAL_WORDPRESS_BASE_URL']]:
        parsed_url = parsed_url._replace(
            scheme=current_app.config['WORDPRESS_PROTOCOL'],
            netloc=f"{current_app.config['WORDPRESS_HOST']}:{current_app.config['WORDPRESS_PORT']}"
        )
        url = urlunparse(parsed_url)
        current_app.logger.debug(f'Translated Wordpress URL to: {url}')
    return url
 def lookup_document(tenant_id: int, lookup_criteria: dict, metadata_type: str) -> tuple[Document, DocumentVersion]:
    """
    Look up a document using metadata criteria
    Args:
        tenant_id: ID of the tenant
        lookup_criteria: Dictionary of key-value pairs to match in metadata
        metadata_type: Which metadata to search in ('user_metadata' or 'system_metadata')
    Returns:
        Tuple of (Document, DocumentVersion) if found
    Raises:
        ValueError: If invalid metadata_type provided
        EveAIException: If lookup fails
    """
    if metadata_type not in ['user_metadata', 'system_metadata']:
        raise ValueError(f"Invalid metadata_type: {metadata_type}")
    try:
        # Query for the latest document version matching the criteria
        query = (db.session.query(Document, DocumentVersion)
                 .join(DocumentVersion)
                 .filter(Document.id == DocumentVersion.doc_id)
                 .order_by(DocumentVersion.id.desc()))
        # Add metadata filtering using PostgreSQL JSONB operators
        metadata_field = getattr(DocumentVersion, metadata_type)
        for key, value in lookup_criteria.items():
            query = query.filter(metadata_field[key].astext == str(value))
        # Get first result
        result = query.first()
        if not result:
            raise EveAIException(
                f"No document found matching criteria in {metadata_type}",
                status_code=404
            )
        return result
    except SQLAlchemyError as e:
        current_app.logger.error(f'Database error during document lookup for tenant {tenant_id}: {e}')
        raise EveAIException(
            "Database error during document lookup",
            status_code=500
        )
    except Exception as e:
        current_app.logger.error(f'Error during document lookup for tenant {tenant_id}: {e}')
        raise EveAIException(
            "Error during document lookup",
            status_code=500
        )
 # Add to common/utils/document_utils.py
 def refresh_document_with_content(doc_id: int, tenant_id: int, file_content: bytes, api_input: dict) -> tuple:
    """
    Refresh document with new content
    Args:
        doc_id: Document ID
        tenant_id: Tenant ID
        file_content: New file content
        api_input: Additional document information
    Returns:
        Tuple of (new_version, task_id)
    """
    doc = Document.query.get(doc_id)
    if not doc:
        raise EveAIInvalidDocument(tenant_id, doc_id)
    old_doc_vers = DocumentVersion.query.filter_by(doc_id=doc_id).order_by(desc(DocumentVersion.id)).first()
    # Create new version with same file type as original
    extension = old_doc_vers.file_type
    new_doc_vers = create_version_for_document(
        doc, tenant_id,
        '',  # No URL for content-based updates
        old_doc_vers.sub_file_type,
        api_input.get('language', old_doc_vers.language),
        api_input.get('user_context', old_doc_vers.user_context),
        api_input.get('user_metadata', old_doc_vers.user_metadata),
        api_input.get('catalog_properties', old_doc_vers.catalog_properties),
    )
    try:
        db.session.add(new_doc_vers)
        db.session.commit()
    except SQLAlchemyError as e:
        db.session.rollback()
        return None, str(e)
    # Upload new content
    upload_file_for_version(new_doc_vers, file_content, extension, tenant_id)
    # Start embedding task
    task = current_celery.send_task('create_embeddings', args=[tenant_id, new_doc_vers.id], queue='embeddings')
    current_app.logger.info(f'Embedding creation started for document {doc_id} on version {new_doc_vers.id} '
                            f'with task id: {task.id}.')
    return new_doc_vers, task.id
--- a/common/utils/eveai_exceptions.py
+++ b/common/utils/eveai_exceptions.py
@@ -10,8 +10,12 @@ class EveAIException(Exception):
    def to_dict(self):
        rv = dict(self.payload or ())
        rv['message'] = self.message
        rv['error'] = self.__class__.__name__
        return rv
    def __str__(self):
        return self.message  # Return the message when the exception is converted to a string
 class EveAIInvalidLanguageException(EveAIException):
    """Raised when an invalid language is provided"""
@@ -41,3 +45,83 @@ class EveAINoLicenseForTenant(EveAIException):
        super().__init__(message, status_code, payload)
 class EveAITenantNotFound(EveAIException):
    """Raised when a tenant is not found"""
    def __init__(self, tenant_id, status_code=400, payload=None):
        self.tenant_id = tenant_id
        message = f"Tenant {tenant_id} not found"
        super().__init__(message, status_code, payload)
 class EveAITenantInvalid(EveAIException):
    """Raised when a tenant is invalid"""
    def __init__(self, tenant_id, status_code=400, payload=None):
        self.tenant_id = tenant_id
        # Construct the message dynamically
        message = f"Tenant with ID '{tenant_id}' is not valid. Please contact the System Administrator."
        super().__init__(message, status_code, payload)
 class EveAINoActiveLicense(EveAIException):
    """Raised when a tenant has no active licenses"""
    def __init__(self, tenant_id, status_code=400, payload=None):
        self.tenant_id = tenant_id
        # Construct the message dynamically
        message = f"Tenant with ID '{tenant_id}' has no active licenses. Please contact the System Administrator."
        super().__init__(message, status_code, payload)
 class EveAIInvalidCatalog(EveAIException):
    """Raised when a catalog cannot be found"""
    def __init__(self, tenant_id, catalog_id, status_code=400, payload=None):
        self.tenant_id = tenant_id
        self.catalog_id = catalog_id
        # Construct the message dynamically
        message = f"Tenant with ID '{tenant_id}' has no valid catalog with ID {catalog_id}. Please contact the System Administrator."
        super().__init__(message, status_code, payload)
 class EveAIInvalidProcessor(EveAIException):
    """Raised when no valid processor can be found for a given Catalog ID"""
    def __init__(self, tenant_id, catalog_id, file_type, status_code=400, payload=None):
        self.tenant_id = tenant_id
        self.catalog_id = catalog_id
        self.file_type = file_type
        # Construct the message dynamically
        message = (f"Tenant with ID '{tenant_id}' has no valid {file_type} processor for catalog with ID {catalog_id}. "
                   f"Please contact the System Administrator.")
        super().__init__(message, status_code, payload)
 class EveAIInvalidDocument(EveAIException):
    """Raised when a tenant has no document with given ID"""
    def __init__(self, tenant_id, document_id, status_code=400, payload=None):
        self.tenant_id = tenant_id
        self.document_id = document_id
        # Construct the message dynamically
        message = f"Tenant with ID '{tenant_id}' has no document with ID {document_id}."
        super().__init__(message, status_code, payload)
 class EveAIInvalidDocumentVersion(EveAIException):
    """Raised when a tenant has no document version with given ID"""
    def __init__(self, tenant_id, document_version_id, status_code=400, payload=None):
        self.tenant_id = tenant_id
        self.document_version_id = document_version_id
        # Construct the message dynamically
        message = f"Tenant with ID '{tenant_id}' has no document version with ID {document_version_id}."
        super().__init__(message, status_code, payload)
 class EveAISocketInputException(EveAIException):
    """Raised when a socket call receives an invalid payload"""
    def __init__(self, message, status_code=400, payload=None):
        super.__init__(message, status_code, payload)
--- a/common/utils/middleware.py
+++ b/common/utils/middleware.py
@@ -24,9 +24,6 @@ def mw_before_request():
    if not tenant_id:
        raise Exception('Cannot switch schema for tenant: no tenant defined in session')
    for role in current_user.roles:
        current_app.logger.debug(f'In middleware: User {current_user.email} has role {role.name}')
    # user = User.query.get(current_user.id)
    if current_user.has_role('Super User') or current_user.tenant_id == tenant_id:
        Database(tenant_id).switch_schema()
--- a/common/utils/model_utils.py
+++ b/common/utils/model_utils.py
@@ -1,249 +1,36 @@
 import os
 from typing import Dict, Any, Optional
 import langcodes
 from flask import current_app
 from langchain_openai import OpenAIEmbeddings, ChatOpenAI
 from langchain_anthropic import ChatAnthropic
 from langchain_core.pydantic_v1 import BaseModel, Field
 from typing import List, Any, Iterator
 from collections.abc import MutableMapping
 from openai import OpenAI
 from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL
 from portkey_ai.langchain.portkey_langchain_callback_handler import LangchainCallbackHandler
 from common.langchain.llm_metrics_handler import LLMMetricsHandler
 from common.langchain.templates.template_manager import TemplateManager
 from langchain_openai import OpenAIEmbeddings, ChatOpenAI, OpenAI
 from langchain_anthropic import ChatAnthropic
 from flask import current_app
 from datetime import datetime as dt, timezone as tz
 from common.langchain.tracked_openai_embeddings import TrackedOpenAIEmbeddings
-from common.langchain.tracked_transcribe import tracked_transcribe
+from common.langchain.tracked_transcription import TrackedOpenAITranscription
 from common.models.document import EmbeddingSmallOpenAI, EmbeddingLargeOpenAI, Catalog
 from common.models.user import Tenant
 from common.utils.cache.base import CacheHandler
 from config.model_config import MODEL_CONFIG
-from common.utils.business_event_context import current_event
+from common.extensions import template_manager, cache_manager
 from common.models.document import EmbeddingLargeOpenAI, EmbeddingSmallOpenAI
 from common.utils.eveai_exceptions import EveAITenantNotFound
-class CitedAnswer(BaseModel):
+def create_language_template(template: str, language: str) -> str:
-    """Default docstring - to be replaced with actual prompt"""
+    """
    Replace language placeholder in template with specified language
-    answer: str = Field(
+    Args:
-        ...,
+        template: Template string with {language} placeholder
-        description="The answer to the user question, based on the given sources",
+        language: Language code to insert
    )
    citations: List[int] = Field(
        ...,
        description="The integer IDs of the SPECIFIC sources that were used to generate the answer"
    )
    insufficient_info: bool = Field(
        False,  # Default value is set to False
        description="A boolean indicating wether given sources were sufficient or not to generate the answer"
    )
-
+    Returns:
-def set_language_prompt_template(cls, language_prompt):
+        str: Template with language placeholder replaced
-    cls.__doc__ = language_prompt
+    """
 class ModelVariables(MutableMapping):
    def __init__(self, tenant: Tenant, catalog_id=None):
        self.tenant = tenant
        self.catalog_id = catalog_id
        self._variables = self._initialize_variables()
        self._embedding_model = None
        self._llm = None
        self._llm_no_rag = None
        self._transcription_client = None
        self._prompt_templates = {}
        self._embedding_db_model = None
        self.llm_metrics_handler = LLMMetricsHandler()
        self._transcription_client = None
    def _initialize_variables(self):
        variables = {}
        # Get the Catalog if catalog_id is passed
        if self.catalog_id:
            catalog = Catalog.query.get_or_404(self.catalog_id)
            # We initialize the variables that are available knowing the tenant.
            variables['embed_tuning'] = catalog.embed_tuning or False
            # Set HTML Chunking Variables
            variables['html_tags'] = catalog.html_tags
            variables['html_end_tags'] = catalog.html_end_tags
            variables['html_included_elements'] = catalog.html_included_elements
            variables['html_excluded_elements'] = catalog.html_excluded_elements
            variables['html_excluded_classes'] = catalog.html_excluded_classes
            # Set Chunk Size variables
            variables['min_chunk_size'] = catalog.min_chunk_size
            variables['max_chunk_size'] = catalog.max_chunk_size
        # Set the RAG Context (will have to change once specialists are defined
        variables['rag_context'] = self.tenant.rag_context or " "
        # Temporary setting until we have Specialists
        variables['rag_tuning'] = False
        variables['RAG_temperature'] = 0.3
        variables['no_RAG_temperature'] = 0.5
        variables['k'] = 8
        variables['similarity_threshold'] = 0.4
        # Set model providers
        variables['embedding_provider'], variables['embedding_model'] = self.tenant.embedding_model.rsplit('.', 1)
        variables['llm_provider'], variables['llm_model'] = self.tenant.llm_model.rsplit('.', 1)
        variables["templates"] = current_app.config['PROMPT_TEMPLATES'][(f"{variables['llm_provider']}."
                                                                         f"{variables['llm_model']}")]
        current_app.logger.info(f"Loaded prompt templates: \n")
        current_app.logger.info(f"{variables['templates']}")
        # Set model-specific configurations
        model_config = MODEL_CONFIG.get(variables['llm_provider'], {}).get(variables['llm_model'], {})
        variables.update(model_config)
        variables['annotation_chunk_length'] = current_app.config['ANNOTATION_TEXT_CHUNK_LENGTH'][self.tenant.llm_model]
        if variables['tool_calling_supported']:
            variables['cited_answer_cls'] = CitedAnswer
        variables['max_compression_duration'] = current_app.config['MAX_COMPRESSION_DURATION']
        variables['max_transcription_duration'] = current_app.config['MAX_TRANSCRIPTION_DURATION']
        variables['compression_cpu_limit'] = current_app.config['COMPRESSION_CPU_LIMIT']
        variables['compression_process_delay'] = current_app.config['COMPRESSION_PROCESS_DELAY']
        return variables
    @property
    def embedding_model(self):
        api_key = os.getenv('OPENAI_API_KEY')
        model = self._variables['embedding_model']
        self._embedding_model = TrackedOpenAIEmbeddings(api_key=api_key,
                                                        model=model,
                                                        )
        self._embedding_db_model = EmbeddingSmallOpenAI \
            if model == 'text-embedding-3-small' \
            else EmbeddingLargeOpenAI
        return self._embedding_model
    @property
    def llm(self):
        api_key = self.get_api_key_for_llm()
        self._llm = ChatOpenAI(api_key=api_key,
                               model=self._variables['llm_model'],
                               temperature=self._variables['RAG_temperature'],
                               callbacks=[self.llm_metrics_handler])
        return self._llm
    @property
    def llm_no_rag(self):
        api_key = self.get_api_key_for_llm()
        self._llm_no_rag = ChatOpenAI(api_key=api_key,
                                      model=self._variables['llm_model'],
                                      temperature=self._variables['RAG_temperature'],
                                      callbacks=[self.llm_metrics_handler])
        return self._llm_no_rag
    def get_api_key_for_llm(self):
        if self._variables['llm_provider'] == 'openai':
            api_key = os.getenv('OPENAI_API_KEY')
        else:  # self._variables['llm_provider'] == 'anthropic'
            api_key = os.getenv('ANTHROPIC_API_KEY')
        return api_key
    @property
    def transcription_client(self):
        api_key = os.getenv('OPENAI_API_KEY')
        self._transcription_client = OpenAI(api_key=api_key, )
        self._variables['transcription_model'] = 'whisper-1'
        return self._transcription_client
    def transcribe(self, *args, **kwargs):
        return tracked_transcribe(self._transcription_client, *args, **kwargs)
    @property
    def embedding_db_model(self):
        if self._embedding_db_model is None:
            self._embedding_db_model = self.get_embedding_db_model()
        return self._embedding_db_model
    def get_embedding_db_model(self):
        current_app.logger.debug("In get_embedding_db_model")
        if self._embedding_db_model is None:
            self._embedding_db_model = EmbeddingSmallOpenAI \
                if self._variables['embedding_model'] == 'text-embedding-3-small' \
                else EmbeddingLargeOpenAI
        current_app.logger.debug(f"Embedding DB Model: {self._embedding_db_model}")
        return self._embedding_db_model
    def get_prompt_template(self, template_name: str) -> str:
        current_app.logger.info(f"Getting prompt template for {template_name}")
        if template_name not in self._prompt_templates:
            self._prompt_templates[template_name] = self._load_prompt_template(template_name)
        return self._prompt_templates[template_name]
    def _load_prompt_template(self, template_name: str) -> str:
        # In the future, this method will make an API call to Portkey
        # For now, we'll simulate it with a placeholder implementation
        # You can replace this with your current prompt loading logic
        return self._variables['templates'][template_name]
    def __getitem__(self, key: str) -> Any:
        current_app.logger.debug(f"ModelVariables: Getting {key}")
        # Support older template names (suffix = _template)
        if key.endswith('_template'):
            key = key[:-len('_template')]
            current_app.logger.debug(f"ModelVariables: Getting modified {key}")
        if key == 'embedding_model':
            return self.embedding_model
        elif key == 'embedding_db_model':
            return self.embedding_db_model
        elif key == 'llm':
            return self.llm
        elif key == 'llm_no_rag':
            return self.llm_no_rag
        elif key == 'transcription_client':
            return self.transcription_client
        elif key in self._variables.get('prompt_templates', []):
            return self.get_prompt_template(key)
        else:
            value = self._variables.get(key)
            if value is not None:
                return value
            else:
                raise KeyError(f'Variable {key} does not exist in ModelVariables')
    def __setitem__(self, key: str, value: Any) -> None:
        self._variables[key] = value
    def __delitem__(self, key: str) -> None:
        del self._variables[key]
    def __iter__(self) -> Iterator[str]:
        return iter(self._variables)
    def __len__(self):
        return len(self._variables)
    def get(self, key: str, default: Any = None) -> Any:
        return self.__getitem__(key) or default
    def update(self, **kwargs) -> None:
        self._variables.update(kwargs)
    def items(self):
        return self._variables.items()
    def keys(self):
        return self._variables.keys()
    def values(self):
        return self._variables.values()
 def select_model_variables(tenant, catalog_id=None):
    model_variables = ModelVariables(tenant=tenant, catalog_id=catalog_id)
    return model_variables
 def create_language_template(template, language):
    try:
        full_language = langcodes.Language.make(language=language)
        language_template = template.replace('{language}', full_language.display_name())
@@ -253,5 +40,249 @@ def create_language_template(template, language):
    return language_template
-def replace_variable_in_template(template, variable, value):
+def replace_variable_in_template(template: str, variable: str, value: str) -> str:
-    return template.replace(variable, value)
+    """
    Replace a variable placeholder in template with specified value
    Args:
        template: Template string with variable placeholder
        variable: Variable placeholder to replace (e.g. "{tenant_context}")
        value: Value to insert
    Returns:
        str: Template with variable placeholder replaced
    """
    return template.replace(variable, value or "")
 class ModelVariables:
    """Manages model-related variables and configurations"""
    def __init__(self, tenant_id: int, variables: Dict[str, Any] = None):
        """
        Initialize ModelVariables with tenant and optional template manager
        Args:
            tenant: Tenant instance
            template_manager: Optional TemplateManager instance
        """
        current_app.logger.info(f'Model variables initialized with tenant {tenant_id} and variables \n{variables}')
        self.tenant_id = tenant_id
        self._variables = variables if variables is not None else self._initialize_variables()
        current_app.logger.info(f'Model _variables initialized to {self._variables}')
        self._embedding_model = None
        self._embedding_model_class = None
        self._llm_instances = {}
        self.llm_metrics_handler = LLMMetricsHandler()
        self._transcription_model = None
    def _initialize_variables(self) -> Dict[str, Any]:
        """Initialize the variables dictionary"""
        variables = {}
        tenant = Tenant.query.get(self.tenant_id)
        if not tenant:
            raise EveAITenantNotFound(self.tenant_id)
        # Set model providers
        variables['embedding_provider'], variables['embedding_model'] = tenant.embedding_model.split('.')
        variables['llm_provider'], variables['llm_model'] = tenant.llm_model.split('.')
        variables['llm_full_model'] = tenant.llm_model
        # Set model-specific configurations
        model_config = MODEL_CONFIG.get(variables['llm_provider'], {}).get(variables['llm_model'], {})
        variables.update(model_config)
        # Additional configurations
        variables['annotation_chunk_length'] = current_app.config['ANNOTATION_TEXT_CHUNK_LENGTH'][tenant.llm_model]
        variables['max_compression_duration'] = current_app.config['MAX_COMPRESSION_DURATION']
        variables['max_transcription_duration'] = current_app.config['MAX_TRANSCRIPTION_DURATION']
        variables['compression_cpu_limit'] = current_app.config['COMPRESSION_CPU_LIMIT']
        variables['compression_process_delay'] = current_app.config['COMPRESSION_PROCESS_DELAY']
        return variables
    @property
    def embedding_model(self):
        """Get the embedding model instance"""
        if self._embedding_model is None:
            api_key = os.getenv('OPENAI_API_KEY')
            self._embedding_model = TrackedOpenAIEmbeddings(
                api_key=api_key,
                model=self._variables['embedding_model']
            )
        return self._embedding_model
    @property
    def embedding_model_class(self):
        """Get the embedding model class"""
        if self._embedding_model_class is None:
            if self._variables['embedding_model'] == 'text-embedding-3-large':
                self._embedding_model_class = EmbeddingLargeOpenAI
            else:   # text-embedding-3-small
                self._embedding_model_class = EmbeddingSmallOpenAI
        return self._embedding_model_class
    @property
    def annotation_chunk_length(self):
        return self._variables['annotation_chunk_length']
    @property
    def max_compression_duration(self):
        return self._variables['max_compression_duration']
    @property
    def max_transcription_duration(self):
        return self._variables['max_transcription_duration']
    @property
    def compression_cpu_limit(self):
        return self._variables['compression_cpu_limit']
    @property
    def compression_process_delay(self):
        return self._variables['compression_process_delay']
    def get_llm(self, temperature: float = 0.3, **kwargs) -> Any:
        """
        Get an LLM instance with specific configuration
        Args:
            temperature: The temperature for the LLM
            **kwargs: Additional configuration parameters
        Returns:
            An instance of the configured LLM
        """
        cache_key = f"{temperature}_{hash(frozenset(kwargs.items()))}"
        if cache_key not in self._llm_instances:
            provider = self._variables['llm_provider']
            model = self._variables['llm_model']
            if provider == 'openai':
                self._llm_instances[cache_key] = ChatOpenAI(
                    api_key=os.getenv('OPENAI_API_KEY'),
                    model=model,
                    temperature=temperature,
                    callbacks=[self.llm_metrics_handler],
                    **kwargs
                )
            elif provider == 'anthropic':
                self._llm_instances[cache_key] = ChatAnthropic(
                    api_key=os.getenv('ANTHROPIC_API_KEY'),
                    model=current_app.config['ANTHROPIC_LLM_VERSIONS'][model],
                    temperature=temperature,
                    callbacks=[self.llm_metrics_handler],
                    **kwargs
                )
            else:
                raise ValueError(f"Unsupported LLM provider: {provider}")
        return self._llm_instances[cache_key]
    @property
    def transcription_model(self) -> TrackedOpenAITranscription:
        """Get the transcription model instance"""
        if self._transcription_model is None:
            api_key = os.getenv('OPENAI_API_KEY')
            self._transcription_model = TrackedOpenAITranscription(
                api_key=api_key,
                model='whisper-1'
            )
        return self._transcription_model
    # Remove the old transcription-related methods since they're now handled by TrackedOpenAITranscription
    @property
    def transcription_client(self):
        raise DeprecationWarning("Use transcription_model instead")
    def transcribe(self, *args, **kwargs):
        raise DeprecationWarning("Use transcription_model.transcribe() instead")
    def get_template(self, template_name: str, version: Optional[str] = None) -> str:
        """
        Get a template for the tenant's configured LLM
        Args:
            template_name: Name of the template to retrieve
            version: Optional specific version to retrieve
        Returns:
            The template content
        """
        try:
            template = template_manager.get_template(
                self._variables['llm_full_model'],
                template_name,
                version
            )
            return template.content
        except Exception as e:
            current_app.logger.error(f"Error getting template {template_name}: {str(e)}")
            # Fall back to old template loading if template_manager fails
            if template_name in self._variables.get('templates', {}):
                return self._variables['templates'][template_name]
            raise
 class ModelVariablesCacheHandler(CacheHandler[ModelVariables]):
    handler_name = 'model_vars_cache'  # Used to access handler instance from cache_manager
    def __init__(self, region):
        super().__init__(region, 'model_variables')
        self.configure_keys('tenant_id')
        self.subscribe_to_model('Tenant', ['tenant_id'])
    def to_cache_data(self, instance: ModelVariables) -> Dict[str, Any]:
        return {
            'tenant_id': instance.tenant_id,
            'variables': instance._variables,
            'last_updated': dt.now(tz=tz.utc).isoformat()
        }
    def from_cache_data(self, data: Dict[str, Any], tenant_id: int, **kwargs) -> ModelVariables:
        instance = ModelVariables(tenant_id, data.get('variables'))
        return instance
    def should_cache(self, value: Dict[str, Any]) -> bool:
        required_fields = {'tenant_id', 'variables'}
        return all(field in value for field in required_fields)
 # Register the handler with the cache manager
 cache_manager.register_handler(ModelVariablesCacheHandler, 'eveai_model')
 # Helper function to get cached model variables
 def get_model_variables(tenant_id: int) -> ModelVariables:
    return cache_manager.model_vars_cache.get(
        lambda tenant_id: ModelVariables(tenant_id),    # function to create ModelVariables if required
        tenant_id=tenant_id
    )
 # Written in a long format, without lambda
 # def get_model_variables(tenant_id: int) -> ModelVariables:
 #     """
 #     Get ModelVariables instance, either from cache or newly created
 #
 #     Args:
 #         tenant_id: The tenant's ID
 #
 #     Returns:
 #         ModelVariables: Instance with either cached or fresh data
 #
 #     Raises:
 #         TenantNotFoundError: If tenant doesn't exist
 #         CacheStateError: If cached data is invalid
 #     """
 #
 #     def create_new_instance(tenant_id: int) -> ModelVariables:
 #         """Creator function that's called when cache miss occurs"""
 #         return ModelVariables(tenant_id)  # This will initialize fresh variables
 #
 #     return cache_manager.model_vars_cache.get(
 #         create_new_instance,  # Function to create new instance if needed
 #         tenant_id=tenant_id  # Parameters passed to both get() and create_new_instance
 #     )
--- a/common/utils/os_utils.py
+++ b/common/utils/os_utils.py
@@ -1,4 +1,6 @@
 import os
 import sys
 import gevent
 import time
 from flask import current_app
@@ -28,3 +30,17 @@ def sync_folder(file_path):
    dir_fd = os.open(file_path, os.O_RDONLY)
    os.fsync(dir_fd)
    os.close(dir_fd)
 def get_project_root():
    """Get the root directory of the project."""
    # Use the module that's actually running (not this file)
    module = sys.modules['__main__']
    if hasattr(module, '__file__'):
        # Get the path to the main module
        main_path = os.path.abspath(module.__file__)
        # Get the root directory (where the main module is located)
        return os.path.dirname(main_path)
    else:
        # Fallback: use current working directory
        return os.getcwd()
--- a/common/utils/security.py
+++ b/common/utils/security.py
@@ -1,10 +1,15 @@
 from flask import session, current_app
 from sqlalchemy import and_
 from common.models.user import Tenant
 from common.models.entitlements import License
 from common.utils.database import Database
 from common.utils.eveai_exceptions import EveAITenantNotFound, EveAITenantInvalid, EveAINoActiveLicense
 from datetime import datetime as dt, timezone as tz
 # Definition of Trigger Handlers
 def set_tenant_session_data(sender, user, **kwargs):
    current_app.logger.debug(f"Setting tenant session data for user {user.id}")
    tenant = Tenant.query.filter_by(id=user.tenant_id).first()
    session['tenant'] = tenant.to_dict()
    session['default_language'] = tenant.default_language
@@ -17,3 +22,24 @@ def clear_tenant_session_data(sender, user, **kwargs):
    session.pop('default_language', None)
    session.pop('default_embedding_model', None)
    session.pop('default_llm_model', None)
 def is_valid_tenant(tenant_id):
    if tenant_id == 1:  # The 'root' tenant, is always valid
        return True
    tenant = Tenant.query.get(tenant_id)
    Database(tenant).switch_schema()
    if tenant is None:
        raise EveAITenantNotFound()
    elif tenant.type == 'Inactive':
        raise EveAITenantInvalid(tenant_id)
    else:
        current_date = dt.now(tz=tz.utc).date()
        active_license = (License.query.filter_by(tenant_id=tenant_id)
                          .filter(and_(License.start_date <= current_date,
                                       License.end_date >= current_date))
                          .one_or_none())
        if not active_license:
            raise EveAINoActiveLicense(tenant_id)
    return True
--- a/common/utils/security_utils.py
+++ b/common/utils/security_utils.py
@@ -11,7 +11,7 @@ def confirm_token(token, expiration=3600):
    try:
        email = serializer.loads(token, salt=current_app.config['SECURITY_PASSWORD_SALT'], max_age=expiration)
    except Exception as e:
-        current_app.logger.debug(f'Error confirming token: {e}')
+        current_app.logger.error(f'Error confirming token: {e}')
        raise
    return email
@@ -35,14 +35,11 @@ def generate_confirmation_token(email):
 def send_confirmation_email(user):
    current_app.logger.debug(f'Sending confirmation email to {user.email}')
    if not test_smtp_connection():
        raise Exception("Failed to connect to SMTP server")
    token = generate_confirmation_token(user.email)
    confirm_url = prefixed_url_for('security_bp.confirm_email', token=token, _external=True)
    current_app.logger.debug(f'Confirmation URL: {confirm_url}')
    html = render_template('email/activate.html', confirm_url=confirm_url)
    subject = "Please confirm your email"
@@ -56,10 +53,8 @@ def send_confirmation_email(user):
 def send_reset_email(user):
    current_app.logger.debug(f'Sending reset email to {user.email}')
    token = generate_reset_token(user.email)
    reset_url = prefixed_url_for('security_bp.reset_password', token=token, _external=True)
    current_app.logger.debug(f'Reset URL: {reset_url}')
    html = render_template('email/reset_password.html', reset_url=reset_url)
    subject = "Reset Your Password"
@@ -98,4 +93,3 @@ def test_smtp_connection():
    except Exception as e:
        current_app.logger.error(f"Failed to connect to SMTP server: {str(e)}")
        return False
--- a/common/utils/simple_encryption.py
+++ b/common/utils/simple_encryption.py
@@ -4,7 +4,7 @@ from flask import Flask
 def generate_api_key(prefix="EveAI-Chat"):
-    parts = [str(random.randint(1000, 9999)) for _ in range(5)]
+    parts = [str(random.randint(1000, 9999)) for _ in range(8)]
    return f"{prefix}-{'-'.join(parts)}"
--- a/common/utils/string_list_converter.py
+++ b/common/utils/string_list_converter.py
@@ -0,0 +1,112 @@
 from typing import List, Union
 import re
 class StringListConverter:
    """Utility class for converting between comma-separated strings and lists"""
    @staticmethod
    def string_to_list(input_string: Union[str, None], allow_empty: bool = True) -> List[str]:
        """
        Convert a comma-separated string to a list of strings.
        Args:
            input_string: Comma-separated string to convert
            allow_empty: If True, returns empty list for None/empty input
                        If False, raises ValueError for None/empty input
        Returns:
            List of stripped strings
        Raises:
            ValueError: If input is None/empty and allow_empty is False
        """
        if not input_string:
            if allow_empty:
                return []
            raise ValueError("Input string cannot be None or empty")
        return [item.strip() for item in input_string.split(',') if item.strip()]
    @staticmethod
    def list_to_string(input_list: Union[List[str], None], allow_empty: bool = True) -> str:
        """
        Convert a list of strings to a comma-separated string.
        Args:
            input_list: List of strings to convert
            allow_empty: If True, returns empty string for None/empty input
                        If False, raises ValueError for None/empty input
        Returns:
            Comma-separated string
        Raises:
            ValueError: If input is None/empty and allow_empty is False
        """
        if not input_list:
            if allow_empty:
                return ''
            raise ValueError("Input list cannot be None or empty")
        return ', '.join(str(item).strip() for item in input_list)
    @staticmethod
    def validate_format(input_string: str,
                        allowed_chars: str = r'a-zA-Z0-9_\-',
                        min_length: int = 1,
                        max_length: int = 50) -> bool:
        """
        Validate the format of items in a comma-separated string.
        Args:
            input_string: String to validate
            allowed_chars: String of allowed characters (for regex pattern)
            min_length: Minimum length for each item
            max_length: Maximum length for each item
        Returns:
            bool: True if format is valid, False otherwise
        """
        if not input_string:
            return False
        # Create regex pattern for individual items
        pattern = f'^[{allowed_chars}]{{{min_length},{max_length}}}$'
        try:
            # Convert to list and check each item
            items = StringListConverter.string_to_list(input_string)
            return all(bool(re.match(pattern, item)) for item in items)
        except Exception:
            return False
    @staticmethod
    def validate_and_convert(input_string: str,
                             allowed_chars: str = r'a-zA-Z0-9_\-',
                             min_length: int = 1,
                             max_length: int = 50) -> List[str]:
        """
        Validate and convert a comma-separated string to a list.
        Args:
            input_string: String to validate and convert
            allowed_chars: String of allowed characters (for regex pattern)
            min_length: Minimum length for each item
            max_length: Maximum length for each item
        Returns:
            List of validated and converted strings
        Raises:
            ValueError: If input string format is invalid
        """
        if not StringListConverter.validate_format(
                input_string, allowed_chars, min_length, max_length
        ):
            raise ValueError(
                f"Invalid format. Items must be {min_length}-{max_length} characters "
                f"long and contain only these characters: {allowed_chars}"
            )
        return StringListConverter.string_to_list(input_string)
--- a/common/utils/token_validation.py
+++ b/common/utils/token_validation.py
@@ -0,0 +1,60 @@
 from dataclasses import dataclass
 from typing import Optional
 from datetime import datetime
 from flask_jwt_extended import decode_token, verify_jwt_in_request
 from flask import current_app
@dataclass
 class TokenValidationResult:
    """Clean, simple validation result"""
    is_valid: bool
    tenant_id: Optional[int] = None
    error_message: Optional[str] = None
 class TokenValidator:
    """Simplified token validator focused on JWT validation"""
    def validate_token(self, token: str) -> TokenValidationResult:
        """
        Validate JWT token
        Args:
            token: The JWT token to validate
        Returns:
            TokenValidationResult with validation status and tenant_id if valid
        """
        try:
            # Decode and validate token
            decoded_token = decode_token(token)
            # Extract tenant_id from token subject
            tenant_id = decoded_token.get('sub')
            if not tenant_id:
                return TokenValidationResult(
                    is_valid=False,
                    error_message="Missing tenant ID in token"
                )
            # Verify token timestamps
            now = datetime.utcnow().timestamp()
            if not (decoded_token.get('exp', 0) > now >= decoded_token.get('nbf', 0)):
                return TokenValidationResult(
                    is_valid=False,
                    error_message="Token expired or not yet valid"
                )
            # Token is valid
            return TokenValidationResult(
                is_valid=True,
                tenant_id=tenant_id
            )
        except Exception as e:
            current_app.logger.error(f"Token validation error: {str(e)}")
            return TokenValidationResult(
                is_valid=False,
                error_message=str(e)
            )
--- a/common/utils/view_assistants.py
+++ b/common/utils/view_assistants.py
@@ -44,7 +44,7 @@ def form_validation_failed(request, form):
        for fieldName, errorMessages in form.errors.items():
            for err in errorMessages:
                flash(f"Error in {fieldName}: {err}", 'danger')
-                current_app.logger.debug(f"Error in {fieldName}: {err}")
+                current_app.logger.error(f"Error in {fieldName}: {err}")
 def form_to_dict(form):
--- a/config/config.py
+++ b/config/config.py
@@ -1,3 +1,4 @@
 import os
 from os import environ, path
 from datetime import timedelta
 import redis
@@ -54,7 +55,6 @@ class Config(object):
    # file upload settings
    MAX_CONTENT_LENGTH = 50 * 1024 * 1024
    UPLOAD_EXTENSIONS = ['.txt', '.pdf', '.png', '.jpg', '.jpeg', '.gif']
    # supported languages
    SUPPORTED_LANGUAGES = ['en', 'fr', 'nl', 'de', 'es']
@@ -68,9 +68,6 @@ class Config(object):
    ANTHROPIC_LLM_VERSIONS = {'claude-3-5-sonnet': 'claude-3-5-sonnet-20240620', }
    # Load prompt templates dynamically
    PROMPT_TEMPLATES = {model: load_prompt_templates(model) for model in SUPPORTED_LLMS}
    # Annotation text chunk length
    ANNOTATION_TEXT_CHUNK_LENGTH = {
        'openai.gpt-4o': 10000,
@@ -87,9 +84,6 @@ class Config(object):
    # Anthropic API Keys
    ANTHROPIC_API_KEY = environ.get('ANTHROPIC_API_KEY')
    # Portkey API Keys
    PORTKEY_API_KEY = environ.get('PORTKEY_API_KEY')
    # Celery settings
    CELERY_TASK_SERIALIZER = 'json'
    CELERY_RESULT_SERIALIZER = 'json'
@@ -138,16 +132,16 @@ class Config(object):
    MAIL_USE_SSL = True
    MAIL_USERNAME = environ.get('MAIL_USERNAME')
    MAIL_PASSWORD = environ.get('MAIL_PASSWORD')
-    MAIL_DEFAULT_SENDER = ('eveAI Admin', MAIL_USERNAME)
+    MAIL_DEFAULT_SENDER = ('Evie', MAIL_USERNAME)
    # Email settings for API key notifications
    PROMOTIONAL_IMAGE_URL = 'https://askeveai.com/wp-content/uploads/2024/07/Evie-Call-scaled.jpg'  # Replace with your actual URL
    # Langsmith settings
    LANGCHAIN_TRACING_V2 = True
    LANGCHAIN_ENDPOINT = 'https://api.smith.langchain.com'
    LANGCHAIN_PROJECT = "eveai"
    SUPPORTED_FILE_TYPES = ['pdf', 'html', 'md', 'txt', 'mp3', 'mp4', 'ogg', 'srt']
    TENANT_TYPES = ['Active', 'Demo', 'Inactive', 'Test']
    # The maximum number of seconds allowed for audio compression (to save resources)
@@ -159,6 +153,13 @@ class Config(object):
    # Delay between compressing chunks in seconds
    COMPRESSION_PROCESS_DELAY = 1
    # WordPress Integration Settings
    WORDPRESS_PROTOCOL = os.environ.get('WORDPRESS_PROTOCOL', 'http')
    WORDPRESS_HOST = os.environ.get('WORDPRESS_HOST', 'host.docker.internal')
    WORDPRESS_PORT = os.environ.get('WORDPRESS_PORT', '10003')
    WORDPRESS_BASE_URL = f"{WORDPRESS_PROTOCOL}://{WORDPRESS_HOST}:{WORDPRESS_PORT}"
    EXTERNAL_WORDPRESS_BASE_URL = 'localhost:10003'
 class DevConfig(Config):
    DEVELOPMENT = True
@@ -181,13 +182,21 @@ class DevConfig(Config):
    # file upload settings
    # UPLOAD_FOLDER = '/app/tenant_files'
    # Redis Settings
    REDIS_URL = 'redis'
    REDIS_PORT = '6379'
    REDIS_BASE_URI = f'redis://{REDIS_URL}:{REDIS_PORT}'
    # Celery settings
    # eveai_app Redis Settings
-    CELERY_BROKER_URL = 'redis://redis:6379/0'
+    CELERY_BROKER_URL = f'{REDIS_BASE_URI}/0'
-    CELERY_RESULT_BACKEND = 'redis://redis:6379/0'
+    CELERY_RESULT_BACKEND = f'{REDIS_BASE_URI}/0'
    # eveai_chat Redis Settings
-    CELERY_BROKER_URL_CHAT = 'redis://redis:6379/3'
+    CELERY_BROKER_URL_CHAT = f'{REDIS_BASE_URI}/3'
-    CELERY_RESULT_BACKEND_CHAT = 'redis://redis:6379/3'
+    CELERY_RESULT_BACKEND_CHAT = f'{REDIS_BASE_URI}/3'
    # eveai_chat_workers cache Redis Settings
    CHAT_WORKER_CACHE_URL = f'{REDIS_BASE_URI}/4'
    # Unstructured settings
    # UNSTRUCTURED_API_KEY = 'pDgCrXumYhM3CNvjvwV8msMldXC3uw'
@@ -195,7 +204,7 @@ class DevConfig(Config):
    # UNSTRUCTURED_FULL_URL = 'https://flowitbv-16c4us0m.api.unstructuredapp.io/general/v0/general'
    # SocketIO settings
-    SOCKETIO_MESSAGE_QUEUE = 'redis://redis:6379/1'
+    SOCKETIO_MESSAGE_QUEUE = f'{REDIS_BASE_URI}/1'
    SOCKETIO_CORS_ALLOWED_ORIGINS = '*'
    SOCKETIO_LOGGER = True
    SOCKETIO_ENGINEIO_LOGGER = True
@@ -211,7 +220,7 @@ class DevConfig(Config):
    GC_CRYPTO_KEY = 'envelope-encryption-key'
    # Session settings
-    SESSION_REDIS = redis.from_url('redis://redis:6379/2')
+    SESSION_REDIS = redis.from_url(f'{REDIS_BASE_URI}/2')
    # PATH settings
    ffmpeg_path = '/usr/bin/ffmpeg'
@@ -278,6 +287,8 @@ class ProdConfig(Config):
    # eveai_chat Redis Settings
    CELERY_BROKER_URL_CHAT = f'{REDIS_BASE_URI}/3'
    CELERY_RESULT_BACKEND_CHAT = f'{REDIS_BASE_URI}/3'
    # eveai_chat_workers cache Redis Settings
    CHAT_WORKER_CACHE_URL = f'{REDIS_BASE_URI}/4'
    # Session settings
    SESSION_REDIS = redis.from_url(f'{REDIS_BASE_URI}/2')
--- a/config/logging_config.py
+++ b/config/logging_config.py
@@ -1,4 +1,8 @@
 import json
 import os
 from datetime import datetime as dt, timezone as tz
 from flask import current_app
 from graypy import GELFUDPHandler
 import logging
 import logging.config
@@ -9,24 +13,173 @@ GRAYLOG_PORT = int(os.environ.get('GRAYLOG_PORT', 12201))
 env = os.environ.get('FLASK_ENV', 'development')
-class CustomLogRecord(logging.LogRecord):
+class TuningLogRecord(logging.LogRecord):
    """Extended LogRecord that handles both tuning and business event logging"""
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # Initialize extra fields after parent initialization
        self._extra_fields = {}
        self._is_tuning_log = False
        self._tuning_type = None
        self._tuning_tenant_id = None
        self._tuning_catalog_id = None
        self._tuning_specialist_id = None
        self._tuning_retriever_id = None
        self._tuning_processor_id = None
        self.component = os.environ.get('COMPONENT_NAME', 'eveai_app')
-    def __setattr__(self, name, value):
+    def getMessage(self):
-        if name not in {'event_type', 'tenant_id', 'trace_id', 'span_id', 'span_name', 'parent_span_id',
+        """
-                        'document_version_id', 'chat_session_id', 'interaction_id', 'environment'}:
+        Override getMessage to handle both string and dict messages
-            super().__setattr__(name, value)
+        """
        msg = self.msg
        if self.args:
            msg = msg % self.args
        return msg
    @property
    def is_tuning_log(self):
        return self._is_tuning_log
    @is_tuning_log.setter
    def is_tuning_log(self, value):
        object.__setattr__(self, '_is_tuning_log', value)
    @property
    def tuning_type(self):
        return self._tuning_type
    @tuning_type.setter
    def tuning_type(self, value):
        object.__setattr__(self, '_tuning_type', value)
    def get_tuning_data(self):
        """Get all tuning-related data if this is a tuning log"""
        if not self._is_tuning_log:
            return {}
        return {
            'is_tuning_log': self._is_tuning_log,
            'tuning_type': self._tuning_type,
            'tuning_tenant_id': self._tuning_tenant_id,
            'tuning_catalog_id': self._tuning_catalog_id,
            'tuning_specialist_id': self._tuning_specialist_id,
            'tuning_retriever_id': self._tuning_retriever_id,
            'tuning_processor_id': self._tuning_processor_id,
        }
    def set_tuning_data(self, tenant_id=None, catalog_id=None, specialist_id=None,
                        retriever_id=None, processor_id=None):
        """Set tuning-specific data"""
        object.__setattr__(self, '_tuning_tenant_id', tenant_id)
        object.__setattr__(self, '_tuning_catalog_id', catalog_id)
        object.__setattr__(self, '_tuning_specialist_id', specialist_id)
        object.__setattr__(self, '_tuning_retriever_id', retriever_id)
        object.__setattr__(self, '_tuning_processor_id', processor_id)
-def custom_log_record_factory(*args, **kwargs):
+class TuningFormatter(logging.Formatter):
-    record = CustomLogRecord(*args, **kwargs)
+    """Universal formatter for all tuning logs"""
-    return record
+
    def __init__(self, fmt=None, datefmt=None):
        super().__init__(fmt or '%(asctime)s [%(levelname)s] %(name)s: %(message)s',
                         datefmt or '%Y-%m-%d %H:%M:%S')
    def format(self, record):
        # First format with the default formatter to handle basic fields
        formatted_msg = super().format(record)
        # If this is a tuning log, add the additional context
        if getattr(record, 'is_tuning_log', False):
            try:
                identifiers = []
                if hasattr(record, 'tenant_id') and record.tenant_id:
                    identifiers.append(f"Tenant: {record.tenant_id}")
                if hasattr(record, 'catalog_id') and record.catalog_id:
                    identifiers.append(f"Catalog: {record.catalog_id}")
                if hasattr(record, 'processor_id') and record.processor_id:
                    identifiers.append(f"Processor: {record.processor_id}")
                formatted_msg = (
                    f"{formatted_msg}\n"
                    f"[TUNING {record.tuning_type}] [{' | '.join(identifiers)}]"
                )
                if hasattr(record, 'tuning_data') and record.tuning_data:
                    formatted_msg += f"\nData: {json.dumps(record.tuning_data, indent=2)}"
            except Exception as e:
                return f"{formatted_msg} (Error formatting tuning data: {str(e)})"
        return formatted_msg
 class GraylogFormatter(logging.Formatter):
    """Maintains existing Graylog formatting while adding tuning fields"""
    def format(self, record):
        if getattr(record, 'is_tuning_log', False):
            # Add tuning-specific fields to Graylog
            record.tuning_fields = {
                'is_tuning_log': True,
                'tuning_type': record.tuning_type,
                'tenant_id': record.tenant_id,
                'catalog_id': record.catalog_id,
                'specialist_id': record.specialist_id,
                'retriever_id': record.retriever_id,
                'processor_id': record.processor_id,
            }
        return super().format(record)
 class TuningLogger:
    """Helper class to manage tuning logs with consistent structure"""
    def __init__(self, logger_name, tenant_id=None, catalog_id=None, specialist_id=None, retriever_id=None, processor_id=None):
        self.logger = logging.getLogger(logger_name)
        self.tenant_id = tenant_id
        self.catalog_id = catalog_id
        self.specialist_id = specialist_id
        self.retriever_id = retriever_id
        self.processor_id = processor_id
    def log_tuning(self, tuning_type: str, message: str, data=None, level=logging.DEBUG):
        """Log a tuning event with structured data"""
        try:
            # Create a standard LogRecord for tuning
            record = logging.LogRecord(
                name=self.logger.name,
                level=level,
                pathname='',
                lineno=0,
                msg=message,
                args=(),
                exc_info=None
            )
            # Add tuning-specific attributes
            record.is_tuning_log = True
            record.tuning_type = tuning_type
            record.tenant_id = self.tenant_id
            record.catalog_id = self.catalog_id
            record.specialist_id = self.specialist_id
            record.retriever_id = self.retriever_id
            record.processor_id = self.processor_id
            if data:
                record.tuning_data = data
            # Process the record
            self.logger.handle(record)
        except Exception as e:
            fallback_logger = logging.getLogger('eveai_workers')
            fallback_logger.exception(f"Failed to log tuning message: {str(e)}")
 # Set the custom log record factory
-logging.setLogRecordFactory(custom_log_record_factory)
+logging.setLogRecordFactory(TuningLogRecord)
 LOGGING = {
@@ -38,7 +191,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_app.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_workers': {
@@ -46,7 +199,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_workers.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_chat': {
@@ -54,7 +207,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_chat.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_chat_workers': {
@@ -62,7 +215,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_chat_workers.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_api': {
@@ -70,7 +223,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_api.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_beat': {
@@ -78,7 +231,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_beat.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_entitlements': {
@@ -86,7 +239,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/eveai_entitlements.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_sqlalchemy': {
@@ -94,7 +247,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/sqlalchemy.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_mailman': {
@@ -102,7 +255,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/mailman.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_security': {
@@ -110,7 +263,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/security.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_rag_tuning': {
@@ -118,7 +271,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/rag_tuning.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_embed_tuning': {
@@ -126,7 +279,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/embed_tuning.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'file_business_events': {
@@ -134,7 +287,7 @@ LOGGING = {
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/business_events.log',
            'maxBytes': 1024 * 1024 * 1,  # 1MB
-            'backupCount': 10,
+            'backupCount': 2,
            'formatter': 'standard',
        },
        'console': {
@@ -142,25 +295,38 @@ LOGGING = {
            'level': 'DEBUG',
            'formatter': 'standard',
        },
        'tuning_file': {
            'level': 'DEBUG',
            'class': 'logging.handlers.RotatingFileHandler',
            'filename': 'logs/tuning.log',
            'maxBytes': 1024 * 1024 * 3,  # 3MB
            'backupCount': 3,
            'formatter': 'tuning',
        },
        'graylog': {
            'level': 'DEBUG',
            'class': 'graypy.GELFUDPHandler',
            'host': GRAYLOG_HOST,
            'port': GRAYLOG_PORT,
-            'debugging_fields': True,  # Set to True if you want to include debugging fields
+            'debugging_fields': True,
-            'extra_fields': True,  # Set to True if you want to include extra fields
+            'formatter': 'graylog'
        },
    },
    'formatters': {
        'standard': {
-            'format': '%(asctime)s [%(levelname)s] %(name)s (%(component)s) [%(module)s:%(lineno)d in %(funcName)s] '
+            'format': '%(asctime)s [%(levelname)s] %(name)s (%(component)s) [%(module)s:%(lineno)d]: %(message)s',
-                      '[Thread: %(threadName)s]: %(message)s'
+            'datefmt': '%Y-%m-%d %H:%M:%S'
        },
        'graylog': {
            'format': '[%(levelname)s] %(name)s (%(component)s) [%(module)s:%(lineno)d in %(funcName)s] '
                      '[Thread: %(threadName)s]: %(message)s',
            'datefmt': '%Y-%m-%d %H:%M:%S',
            '()': GraylogFormatter
        },
        'tuning': {
            '()': TuningFormatter,
            'datefmt': '%Y-%m-%d %H:%M:%S UTC'
        }
    },
    'loggers': {
        'eveai_app': {  # logger for the eveai_app
@@ -213,21 +379,17 @@ LOGGING = {
            'level': 'DEBUG',
            'propagate': False
        },
        'rag_tuning': {  # logger for the rag_tuning
            'handlers': ['file_rag_tuning', 'graylog', ] if env == 'production' else ['file_rag_tuning', ],
            'level': 'DEBUG',
            'propagate': False
        },
        'embed_tuning': {  # logger for the embed_tuning
            'handlers': ['file_embed_tuning', 'graylog', ] if env == 'production' else ['file_embed_tuning', ],
            'level': 'DEBUG',
            'propagate': False
        },
        'business_events': {
            'handlers': ['file_business_events', 'graylog'],
            'level': 'DEBUG',
            'propagate': False
        },
        # Single tuning logger
        'tuning': {
            'handlers': ['tuning_file', 'graylog'] if env == 'production' else ['tuning_file'],
            'level': 'DEBUG',
            'propagate': False,
        },
        '': {  # root logger
            'handlers': ['console'],
            'level': 'WARNING',  # Set higher level for root to minimize noise
--- a/config/prompts/anthropic/claude-3-5-sonnet.yaml
+++ b/config/prompts/anthropic/claude-3-5-sonnet.yaml
@@ -1,88 +0,0 @@
 html_parse: |
  You are a top administrative assistant specialized in transforming given HTML into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  # Best practices are:
  - Respect wordings and language(s) used in the HTML.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - Sub-headers can be used as lists. This is true when a header is followed by a series of sub-headers without content (paragraphs or listed items). Present those sub-headers as a list.  
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input html file. Answer with the pure markdown, without any other text.
  HTML is between triple backticks.
  ```{html}```  
 pdf_parse: |
  You are a top administrative aid specialized in transforming given PDF-files into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  # Best practices are:
  - Respect wordings and language(s) used in the PDF.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - When headings are numbered, show the numbering and define the header level. 
  - A new item is started when a <return> is found before a full line is reached. In order to know the number of characters in a line, please check the document and the context within the document (e.g. an image could limit the number of characters temporarily).
  - Paragraphs are to be stripped of newlines so they become easily readable.
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input pdf content. Answer with the pure markdown, without any other text.
  PDF content is between triple backticks.
  ```{pdf_content}```
 summary: |
  Write a concise summary of the text in {language}. The text is delimited between triple backticks.
  ```{text}```
 rag: |
  Answer the question based on the following context, delimited between triple backticks. 
  {tenant_context}
  Use the following {language} in your communication, and cite the sources used.
  If the question cannot be answered using the given context, say "I have insufficient information to answer this question."
  Context:
  ```{context}```
  Question:
  {question}
 history: |
  You are a helpful assistant that details a question based on a previous context,
  in such a way that the question is understandable without the previous context. 
  The context is a conversation history, with the HUMAN asking questions, the AI answering questions.
  The history is delimited between triple backticks.
  You answer by stating the question in {language}.
  History:
  ```{history}```
  Question to be detailed:
  {question}
 encyclopedia: |
  You have a lot of background knowledge, and as such you are some kind of 
  'encyclopedia' to explain general terminology. Only answer if you have a clear understanding of the question. 
  If not, say you do not have sufficient information to answer the question. Use the {language} in your communication.
  Question:
  {question}
 transcript: |
  """You are a top administrative assistant specialized in transforming given transcriptions into markdown formatted files. Your task is to process and improve the given transcript, not to summarize it.
  IMPORTANT INSTRUCTIONS:
  1. DO NOT summarize the transcript and don't make your own interpretations. Return the FULL, COMPLETE transcript with improvements.
  2. Improve any errors in the transcript based on context.
  3. Respect the original wording and language(s) used in the transcription. Main Language used is {language}.
  4. Divide the transcript into paragraphs for better readability. Each paragraph ONLY contains ORIGINAL TEXT.
  5. Group related paragraphs into logical sections.
  6. Add appropriate headers (using markdown syntax) to each section in {language}.
  7. We do not need an overall title. Just add logical headers
  8. Ensure that the entire transcript is included in your response, from start to finish.
  REMEMBER: 
  - Your output should be the complete transcript in markdown format, NOT A SUMMARY OR ANALYSIS. 
  - Include EVERYTHING from the original transcript, just organized and formatted better.
  - Just return the markdown version of the transcript, without any other text such as an introduction or a summary.
  Here is the transcript to process (between triple backticks):
  ```{transcript}```
  Process this transcript according to the instructions above and return the full, formatted markdown version.
  """
--- a/config/prompts/openai/gpt-4o-mini.yaml
+++ b/config/prompts/openai/gpt-4o-mini.yaml
@@ -1,79 +0,0 @@
 html_parse: |
  You are a top administrative assistant specialized in transforming given HTML into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  # Best practices are:
  - Respect wordings and language(s) used in the HTML.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - Sub-headers can be used as lists. This is true when a header is followed by a series of sub-headers without content (paragraphs or listed items). Present those sub-headers as a list.  
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input html file. Answer with the pure markdown, without any other text.
  HTML is between triple backquotes.
  ```{html}```  
 pdf_parse: |
  You are a top administrative aid specialized in transforming given PDF-files into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  # Best practices are:
  - Respect wordings and language(s) used in the PDF.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - When headings are numbered, show the numbering and define the header level. 
  - A new item is started when a <return> is found before a full line is reached. In order to know the number of characters in a line, please check the document and the context within the document (e.g. an image could limit the number of characters temporarily).
  - Paragraphs are to be stripped of newlines so they become easily readable.
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input pdf content. Answer with the pure markdown, without any other text.
  PDF content is between triple backquotes.
  ```{pdf_content}```
 summary: |
  Write a concise summary of the text in {language}. The text is delimited between triple backquotes.
  ```{text}```
 rag: |
  Answer the question based on the following context, delimited between triple backquotes. 
  {tenant_context}
  Use the following {language} in your communication, and cite the sources used.
  If the question cannot be answered using the given context, say "I have insufficient information to answer this question."
  Context:
  ```{context}```
  Question:
  {question}
 history: |
  You are a helpful assistant that details a question based on a previous context,
  in such a way that the question is understandable without the previous context. 
  The context is a conversation history, with the HUMAN asking questions, the AI answering questions.
  The history is delimited between triple backquotes.
  You answer by stating the question in {language}.
  History:
  ```{history}```
  Question to be detailed:
  {question}
 encyclopedia: |
  You have a lot of background knowledge, and as such you are some kind of 
  'encyclopedia' to explain general terminology. Only answer if you have a clear understanding of the question. 
  If not, say you do not have sufficient information to answer the question. Use the {language} in your communication.
  Question:
  {question}
 transcript: |
  You are a top administrative assistant specialized in transforming given transcriptions into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system. The transcriptions originate from podcast, videos and similar material.
  # Best practices and steps are:
  - Respect wordings and language(s) used in the transcription. Main language is {language}.
  - Sometimes, the transcript contains speech of several people participating in a conversation. Although these are not obvious from reading the file, try to detect when other people are speaking.    
  - Divide the transcript into several logical parts. Ensure questions and their answers are in the same logical part.
  - annotate the text to identify these logical parts using headings in {language}.
  - improve errors in the transcript given the context, but do not change the meaning and intentions of the transcription.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of processing the complete input transcription. Answer with the pure markdown, without any other text.
  The transcript is between triple backquotes.
  ```{transcript}```
--- a/config/prompts/openai/gpt-4o.yaml
+++ b/config/prompts/openai/gpt-4o.yaml
@@ -1,84 +0,0 @@
 html_parse: |
  You are a top administrative assistant specialized in transforming given HTML into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  # Best practices are:
  - Respect wordings and language(s) used in the HTML.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - Sub-headers can be used as lists. This is true when a header is followed by a series of sub-headers without content (paragraphs or listed items). Present those sub-headers as a list.  
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input html file. Answer with the pure markdown, without any other text.
  HTML is between triple backquotes.
  ```{html}```  
 pdf_parse: |
  You are a top administrative aid specialized in transforming given PDF-files into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  The content you get is already processed (some markdown already generated), but needs to be corrected. For large files, you may receive only portions of the full file. Consider this when processing the content.
  # Best practices are:
  - Respect wordings and language(s) used in the provided content.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - When headings are numbered, show the numbering and define the header level. You may have to correct current header levels, as preprocessing is known to make errors.
  - A new item is started when a <return> is found before a full line is reached. In order to know the number of characters in a line, please check the document and the context within the document (e.g. an image could limit the number of characters temporarily).
  - Paragraphs are to be stripped of newlines so they become easily readable.
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input pdf content. Answer with the pure markdown, without any other text.
  PDF content is between triple backquotes.
  ```{pdf_content}```
 summary: |
  Write a concise summary of the text in {language}. The text is delimited between triple backquotes.
  ```{text}```
 rag: |
  Answer the question based on the following context, delimited between triple backquotes. 
  {tenant_context}
  Use the following {language} in your communication, and cite the sources used.
  If the question cannot be answered using the given context, say "I have insufficient information to answer this question."
  Context:
  ```{context}```
  Question:
  {question}
 history: |
  You are a helpful assistant that details a question based on a previous context,
  in such a way that the question is understandable without the previous context. 
  The context is a conversation history, with the HUMAN asking questions, the AI answering questions.
  The history is delimited between triple backquotes.
  You answer by stating the question in {language}.
  History:
  ```{history}```
  Question to be detailed:
  {question}
 encyclopedia: |
  You have a lot of background knowledge, and as such you are some kind of 
  'encyclopedia' to explain general terminology. Only answer if you have a clear understanding of the question. 
  If not, say you do not have sufficient information to answer the question. Use the {language} in your communication.
  Question:
  {question}
 transcript: |
  You are a top administrative assistant specialized in transforming given transcriptions into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system. The transcriptions originate from podcast, videos and similar material.
  You may receive information in different chunks. If you're not receiving the first chunk, you'll get the last part of the previous chunk, including it's title in between triple $. Consider this last part and the title as the start of the new chunk.
  # Best practices and steps are:
  - Respect wordings and language(s) used in the transcription. Main language is {language}.
  - Sometimes, the transcript contains speech of several people participating in a conversation. Although these are not obvious from reading the file, try to detect when other people are speaking.    
  - Divide the transcript into several logical parts. Ensure questions and their answers are in the same logical part. Don't make logical parts too small. They should contain at least 7 or 8 sentences.
  - annotate the text to identify these logical parts using headings in {language}.
  - improve errors in the transcript given the context, but do not change the meaning and intentions of the transcription.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of processing the complete input transcription. Answer with the pure markdown, without any other text.
  The transcript is between triple backquotes.
  $$${previous_part}$$$
  ```{transcript}```
--- a/config/prompts/openai/gpt-4o/encyclopedia/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/encyclopedia/1.0.0.yaml
@@ -0,0 +1,12 @@
 version: "1.0.0"
 content: |
  You have a lot of background knowledge, and as such you are some kind of 
  'encyclopedia' to explain general terminology. Only answer if you have a clear understanding of the question. 
  If not, say you do not have sufficient information to answer the question. Use the {language} in your communication.
  Question:
  {question}
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "A background information retriever for Evie"
  changes: "Initial version migrated from flat file structure"
--- a/config/prompts/openai/gpt-4o/history/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/history/1.0.0.yaml
@@ -0,0 +1,16 @@
 version: "1.0.0"
 content: |
  You are a helpful assistant that details a question based on a previous context,
  in such a way that the question is understandable without the previous context. 
  The context is a conversation history, with the HUMAN asking questions, the AI answering questions.
  The history is delimited between triple backquotes.
  You answer by stating the question in {language}.
  History:
  ```{history}```
  Question to be detailed:
  {question}
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "Prompt to further detail a question based on the previous conversation"
  changes: "Initial version migrated from flat file structure"
--- a/config/prompts/openai/gpt-4o/html_parse/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/html_parse/1.0.0.yaml
@@ -0,0 +1,20 @@
 version: "1.0.0"
 content: |
  You are a top administrative assistant specialized in transforming given HTML into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  # Best practices are:
  - Respect wordings and language(s) used in the HTML.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - Sub-headers can be used as lists. This is true when a header is followed by a series of sub-headers without content (paragraphs or listed items). Present those sub-headers as a list.  
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input html file. Answer with the pure markdown, without any other text.
  HTML is between triple backquotes.
  ```{html}```
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "An aid in transforming HTML-based inputs to markdown"
  changes: "Initial version migrated from flat file structure"
--- a/config/prompts/openai/gpt-4o/pdf_parse/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/pdf_parse/1.0.0.yaml
@@ -0,0 +1,23 @@
 version: "1.0.0"
 content: |
  You are a top administrative aid specialized in transforming given PDF-files into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system.
  The content you get is already processed (some markdown already generated), but needs to be corrected. For large files, you may receive only portions of the full file. Consider this when processing the content.
  # Best practices are:
  - Respect wordings and language(s) used in the provided content.
  - The following items need to be considered: headings, paragraphs, listed items (numbered or not) and tables. Images can be neglected.
  - When headings are numbered, show the numbering and define the header level. You may have to correct current header levels, as preprocessing is known to make errors.
  - A new item is started when a <return> is found before a full line is reached. In order to know the number of characters in a line, please check the document and the context within the document (e.g. an image could limit the number of characters temporarily).
  - Paragraphs are to be stripped of newlines so they become easily readable.
  - Be careful of encoding of the text. Everything needs to be human readable.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of the processing of the complete input pdf content. Answer with the pure markdown, without any other text.
  PDF content is between triple backquotes.
  ```{pdf_content}```
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "A assistant to parse PDF-content into markdown"
  changes: "Initial version migrated from flat file structure"
--- a/config/prompts/openai/gpt-4o/rag/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/rag/1.0.0.yaml
@@ -0,0 +1,15 @@
 version: "1.0.0"
 content: |
  Answer the question based on the following context, delimited between triple backquotes. 
  {tenant_context}
  Use the following {language} in your communication, and cite the sources used at the end of the full conversation.
  If the question cannot be answered using the given context, say "I have insufficient information to answer this question."
  Context:
  ```{context}```
  Question:
  {question}
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "The Main RAG retriever"
  changes: "Initial version migrated from flat file structure"
--- a/config/prompts/openai/gpt-4o/summary/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/summary/1.0.0.yaml
@@ -0,0 +1,9 @@
 version: "1.0.0"
 content: |
  Write a concise summary of the text in {language}. The text is delimited between triple backquotes.
  ```{text}```
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "An assistant to create a summary when multiple chunks are required for 1 file"
  changes: "Initial version migrated from flat file structure"
--- a/config/prompts/openai/gpt-4o/transcript/1.0.0.yaml
+++ b/config/prompts/openai/gpt-4o/transcript/1.0.0.yaml
@@ -0,0 +1,25 @@
 version: "1.0.0"
 content: |
  You are a top administrative assistant specialized in transforming given transcriptions into markdown formatted files. The generated files will be used to generate embeddings in a RAG-system. The transcriptions originate from podcast, videos and similar material.
  You may receive information in different chunks. If you're not receiving the first chunk, you'll get the last part of the previous chunk, including it's title in between triple $. Consider this last part and the title as the start of the new chunk.
  # Best practices and steps are:
  - Respect wordings and language(s) used in the transcription. Main language is {language}.
  - Sometimes, the transcript contains speech of several people participating in a conversation. Although these are not obvious from reading the file, try to detect when other people are speaking.    
  - Divide the transcript into several logical parts. Ensure questions and their answers are in the same logical part. Don't make logical parts too small. They should contain at least 7 or 8 sentences.
  - annotate the text to identify these logical parts using headings in {language}.
  - improve errors in the transcript given the context, but do not change the meaning and intentions of the transcription.
  Process the file carefully, and take a stepped approach. The resulting markdown should be the result of processing the complete input transcription. Answer with the pure markdown, without any other text.
  The transcript is between triple backquotes.
  $$${previous_part}$$$
  ```{transcript}```
 metadata:
  author: "Josako"
  date_added: "2024-11-10"
  description: "An assistant to transform a transcript to markdown."
  changes: "Initial version migrated from flat file structure"
--- a/config/type_defs/init.py
+++ b/config/type_defs/init.py
--- a/config/type_defs/catalog_types.py
+++ b/config/type_defs/catalog_types.py
@@ -1,9 +1,10 @@
 # Catalog Types
 CATALOG_TYPES = {
-    "DEFAULT": {
+    "STANDARD_CATALOG": {
-        "name": "Default Catalog",
+        "name": "Standard Catalog",
        "Description": "A Catalog with information in Evie's Library, to be considered as a whole",
-        "configuration": {}
+        "configuration": {},
        "document_version_configurations": []
    },
    "DOSSIER": {
        "name": "Dossier Catalog",
@@ -21,31 +22,6 @@ CATALOG_TYPES = {
                                - min_value/max_value: range limits (for numeric types only)""",
                "required": True,
                "default": {},
                "field_properties": {
                    "type": {
                        "allowed_values": ["string", "integer", "float", "date", "enum"],
                        "required": True
                    },
                    "required": {
                        "type": "boolean",
                        "default": False
                    },
                    "description": {
                        "type": "string"
                    },
                    "allowed_values": {
                        "type": "list",
                        "description": "For enum type fields only"
                    },
                    "min_value": {
                        "type": "number",
                        "description": "For numeric fields only"
                    },
                    "max_value": {
                        "type": "number",
                        "description": "For numeric fields only"
                    }
                }
            }
        },
        "document_version_configurations": ["tagging_fields"]
--- a/config/type_defs/processor_types.py
+++ b/config/type_defs/processor_types.py
@@ -0,0 +1,168 @@
 # Catalog Types
 PROCESSOR_TYPES = {
    "HTML_PROCESSOR": {
        "name": "HTML Processor",
        "file_types": "html",
        "Description": "A processor for HTML files",
        "configuration": {
            "chunking_patterns": {
                "name": "Chunking Patterns",
                "description": "A list of Patterns used to chunk files into logical pieces",
                "type": "chunking_patterns",
                "required": False
            },
            "chunking_heading_level": {
                "name": "Chunking Heading Level",
                "type": "integer",
                "description": "Maximum heading level to consider for chunking (1-6)",
                "required": False,
                "default": 2
            },
            "html_tags": {
                "name": "HTML Tags",
                "type": "string",
                "description": "A comma-separated list of HTML tags",
                "required": True,
                "default": "p, h1, h2, h3, h4, h5, h6, li, table, thead, tbody, tr, td"
            },
            "html_end_tags": {
                "name": "HTML End Tags",
                "type": "string",
                "description": "A comma-separated list of HTML end tags (where can the chunk end)",
                "required": True,
                "default": "p, li, table"
            },
            "html_included_elements": {
                "name": "HTML Included Elements",
                "type": "string",
                "description": "A comma-separated list of elements to be included",
                "required": True,
                "default": "article, main"
            },
            "html_excluded_elements": {
                "name": "HTML Excluded Elements",
                "type": "string",
                "description": "A comma-separated list of elements to be excluded",
                "required": False,
                "default": "header, footer, nav, script"
            },
            "html_excluded_classes": {
                "name": "HTML Excluded Classes",
                "type": "string",
                "description": "A comma-separated list of classes to be excluded",
                "required": False,
            },
        },
    },
    "PDF_PROCESSOR": {
        "name": "PDF Processor",
        "file_types": "pdf",
        "Description": "A Processor for PDF files",
        "configuration": {
            "chunking_patterns": {
                "name": "Chunking Patterns",
                "description": "A list of Patterns used to chunk files into logical pieces",
                "type": "chunking_patterns",
                "required": False
            },
            "chunking_heading_level": {
                "name": "Chunking Heading Level",
                "type": "integer",
                "description": "Maximum heading level to consider for chunking (1-6)",
                "required": False,
                "default": 2
            },
        },
    },
    "AUDIO_PROCESSOR": {
        "name": "AUDIO Processor",
        "file_types": "mp3, mp4, ogg",
        "Description": "A Processor for audio files",
        "configuration": {}
    },
    "MARKDOWN_PROCESSOR": {
        "name": "Markdown Processor",
        "file_types": "md",
        "Description": "A Processor for markdown files",
        "configuration": {
            "chunking_patterns": {
                "name": "Chunking Patterns",
                "description": "A list of Patterns used to chunk files into logical pieces",
                "type": "chunking_patterns",
                "required": False
            },
            "chunking_heading_level": {
                "name": "Chunking Heading Level",
                "type": "integer",
                "description": "Maximum heading level to consider for chunking (1-6)",
                "required": False,
                "default": 2
            },
        }
    },
    "DOCX_PROCESSOR": {
        "name": "DOCX Processor",
        "file_types": "docx",
        "Description": "A processor for DOCX files",
        "configuration": {
            "chunking_patterns": {
                "name": "Chunking Patterns",
                "description": "A list of Patterns used to chunk files into logical pieces",
                "type": "chunking_patterns",
                "required": False
            },
            "chunking_heading_level": {
                "name": "Chunking Heading Level",
                "type": "integer",
                "description": "Maximum heading level to consider for chunking (1-6)",
                "required": False,
                "default": 2
            },
            "extract_comments": {
                "name": "Extract Comments",
                "type": "boolean",
                "description": "Whether to include document comments in the markdown",
                "required": False,
                "default": False
            },
            "extract_headers_footers": {
                "name": "Extract Headers/Footers",
                "type": "boolean",
                "description": "Whether to include headers and footers in the markdown",
                "required": False,
                "default": False
            },
            "preserve_formatting": {
                "name": "Preserve Formatting",
                "type": "boolean",
                "description": "Whether to preserve bold, italic, and other text formatting",
                "required": False,
                "default": True
            },
            "list_style": {
                "name": "List Style",
                "type": "enum",
                "description": "How to format lists in markdown",
                "required": False,
                "default": "dash",
                "allowed_values": ["dash", "asterisk", "plus"]
            },
            "image_handling": {
                "name": "Image Handling",
                "type": "enum",
                "description": "How to handle embedded images",
                "required": False,
                "default": "skip",
                "allowed_values": ["skip", "extract", "placeholder"]
            },
            "table_alignment": {
                "name": "Table Alignment",
                "type": "enum",
                "description": "How to align table contents",
                "required": False,
                "default": "left",
                "allowed_values": ["left", "center", "preserve"]
            }
        }
    }
 }
--- a/config/type_defs/retriever_types.py
+++ b/config/type_defs/retriever_types.py
@@ -1,7 +1,7 @@
 # Retriever Types
 RETRIEVER_TYPES = {
-    "DEFAULT_RAG": {
+    "STANDARD_RAG": {
-        "name": "Default RAG",
+        "name": "Standard RAG Retriever",
        "description": "Retrieving all embeddings conform the query",
        "configuration": {
            "es_k": {
@@ -18,6 +18,14 @@ RETRIEVER_TYPES = {
                "required": True,
                "default": 0.3,
            },
        },
        "arguments": {
            "query": {
                "name": "query",
                "type": "str",
                "description": "Query to retrieve embeddings",
                "required": True,
            },
        }
    }
 }
--- a/config/type_defs/service_types.py
+++ b/config/type_defs/service_types.py
@@ -0,0 +1,11 @@
 # Specialist Types
 SERVICE_TYPES = {
    "CHAT": {
        "name": "CHAT",
        "description": "Service allows to use CHAT functionality.",
    },
    "DOCAPI": {
        "name": "DOCAPI",
        "description": "Service allows to use document API functionality.",
    },
 }
--- a/config/type_defs/specialist_types.py
+++ b/config/type_defs/specialist_types.py
@@ -0,0 +1,62 @@
 # Specialist Types
 SPECIALIST_TYPES = {
    "STANDARD_RAG": {
        "name": "Q&A RAG Specialist",
        "description": "Standard Q&A through RAG Specialist",
        "configuration": {
            "specialist_context": {
                "name": "Specialist Context",
                "type": "text",
                "description": "The context to be used by the specialist.",
                "required": False,
            },
            "temperature": {
                "name": "Temperature",
                "type": "number",
                "description": "The inference temperature to be used by the specialist.",
                "required": False,
                "default": 0.3
            }
        },
        "arguments": {
            "language": {
                "name": "Language",
                "type": "str",
                "description": "Language code to be used for receiving questions and giving answers",
                "required": True,
            },
            "query": {
                "name": "query",
                "type": "str",
                "description": "Query to answer",
                "required": True,
            }
        },
        "results": {
            "detailed_query": {
                "name": "detailed_query",
                "type": "str",
                "description": "The query detailed with the Chat Session History.",
                "required": True,
            },
            "answer": {
                "name": "answer",
                "type": "str",
                "description": "Answer to the query",
                "required": True,
            },
            "citations": {
                "name": "citations",
                "type": "List[str]",
                "description": "List of citations",
                "required": False,
            },
            "insufficient_info": {
                "name": "insufficient_info",
                "type": "bool",
                "description": "Whether or not the query is insufficient info",
                "required": True,
            },
        }
    }
 }
--- a/docker/compose_dev.yaml
+++ b/docker/compose_dev.yaml
@@ -18,16 +18,17 @@ x-common-variables: &common-variables
  FLASK_DEBUG: true
  SECRET_KEY: '97867c1491bea5ee6a8e8436eb11bf2ba6a69ff53ab1b17ecba450d0f2e572e1'
  SECURITY_PASSWORD_SALT: '228614859439123264035565568761433607235'
-  MAIL_USERNAME: eveai_super@flow-it.net
+  MAIL_USERNAME: evie@askeveai.com
-  MAIL_PASSWORD: '$$6xsWGbNtx$$CFMQZqc*'
+  MAIL_PASSWORD: 'D**0z@UGfJOI@yv3eC5'
  MAIL_SERVER: mail.flow-it.net
  MAIL_PORT: 465
  REDIS_URL: redis
  REDIS_PORT: '6379'
  FLOWER_USER: 'Felucia'
  FLOWER_PASSWORD: 'Jungles'
  OPENAI_API_KEY: 'sk-proj-8R0jWzwjL7PeoPyMhJTZT3BlbkFJLb6HfRB2Hr9cEVFWEhU7'
  GROQ_API_KEY: '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'
  ANTHROPIC_API_KEY: 'sk-ant-api03-c2TmkzbReeGhXBO5JxNH6BJNylRDonc9GmZd0eRbrvyekec2'
  PORTKEY_API_KEY: 'T2Dt4QTpgCvWxa1OftYCJtj7NcDZ'
  JWT_SECRET_KEY: 'bsdMkmQ8ObfMD52yAFg4trrvjgjMhuIqg2fjDpD/JqvgY0ccCcmlsEnVFmR79WPiLKEA3i8a5zmejwLZKl4v9Q=='
  API_ENCRYPTION_KEY: 'xfF5369IsredSrlrYZqkM9ZNrfUASYYS6TCcAR9UKj4='
  MINIO_ENDPOINT: minio:9000
@@ -36,11 +37,6 @@ x-common-variables: &common-variables
  NGINX_SERVER_NAME: 'localhost http://macstudio.ask-eve-ai-local.com/'
  LANGCHAIN_API_KEY: "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"
 networks:
  eveai-network:
    driver: bridge
 services:
  nginx:
    image: josakola/nginx:latest
@@ -60,9 +56,10 @@ services:
      - ../nginx/sites-enabled:/etc/nginx/sites-enabled
      - ../nginx/static:/etc/nginx/static
      - ../nginx/public:/etc/nginx/public
-      - ../integrations/Wordpress/eveai-chat-widget/css/eveai-chat-style.css:/etc/nginx/static/css/eveai-chat-style.css
+      - ../integrations/Wordpress/eveai-chat/assets/css/eveai-chat-style.css:/etc/nginx/static/css/eveai-chat-style.css
-      - ../integrations/Wordpress/eveai-chat-widget/js/eveai-chat-widget.js:/etc/nginx/static/js/eveai-chat-widget.js
+      - ../integrations/Wordpress/eveai-chat/assets/js/eveai-chat-widget.js:/etc/nginx/static/js/eveai-chat-widget.js
-      - ../integrations/Wordpress/eveai-chat-widget/js/eveai-sdk.js:/etc/nginx/static/js/eveai-sdk.js
+      - ../integrations/Wordpress/eveai-chat/assets/js/eveai-chat-widget.js:/etc/nginx/static/js/eveai-token-manager.js
      - ../integrations/Wordpress/eveai-chat/assets/js/eveai-sdk.js:/etc/nginx/static/js/eveai-sdk.js
      - ./logs/nginx:/var/log/nginx
    depends_on:
      - eveai_app
@@ -90,7 +87,7 @@ services:
      - ../migrations:/app/migrations
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
       db:
         condition: service_healthy
@@ -124,7 +121,7 @@ services:
      - ../config:/app/config
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
      db:
        condition: service_healthy
@@ -154,7 +151,7 @@ services:
      - ../config:/app/config
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
      db:
        condition: service_healthy
@@ -186,7 +183,7 @@ services:
      - ../config:/app/config
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
      db:
        condition: service_healthy
@@ -208,13 +205,16 @@ services:
    environment:
      <<: *common-variables
      COMPONENT_NAME: eveai_api
      WORDPRESS_HOST: host.docker.internal
      WORDPRESS_PORT: 10003
      WORDPRESS_PROTOCOL: http
    volumes:
      - ../eveai_api:/app/eveai_api
      - ../common:/app/common
      - ../config:/app/config
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
      db:
        condition: service_healthy
@@ -248,7 +248,7 @@ services:
      - ../config:/app/config
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
      redis:
        condition: service_healthy
@@ -272,7 +272,7 @@ services:
      - ../config:/app/config
      - ../scripts:/app/scripts
      - ../patched_packages:/app/patched_packages
-      - eveai_logs:/app/logs
+      - ./eveai_logs:/app/logs
    depends_on:
      db:
        condition: service_healthy
@@ -283,7 +283,6 @@ services:
    networks:
      - eveai-network
  db:
    hostname: db
    image: ankane/pgvector
@@ -308,8 +307,8 @@ services:
  redis:
    image: redis:7.2.5
    restart: always
-    expose:
+    ports:
-      - 6379
+      - "6379:6379"
    volumes:
      - ./db/redis:/data
    healthcheck:
@@ -359,6 +358,13 @@ services:
    networks:
      - eveai-network
 networks:
  eveai-network:
    driver: bridge
    # This enables the containers to access the host network
    driver_opts:
      com.docker.network.bridge.host_ipc: "true"
 volumes:
  minio_data:
  eveai_logs:
--- a/docker/compose_stackhero.yaml
+++ b/docker/compose_stackhero.yaml
@@ -31,7 +31,6 @@ x-common-variables: &common-variables
  OPENAI_API_KEY: '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'
  GROQ_API_KEY: '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'
  ANTHROPIC_API_KEY: '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'
  PORTKEY_API_KEY: 'XvmvBFIVbm76opUxA7MNP14QmdQj'
  JWT_SECRET_KEY: '0d99e810e686ea567ef305d8e9b06195c4db482952e19276590a726cde60a408'
  API_ENCRYPTION_KEY: 'Ly5XYWwEKiasfAwEqdEMdwR-k0vhrq6QPYd4whEROB0='
  GRAYLOG_HOST: de4zvu.stackhero-network.com
--- a/docker/eveai_api/Dockerfile
+++ b/docker/eveai_api/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/eveai_app/Dockerfile
+++ b/docker/eveai_app/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/eveai_beat/Dockerfile
+++ b/docker/eveai_beat/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/eveai_chat/Dockerfile
+++ b/docker/eveai_chat/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/eveai_chat_workers/Dockerfile
+++ b/docker/eveai_chat_workers/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/eveai_entitlements/Dockerfile
+++ b/docker/eveai_entitlements/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/eveai_workers/Dockerfile
+++ b/docker/eveai_workers/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 # Prevents Python from writing pyc files.
--- a/docker/flower/Dockerfile
+++ b/docker/flower/Dockerfile
@@ -1,4 +1,4 @@
-ARG PYTHON_VERSION=3.12.3
+ARG PYTHON_VERSION=3.12.7
 FROM python:${PYTHON_VERSION}-slim as base
 ENV PYTHONDONTWRITEBYTECODE=1
--- a/docker/nginx/Dockerfile
+++ b/docker/nginx/Dockerfile
@@ -10,9 +10,10 @@ COPY ../../nginx/mime.types /etc/nginx/mime.types
 # Copy static & public files
 RUN mkdir -p /etc/nginx/static /etc/nginx/public
 COPY ../../nginx/static /etc/nginx/static
-COPY ../../integrations/Wordpress/eveai-chat-widget/css/eveai-chat-style.css /etc/nginx/static/css/
+COPY ../../integrations/Wordpress/eveai-chat/assets/css/eveai-chat-style.css /etc/nginx/static/css/
-COPY ../../integrations/Wordpress/eveai-chat-widget/js/eveai-chat-widget.js /etc/nginx/static/js/
+COPY ../../integrations/Wordpress/eveai-chat/assets/js/eveai-chat-widget.js /etc/nginx/static/js/
-COPY ../../integrations/Wordpress/eveai-chat-widget/js/eveai-sdk.js /etc/nginx/static/js
+COPY ../../integrations/Wordpress/eveai-chat/assets/js/eveai-token-manager.js /etc/nginx/static/js/
 COPY ../../integrations/Wordpress/eveai-chat/assets/js/eveai-sdk.js /etc/nginx/static/js
 COPY ../../nginx/public /etc/nginx/public
 # Copy site-specific configurations
--- a/eveai_api/init.py
+++ b/eveai_api/init.py
@@ -1,9 +1,16 @@
 import traceback
 from flask import Flask, jsonify, request
 from flask_jwt_extended import get_jwt_identity, verify_jwt_in_request
-from common.extensions import db, api_rest, jwt, minio_client, simple_encryption
+from sqlalchemy.exc import SQLAlchemyError
 from werkzeug.exceptions import HTTPException
 from common.extensions import db, api_rest, jwt, minio_client, simple_encryption, cors
 import os
 import logging.config
 from common.models.user import TenantDomain
 from common.utils.cors_utils import get_allowed_origins
 from common.utils.database import Database
 from config.logging_config import LOGGING
 from .api.document_api import document_ns
@@ -11,6 +18,7 @@ from .api.auth import auth_ns
 from config.config import get_config
 from common.utils.celery_utils import make_celery, init_celery
 from common.utils.eveai_exceptions import EveAIException
 from common.utils.debug_utils import register_request_debugger
 def create_app(config_file=None):
@@ -45,44 +53,49 @@ def create_app(config_file=None):
    # Register Blueprints
    register_blueprints(app)
-    # Error handler for the API
+    # Register Error Handlers
-    @app.errorhandler(EveAIException)
+    register_error_handlers(app)
-    def handle_eveai_exception(error):
+
-        return {'message': str(error)}, error.status_code
+    # Register Request Debugger
    register_request_debugger(app)
    @app.before_request
-    def before_request():
+    def check_cors():
-        app.logger.debug(f'Before request: {request.method} {request.path}')
+        if request.method == 'OPTIONS':
-        app.logger.debug(f'Request URL: {request.url}')
+            app.logger.debug("Handling OPTIONS request")
-        app.logger.debug(f'Request headers: {dict(request.headers)}')
+            return '', 200  # Allow OPTIONS to pass through
-        # Log JSON data if the content type is application/json
+        origin = request.headers.get('Origin')
-        if request.is_json:
+        if not origin:
-            app.logger.debug(f'JSON data: {request.json}')
+            return  # Not a CORS request
-        # Log raw data for other content types
+        # Get tenant ID from request
-        if request.data:
+        if verify_jwt_in_request():
-            app.logger.debug(f'Raw data: {request.data}')
+            tenant_id = get_jwt_identity()
-
+            if not tenant_id:
-        # Check if this is a request to the token endpoint
+                return
-        if request.path == '/api/v1/auth/token' and request.method == 'POST':
+        else:
            app.logger.debug('Token request detected, skipping JWT verification')
            return
        # Check if origin is allowed for this tenant
        allowed_origins = get_allowed_origins(tenant_id)
        if origin not in allowed_origins:
            app.logger.warning(f'Origin {origin} not allowed for tenant {tenant_id}')
            return {'error': 'Origin not allowed'}, 403
    @app.before_request
    def set_tenant_schema():
        # Check if this a health check request
        if request.path.startswith('/_healthz') or request.path.startswith('/healthz'):
-            app.logger.debug('Health check request detected, skipping JWT verification')
+            pass
        else:
            try:
                verify_jwt_in_request(optional=True)
                tenant_id = get_jwt_identity()
                app.logger.debug(f'Tenant ID from JWT: {tenant_id}')
                if tenant_id:
                    Database(tenant_id).switch_schema()
                    app.logger.debug(f'Switched to schema for tenant {tenant_id}')
                else:
                    app.logger.debug('No tenant ID found in JWT')
            except Exception as e:
                app.logger.error(f'Error in before_request: {str(e)}')
                # Don't raise the exception here, let the request continue
@@ -101,6 +114,17 @@ def register_extensions(app):
    jwt.init_app(app)
    minio_client.init_app(app)
    simple_encryption.init_app(app)
    cors.init_app(app, resources={
        r"/api/v1/*": {
            "origins": "*",
            "methods": ["GET", "POST", "PUT", "OPTIONS"],
            "allow_headers": ["Content-Type", "Authorization", "X-Requested-With"],
            "expose_headers": ["Content-Length", "Content-Range"],
            "supports_credentials": True,
            "max_age": 1728000,  # 20 days
            "allow_credentials": True
        }
    })
 def register_namespaces(app):
@@ -112,3 +136,61 @@ def register_blueprints(app):
    from .views.healthz_views import healthz_bp
    app.register_blueprint(healthz_bp)
 def register_error_handlers(app):
    @app.errorhandler(Exception)
    def handle_exception(e):
        """Handle all unhandled exceptions with detailed error responses"""
        # Get the current exception info
        exc_info = traceback.format_exc()
        # Log the full exception details
        app.logger.error(f"Unhandled exception: {str(e)}\n{exc_info}")
        # Start with a default error response
        response = {
            "error": "Internal Server Error",
            "message": str(e),
            "type": e.__class__.__name__
        }
        status_code = 500
        # Handle specific types of exceptions
        if isinstance(e, HTTPException):
            status_code = e.code
            response["error"] = e.name
        elif isinstance(e, SQLAlchemyError):
            response["error"] = "Database Error"
            response["details"] = str(e.__cause__ or e)
        elif isinstance(e, ValueError):
            status_code = 400
            response["error"] = "Invalid Input"
        # In development, include additional debug information
        if app.debug:
            response["debug"] = {
                "exception": exc_info,
                "class": e.__class__.__name__,
                "module": e.__class__.__module__
            }
        return jsonify(response), status_code
    @app.errorhandler(404)
    def not_found_error(e):
        return jsonify({
            "error": "Not Found",
            "message": str(e),
            "type": "NotFoundError"
        }), 404
    @app.errorhandler(400)
    def bad_request_error(e):
        return jsonify({
            "error": "Bad Request",
            "message": str(e),
            "type": "BadRequestError"
        }), 400
--- a/eveai_api/api/auth.py
+++ b/eveai_api/api/auth.py
@@ -1,10 +1,11 @@
-from datetime import timedelta
+from datetime import timedelta, datetime as dt, timezone as tz
 from flask_restx import Namespace, Resource, fields
-from flask_jwt_extended import create_access_token
+from flask_jwt_extended import create_access_token, verify_jwt_in_request, get_jwt, get_jwt_identity, jwt_required
-from common.models.user import Tenant
+from common.models.user import Tenant, TenantProject
 from common.extensions import simple_encryption
-from flask import current_app, request
+from flask import current_app, jsonify, request
 from functools import wraps
 auth_ns = Namespace('auth', description='Authentication related operations')
@@ -18,6 +19,12 @@ token_response = auth_ns.model('TokenResponse', {
    'expires_in': fields.Integer(description='Token expiration time in seconds')
 })
 token_verification = auth_ns.model('TokenVerification', {
    'is_valid': fields.Boolean(description='Token validity status'),
    'expires_in': fields.Integer(description='Seconds until token expiration'),
    'tenant_id': fields.Integer(description='Tenant ID from token')
 })
@auth_ns.route('/token')
 class Token(Resource):
@@ -30,42 +37,53 @@ class Token(Resource):
        """
        Get JWT token
        """
        current_app.logger.debug(f"Token endpoint called with data: {request.json}")
        try:
-            tenant_id = auth_ns.payload['tenant_id']
+            tenant_id = int(auth_ns.payload['tenant_id'])
            api_key = auth_ns.payload['api_key']
        except KeyError as e:
            current_app.logger.error(f"Missing required field: {e}")
            return {'message': f"Missing required field: {e}"}, 400
        current_app.logger.debug(f"Querying database for tenant: {tenant_id}")
        tenant = Tenant.query.get(tenant_id)
        if not tenant:
            current_app.logger.error(f"Tenant not found: {tenant_id}")
-            return {'message': "Tenant not found"}, 404
+            return {'message': f"Authentication invalid for tenant {tenant_id}"}, 404
-        current_app.logger.debug(f"Tenant found: {tenant.id}")
+        projects = TenantProject.query.filter_by(
            tenant_id=tenant_id,
            active=True
        ).all()
        # Find project with matching API key
        matching_project = None
        for project in projects:
            try:
-            current_app.logger.debug("Attempting to decrypt API key")
+                decrypted_key = simple_encryption.decrypt_api_key(project.encrypted_api_key)
-            decrypted_api_key = simple_encryption.decrypt_api_key(tenant.encrypted_api_key)
+                if decrypted_key == api_key:
                    matching_project = project
                    break
            except Exception as e:
-            current_app.logger.error(f"Error decrypting API key: {e}")
+                current_app.logger.error(f"Error decrypting API key for project {project.id}: {e}")
-            return {'message': "Internal server error"}, 500
+                continue
-        if api_key != decrypted_api_key:
+        if not matching_project:
-            current_app.logger.error(f"Invalid API key for tenant: {tenant_id}")
+            current_app.logger.error(f"Project for given API key not found for Tenant: {tenant_id}")
            return {'message': "Invalid API key"}, 401
        # Get the JWT_ACCESS_TOKEN_EXPIRES setting from the app config
        expires_delta = current_app.config.get('JWT_ACCESS_TOKEN_EXPIRES', timedelta(minutes=15))
        try:
-            current_app.logger.debug(f"Creating access token for tenant: {tenant_id}")
+            additional_claims = {
-            access_token = create_access_token(identity=tenant_id, expires_delta=expires_delta)
+                'services': matching_project.services,
-            current_app.logger.debug("Access token created successfully")
+            }
            access_token = create_access_token(
                identity=tenant_id,
                expires_delta=expires_delta,
                additional_claims=additional_claims
            )
            current_app.logger.debug(f"Created token: {access_token}")
            return {
                'access_token': access_token,
                'expires_in': expires_delta.total_seconds()
@@ -73,3 +91,108 @@ class Token(Resource):
        except Exception as e:
            current_app.logger.error(f"Error creating access token: {e}")
            return {'message': "Internal server error"}, 500
@auth_ns.route('/verify')
 class TokenVerification(Resource):
    @auth_ns.doc('verify_token')
    @auth_ns.response(200, 'Token verification result', token_verification)
    @auth_ns.response(401, 'Invalid token')
    def get(self):
        """Verify a token's validity and get expiration information"""
        try:
            verify_jwt_in_request()
            jwt_data = get_jwt()
            # Get expiration timestamp from token
            exp_timestamp = jwt_data['exp']
            current_timestamp = dt.now().timestamp()
            return {
                'is_valid': True,
                'expires_in': int(exp_timestamp - current_timestamp),
                'tenant_id': jwt_data['sub']  # tenant_id is stored in 'sub' claim
            }, 200
        except Exception as e:
            current_app.logger.error(f"Token verification failed: {str(e)}")
            return {
                'is_valid': False,
                'message': 'Invalid token'
            }, 401
@auth_ns.route('/refresh')
 class TokenRefresh(Resource):
    @auth_ns.doc('refresh_token')
    @auth_ns.response(200, 'New token', token_response)
    @auth_ns.response(401, 'Invalid token')
    def post(self):
        """Get a new token before the current one expires"""
        try:
            verify_jwt_in_request()
            jwt_data = get_jwt()
            tenant_id = jwt_data['sub']
            # Optional: Add additional verification here if needed
            # Create new token
            expires_delta = current_app.config.get('JWT_ACCESS_TOKEN_EXPIRES', timedelta(minutes=15))
            new_token = create_access_token(
                identity=tenant_id,
                expires_delta=expires_delta
            )
            return {
                'access_token': new_token,
                'expires_in': int(expires_delta.total_seconds())
            }, 200
        except Exception as e:
            current_app.logger.error(f"Token refresh failed: {str(e)}")
            return {'message': 'Token refresh failed'}, 401
@auth_ns.route('/services')
 class Services(Resource):
    @jwt_required()
    @auth_ns.doc(security='Bearer')
    @auth_ns.response(200, 'Success', {
        'services': fields.List(fields.String, description='List of allowed services for this token'),
        'tenant_id': fields.Integer(description='Tenant ID associated with this token')
    })
    @auth_ns.response(401, 'Invalid or expired token')
    def get(self):
        """
        Get allowed services for the current token
        """
        # Log the incoming authorization header
        auth_header = request.headers.get('Authorization')
        current_app.logger.debug(f"Received Authorization header: {auth_header}")
        claims = get_jwt()
        tenant_id = get_jwt_identity()
        return {
            'services': claims.get('services', []),
            'tenant_id': tenant_id
        }, 200
 # Decorate function to check for a particular service
 def requires_service(service_name):
    def decorator(fn):
        @wraps(fn)
        def wrapper(*args, **kwargs):
            # Get the JWT claims
            claims = get_jwt()
            services = claims.get('services', [])
            if service_name not in services:
                return {
                    'message': f'This endpoint requires the {service_name} service',
                    'error': 'Insufficient permissions'
                }, 403
            return fn(*args, **kwargs)
        return wrapper
    return decorator
--- a/eveai_api/api/document_api.py
+++ b/eveai_api/api/document_api.py
@@ -1,7 +1,9 @@
 import io
 import json
 from datetime import datetime
 import pytz
 import requests
 from flask import current_app, request
 from flask_restx import Namespace, Resource, fields, reqparse
 from flask_jwt_extended import jwt_required, get_jwt_identity
@@ -9,10 +11,12 @@ from werkzeug.datastructures import FileStorage
 from werkzeug.utils import secure_filename
 from common.utils.document_utils import (
    create_document_stack, process_url, start_embedding_task,
-    validate_file_type, EveAIInvalidLanguageException, EveAIDoubleURLException, EveAIUnsupportedFileType,
+    EveAIInvalidLanguageException, EveAIDoubleURLException, EveAIUnsupportedFileType,
-    process_multiple_urls, get_documents_list, edit_document, refresh_document, edit_document_version,
+    get_documents_list, edit_document, refresh_document, edit_document_version,
-    refresh_document_with_info
+    refresh_document_with_info, lookup_document, refresh_document_with_content
 )
 from common.utils.eveai_exceptions import EveAIException
 from eveai_api.api.auth import requires_service
 def validate_date(date_str):
@@ -58,25 +62,31 @@ add_document_response = document_ns.model('AddDocumentResponse', {
@document_ns.route('/add_document')
 class AddDocument(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.expect(upload_parser)
    @document_ns.response(201, 'Document added successfully', add_document_response)
    @document_ns.response(400, 'Validation Error')
    @document_ns.response(500, 'Internal Server Error')
    def post(self):
        """
-        Add a new document
+        Add a new document by providing the content of a file (Multipart/form-data).
        """
        tenant_id = get_jwt_identity()
        current_app.logger.info(f'Adding document for tenant {tenant_id}')
        try:
            args = upload_parser.parse_args()
        except Exception as e:
            current_app.logger.error(f"Error parsing arguments: {str(e)}")
            current_app.logger.error(f"Exception type: {type(e)}")
            raise
        try:
            file = args['file']
            filename = secure_filename(file.filename)
            extension = filename.rsplit('.', 1)[1].lower()
-            validate_file_type(extension)
+            # validate_file_type(extension)
            api_input = {
                'catalog_id': args.get('catalog_id'),
@@ -106,6 +116,113 @@ class AddDocument(Resource):
            document_ns.abort(500, 'Error adding document')
 # Models for AddDocumentThroughURL
 add_document_through_url = document_ns.model('AddDocumentThroughURL', {
    'catalog_id': fields.Integer(required=True, description='ID of the catalog the URL needs to be added to'),
    'temp_url': fields.String(required=True, description='Temporary URL of the document to add'),
    'name': fields.String(required=True, description='Name of the document'),
    'language': fields.String(required=True, description='Language of the document'),
    'user_context': fields.String(required=False, description='User context for the document'),
    'valid_from': fields.String(required=False, description='Valid from date for the document'),
    'user_metadata': fields.String(required=False, description='User metadata for the document'),
    'system_metadata': fields.String(required=False, description='System metadata for the document'),
    'catalog_properties': fields.String(required=False, description='The catalog configuration to be passed along (JSON '
                                                                    'format). Validity is against catalog requirements '
                                                                    'is not checked, and is the responsibility of the '
                                                                    'calling client.'),
 })
 add_document_through_url_response = document_ns.model('AddDocumentThroughURLResponse', {
    'message': fields.String(description='Status message'),
    'document_id': fields.Integer(description='ID of the created document'),
    'document_version_id': fields.Integer(description='ID of the created document version'),
    'task_id': fields.String(description='ID of the embedding task')
 })
@document_ns.route('/add_document_through_url')
 class AddDocumentThroughURL(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.expect(add_document_through_url)
    @document_ns.response(201, 'Document added successfully', add_document_through_url)
    @document_ns.response(400, 'Validation Error')
    @document_ns.response(422, 'File could not be processed')
    @document_ns.response(500, 'Internal Server Error')
    def post(self):
        """
        Add a new document using a URL. The URL can be temporary, and will not be stored.
        Mainly used for passing temporary URLs like used in e.g. Zapier
        """
        tenant_id = get_jwt_identity()
        current_app.logger.info(f'Adding document through url for tenant {tenant_id}')
        try:
            args = document_ns.payload
        except Exception as e:
            current_app.logger.error(f"Error parsing arguments: {str(e)}")
            current_app.logger.error(f"Exception type: {type(e)}")
            raise
        try:
            # Step 1: Download from stashed URL
            stashed_url = args['temp_url']
            current_app.logger.info(f"Downloading stashed file from URL: {stashed_url}")
            response = requests.get(stashed_url, stream=True)
            response.raise_for_status()
            hydration_url = response.text.strip()
            current_app.logger.info(f"Downloading actual file from URL: {hydration_url}")
            # Step 2: Download from hydration URL
            actual_file_response = requests.get(hydration_url, stream=True)
            actual_file_response.raise_for_status()
            hydrated_file_content = actual_file_response.content
            # Get filename from URL or use provided name
            filename = secure_filename(args.get('name'))
            extension = filename.rsplit('.', 1)[1].lower() if '.' in filename else ''
            # Create FileStorage object from downloaded content
            file_content = io.BytesIO(hydrated_file_content)
            file = FileStorage(
                stream=file_content,
                filename=filename,
                content_type=response.headers.get('content-type', 'application/octet-stream')
            )
            current_app.logger.info(f"Successfully downloaded file: {filename}")
        except requests.RequestException as e:
            current_app.logger.error(f"Error downloading file: {str(e)}")
            return {'message': f'Error downloading file: {str(e)}'}, 422
        try:
            # Prepare API input
            api_input = {
                'catalog_id': args.get('catalog_id'),
                'name': args.get('name') or filename,
                'language': args.get('language'),
                'user_context': args.get('user_context'),
                'valid_from': args.get('valid_from'),
                'user_metadata': args.get('user_metadata'),
                'catalog_properties': args.get('catalog_properties'),
            }
            new_doc, new_doc_vers = create_document_stack(api_input, file, filename, extension, tenant_id)
            task_id = start_embedding_task(tenant_id, new_doc_vers.id)
            return {
                'message': f'Processing on document {new_doc.name}, version {new_doc_vers.id} started. Task ID: {task_id}.',
                'document_id': new_doc.id,
                'document_version_id': new_doc_vers.id,
                'task_id': task_id
            }, 201
        except (EveAIInvalidLanguageException, EveAIUnsupportedFileType) as e:
            current_app.logger.error(f'Error adding document: {str(e)}')
            return {'message': str(e)}, 400
        except Exception as e:
            current_app.logger.error(f'Error adding document: {str(e)}')
            return {'message': 'Error adding document'}, 500
 # Models for AddURL
 add_url_model = document_ns.model('AddURL', {
    'catalog_id': fields.Integer(required='True', description='ID of the catalog the URL needs to be added to'),
@@ -133,13 +250,15 @@ add_url_response = document_ns.model('AddURLResponse', {
@document_ns.route('/add_url')
 class AddURL(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.expect(add_url_model)
    @document_ns.response(201, 'Document added successfully', add_url_response)
    @document_ns.response(400, 'Validation Error')
    @document_ns.response(500, 'Internal Server Error')
    def post(self):
        """
-        Add a new document from URL
+        Add a new document from URL. The URL in this case is stored and can be used to refresh the document.
        As a consequence, this must be a permanent and accessible URL.
        """
        tenant_id = get_jwt_identity()
        current_app.logger.info(f'Adding document from URL for tenant {tenant_id}')
@@ -189,6 +308,7 @@ document_list_model = document_ns.model('DocumentList', {
@document_ns.route('/list')
 class DocumentList(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.doc('list_documents')
    @document_ns.marshal_list_with(document_list_model, envelope='documents')
    def get(self):
@@ -209,23 +329,35 @@ edit_document_model = document_ns.model('EditDocument', {
@document_ns.route('/<int:document_id>')
 class DocumentResource(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.doc('edit_document')
    @document_ns.expect(edit_document_model)
    @document_ns.response(200, 'Document updated successfully')
    @document_ns.response(400, 'Validation Error')
    @document_ns.response(404, 'Document not found')
    @document_ns.response(500, 'Internal Server Error')
    def put(self, document_id):
-        """Edit a document"""
+        """Edit a document. The content of the document will not be refreshed!"""
        try:
            current_app.logger.debug(f'Editing document {document_id}')
            data = request.json
-        updated_doc, error = edit_document(document_id, data['name'], data.get('valid_from'), data.get('valid_to'))
+            tenant_id = get_jwt_identity()
            updated_doc, error = edit_document(tenant_id, document_id, data.get('name', None),
                                               data.get('valid_from', None), data.get('valid_to', None))
            if updated_doc:
                return {'message': f'Document {updated_doc.id} updated successfully'}, 200
            else:
                return {'message': f'Error updating document: {error}'}, 400
        except EveAIException as e:
            return e.to_dict(), e.status_code
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.doc('refresh_document')
    @document_ns.response(200, 'Document refreshed successfully')
    def post(self, document_id):
-        """Refresh a document"""
+        """Refresh a document. In this case, the content of the document will be refreshed! This requires the document
        version to have a permanent and accessible URL!"""
        tenant_id = get_jwt_identity()
        new_version, result = refresh_document(document_id, tenant_id)
        if new_version:
@@ -243,13 +375,15 @@ edit_document_version_model = document_ns.model('EditDocumentVersion', {
@document_ns.route('/version/<int:version_id>')
 class DocumentVersionResource(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.doc('edit_document_version')
    @document_ns.expect(edit_document_version_model)
    @document_ns.response(200, 'Document version updated successfully')
    def put(self, version_id):
        """Edit a document version"""
        data = request.json
-        updated_version, error = edit_document_version(version_id, data['user_context'], data.get('catalog_properties'))
+        tenant_id = get_jwt_identity()
        updated_version, error = edit_document_version(tenant_id, version_id, data['user_context'], data.get('catalog_properties'))
        if updated_version:
            return {'message': f'Document Version {updated_version.id} updated successfully'}, 200
        else:
@@ -269,11 +403,13 @@ refresh_document_model = document_ns.model('RefreshDocument', {
@document_ns.route('/<int:document_id>/refresh')
 class RefreshDocument(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.response(200, 'Document refreshed successfully')
    @document_ns.response(404, 'Document not found')
    def post(self, document_id):
        """
-        Refresh a document without additional information
+        Refresh a document without additional information. In this case, the content of the document will be refreshed!
        This requires the document version to have a permanent and accessible URL!
        """
        tenant_id = get_jwt_identity()
        current_app.logger.info(f'Refreshing document {document_id} for tenant {tenant_id}')
@@ -299,13 +435,14 @@ class RefreshDocument(Resource):
@document_ns.route('/<int:document_id>/refresh_with_info')
 class RefreshDocumentWithInfo(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.expect(refresh_document_model)
    @document_ns.response(200, 'Document refreshed successfully')
    @document_ns.response(400, 'Validation Error')
    @document_ns.response(404, 'Document not found')
    def post(self, document_id):
        """
-        Refresh a document with new information
+        Refresh a document with new version information.
        """
        tenant_id = get_jwt_identity()
        current_app.logger.info(f'Refreshing document {document_id} with info for tenant {tenant_id}')
@@ -327,3 +464,112 @@ class RefreshDocumentWithInfo(Resource):
        except Exception as e:
            current_app.logger.error(f'Error refreshing document with info: {str(e)}')
            return {'message': 'Internal server error'}, 500
 # Define models for lookup requests
 lookup_model = document_ns.model('DocumentLookup', {
    'lookup_criteria': fields.Raw(required=True,
                                  description='JSON object containing key-value pairs to match in metadata. '
                                              'Example: {"external_id": "123", "source": "zapier", "source_type": "google_docs"}'),
    'metadata_type': fields.String(required=True, enum=['user_metadata', 'system_metadata'],
                                   description='Which metadata field to search in')
 })
 lookup_response = document_ns.model('DocumentLookupResponse', {
    'document_id': fields.Integer(description='ID of the found document'),
    'document_version_id': fields.Integer(description='ID of the latest document version'),
    'name': fields.String(description='Document name'),
    'metadata': fields.Raw(description='Full metadata of the found document')
 })
@document_ns.route('/lookup')
 class DocumentLookup(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.expect(lookup_model)
    @document_ns.marshal_with(lookup_response)
    @document_ns.response(200, 'Document found', lookup_response)
    @document_ns.response(404, 'No document found matching criteria')
    def post(self):
        """
        Look up a document using metadata criteria
        """
        tenant_id = get_jwt_identity()
        try:
            data = request.json
            document, version = lookup_document(
                tenant_id,
                data['lookup_criteria'],
                data['metadata_type']
            )
            return {
                'document_id': document.id,
                'document_version_id': version.id,
                'name': document.name,
                'metadata': getattr(version, data['metadata_type'])
            }
        except EveAIException as e:
            return e.to_dict(), e.status_code
        except KeyError as e:
            return {'message': f'Missing required field: {str(e)}'}, 400
 refresh_content_model = document_ns.model('RefreshDocumentContent', {
    'file_content': fields.Raw(required=True, description='The new file content'),
    'language': fields.String(required=False, description='Language of the document'),
    'user_context': fields.String(required=False, description='User context for the document'),
    'user_metadata': fields.Raw(required=False, description='Custom metadata fields'),
    'catalog_properties': fields.Raw(required=False, description='Catalog-specific properties'),
    'trigger_service': fields.String(required=False, description='Service that triggered the update')
 })
@document_ns.route('/<int:document_id>/refresh_content')
 class RefreshDocumentContent(Resource):
    @jwt_required()
    @requires_service('DOCAPI')
    @document_ns.expect(refresh_content_model)
    @document_ns.response(200, 'Document refreshed successfully')
    def post(self, document_id):
        """Refresh a document with new content"""
        tenant_id = get_jwt_identity()
        try:
            data = request.json
            file_content = data['file_content']
            # Build user_metadata by merging:
            # 1. Existing metadata (if any)
            # 2. New metadata from request
            # 3. Zapier-specific fields
            user_metadata = data.get('user_metadata', {})
            user_metadata.update({
                'source': 'zapier',
                'trigger_service': data.get('trigger_service')
            })
            data['user_metadata'] = user_metadata
            # Keep catalog_properties separate
            if 'catalog_properties' in data:
                # We could add validation here against catalog configuration
                data['catalog_properties'] = data['catalog_properties']
            new_version, task_id = refresh_document_with_content(
                document_id,
                tenant_id,
                file_content,
                data
            )
            return {
                'message': f'Document refreshed successfully. New version: {new_version.id}. Task ID: {task_id}',
                'document_id': document_id,
                'document_version_id': new_version.id,
                'task_id': task_id
            }, 200
        except EveAIException as e:
            return e.to_dict(), e.status_code
--- a/eveai_app/init.py
+++ b/eveai_app/init.py
@@ -7,7 +7,7 @@ from werkzeug.middleware.proxy_fix import ProxyFix
 import logging.config
 from common.extensions import (db, migrate, bootstrap, security, mail, login_manager, cors, csrf, session,
-                               minio_client, simple_encryption, metrics)
+                               minio_client, simple_encryption, metrics, cache_manager)
 from common.models.user import User, Role, Tenant, TenantDomain
 import common.models.interaction
 import common.models.entitlements
@@ -119,6 +119,7 @@ def register_extensions(app):
    simple_encryption.init_app(app)
    session.init_app(app)
    minio_client.init_app(app)
    cache_manager.init_app(app)
    metrics.init_app(app)
--- a/eveai_app/errors.py
+++ b/eveai_app/errors.py
@@ -1,4 +1,4 @@
-from flask import render_template, request, jsonify, redirect
+from flask import render_template, request, jsonify, redirect, current_app
 from flask_login import current_user
 from common.utils.nginx_utils import prefixed_url_for
@@ -6,24 +6,28 @@ from common.utils.nginx_utils import prefixed_url_for
 def not_found_error(error):
    if not current_user.is_authenticated:
        return redirect(prefixed_url_for('security.login'))
    current_app.logger.error(f"Not Found Error: {error}")
    return render_template('error/404.html'), 404
 def internal_server_error(error):
    if not current_user.is_authenticated:
        return redirect(prefixed_url_for('security.login'))
    current_app.logger.error(f"Internal Server Error: {error}")
    return render_template('error/500.html'), 500
 def not_authorised_error(error):
    if not current_user.is_authenticated:
        return redirect(prefixed_url_for('security.login'))
    current_app.logger.error(f"Not Authorised Error: {error}")
    return render_template('error/401.html')
 def access_forbidden(error):
    if not current_user.is_authenticated:
        return redirect(prefixed_url_for('security.login'))
    current_app.logger.error(f"Access Forbidden: {error}")
    return render_template('error/403.html')
@@ -32,6 +36,7 @@ def key_error_handler(error):
    if str(error) == "'tenant'":
        return redirect(prefixed_url_for('security.login'))
    # For other KeyErrors, you might want to log the error and return a generic error page
    current_app.logger.error(f"Key Error: {error}")
    return render_template('error/generic.html', error_message="An unexpected error occurred"), 500
--- a/eveai_app/templates/document/catalogs.html
+++ b/eveai_app/templates/document/catalogs.html
@@ -10,7 +10,7 @@
 {% block content %}
 <div class="container">
    <form method="POST" action="{{ url_for('document_bp.handle_catalog_selection') }}">
-        {{ render_selectable_table(headers=["Catalog ID", "Name"], rows=rows, selectable=True, id="catalogsTable") }}
+        {{ render_selectable_table(headers=["Catalog ID", "Name", "Type"], rows=rows, selectable=True, id="catalogsTable") }}
        <div class="form-group mt-3">
            <button type="submit" name="action" value="set_session_catalog" class="btn btn-primary">Set Session Catalog</button>
            <button type="submit" name="action" value="edit_catalog" class="btn btn-primary">Edit Catalog</button>
--- a/eveai_app/templates/document/edit_catalog.html
+++ b/eveai_app/templates/document/edit_catalog.html
@@ -26,7 +26,7 @@ When you change chunking of embedding information, you'll need to manually refre
                {{ render_field(field, disabled_fields, exclude_fields) }}
            {% endfor %}
        {% endfor %}
-        <button type="submit" class="btn btn-primary">Save Retriever</button>
+        <button type="submit" class="btn btn-primary">Save Catalog</button>
    </form>
 {% endblock %}
--- a/eveai_app/templates/document/edit_processor.html
+++ b/eveai_app/templates/document/edit_processor.html
@@ -0,0 +1,33 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_field %}
 {% block title %}Edit Processor{% endblock %}
 {% block content_title %}Edit Processor{% endblock %}
 {% block content_description %}Edit a Processor (for a Catalog){% endblock %}
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
        {% set disabled_fields = ['type'] %}
        {% set exclude_fields = [] %}
        <!-- Render Static Fields -->
        {% for field in form.get_static_fields() %}
            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <!-- Render Dynamic Fields -->
        {% for collection_name, fields in form.get_dynamic_fields().items() %}
            {% if fields|length > 0 %}
                <h4 class="mt-4">{{ collection_name }}</h4>
            {% endif %}
            {% for field in fields %}
                {{ render_field(field, disabled_fields, exclude_fields) }}
            {% endfor %}
        {% endfor %}
        <button type="submit" class="btn btn-primary">Save Processor</button>
    </form>
 {% endblock %}
 {% block content_footer %}
 {% endblock %}
--- a/eveai_app/templates/document/library_operations.html
+++ b/eveai_app/templates/document/library_operations.html
@@ -10,6 +10,17 @@
 <div class="container">
    <form method="POST" action="{{ url_for('document_bp.handle_library_selection') }}">
        <div class="form-group mt-3">
            <h2>Create Default RAG Library</h2>
            <p>This function will create a default library setup for RAG purposes. More specifically, it will create:</p>
            <ul>
                <li>A default RAG Catalog</li>
                <li>A Default HTML Processor</li>
                <li>A default RAG Retriever</li>
                <li>A default RAG Specialist</li>
            </ul>
            <p>This enables a quick start-up for standard Ask Eve AI functionality. All elements can be changed later on an individual basis.</p>
            <button type="submit" name="action" value="create_default_rag_library" class="btn btn-danger">Create Default RAG Library</button>
            <h2>Re-Embed Latest Versions</h2>
            <p>This functionality will re-apply embeddings on the latest versions of all documents in the library.
                This is useful only while tuning the embedding parameters, or when changing embedding algorithms.
@@ -17,6 +28,7 @@
                use it with caution!
            </p>
            <button type="submit" name="action" value="re_embed_latest_versions" class="btn btn-danger">Re-embed Latest Versions (expensive)</button>
            <h2>Refresh all documents</h2>
            <p>This operation will create new versions of all documents in the library with a URL. Documents that were uploaded directly,
                cannot be automatically refreshed. This is an expensive operation, and impacts the performance of the system in future use.
--- a/eveai_app/templates/document/processor.html
+++ b/eveai_app/templates/document/processor.html
@@ -0,0 +1,23 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_field %}
 {% block title %}Processor Registration{% endblock %}
 {% block content_title %}Register Processor{% endblock %}
 {% block content_description %}Define a new processor (for a catalog){% endblock %}
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
        {%  set disabled_fields = [] %}
        {%  set exclude_fields = [] %}
        {% for field in form %}
            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <button type="submit" class="btn btn-primary">Register Processor</button>
    </form>
 {% endblock %}
 {% block content_footer %}
 {% endblock %}
--- a/eveai_app/templates/document/processors.html
+++ b/eveai_app/templates/document/processors.html
@@ -0,0 +1,23 @@
 {% extends 'base.html' %}
 {%  from 'macros.html' import render_selectable_table, render_pagination %}
 {% block title %}Processors{% endblock %}
 {% block content_title %}Processors{% endblock %}
 {% block content_description %}View Processors for Tenant{% endblock %}
 {% block content_class %}<div class="col-xl-12 col-lg-5 col-md-7 mx-auto"></div>{% endblock %}
 {% block content %}
 <div class="container">
    <form method="POST" action="{{ url_for('document_bp.handle_processor_selection') }}">
        {{ render_selectable_table(headers=["Processor ID", "Name", "Type", "Catalog ID"], rows=rows, selectable=True, id="retrieversTable") }}
        <div class="form-group mt-3">
            <button type="submit" name="action" value="edit_processor" class="btn btn-primary">Edit Processor</button>
        </div>
    </form>
 </div>
 {% endblock %}
 {% block content_footer %}
    {{ render_pagination(pagination, 'document_bp.processors') }}
 {% endblock %}
--- a/eveai_app/templates/document/retrievers.html
+++ b/eveai_app/templates/document/retrievers.html
@@ -4,13 +4,13 @@
 {% block title %}Retrievers{% endblock %}
 {% block content_title %}Retrievers{% endblock %}
-{% block content_description %}View Retrieers for Tenant{% endblock %}
+{% block content_description %}View Retrievers for Tenant{% endblock %}
 {% block content_class %}<div class="col-xl-12 col-lg-5 col-md-7 mx-auto"></div>{% endblock %}
 {% block content %}
 <div class="container">
    <form method="POST" action="{{ url_for('document_bp.handle_retriever_selection') }}">
-        {{ render_selectable_table(headers=["Retriever ID", "Name", "Type", "Catalog ID"], rows=rows, selectable=True, id="retrieverssTable") }}
+        {{ render_selectable_table(headers=["Retriever ID", "Name", "Type", "Catalog ID"], rows=rows, selectable=True, id="retrieversTable") }}
        <div class="form-group mt-3">
            <button type="submit" name="action" value="edit_retriever" class="btn btn-primary">Edit Retriever</button>
        </div>
--- a/eveai_app/templates/email/api_key_notification.html
+++ b/eveai_app/templates/email/api_key_notification.html
@@ -0,0 +1,28 @@
 {% extends "email/base.html" %}
 {% block content %}
 <p>Hello,</p>
 <p>A new API project has been created for your Ask Eve AI tenant. Here are the details:</p>
 <div class="info-box">
    <p><strong>Tenant ID:</strong> {{ tenant_id }}</p>
    <p><strong>Tenant Name:</strong> {{ tenant_name }}</p>
    <p><strong>Project Name:</strong> {{ project_name }}</p>
    <p><strong>API Key:</strong> <span style="font-family: monospace; background-color: #f0f0f0; padding: 5px;">{{ api_key }}</span></p>
    <div style="margin-top: 15px;">
        <p><strong>Enabled Services:</strong></p>
        <ul style="list-style-type: none; padding-left: 0;">
            {% for service in services %}
            <li>✓ {{ service }}</li>
            {% endfor %}
        </ul>
    </div>
 </div>
 <div class="warning-box">
    <strong>Important:</strong> Please store this API key securely. It cannot be retrieved once this email is gone.
 </div>
 <p>You can start using this API key right away to interact with our services. For documentation and usage examples, please visit our <a href="https://docs.askeveai.com">documentation</a>.</p>
 {% endblock %}
--- a/eveai_app/templates/email/base.html
+++ b/eveai_app/templates/email/base.html
@@ -0,0 +1,106 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>{{ subject|default('Message from Ask Eve AI') }}</title>
    <style>
        .email-container {
            font-family: Tahoma, Geneva, sans-serif;
            max-width: 600px;
            margin: 0 auto;
        }
        .header {
            text-align: center;
            padding: 20px;
        }
        .header img {
            max-width: 200px;
        }
        .footer {
            text-align: center;
            padding: 20px;
            background-color: #f8f9fa;
        }
        .signature {
            font-style: italic;
            margin: 20px 0;
        }
        .footer-text {
            font-size: 12px;
            color: #666;
        }
        .footer img {
            max-width: 100%;
            height: auto;
            width: 600px; /* Match the container width */
            display: block;
            margin: 20px auto;
        }
        @media only screen and (max-width: 600px) {
            .footer img {
                width: 100%;
            }
        }
        .social-links {
            margin: 20px 0;
        }
        .social-links a {
            margin: 0 10px;
            color: #0066cc;
            text-decoration: none;
        }
        .info-box {
            background-color: #f8f9fa;
            border-left: 4px solid #0066cc;
            padding: 15px;
            margin: 20px 0;
        }
        .warning-box {
            background-color: #fff3cd;
            border-left: 4px solid #ffc107;
            padding: 15px;
            margin: 20px 0;
        }
    </style>
 </head>
 <body>
    <div class="email-container">
        <div class="header">
            <img src="https://askeveai.com/wp-content/uploads/2024/07/Logo-Square-small.png" alt="Ask Eve AI Logo">
        </div>
        <div class="content-wrapper">
            {% block content %}{% endblock %}
        </div>
        <div class="footer">
            <div class="signature">
                Best regards,<br>
                Evie
            </div>
            {% if promo_image_url %}
            <a href="https://www.askeveai.com">
                <img src="{{ promo_image_url }}" alt="Ask Eve AI Promotion">
            </a>
            {% endif %}
            <div class="social-links">
                <a href="https://twitter.com/askeveai">Twitter</a>
                <a href="https://linkedin.com/company/ask-eve-ai">LinkedIn</a>
            </div>
            <div class="footer-text">
                © {{ year }} Ask Eve AI. All rights reserved.<br>
                <a href="https://www.askeveai.com/privacy">Privacy Policy</a> |
                <a href="https://www.askeveai.com/terms">Terms of Service</a>
                {% if unsubscribe_url %}
                | <a href="{{ unsubscribe_url }}">Unsubscribe</a>
                {% endif %}
            </div>
        </div>
    </div>
 </body>
 </html>
--- a/eveai_app/templates/entitlements/view_licenses.html
+++ b/eveai_app/templates/entitlements/view_licenses.html
@@ -0,0 +1,25 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_selectable_table, render_pagination %}
 {% block title %}View Licenses{% endblock %}
 {%  block content_title %}View Licenses{% endblock %}
 {% block content_description %}View Licenses{% endblock %}
 {% block content %}
 <form action="{{ url_for('entitlements_bp.handle_license_selection') }}" method="POST">
    {{ render_selectable_table(headers=["License ID", "Name", "Start Date", "End Date", "Active"], rows=rows, selectable=True, id="licensesTable") }}
    <div class="form-group mt-3">
        <button type="submit" name="action" value="edit_license" class="btn btn-primary">Edit License</button>
        <!-- Additional buttons can be added here for other actions -->
    </div>
 </form>
 {% endblock %}
 {% block content_footer %}
    {{ render_pagination(pagination, 'entitlements_bp.view_licenses') }}
 {% endblock %}
 {% block scripts %}
 {% endblock %}
--- a/eveai_app/templates/entitlements/view_usages.html
+++ b/eveai_app/templates/entitlements/view_usages.html
@@ -7,7 +7,7 @@
 {% block content_description %}View License Usage{% endblock %}
 {% block content %}
-<form action="{{ url_for('user_bp.handle_user_action') }}" method="POST">
+<form action="{{ url_for('entitlements_bp.handle_usage_selection') }}" method="POST">
    {{ render_selectable_table(headers=["Usage ID", "Start Date", "End Date", "Storage (MiB)", "Embedding (MiB)", "Interaction (tokens)"], rows=rows, selectable=False, id="usagesTable") }}
 <!--    <div class="form-group mt-3">-->
 <!--        <button type="submit" name="action" value="edit_user" class="btn btn-primary">Edit Selected User</button>-->
@@ -20,7 +20,7 @@
 {% endblock %}
 {% block content_footer %}
-    {{ render_pagination(pagination, 'user_bp.select_tenant') }}
+    {{ render_pagination(pagination, 'entitlements_bp.view_usages') }}
 {% endblock %}
 {% block scripts %}
--- a/eveai_app/templates/interaction/edit_specialist.html
+++ b/eveai_app/templates/interaction/edit_specialist.html
@@ -0,0 +1,33 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_field %}
 {% block title %}Edit Specialist{% endblock %}
 {% block content_title %}Edit Specialist{% endblock %}
 {% block content_description %}Edit a Specialist{% endblock %}
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
        {% set disabled_fields = ['type'] %}
        {% set exclude_fields = [] %}
        <!-- Render Static Fields -->
        {% for field in form.get_static_fields() %}
            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <!-- Render Dynamic Fields -->
        {% for collection_name, fields in form.get_dynamic_fields().items() %}
            {% if fields|length > 0 %}
                <h4 class="mt-4">{{ collection_name }}</h4>
            {% endif %}
            {% for field in fields %}
                {{ render_field(field, disabled_fields, exclude_fields) }}
            {% endfor %}
        {% endfor %}
        <button type="submit" class="btn btn-primary">Save Specialist</button>
    </form>
 {% endblock %}
 {% block content_footer %}
 {% endblock %}
--- a/eveai_app/templates/interaction/specialist.html
+++ b/eveai_app/templates/interaction/specialist.html
@@ -0,0 +1,23 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_field %}
 {% block title %}Specialist Registration{% endblock %}
 {% block content_title %}Register Specialist{% endblock %}
 {% block content_description %}Define a new specialist{% endblock %}
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
        {%  set disabled_fields = [] %}
        {%  set exclude_fields = [] %}
        {% for field in form %}
            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <button type="submit" class="btn btn-primary">Register Specialist</button>
    </form>
 {% endblock %}
 {% block content_footer %}
 {% endblock %}
--- a/eveai_app/templates/interaction/specialists.html
+++ b/eveai_app/templates/interaction/specialists.html
@@ -0,0 +1,23 @@
 {% extends 'base.html' %}
 {%  from 'macros.html' import render_selectable_table, render_pagination %}
 {% block title %}Retrievers{% endblock %}
 {% block content_title %}Specialists{% endblock %}
 {% block content_description %}View Specialists for Tenant{% endblock %}
 {% block content_class %}<div class="col-xl-12 col-lg-5 col-md-7 mx-auto"></div>{% endblock %}
 {% block content %}
 <div class="container">
    <form method="POST" action="{{ url_for('interaction_bp.handle_specialist_selection') }}">
        {{ render_selectable_table(headers=["Specialist ID", "Name", "Type"], rows=rows, selectable=True, id="specialistsTable") }}
        <div class="form-group mt-3">
            <button type="submit" name="action" value="edit_specialist" class="btn btn-primary">Edit Specialist</button>
        </div>
    </form>
 </div>
 {% endblock %}
 {% block content_footer %}
    {{ render_pagination(pagination, 'document_bp.retrievers') }}
 {% endblock %}
--- a/eveai_app/templates/interaction/view_chat_session.html
+++ b/eveai_app/templates/interaction/view_chat_session.html
@@ -1,6 +1,4 @@
 {% extends "base.html" %}
 {% from "macros.html" import render_field %}
 {% block title %}Session Overview{% endblock %}
 {% block content_title %}Session Overview{% endblock %}
@@ -8,7 +6,7 @@
 {% block content %}
 <div class="container mt-5">
-    <h2>Chat Session Details</h2>
+    <h4>Chat Session Details</h4>
    <div class="card mb-4">
        <div class="card-header">
            <h5>Session Information</h5>
@@ -21,44 +19,73 @@
        </div>
    </div>
-    <h3>Interactions</h3>
+    <h5>Interactions</h5>
    <div class="accordion" id="interactionsAccordion">
-        {% for interaction in interactions %}
+        {% for interaction, id, question_at, specialist_arguments, specialist_results, specialist_name, specialist_type in interactions %}
        <div class="accordion-item">
-            <h2 class="accordion-header" id="heading{{ loop.index }}">
+            <p class="accordion-header" id="heading{{ loop.index }}">
                <button class="accordion-button collapsed" type="button" data-bs-toggle="collapse"
                        data-bs-target="#collapse{{ loop.index }}" aria-expanded="false"
                        aria-controls="collapse{{ loop.index }}">
-                    <div class="d-flex justify-content-between align-items-center w-100">
+                    <div class="interaction-header">
-                        <span class="interaction-question">{{ interaction.question | truncate(50) }}</span>
+                        <div class="interaction-metadata">
-                        <span class="interaction-icons">
+                            <div class="interaction-time text-muted">
-                            <i class="material-icons algorithm-icon {{ interaction.algorithm_used | lower }}">fingerprint</i>
+                                {{ question_at | to_local_time(chat_session.timezone) }}
-                            <i class="material-icons thumb-icon {% if interaction.appreciation == 100 %}filled{% else %}outlined{% endif %}">thumb_up</i>
+                            </div>
-                            <i class="material-icons thumb-icon {% if interaction.appreciation == 0 %}filled{% else %}outlined{% endif %}">thumb_down</i>
+                            <div class="specialist-info">
-                        </span>
+                                <span class="badge bg-primary">{{ specialist_name if specialist_name else 'No Specialist' }}</span>
                                <span class="badge bg-secondary">{{ specialist_type if specialist_type else '' }}</span>
                            </div>
                        </div>
                        <div class="interaction-question">
                            {{ specialist_results.detailed_query if specialist_results and specialist_results.detailed_query else specialist_arguments.query }}
                        </div>
                    </div>
                </button>
-            </h2>
+            </p>
            <div id="collapse{{ loop.index }}" class="accordion-collapse collapse" aria-labelledby="heading{{ loop.index }}"
                 data-bs-parent="#interactionsAccordion">
                <div class="accordion-body">
-                    <h6>Detailed Question:</h6>
+                    <!-- Arguments Section -->
-                    <p>{{ interaction.detailed_question }}</p>
+                    {% if specialist_arguments %}
-                    <h6>Answer:</h6>
+                    <div class="mb-4">
-                    <div class="markdown-content">{{ interaction.answer | safe }}</div>
+                        <h6 class="mb-3">Specialist Arguments:</h6>
-                    {% if embeddings_dict.get(interaction.id) %}
+                        <div class="code-wrapper">
                            <pre><code class="language-json" style="width: 100%;">{{ specialist_arguments | tojson(indent=2) }}</code></pre>
                        </div>
                    </div>
                    {% endif %}
                    <!-- Results Section -->
                    {% if specialist_results %}
                    <div class="mb-4">
                        <h6 class="mb-3">Specialist Results:</h6>
                        <div class="code-wrapper">
                            <pre><code class="language-json" style="width: 100%;">{{ specialist_results | tojson(indent=2) }}</code></pre>
                        </div>
                    </div>
                    {% endif %}
                    <!-- Related Documents Section -->
                    {% if embeddings_dict.get(id) %}
                    <div class="mt-4">
                        <h6>Related Documents:</h6>
-                    <ul>
+                        <ul class="list-group">
-                        {% for embedding in embeddings_dict[interaction.id] %}
+                            {% for embedding in embeddings_dict[id] %}
-                        <li>
+                            <li class="list-group-item">
                                {% if embedding.url %}
-                            <a href="{{ embedding.url }}" target="_blank">{{ embedding.url }}</a>
+                                <a href="{{ embedding.url }}" target="_blank" class="text-decoration-none">
                                    <i class="material-icons align-middle me-2">link</i>
                                    {{ embedding.url }}
                                </a>
                                {% else %}
                                <i class="material-icons align-middle me-2">description</i>
                                {{ embedding.object_name }}
                                {% endif %}
                            </li>
                            {% endfor %}
                        </ul>
                    </div>
                    {% endif %}
                </div>
            </div>
@@ -68,14 +95,166 @@
 </div>
 {% endblock %}
 {% block styles %}
 {{ super() }}
 <style>
    .interaction-header {
        font-size: 0.9rem;
        display: flex;
        flex-direction: column;
        width: 100%;
        padding: 0.5rem 0;
    }
    .interaction-metadata {
        display: flex;
        gap: 1rem;
        align-items: center;
        margin-bottom: 0.5rem;
    }
    .interaction-time {
        font-size: 0.9rem;
    }
    .specialist-info {
        display: flex;
        gap: 0.5rem;
        align-items: center;
    }
    .interaction-question {
        font-size: 0.9rem;
        font-weight: bold;
        line-height: 1.4;
    }
    .badge {
        font-size: 0.9rem;
        padding: 0.35em 0.65em;
        white-space: nowrap;
    }
    .accordion-button {
        padding: 0.5rem 1rem;
    }
    .accordion-button::after {
        margin-left: 1rem;
    }
    .json-display {
        background-color: #f8f9fa;
        border-radius: 4px;
        padding: 15px;
        margin: 0;
        white-space: pre-wrap;
        word-wrap: break-word;
        font-family: monospace;
        font-size: 0.85rem;
        line-height: 1.5;
        max-width: 100%;
        overflow-x: auto;
    }
    .list-group-item {
        font-size: 0.9rem;
    }
    .material-icons {
        font-size: 1.1rem;
    }
    pre {
        margin: 0;
        padding: 0;
        white-space: pre-wrap !important;       /* Force wrapping */
        word-wrap: break-word !important;       /* Break long words if necessary */
        max-width: 100%;                        /* Ensure container doesn't overflow */
    }
    pre, code {
        margin: 0;
        padding: 0;
        white-space: pre-wrap !important;       /* Force wrapping */
        word-wrap: break-word !important;       /* Break long words if necessary */
        max-width: 100%;                        /* Ensure container doesn't overflow */
    }
    pre code {
        padding: 1rem !important;
        border-radius: 4px;
        font-size: 0.75rem;
        line-height: 1.5;
        white-space: pre-wrap !important;       /* Force wrapping in code block */
    }
    .code-wrapper {
        position: relative;
        width: 100%;
    }
    /* Override all possible highlight.js white-space settings */
    .code-wrapper pre,
    .code-wrapper pre code,
    .code-wrapper pre code.hljs,
    .code-wrapper .hljs {
        white-space: pre-wrap !important;
        overflow-wrap: break-word !important;
        word-wrap: break-word !important;
        word-break: break-word !important;
        max-width: 100% !important;
        overflow-x: hidden !important;
    }
    .code-wrapper pre {
        margin: 0;
        background: #f8f9fa;
        border-radius: 4px;
    }
    .code-wrapper pre code {
        padding: 1rem !important;
        font-family: monospace;
        font-size: 0.9rem;
        line-height: 1.5;
        display: block;
    }
    /* Override highlight.js default nowrap behavior */
    .hljs {
        background: #f8f9fa !important;
        white-space: pre-wrap !important;
        word-wrap: break-word !important;
    }
    /* Color theme */
    .hljs-string {
        color: #0a3069 !important;
    }
    .hljs-attr {
        color: #953800 !important;
    }
    .hljs-number {
        color: #116329 !important;
    }
    .hljs-boolean {
        color: #0550ae !important;
    }
 </style>
 {% endblock %}
 {% block scripts %}
-<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
+{{ super() }}
 <script>
-    document.addEventListener('DOMContentLoaded', function() {
+document.addEventListener('DOMContentLoaded', function() {
-        var markdownElements = document.querySelectorAll('.markdown-content');
+    // Initialize syntax highlighting
-        markdownElements.forEach(function(el) {
+    document.querySelectorAll('pre code').forEach((block) => {
-            el.innerHTML = marked.parse(el.textContent);
+        hljs.highlightElement(block);
        });
    });
 });
 </script>
 {% endblock %}
--- a/eveai_app/templates/macros.html
+++ b/eveai_app/templates/macros.html
@@ -1,35 +1,4 @@
-<!--{% macro render_field(field, disabled_fields=[], exclude_fields=[], class='') %}-->
+{% macro render_field_content(field, disabled=False, class='') %}
 <!--    {% set disabled = field.name in disabled_fields %}-->
 <!--    {% set exclude_fields = exclude_fields + ['csrf_token', 'submit'] %}-->
 <!--    {% if field.name not in exclude_fields %}-->
 <!--        {% if field.type == 'BooleanField' %}-->
 <!--            <div class="form-check">-->
 <!--                {{ field(class="form-check-input " + class, type="checkbox", id="flexSwitchCheckDefault") }}-->
 <!--                {{ field.label(class="form-check-label", for="flexSwitchCheckDefault", disabled=disabled) }}-->
 <!--            </div>-->
 <!--        {% else %}-->
 <!--            <div class="form-group">-->
 <!--                {{ field.label(class="form-label") }}-->
 <!--                {{ field(class="form-control " + class, disabled=disabled) }}-->
 <!--                {% if field.errors %}-->
 <!--                    <div class="invalid-feedback">-->
 <!--                        {% for error in field.errors %}-->
 <!--                            {{ error }}-->
 <!--                        {% endfor %}-->
 <!--                    </div>-->
 <!--                {% endif %}-->
 <!--            </div>-->
 <!--        {% endif %}-->
 <!--    {% endif %}-->
 <!--{% endmacro %}-->
 {% macro render_field_old(field, disabled_fields=[], exclude_fields=[], class='') %}
    <!-- Debug info -->
    <!-- Field name: {{ field.name }}, Field type: {{ field.__class__.__name__ }} -->
    {% set disabled = field.name in disabled_fields %}
    {% set exclude_fields = exclude_fields + ['csrf_token', 'submit'] %}
    {% if field.name not in exclude_fields %}
    {% if field.type == 'BooleanField' %}
        <div class="form-group">
            <div class="form-check form-switch">
@@ -39,8 +8,20 @@
                                 **{'data-bs-toggle': 'tooltip',
                                    'data-bs-placement': 'right',
                                    'title': field.description}) }}
                    {% if field.flags.required %}
                        <span class="required-field-indicator" aria-hidden="true">
                            <i class="material-symbols-outlined required-icon">check_circle</i>
                        </span>
                        <span class="visually-hidden">Required field</span>
                    {% endif %}
                {% else %}
                    {{ field.label(class="form-check-label") }}
                    {% if field.flags.required %}
                        <span class="required-field-indicator" aria-hidden="true">
                            <i class="material-symbols-outlined required-icon">check_circle</i>
                        </span>
                        <span class="visually-hidden">Required field</span>
                    {% endif %}
                {% endif %}
            </div>
            {% if field.errors %}
@@ -58,66 +39,20 @@
                             **{'data-bs-toggle': 'tooltip',
                                'data-bs-placement': 'right',
                                'title': field.description}) }}
                {% if field.flags.required %}
                    <span class="required-field-indicator" aria-hidden="true">
                        <i class="material-symbols-outlined required-icon">check_circle</i>
                    </span>
                    <span class="visually-hidden">Required field</span>
                {% endif %}
            {% else %}
                {{ field.label(class="form-label") }}
                {% if field.flags.required %}
                    <span class="required-field-indicator" aria-hidden="true">
                        <i class="material-symbols-outlined required-icon">check_circle</i>
                    </span>
                    <span class="visually-hidden">Required field</span>
                {% endif %}
                {% if field.type == 'TextAreaField' and 'json-editor' in class %}
                    <div id="{{ field.id }}-editor" class="json-editor-container"></div>
                    {{ field(class="form-control d-none " + class, disabled=disabled) }}
                {% else %}
                    {{ field(class="form-control " + class, disabled=disabled) }}
                {% endif %}
                {% if field.errors %}
                    <div class="invalid-feedback d-block">
                        {% for error in field.errors %}
                            {{ error }}
                        {% endfor %}
                    </div>
                {% endif %}
            </div>
        {% endif %}
    {% endif %}
 {% endmacro %}
 {% macro render_field(field, disabled_fields=[], exclude_fields=[], class='') %}
    <!-- Debug info -->
    <!-- Field name: {{ field.name }}, Field type: {{ field.__class__.__name__ }} -->
    {% set disabled = field.name in disabled_fields %}
    {% set exclude_fields = exclude_fields + ['csrf_token', 'submit'] %}
    {% if field.name not in exclude_fields %}
        {% if field.type == 'BooleanField' %}
            <div class="form-group">
                <div class="form-check form-switch">
                    {{ field(class="form-check-input " + class, disabled=disabled) }}
                    {% if field.description %}
                        {{ field.label(class="form-check-label",
                                     **{'data-bs-toggle': 'tooltip',
                                        'data-bs-placement': 'right',
                                        'title': field.description}) }}
                    {% else %}
                        {{ field.label(class="form-check-label") }}
                    {% endif %}
                </div>
                {% if field.errors %}
                    <div class="invalid-feedback d-block">
                        {% for error in field.errors %}
                            {{ error }}
                        {% endfor %}
                    </div>
                {% endif %}
            </div>
        {% else %}
            <div class="form-group">
                {% if field.description %}
                    {{ field.label(class="form-label",
                                 **{'data-bs-toggle': 'tooltip',
                                    'data-bs-placement': 'right',
                                    'title': field.description}) }}
                {% else %}
                    {{ field.label(class="form-label") }}
            {% endif %}
            {% if field.type == 'TextAreaField' and 'json-editor' in class %}
@@ -138,30 +73,24 @@
            {% endif %}
        </div>
    {% endif %}
 {% endmacro %}
 {% macro render_field(field, disabled_fields=[], exclude_fields=[], class='') %}
    <!-- Debug info -->
    <!-- Field name: {{ field.name }}, Field type: {{ field.__class__.__name__ }} -->
    {% set disabled = field.name in disabled_fields %}
    {% set exclude_fields = exclude_fields + ['csrf_token', 'submit'] %}
    {% if field.name not in exclude_fields %}
        {{ render_field_content(field, disabled, class) }}
    {% endif %}
 {% endmacro %}
-{% macro render_included_field(field, disabled_fields=[], include_fields=[]) %}
+{% macro render_included_field(field, disabled_fields=[], include_fields=[], class='') %}
    {% set disabled = field.name in disabled_fields %}
    {% if field.name in include_fields %}
-        {% if field.type == 'BooleanField' %}
+        {{ render_field_content(field, disabled, class) }}
            <div class="form-check">
                {{ field(class="form-check-input", type="checkbox", id="flexSwitchCheckDefault") }}
                {{ field.label(class="form-check-label", for="flexSwitchCheckDefault", disabled=disabled) }}
            </div>
        {% else %}
            <div class="form-group">
                {{ field.label(class="form-label") }}
                {{ field(class="form-control", disabled=disabled) }}
                {% if field.errors %}
                    <div class="invalid-feedback">
                        {% for error in field.errors %}
                            {{ error }}
                        {% endfor %}
                    </div>
                {% endif %}
            </div>
        {% endif %}
    {% endif %}
 {% endmacro %}
--- a/eveai_app/templates/navbar.html
+++ b/eveai_app/templates/navbar.html
@@ -75,6 +75,8 @@
                                    {'name': 'Edit Tenant', 'url': '/user/tenant/' ~ session['tenant'].get('id'), 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Tenant Domains', 'url': '/user/view_tenant_domains', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Tenant Domain Registration', 'url': '/user/tenant_domain', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Tenant Projects', 'url': '/user/tenant_projects', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Tenant Project Registration', 'url': '/user/tenant_project', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'User List', 'url': '/user/view_users', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'User Registration', 'url': '/user/user', 'roles': ['Super User', 'Tenant Admin']},
                                ]) }}
@@ -83,6 +85,8 @@
                                {{ dropdown('Document Mgmt', 'note_stack', [
                                    {'name': 'Add Catalog', 'url': '/document/catalog', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'All Catalogs', 'url': '/document/catalogs', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Add Processor', 'url': '/document/processor', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'All Processors', 'url': '/document/processors', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Add Retriever', 'url': '/document/retriever', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'All Retrievers', 'url': '/document/retrievers', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Add Document', 'url': '/document/add_document', 'roles': ['Super User', 'Tenant Admin']},
@@ -95,6 +99,8 @@
                            {% endif %}
                            {% if current_user.is_authenticated %}
                                {{ dropdown('Interactions', 'hub', [
                                    {'name': 'Add Specialist', 'url': '/interaction/specialist', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'All Specialists', 'url': '/interaction/specialists', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Chat Sessions', 'url': '/interaction/chat_sessions', 'roles': ['Super User', 'Tenant Admin']},
                                ]) }}
                            {% endif %}
@@ -103,6 +109,7 @@
                                    {'name': 'License Tier Registration', 'url': '/entitlements/license_tier', 'roles': ['Super User']},
                                    {'name': 'All License Tiers', 'url': '/entitlements/view_license_tiers', 'roles': ['Super User']},
                                    {'name': 'Trigger Actions', 'url': '/administration/trigger_actions', 'roles': ['Super User']},
                                    {'name': 'All Licenses', 'url': '/entitlements/view_licenses', 'roles': ['Super User', 'Tenant Admin']},
                                    {'name': 'Usage', 'url': '/entitlements/view_usages', 'roles': ['Super User', 'Tenant Admin']},
                                ]) }}
                            {% endif %}
@@ -118,22 +125,11 @@
                            {% endif %}
                        </ul>
                        {% if current_user.is_authenticated %}
                            <ul class="navbar-nav d-lg-block d-none">
                                <li class="nav-item">
                                    <a href="/document/catalogs" class="btn btn-sm  bg-gradient-primary  mb-0 me-2">
                                        {% if 'catalog_name' in session %}
                                            CATALOG: {{ session['catalog_name'] }}
                                        {% else %}
                                            CHOOSE CATALOG
                                        {% endif %}
                                    </a>
                                </li>
                            </ul>
                            <ul class="navbar-nav d-lg-block d-none">
                                <li class="nav-item">
                                    <a href="/session_defaults" class="btn btn-sm  bg-gradient-primary  mb-0">
                                        {% if 'tenant' in session %}
-                                            TENANT: {{ session['tenant'].get('name', 'None') }}
+                                            TENANT {{ session['tenant'].get('id', 'None') }}: {{ session['tenant'].get('name', 'None') }}
                                        {% endif %}
                                    </a>
                                </li>
--- a/eveai_app/templates/scripts.html
+++ b/eveai_app/templates/scripts.html
@@ -59,6 +59,93 @@ document.addEventListener('DOMContentLoaded', function() {
    });
 });
 </script>
 <script>
 document.addEventListener('DOMContentLoaded', function() {
    // Get all forms with tabs
    const formsWithTabs = document.querySelectorAll('form');
    formsWithTabs.forEach(form => {
        // Handle the form's submit event
        form.addEventListener('submit', function(event) {
            const invalidFields = form.querySelectorAll(':invalid');
            if (invalidFields.length > 0) {
                // Prevent form submission
                event.preventDefault();
                // Find which tab contains the first invalid field
                const firstInvalidField = invalidFields[0];
                const tabPane = firstInvalidField.closest('.tab-pane');
                if (tabPane) {
                    // Get the tab ID
                    const tabId = tabPane.id;
                    // Find and click the corresponding tab button
                    const tabButton = document.querySelector(`[data-bs-toggle="tab"][data-bs-target="#${tabId}"]`);
                    if (tabButton) {
                        const tab = new bootstrap.Tab(tabButton);
                        tab.show();
                    }
                    // Scroll the invalid field into view and focus it
                    firstInvalidField.scrollIntoView({ behavior: 'smooth', block: 'center' });
                    firstInvalidField.focus();
                }
                // Optional: Show a message about validation errors
                const errorCount = invalidFields.length;
                const message = `Please fill in all required fields (${errorCount} ${errorCount === 1 ? 'error' : 'errors'} found)`;
                if (typeof Swal !== 'undefined') {
                    // If SweetAlert2 is available
                    Swal.fire({
                        title: 'Validation Error',
                        text: message,
                        icon: 'error',
                        confirmButtonText: 'OK'
                    });
                } else {
                    // Fallback to browser alert
                    alert(message);
                }
            }
        });
        // Optional: Real-time validation as user switches tabs
        const tabButtons = document.querySelectorAll('[data-bs-toggle="tab"]');
        tabButtons.forEach(button => {
            button.addEventListener('shown.bs.tab', function() {
                const previousTabPane = document.querySelector(button.getAttribute('data-bs-target'));
                if (previousTabPane) {
                    const invalidFields = previousTabPane.querySelectorAll(':invalid');
                    if (invalidFields.length > 0) {
                        // Add visual indicator to tab
                        button.classList.add('has-error');
                    } else {
                        button.classList.remove('has-error');
                    }
                }
            });
        });
    });
 });
 </script>
 <script>
 document.addEventListener('DOMContentLoaded', function() {
    document.querySelectorAll('textarea[data-handle-enter="true"]').forEach(function(textarea) {
        textarea.addEventListener('keydown', function(e) {
            if (e.key === 'Enter' && e.shiftKey) {
                e.preventDefault();
                const start = this.selectionStart;
                const end = this.selectionEnd;
                this.value = this.value.substring(0, start) + '\n' + this.value.substring(end);
                this.selectionStart = this.selectionEnd = start + 1;
            }
        });
    });
 });
 </script>
 <style>
 .json-editor-container {
--- a/eveai_app/templates/user/confirm_delete_tenant_project.html
+++ b/eveai_app/templates/user/confirm_delete_tenant_project.html
@@ -0,0 +1,28 @@
 {% extends 'base.html' %}
 {% block title %}Delete Tenant Project{% endblock %}
 {% block content_title %}Delete Tenant Project{% endblock %}
 {% block content_description %}Are you sure you want to delete this tenant project?{% endblock %}
 {% block content %}
 <div class="container">
    <div class="alert alert-warning">
        <p><strong>Warning:</strong> You are about to delete the following tenant project:</p>
        <ul>
            <li><strong>Name:</strong> {{ tenant_project.name }}</li>
            <li><strong>API Key:</strong> {{ tenant_project.visual_api_key }}</li>
            <li><strong>Responsible:</strong> {{ tenant_project.responsible_email or 'Not specified' }}</li>
        </ul>
        <p>This action cannot be undone.</p>
    </div>
    <form method="POST">
        {{ form.csrf_token if form }}
        <div class="form-group mt-3">
            <a href="{{ url_for('user_bp.tenant_projects') }}" class="btn btn-secondary">Cancel</a>
            <button type="submit" class="btn btn-danger">Confirm Delete</button>
        </div>
    </form>
 </div>
 {% endblock %}
--- a/eveai_app/templates/user/edit_tenant_project.html
+++ b/eveai_app/templates/user/edit_tenant_project.html
@@ -0,0 +1,26 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_field %}
 {% block title %}Edit Tenant Project{% endblock %}
 {% block content_title %}Edit Tenant Project{% endblock %}
 {% block content_description %}Edit a Tenant Project. It is impossible to view of renew the existing API key.
 You need to invalidate the current project, and create a new one.
 {% endblock %}
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
        {% set disabled_fields = [] %}
        {% set exclude_fields = [] %}
        <!-- Render Static Fields -->
        {% for field in form %}
            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <button type="submit" class="btn btn-primary">Save Tenant Project</button>
    </form>
 {% endblock %}
 {% block content_footer %}
 {% endblock %}
--- a/eveai_app/templates/user/select_tenant.html
+++ b/eveai_app/templates/user/select_tenant.html
@@ -24,10 +24,13 @@
 <!-- Tenant Selection Form -->
 <form method="POST" action="{{ url_for('user_bp.handle_tenant_selection') }}">
    {{ render_selectable_table(headers=["Tenant ID", "Tenant Name", "Website", "Type"], rows=rows, selectable=True, id="tenantsTable") }}
-    <div class="form-group mt-3">
+    <div class="form-group mt-3 d-flex justify-content-between">
        <div>
            <button type="submit" name="action" value="select_tenant" class="btn btn-primary">Set Session Tenant</button>
            <button type="submit" name="action" value="edit_tenant" class="btn btn-secondary">Edit Tenant</button>
        </div>
        <button type="submit" name="action" value="new_tenant" class="btn btn-secondary">New Tenant</button>
    </div>
 </form>
 {% endblock %}
--- a/eveai_app/templates/user/tenant.html
+++ b/eveai_app/templates/user/tenant.html
@@ -9,65 +9,12 @@
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
-        <!-- Main Tenant Information -->
+        {%  set disabled_fields = [] %}
-        {% set main_fields = ['name', 'website', 'default_language', 'allowed_languages', 'timezone','rag_context', 'type'] %}
+        {% set exclude_fields = [] %}
        {% for field in form %}
-            {{ render_included_field(field, disabled_fields=[], include_fields=main_fields) }}
+            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <!-- Nav Tabs -->
        <div class="row mt-5">
            <div class="col-lg-12">
                <div class="nav-wrapper position-relative end-0">
                    <ul class="nav nav-pills nav-fill p-1" role="tablist">
                        <li class="nav-item" role="presentation">
                            <a class="nav-link mb-0 px-0 py-1 active" data-toggle="tab" href="#model-info-tab" role="tab" aria-controls="model-info" aria-selected="true">
                                Model Information
                            </a>
                        </li>
                        <li class="nav-item">
                            <a class="nav-link mb-0 px-0 py-1" data-toggle="tab" href="#license-info-tab" role="tab" aria-controls="license-info" aria-selected="false">
                                License Information
                            </a>
                        </li>
                    </ul>
                </div>
                <div class="tab-content tab-space">
                    <!-- Model Information Tab -->
                    <div class="tab-pane fade show active" id="model-info-tab" role="tabpanel">
                        {% set model_fields = ['embedding_model', 'llm_model'] %}
                        {% for field in form %}
                            {{ render_included_field(field, disabled_fields=[], include_fields=model_fields) }}
                        {% endfor %}
                    </div>
                    <!-- License Information Tab -->
                    <div class="tab-pane fade" id="license-info-tab" role="tabpanel">
                        {% set license_fields = ['currency', 'usage_email', ] %}
                        {% for field in form %}
                            {{ render_included_field(field, disabled_fields=[], include_fields=license_fields) }}
                        {% endfor %}
                        <!-- Register API Key Button -->
                        <div class="form-group">
                            <button type="button" class="btn btn-primary" onclick="generateNewChatApiKey()">Register Chat API Key</button>
                            <button type="button" class="btn btn-primary" onclick="generateNewApiKey()">Register API Key</button>
                        </div>
                        <!-- API Key Display Field -->
                        <div id="chat-api-key-field" style="display:none;">
                            <label for="chat-api-key">Chat API Key:</label>
                            <input type="text" id="chat-api-key" class="form-control" readonly>
                            <button type="button" id="copy-chat-button" class="btn btn-primary">Copy to Clipboard</button>
                            <p id="copy-chat-message" style="display:none;color:green;">Chat API key copied to clipboard</p>
                        </div>
                        <div id="api-key-field" style="display:none;">
                            <label for="api-key">API Key:</label>
                            <input type="text" id="api-key" class="form-control" readonly>
                            <button type="button" id="copy-api-button" class="btn btn-primary">Copy to Clipboard</button>
                            <p id="copy-message" style="display:none;color:green;">API key copied to clipboard</p>
                        </div>
                    </div>
                </div>
            </div>
        </div>
        <button type="submit" class="btn btn-primary">Save Tenant</button>
    </form>
 {% endblock %}
@@ -78,88 +25,6 @@
 {% endblock %}
 {% block scripts %}
 <script>
    // Function to generate a new Chat API Key
    function generateNewChatApiKey() {
        generateApiKey('/admin/user/generate_chat_api_key', '#chat-api-key', '#chat-api-key-field');
    }
    // Function to generate a new general API Key
    function generateNewApiKey() {
        generateApiKey('/admin/user/generate_api_api_key', '#api-key', '#api-key-field');
    }
    // Reusable function to handle API key generation
    function generateApiKey(url, inputSelector, fieldSelector) {
        $.ajax({
            url: url,
            type: 'POST',
            contentType: 'application/json',
            success: function(response) {
                $(inputSelector).val(response.api_key);
                $(fieldSelector).show();
            },
            error: function(error) {
                alert('Error generating new API key: ' + error.responseText);
            }
        });
    }
    // Function to copy text to clipboard
    function copyToClipboard(selector, messageSelector) {
        const element = document.querySelector(selector);
        if (element) {
            const text = element.value;
            if (navigator.clipboard && navigator.clipboard.writeText) {
                navigator.clipboard.writeText(text).then(function() {
                    showCopyMessage(messageSelector);
                }).catch(function(error) {
                    alert('Failed to copy text: ' + error);
                });
            } else {
                fallbackCopyToClipboard(text, messageSelector);
            }
        } else {
            console.error('Element not found for selector:', selector);
        }
    }
    // Fallback method for copying text to clipboard
    function fallbackCopyToClipboard(text, messageSelector) {
        const textArea = document.createElement('textarea');
        textArea.value = text;
        document.body.appendChild(textArea);
        textArea.focus();
        textArea.select();
        try {
            document.execCommand('copy');
            showCopyMessage(messageSelector);
        } catch (err) {
            alert('Fallback: Oops, unable to copy', err);
        }
        document.body.removeChild(textArea);
    }
    // Function to show copy confirmation message
    function showCopyMessage(messageSelector) {
        const message = document.querySelector(messageSelector);
        if (message) {
            message.style.display = 'block';
            setTimeout(function() {
                message.style.display = 'none';
            }, 2000);
        }
    }
    // Event listeners for copy buttons
    document.getElementById('copy-chat-button').addEventListener('click', function() {
        copyToClipboard('#chat-api-key', '#copy-chat-message');
    });
    document.getElementById('copy-api-button').addEventListener('click', function() {
        copyToClipboard('#api-key', '#copy-message');
    });
 </script>
 <script>
    // JavaScript to detect user's timezone
    document.addEventListener('DOMContentLoaded', (event) => {
--- a/eveai_app/templates/user/tenant_project.html
+++ b/eveai_app/templates/user/tenant_project.html
@@ -0,0 +1,23 @@
 {% extends 'base.html' %}
 {% from "macros.html" import render_field %}
 {% block title %}Tenant Project Registration{% endblock %}
 {% block content_title %}Register Tenant Project{% endblock %}
 {% block content_description %}Define a new tenant project to enable APIs{% endblock %}
 {% block content %}
    <form method="post">
        {{ form.hidden_tag() }}
        {%  set disabled_fields = [] %}
        {%  set exclude_fields = [] %}
        {% for field in form %}
            {{ render_field(field, disabled_fields, exclude_fields) }}
        {% endfor %}
        <button type="submit" class="btn btn-primary">Register Tenant Project</button>
    </form>
 {% endblock %}
 {% block content_footer %}
 {% endblock %}
--- a/eveai_app/templates/user/tenant_projects.html
+++ b/eveai_app/templates/user/tenant_projects.html
@@ -0,0 +1,25 @@
 {% extends 'base.html' %}
 {%  from 'macros.html' import render_selectable_table, render_pagination %}
 {% block title %}Documents{% endblock %}
 {% block content_title %}Tenant Projects{% endblock %}
 {% block content_description %}View Tenant Projects for Tenant{% endblock %}
 {% block content_class %}<div class="col-xl-12 col-lg-5 col-md-7 mx-auto"></div>{% endblock %}
 {% block content %}
 <div class="container">
    <form method="POST" action="{{ url_for("user_bp.handle_tenant_project_selection") }}">
        {{ render_selectable_table(headers=["Tenant Project ID", "Name", "API Clue", "Responsible", "Active"], rows=rows, selectable=True, id="catalogsTable") }}
        <div class="form-group mt-3">
            <button type="submit" name="action" value="edit_tenant_project" class="btn btn-primary">Edit Tenant Project</button>
            <button type="submit" name="action" value="invalidate_tenant_project" class="btn btn-primary">Invalidate Tenant Project</button>
            <button type="submit" name="action" value="delete_tenant_project" class="btn btn-danger">Delete Tenant Project</button>
        </div>
    </form>
 </div>
 {% endblock %}
 {% block content_footer %}
    {{ render_pagination(pagination, "user_bp.tenant_projects") }}
 {% endblock %}
--- a/eveai_app/views/basic_forms.py
+++ b/eveai_app/views/basic_forms.py
@@ -2,7 +2,12 @@ from flask import session
 from flask_security import current_user
 from flask_wtf import FlaskForm
 from wtforms import StringField, SelectField
-from wtforms.validators import DataRequired
+from wtforms.validators import DataRequired, Optional
 from wtforms_sqlalchemy.fields import QuerySelectField
 from common.models.document import Catalog
 from common.models.user import Tenant
 from common.utils.database import Database
 class SessionDefaultsForm(FlaskForm):
@@ -13,11 +18,32 @@ class SessionDefaultsForm(FlaskForm):
    tenant_name = StringField('Tenant Name', validators=[DataRequired()])
    default_language = SelectField('Default Language', choices=[], validators=[DataRequired()])
    # Default Catalog - initialize as a regular SelectField
    catalog = SelectField('Catalog', choices=[], validators=[Optional()])
    def __init__(self):
        super().__init__()
        # Set basic fields first (these don't require DB access)
        self.user_name.data = current_user.user_name
        self.user_email.data = current_user.email
        self.tenant_name.data = session.get('tenant').get('name')
        self.default_language.choices = [(lang, lang.lower()) for lang in
                                         session.get('tenant').get('allowed_languages')]
        self.default_language.data = session.get('default_language')
        # Get a new session for catalog queries
        tenant_id = session.get('tenant').get('id')
        tenant_session = Database(tenant_id).get_session()
        try:
            # Populate catalog choices using tenant session
            catalogs = tenant_session.query(Catalog).all()
            self.catalog.choices = [(str(c.id), c.name) for c in catalogs]
            self.catalog.choices.insert(0, ('', 'Select a Catalog'))  # Add empty choice
            # Set current catalog if exists
            catalog_id = session.get('catalog_id')
            if catalog_id:
                self.catalog.data = str(catalog_id)
        finally:
            tenant_session.close()
--- a/eveai_app/views/basic_views.py
+++ b/eveai_app/views/basic_views.py
@@ -1,7 +1,11 @@
-from flask import request, render_template, Blueprint, session, current_app, jsonify
+from flask import request, render_template, Blueprint, session, current_app, jsonify, flash, redirect
 from flask_security import roles_required, roles_accepted
 from flask_wtf.csrf import generate_csrf
 from common.models.document import Catalog
 from common.models.user import Tenant
 from common.utils.database import Database
 from common.utils.nginx_utils import prefixed_url_for
 from .basic_forms import SessionDefaultsForm
 basic_bp = Blueprint('basic_bp', __name__)
@@ -9,12 +13,11 @@ basic_bp = Blueprint('basic_bp', __name__)
@basic_bp.before_request
 def log_before_request():
-    current_app.logger.debug(f"Before request (basic_bp): {request.method} {request.url}")
+    current_app.logger.debug(f'Before request: {request.path} =====================================')
@basic_bp.after_request
 def log_after_request(response):
    current_app.logger.debug(f"After request  (basic_bp): {request.method} {request.url} - Status: {response.status}")
    return response
@@ -36,13 +39,40 @@ def confirm_email_fail():
@basic_bp.route('/session_defaults', methods=['GET', 'POST'])
@roles_accepted('Super User', 'Tenant Admin')
 def session_defaults():
    try:
        # Get tenant session
        tenant_id = session.get('tenant').get('id')
        tenant_db = Database(tenant_id)
        tenant_session = tenant_db.get_session()
        try:
            form = SessionDefaultsForm()
            if form.validate_on_submit():
                session['default_language'] = form.default_language.data
                if form.catalog.data:
                    catalog_id = int(form.catalog.data)
                    catalog = tenant_session.query(Catalog).get(catalog_id)
                    if catalog:
                        session['catalog_id'] = catalog.id
                        session['catalog_name'] = catalog.name
                else:
                    session.pop('catalog_id', None)
                    session.pop('catalog_name', None)
                flash('Session defaults updated successfully', 'success')
                return redirect(prefixed_url_for('basic_bp.index'))
            return render_template('basic/session_defaults.html', form=form)
        finally:
            tenant_session.close()
    except Exception as e:
        current_app.logger.error(f"Error in session_defaults: {str(e)}")
        flash('Error accessing catalog data. Please ensure your session is valid.', 'danger')
        return redirect(prefixed_url_for('security_bp.login'))
@basic_bp.route('/set_user_timezone', methods=['POST'])
 def set_user_timezone():
--- a/eveai_app/views/document_forms.py
+++ b/eveai_app/views/document_forms.py
@@ -1,29 +1,20 @@
-from flask import session, current_app, request
+from flask import session, current_app
 from flask_wtf import FlaskForm
-from wtforms import (StringField, BooleanField, SubmitField, DateField, IntegerField, FloatField, SelectMultipleField,
+from wtforms import (StringField, BooleanField, SubmitField, DateField, IntegerField, SelectField, TextAreaField, URLField)
                     SelectField, FieldList, FormField, TextAreaField, URLField)
 from wtforms.validators import DataRequired, Length, Optional, URL, ValidationError, NumberRange
-from flask_wtf.file import FileField, FileAllowed, FileRequired
+from flask_wtf.file import FileField, FileRequired
 import json
 from wtforms_sqlalchemy.fields import QuerySelectField
 from common.extensions import db
 from common.models.document import Catalog
-from config.catalog_types import CATALOG_TYPES
+from config.type_defs.catalog_types import CATALOG_TYPES
-from config.retriever_types import RETRIEVER_TYPES
+from config.type_defs.processor_types import PROCESSOR_TYPES
 from config.type_defs.retriever_types import RETRIEVER_TYPES
 from .dynamic_form_base import DynamicFormBase
 def allowed_file(form, field):
    if field.data:
        filename = field.data.filename
        allowed_extensions = current_app.config.get('SUPPORTED_FILE_TYPES', [])
        if not ('.' in filename and filename.rsplit('.', 1)[1].lower() in allowed_extensions):
            raise ValidationError('Unsupported file type.')
 def validate_json(form, field):
    if field.data:
        try:
@@ -39,29 +30,14 @@ class CatalogForm(FlaskForm):
    # Select Field for Catalog Type (Uses the CATALOG_TYPES defined in config)
    type = SelectField('Catalog Type', validators=[DataRequired()])
    # Metadata fields
    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json])
    configuration = TextAreaField('Configuration', validators=[Optional(), validate_json])
    # HTML Embedding Variables
    html_tags = StringField('HTML Tags', validators=[DataRequired()],
                            default='p, h1, h2, h3, h4, h5, h6, li, , tbody, tr, td')
    html_end_tags = StringField('HTML End Tags', validators=[DataRequired()],
                                default='p, li')
    html_included_elements = StringField('HTML Included Elements', validators=[Optional()], default='article, main')
    html_excluded_elements = StringField('HTML Excluded Elements', validators=[Optional()],
                                         default='header, footer, nav, script')
    html_excluded_classes = StringField('HTML Excluded Classes', validators=[Optional()])
    min_chunk_size = IntegerField('Minimum Chunk Size (2000)', validators=[NumberRange(min=0), Optional()],
                                  default=2000)
    max_chunk_size = IntegerField('Maximum Chunk Size (3000)', validators=[NumberRange(min=0), Optional()],
                                  default=3000)
-    # Chat Variables
+
-    chat_RAG_temperature = FloatField('RAG Temperature', default=0.3, validators=[NumberRange(min=0, max=1)])
+    # Metadata fields
-    chat_no_RAG_temperature = FloatField('No RAG Temperature', default=0.5, validators=[NumberRange(min=0, max=1)])
+    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
-    # Tuning variables
+    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json])
    embed_tuning = BooleanField('Enable Embedding Tuning', default=False)
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
@@ -76,28 +52,78 @@ class EditCatalogForm(DynamicFormBase):
    # Select Field for Catalog Type (Uses the CATALOG_TYPES defined in config)
    type = StringField('Catalog Type', validators=[DataRequired()], render_kw={'readonly': True})
    # Metadata fields
    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json],)
    # HTML Embedding Variables
    html_tags = StringField('HTML Tags', validators=[DataRequired()],
                            default='p, h1, h2, h3, h4, h5, h6, li, , tbody, tr, td')
    html_end_tags = StringField('HTML End Tags', validators=[DataRequired()],
                                default='p, li')
    html_included_elements = StringField('HTML Included Elements', validators=[Optional()], default='article, main')
    html_excluded_elements = StringField('HTML Excluded Elements', validators=[Optional()],
                                         default='header, footer, nav, script')
    html_excluded_classes = StringField('HTML Excluded Classes', validators=[Optional()])
    min_chunk_size = IntegerField('Minimum Chunk Size (2000)', validators=[NumberRange(min=0), Optional()],
                                  default=2000)
    max_chunk_size = IntegerField('Maximum Chunk Size (3000)', validators=[NumberRange(min=0), Optional()],
                                  default=3000)
-    # Chat Variables
+
-    chat_RAG_temperature = FloatField('RAG Temperature', default=0.3, validators=[NumberRange(min=0, max=1)])
+    # Metadata fields
-    chat_no_RAG_temperature = FloatField('No RAG Temperature', default=0.5, validators=[NumberRange(min=0, max=1)])
+    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
-    # Tuning variables
+    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json],)
-    embed_tuning = BooleanField('Enable Embedding Tuning', default=False)
+
 class ProcessorForm(FlaskForm):
    name = StringField('Name', validators=[DataRequired(), Length(max=50)])
    description = TextAreaField('Description', validators=[Optional()])
    # Catalog for the Retriever
    catalog = QuerySelectField(
        'Catalog ID',
        query_factory=lambda: Catalog.query.all(),
        allow_blank=True,
        get_label='name',
        validators=[Optional()],
    )
    # Select Field for Catalog Type (Uses the CATALOG_TYPES defined in config)
    type = SelectField('Processor Type', validators=[DataRequired()])
    sub_file_type = StringField('Sub File Type', validators=[Optional(), Length(max=50)])
    min_chunk_size = IntegerField('Minimum Chunk Size (2000)', validators=[NumberRange(min=0), Optional()],
                                  default=2000)
    max_chunk_size = IntegerField('Maximum Chunk Size (3000)', validators=[NumberRange(min=0), Optional()],
                                  default=3000)
    tuning = BooleanField('Enable Embedding Tuning', default=False)
    # Metadata fields
    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json])
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # Dynamically populate the 'type' field using the constructor
        self.type.choices = sorted(
            [(key, value['name']) for key, value in PROCESSOR_TYPES.items()],
            key=lambda x: x[1],
        )
 class EditProcessorForm(DynamicFormBase):
    name = StringField('Name', validators=[DataRequired(), Length(max=50)])
    description = TextAreaField('Description', validators=[Optional()])
    # Catalog for the Retriever
    catalog = QuerySelectField(
        'Catalog ID',
        query_factory=lambda: Catalog.query.all(),
        allow_blank=True,
        get_label='name',
        validators=[Optional()],
    )
    type = StringField('Processor Type', validators=[DataRequired()], render_kw={'readonly': True})
    sub_file_type = StringField('Sub File Type', validators=[Optional(), Length(max=50)])
    min_chunk_size = IntegerField('Minimum Chunk Size (2000)', validators=[NumberRange(min=0), Optional()],
                                  default=2000)
    max_chunk_size = IntegerField('Maximum Chunk Size (3000)', validators=[NumberRange(min=0), Optional()],
                                  default=3000)
    tuning = BooleanField('Enable Embedding Tuning', default=False)
    # Metadata fields
    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json])
 class RetrieverForm(FlaskForm):
@@ -137,22 +163,18 @@ class EditRetrieverForm(DynamicFormBase):
        validators=[Optional()],
    )
    # Select Field for Retriever Type (Uses the RETRIEVER_TYPES defined in config)
-    type = SelectField('Retriever Type', validators=[DataRequired()], render_kw={'readonly': True})
+    type = StringField('Processor Type', validators=[DataRequired()], render_kw={'readonly': True})
    tuning = BooleanField('Enable Tuning', default=False)
    # Metadata fields
    user_metadata = TextAreaField('User Metadata', validators=[Optional(), validate_json])
    system_metadata = TextAreaField('System Metadata', validators=[Optional(), validate_json])
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # Set the retriever type choices (loaded from config)
        self.type.choices = [(key, value['name']) for key, value in RETRIEVER_TYPES.items()]
 class AddDocumentForm(DynamicFormBase):
-    file = FileField('File', validators=[FileRequired(), allowed_file])
+    file = FileField('File', validators=[FileRequired()])
    catalog = StringField('Catalog', render_kw={'readonly': True})
    sub_file_type = StringField('Sub File Type', validators=[Optional(), Length(max=50)])
    name = StringField('Name', validators=[Length(max=100)])
    language = SelectField('Language', choices=[], validators=[Optional()])
    user_context = TextAreaField('User Context', validators=[Optional()])
@@ -166,9 +188,13 @@ class AddDocumentForm(DynamicFormBase):
        if not self.language.data:
            self.language.data = session.get('tenant').get('default_language')
        self.catalog.data = session.get('catalog_name', '')
 class AddURLForm(DynamicFormBase):
    url = URLField('URL', validators=[DataRequired(), URL()])
    catalog = StringField('Catalog', render_kw={'readonly': True})
    sub_file_type = StringField('Sub File Type', validators=[Optional(), Length(max=50)])
    name = StringField('Name', validators=[Length(max=100)])
    language = SelectField('Language', choices=[], validators=[Optional()])
    user_context = TextAreaField('User Context', validators=[Optional()])
@@ -182,22 +208,7 @@ class AddURLForm(DynamicFormBase):
        if not self.language.data:
            self.language.data = session.get('tenant').get('default_language')
-
+        self.catalog.data = session.get('catalog_name', '')
 class AddURLsForm(FlaskForm):
    urls = TextAreaField('URL(s) (one per line)', validators=[DataRequired()])
    name = StringField('Name Prefix', validators=[Length(max=100)])
    language = SelectField('Language', choices=[], validators=[Optional()])
    user_context = TextAreaField('User Context', validators=[Optional()])
    valid_from = DateField('Valid from', id='form-control datepicker', validators=[Optional()])
    submit = SubmitField('Submit')
    def __init__(self):
        super().__init__()
        self.language.choices = [(language, language) for language in
                                 session.get('tenant').get('allowed_languages')]
        if not self.language.data:
            self.language.data = session.get('tenant').get('default_language')
 class EditDocumentForm(FlaskForm):
@@ -209,6 +220,7 @@ class EditDocumentForm(FlaskForm):
 class EditDocumentVersionForm(DynamicFormBase):
    sub_file_type = StringField('Sub File Type', validators=[Optional(), Length(max=50)])
    language = StringField('Language')
    user_context = TextAreaField('User Context', validators=[Optional()])
    system_context = TextAreaField('System Context', validators=[Optional()])
--- a/eveai_app/views/document_views.py
+++ b/eveai_app/views/document_views.py
@@ -1,7 +1,6 @@
 import ast
 from datetime import datetime as dt, timezone as tz
 from babel.messages.setuptools_frontend import update_catalog
 from flask import request, redirect, flash, render_template, Blueprint, session, current_app
 from flask_security import roles_accepted, current_user
 from sqlalchemy import desc
@@ -10,40 +9,38 @@ from werkzeug.utils import secure_filename
 from sqlalchemy.exc import SQLAlchemyError
 import requests
 from requests.exceptions import SSLError
 from urllib.parse import urlparse, unquote
 import io
 import json
-from common.models.document import Document, DocumentVersion, Catalog, Retriever
+from common.models.document import Document, DocumentVersion, Catalog, Retriever, Processor
-from common.extensions import db, minio_client
+from common.extensions import db
-from common.utils.document_utils import validate_file_type, create_document_stack, start_embedding_task, process_url, \
+from common.models.interaction import Specialist, SpecialistRetriever
-    process_multiple_urls, get_documents_list, edit_document, \
+from common.utils.document_utils import create_document_stack, start_embedding_task, process_url, \
    edit_document, \
    edit_document_version, refresh_document
 from common.utils.eveai_exceptions import EveAIInvalidLanguageException, EveAIUnsupportedFileType, \
    EveAIDoubleURLException
-from .document_forms import AddDocumentForm, AddURLForm, EditDocumentForm, EditDocumentVersionForm, AddURLsForm, \
+from config.type_defs.processor_types import PROCESSOR_TYPES
-    CatalogForm, EditCatalogForm, RetrieverForm, EditRetrieverForm
+from .document_forms import AddDocumentForm, AddURLForm, EditDocumentForm, EditDocumentVersionForm, \
    CatalogForm, EditCatalogForm, RetrieverForm, EditRetrieverForm, ProcessorForm, EditProcessorForm
 from common.utils.middleware import mw_before_request
 from common.utils.celery_utils import current_celery
 from common.utils.nginx_utils import prefixed_url_for
-from common.utils.view_assistants import form_validation_failed, prepare_table_for_macro, form_to_dict
+from common.utils.view_assistants import form_validation_failed, prepare_table_for_macro
 from .document_list_view import DocumentListView
 from .document_version_list_view import DocumentVersionListView
-from config.catalog_types import CATALOG_TYPES
+from config.type_defs.catalog_types import CATALOG_TYPES
-from config.retriever_types import RETRIEVER_TYPES
+from config.type_defs.retriever_types import RETRIEVER_TYPES
 document_bp = Blueprint('document_bp', __name__, url_prefix='/document')
@document_bp.before_request
 def log_before_request():
-    current_app.logger.debug(f"Before request (document_bp): {request.method} {request.url}")
+    current_app.logger.debug(f'Before request: {request.path} =====================================')
@document_bp.after_request
 def log_after_request(response):
    current_app.logger.debug(
        f"After request  (document_bp): {request.method} {request.url} - Status: {response.status}")
    return response
@@ -53,8 +50,6 @@ def before_request():
        mw_before_request()
    except Exception as e:
        current_app.logger.error(f'Error switching schema in Document Blueprint: {e}')
        for role in current_user.roles:
            current_app.logger.debug(f'User {current_user.email} has role {role.name}')
        raise
@@ -67,16 +62,6 @@ def catalog():
        tenant_id = session.get('tenant').get('id')
        new_catalog = Catalog()
        form.populate_obj(new_catalog)
        # Handle Embedding Variables
        new_catalog.html_tags = [tag.strip() for tag in form.html_tags.data.split(',')] if form.html_tags.data else []
        new_catalog.html_end_tags = [tag.strip() for tag in form.html_end_tags.data.split(',')] \
            if form.html_end_tags.data else []
        new_catalog.html_included_elements = [tag.strip() for tag in form.html_included_elements.data.split(',')] \
            if form.html_included_elements.data else []
        new_catalog.html_excluded_elements = [tag.strip() for tag in form.html_excluded_elements.data.split(',')] \
            if form.html_excluded_elements.data else []
        new_catalog.html_excluded_classes = [cls.strip() for cls in form.html_excluded_classes.data.split(',')] \
            if form.html_excluded_classes.data else []
        set_logging_information(new_catalog, dt.now(tz.utc))
        try:
@@ -84,6 +69,8 @@ def catalog():
            db.session.commit()
            flash('Catalog successfully added!', 'success')
            current_app.logger.info(f'Catalog {new_catalog.name} successfully added for tenant {tenant_id}!')
            # Enable step 2 of creation of catalog - add configuration of the catalog (dependent on type)
            return redirect(prefixed_url_for('document_bp.catalog', catalog_id=new_catalog.id))
        except SQLAlchemyError as e:
            db.session.rollback()
            flash(f'Failed to add catalog. Error: {e}', 'danger')
@@ -140,27 +127,8 @@ def edit_catalog(catalog_id):
    configuration_config = CATALOG_TYPES[catalog.type]["configuration"]
    form.add_dynamic_fields("configuration", configuration_config, catalog.configuration)
    # Convert arrays to comma-separated strings for display
    if request.method == 'GET':
        form.html_tags.data = ', '.join(catalog.html_tags or '')
        form.html_end_tags.data = ', '.join(catalog.html_end_tags or '')
        form.html_included_elements.data = ', '.join(catalog.html_included_elements or '')
        form.html_excluded_elements.data = ', '.join(catalog.html_excluded_elements or '')
        form.html_excluded_classes.data = ', '.join(catalog.html_excluded_classes or '')
    if request.method == 'POST' and form.validate_on_submit():
        form.populate_obj(catalog)
        # Handle Embedding Variables
        catalog.html_tags = [tag.strip() for tag in form.html_tags.data.split(',')] if form.html_tags.data else []
        catalog.html_end_tags = [tag.strip() for tag in form.html_end_tags.data.split(',')] \
            if form.html_end_tags.data else []
        catalog.html_included_elements = [tag.strip() for tag in form.html_included_elements.data.split(',')] \
            if form.html_included_elements.data else []
        catalog.html_excluded_elements = [tag.strip() for tag in form.html_excluded_elements.data.split(',')] \
            if form.html_excluded_elements.data else []
        catalog.html_excluded_classes = [cls.strip() for cls in form.html_excluded_classes.data.split(',')] \
            if form.html_excluded_classes.data else []
        catalog.configuration = form.get_dynamic_data('configuration')
        update_logging_information(catalog, dt.now(tz.utc))
        try:
@@ -180,6 +148,116 @@ def edit_catalog(catalog_id):
    return render_template('document/edit_catalog.html', form=form, catalog_id=catalog_id)
@document_bp.route('/processor', methods=['GET', 'POST'])
@roles_accepted('Super User', 'Tenant Admin')
 def processor():
    form = ProcessorForm()
    if form.validate_on_submit():
        tenant_id = session.get('tenant').get('id')
        new_processor = Processor()
        form.populate_obj(new_processor)
        new_processor.catalog_id = form.catalog.data.id
        set_logging_information(new_processor, dt.now(tz.utc))
        try:
            db.session.add(new_processor)
            db.session.commit()
            flash('Processor successfully added!', 'success')
            current_app.logger.info(f'Processor {new_processor.name} successfully added for tenant {tenant_id}!')
            # Enable step 2 of creation of retriever - add configuration of the retriever (dependent on type)
            return redirect(prefixed_url_for('document_bp.edit_processor', processor_id=new_processor.id))
        except SQLAlchemyError as e:
            db.session.rollback()
            flash(f'Failed to add processor. Error: {e}', 'danger')
            current_app.logger.error(f'Failed to add retriever {new_processor.name}'
                                     f'for tenant {tenant_id}. Error: {str(e)}')
    return render_template('document/processor.html', form=form)
@document_bp.route('/processor/<int:processor_id>', methods=['GET', 'POST'])
@roles_accepted('Super User', 'Tenant Admin')
 def edit_processor(processor_id):
    """Edit an existing processorr configuration."""
    # Get the processor or return 404
    processor = Processor.query.get_or_404(processor_id)
    if processor.catalog_id:
        # If catalog_id is just an ID, fetch the Catalog object
        processor.catalog = Catalog.query.get(processor.catalog_id)
    else:
        processor.catalog = None
    # Create form instance with the processor
    form = EditProcessorForm(request.form, obj=processor)
    configuration_config = PROCESSOR_TYPES[processor.type]["configuration"]
    form.add_dynamic_fields("configuration", configuration_config, processor.configuration)
    if form.validate_on_submit():
        # Update basic fields
        form.populate_obj(processor)
        processor.configuration = form.get_dynamic_data('configuration')
        # Update catalog relationship
        processor.catalog_id = form.catalog.data.id if form.catalog.data else None
        # Update logging information
        update_logging_information(processor, dt.now(tz.utc))
        # Save changes to database
        try:
            db.session.add(processor)
            db.session.commit()
            flash('Retriever updated successfully!', 'success')
            current_app.logger.info(f'Processor {processor.id} updated successfully')
        except SQLAlchemyError as e:
            db.session.rollback()
            flash(f'Failed to update processor. Error: {str(e)}', 'danger')
            current_app.logger.error(f'Failed to update processor {processor_id}. Error: {str(e)}')
            return render_template('document/edit_processor.html', form=form, processor_id=processor_id)
        return redirect(prefixed_url_for('document_bp.processors'))
    else:
        form_validation_failed(request, form)
    return render_template('document/edit_processor.html', form=form, processor_id=processor_id)
@document_bp.route('/processors', methods=['GET', 'POST'])
@roles_accepted('Super User', 'Tenant Admin')
 def processors():
    page = request.args.get('page', 1, type=int)
    per_page = request.args.get('per_page', 10, type=int)
    query = Processor.query.order_by(Processor.id)
    pagination = query.paginate(page=page, per_page=per_page)
    the_processors = pagination.items
    # prepare table data
    rows = prepare_table_for_macro(the_processors,
                                   [('id', ''), ('name', ''), ('type', ''), ('catalog_id', '')])
    # Render the catalogs in a template
    return render_template('document/processors.html', rows=rows, pagination=pagination)
@document_bp.route('/handle_processor_selection', methods=['POST'])
@roles_accepted('Super User', 'Tenant Admin')
 def handle_processor_selection():
    processor_identification = request.form.get('selected_row')
    processor_id = ast.literal_eval(processor_identification).get('value')
    action = request.form['action']
    if action == 'edit_processor':
        return redirect(prefixed_url_for('document_bp.edit_processor', processor_id=processor_id))
    return redirect(prefixed_url_for('document_bp.processors'))
@document_bp.route('/retriever', methods=['GET', 'POST'])
@roles_accepted('Super User', 'Tenant Admin')
 def retriever():
@@ -198,15 +276,14 @@ def retriever():
            db.session.commit()
            flash('Retriever successfully added!', 'success')
            current_app.logger.info(f'Catalog {new_retriever.name} successfully added for tenant {tenant_id}!')
            # Enable step 2 of creation of retriever - add configuration of the retriever (dependent on type)
            return redirect(prefixed_url_for('document_bp.edit_retriever', retriever_id=new_retriever.id))
        except SQLAlchemyError as e:
            db.session.rollback()
            flash(f'Failed to add retriever. Error: {e}', 'danger')
            current_app.logger.error(f'Failed to add retriever {new_retriever.name}'
                                     f'for tenant {tenant_id}. Error: {str(e)}')
        # Enable step 2 of creation of retriever - add configuration of the retriever (dependent on type)
        return redirect(prefixed_url_for('document_bp.retriever', retriever_id=new_retriever.id))
    return render_template('document/retriever.html', form=form)
@@ -228,8 +305,6 @@ def edit_retriever(retriever_id):
    configuration_config = RETRIEVER_TYPES[retriever.type]["configuration"]
    form.add_dynamic_fields("configuration", configuration_config, retriever.configuration)
    if request.method == 'POST':
        current_app.logger.debug(f'Received POST request with {request.form}')
    if form.validate_on_submit():
        # Update basic fields
@@ -258,7 +333,6 @@ def edit_retriever(retriever_id):
    else:
        form_validation_failed(request, form)
    current_app.logger.debug(f"Rendering Template for {retriever_id}")
    return render_template('document/edit_retriever.html', form=form, retriever_id=retriever_id)
@@ -300,7 +374,7 @@ def add_document():
    form = AddDocumentForm(request.form)
    catalog_id = session.get('catalog_id', None)
    if catalog_id is None:
-        flash('You need to set a Session Catalog before adding Documents or URLs')
+        flash('You need to set a Session Catalog before adding Documents or URLs', 'warning')
        return redirect(prefixed_url_for('document_bp.catalogs'))
    catalog = Catalog.query.get_or_404(catalog_id)
@@ -314,11 +388,9 @@ def add_document():
            current_app.logger.info(f'Adding Document for {catalog_id}')
            tenant_id = session['tenant']['id']
            file = form.file.data
            sub_file_type = form.sub_file_type.data
            filename = secure_filename(file.filename)
            extension = filename.rsplit('.', 1)[1].lower()
            validate_file_type(extension)
            catalog_properties = {}
            document_version_configurations = CATALOG_TYPES[catalog.type]['document_version_configurations']
            for config in document_version_configurations:
@@ -327,14 +399,13 @@ def add_document():
            api_input = {
                'catalog_id': catalog_id,
                'name': form.name.data,
                'sub_file_type': form.sub_file_type.data,
                'language': form.language.data,
                'user_context': form.user_context.data,
                'valid_from': form.valid_from.data,
                'user_metadata': json.loads(form.user_metadata.data) if form.user_metadata.data else None,
                'catalog_properties': catalog_properties,
            }
            current_app.logger.debug(f'Creating document stack with input {api_input}')
            new_doc, new_doc_vers = create_document_stack(api_input, file, filename, extension, tenant_id)
            task_id = start_embedding_task(tenant_id, new_doc_vers.id)
@@ -344,6 +415,7 @@ def add_document():
        except (EveAIInvalidLanguageException, EveAIUnsupportedFileType) as e:
            flash(str(e), 'error')
            current_app.logger.error(f"Error adding document: {str(e)}")
        except Exception as e:
            current_app.logger.error(f'Error adding document: {str(e)}')
            flash('An error occurred while adding the document.', 'error')
@@ -357,7 +429,7 @@ def add_url():
    form = AddURLForm(request.form)
    catalog_id = session.get('catalog_id', None)
    if catalog_id is None:
-        flash('You need to set a Session Catalog before adding Documents or URLs')
+        flash('You need to set a Session Catalog before adding Documents or URLs', 'warning')
        return redirect(prefixed_url_for('document_bp.catalogs'))
    catalog = Catalog.query.get_or_404(catalog_id)
@@ -381,6 +453,7 @@ def add_url():
            api_input = {
                'catalog_id': catalog_id,
                'name': form.name.data or filename,
                'sub_file_type': form.sub_file_type.data,
                'url': url,
                'language': form.language.data,
                'user_context': form.user_context.data,
@@ -469,6 +542,7 @@ def edit_document_view(document_id):
    if form.validate_on_submit():
        updated_doc, error = edit_document(
            session.get('tenant').get('id', 0),
            document_id,
            form.name.data,
            form.valid_from.data,
@@ -491,10 +565,8 @@ def edit_document_version_view(document_version_id):
    doc_vers = DocumentVersion.query.get_or_404(document_version_id)
    form = EditDocumentVersionForm(request.form, obj=doc_vers)
-    catalog_id = session.get('catalog_id', None)
+    doc_vers = DocumentVersion.query.get_or_404(document_version_id)
-    if catalog_id is None:
+    catalog_id = doc_vers.document.catalog_id
        flash('You need to set a Session Catalog before adding Documents or URLs')
        return redirect(prefixed_url_for('document_bp.catalogs'))
    catalog = Catalog.query.get_or_404(catalog_id)
    if catalog.configuration and len(catalog.configuration) > 0:
@@ -509,6 +581,7 @@ def edit_document_version_view(document_version_id):
            catalog_properties[config] = form.get_dynamic_data(config)
        updated_version, error = edit_document_version(
            session.get('tenant').get('id', 0),
            document_version_id,
            form.user_context.data,
            catalog_properties,
@@ -565,8 +638,6 @@ def handle_document_version_selection():
    action = request.form['action']
    current_app.logger.debug(f'Triggered Document Version Action: {action}')
    match action:
        case 'edit_document_version':
            return redirect(prefixed_url_for('document_bp.edit_document_version_view', document_version_id=doc_vers_id))
@@ -590,6 +661,8 @@ def handle_library_selection():
    action = request.form['action']
    match action:
        case 'create_default_rag_library':
            create_default_rag_library()
        case 're_embed_latest_versions':
            re_embed_latest_versions()
        case 'refresh_all_documents':
@@ -598,12 +671,90 @@ def handle_library_selection():
    return redirect(prefixed_url_for('document_bp.library_operations'))
 def create_default_rag_library():
    # Check if no catalog exists. If non exists, no processors, retrievers or specialist can exists
    catalogs = Catalog.query.all()
    if catalogs:
        flash("Default RAG Library can only be created if no catalogs are defined!", 'danger')
        return redirect(prefixed_url_for('document_bp.library_operations'))
    timestamp = dt.now(tz=tz.utc)
    try:
        cat = Catalog(
            name='Default RAG Catalog',
            description='Default RAG Catalog',
            type="STANDARD_CATALOG",
            min_chunk_size=2000,
            max_chunk_size=3000,
        )
        set_logging_information(cat, timestamp)
        db.session.add(cat)
        db.session.commit()
        proc = Processor(
            name='Default HTML Processor',
            description='Default HTML Processor',
            catalog_id=cat.id,
            type="HTML Processor",
            configuration={
                "html_tags": "p, h1, h2, h3, h4, h5, h6, li, table, thead, tbody, tr, td",
                "html_end_tags": "p, li, table",
                "html_excluded_classes": "",
                "html_excluded_elements": "header, footer, nav, script",
                "html_included_elements": "article, main"
            }
        )
        set_logging_information(proc, timestamp)
        retr = Retriever(
            name='Default RAG Retriever',
            description='Default RAG Retriever',
            catalog_id=cat.id,
            type="STANDARD_RAG",
            configuration={
                "es_k": "8",
                "es_similarity_threshold": 0.3
            }
        )
        set_logging_information(retr, timestamp)
        db.session.add(proc)
        db.session.add(retr)
        db.session.commit()
        spec = Specialist(
            name='Default RAG Specialist',
            description='Default RAG Specialist',
            type='STANDARD_RAG',
            configuration={"temperature": "0.3", "specialist_context": "To be specified"}
        )
        set_logging_information(spec, timestamp)
        db.session.add(spec)
        db.session.commit()
        spec_retr = SpecialistRetriever(
            specialist_id=spec.id,
            retriever_id=retr.id,
        )
        db.session.add(spec_retr)
        db.session.commit()
    except SQLAlchemyError as e:
        db.session.rollback()
        flash(f'Failed to create Default RAG Library. Error: {e}', 'danger')
        current_app.logger.error(f'Failed to create Default RAG Library'
                                 f'for tenant {session['tenant']['id']}. Error: {str(e)}')
    return redirect(prefixed_url_for('document_bp.library_operations'))
@document_bp.route('/document_versions_list', methods=['GET'])
@roles_accepted('Super User', 'Tenant Admin')
 def document_versions_list():
    current_app.logger.debug('Getting document versions list')
    view = DocumentVersionListView(DocumentVersion, 'document/document_versions_list_view.html', per_page=20)
    current_app.logger.debug('Got document versions list')
    return view.get()
@@ -656,8 +807,9 @@ def update_logging_information(obj, timestamp):
 def log_session_state(session, msg=""):
-    current_app.logger.debug(f"{msg} - Session dirty: {session.dirty}")
+    pass
-    current_app.logger.debug(f"{msg} - Session new: {session.new}")
+    # current_app.logger.info(f"{msg} - Session dirty: {session.dirty}")
    # current_app.logger.info(f"{msg} - Session new: {session.new}")
 def fetch_html(url):
--- a/eveai_app/views/dynamic_form_base.py
+++ b/eveai_app/views/dynamic_form_base.py
@@ -5,6 +5,46 @@ import json
 from wtforms.fields.choices import SelectField
 from wtforms.fields.datetime import DateField
 from common.utils.config_field_types import TaggingFields, json_to_patterns, patterns_to_json
 class TaggingFieldsField(TextAreaField):
    def __init__(self, *args, **kwargs):
        kwargs['render_kw'] = {
            'class': 'chunking-patterns-field',
            'data-handle-enter': 'true'
        }
        super().__init__(*args, **kwargs)
    # def _value(self):
    #     if self.data:
    #         return json.dumps(self.data)
    #     return ''
    #
    # def process_formdata(self, valuelist):
    #     if valuelist and valuelist[0]:
    #         try:
    #             self.data = json.loads(valuelist[0])
    #         except json.JSONDecodeError as e:
    #             raise ValueError('Not valid JSON content')
 class ChunkingPatternsField(TextAreaField):
    def __init__(self, *args, **kwargs):
        kwargs['render_kw'] = {
            'class': 'chunking-patterns-field',
            'data-handle-enter': 'true'
        }
        super().__init__(*args, **kwargs)
    # def _value(self):
    #     if self.data:
    #         return '\n'.join(self.data)
    #     return ''
    #
    # def process_formdata(self, valuelist):
    #     if valuelist and valuelist[0]:
    #         self.data = [line.strip() for line in valuelist[0].split('\n') if line.strip()]
 class DynamicFormBase(FlaskForm):
@@ -38,14 +78,35 @@ class DynamicFormBase(FlaskForm):
                        message=f"Value must be between {min_value or '-∞'} and {max_value or '∞'}"
                    )
                )
        elif field_type == 'tagging_fields':
            validators_list.append(self._validate_tagging_fields)
        return validators_list
    def _validate_tagging_fields(self, form, field):
        """Validate the tagging fields structure"""
        if not field.data:
            return
        try:
            # Parse JSON data
            fields_data = json.loads(field.data)
            # Validate using TaggingFields model
            try:
                TaggingFields.from_dict(fields_data)
            except ValueError as e:
                raise ValidationError(str(e))
        except json.JSONDecodeError:
            raise ValidationError("Invalid JSON format")
        except Exception as e:
            raise ValidationError(f"Invalid field definition: {str(e)}")
    def add_dynamic_fields(self, collection_name, config, initial_data=None):
        """Add dynamic fields to the form based on the configuration."""
        self.dynamic_fields[collection_name] = []
        for field_name, field_def in config.items():
            current_app.logger.debug(f"{field_name}: {field_def}")
            # Prefix the field name with the collection name
            full_field_name = f"{collection_name}_{field_name}"
            label = field_def.get('name', field_name)
@@ -58,8 +119,7 @@ class DynamicFormBase(FlaskForm):
            # Handle special case for tagging_fields
            if field_type == 'tagging_fields':
-                field_class = TextAreaField
+                field_class = TaggingFieldsField
                field_validators.append(validate_tagging_fields)
                extra_classes = 'json-editor'
                field_kwargs = {}
            elif field_type == 'enum':
@@ -68,6 +128,10 @@ class DynamicFormBase(FlaskForm):
                choices = [(str(val), str(val)) for val in allowed_values]
                extra_classes = ''
                field_kwargs = {'choices': choices}
            elif field_type == 'chunking_patterns':
                field_class = ChunkingPatternsField
                extra_classes = ['monospace-text', 'pattern-input']
                field_kwargs = {}
            else:
                extra_classes = ''
                field_class = {
@@ -75,6 +139,7 @@ class DynamicFormBase(FlaskForm):
                    'float': FloatField,
                    'boolean': BooleanField,
                    'string': StringField,
                    'text': TextAreaField,
                    'date': DateField,
                }.get(field_type, StringField)
                field_kwargs = {}
@@ -89,6 +154,12 @@ class DynamicFormBase(FlaskForm):
                    except (TypeError, ValueError) as e:
                        current_app.logger.error(f"Error converting initial data to JSON: {e}")
                        field_data = "{}"
                elif field_type == 'chunking_patterns':
                    try:
                        field_data = json_to_patterns(field_data)
                    except (TypeError, ValueError) as e:
                        current_app.logger.error(f"Error converting initial data to a list of patterns: {e}")
                        field_data = {}
            elif default is not None:
                field_data = default
@@ -144,23 +215,24 @@ class DynamicFormBase(FlaskForm):
    def get_dynamic_data(self, collection_name):
        """Retrieve the data from dynamic fields of a specific collection."""
        data = {}
        current_app.logger.debug(f"{collection_name} in {self.dynamic_fields}?")
        if collection_name not in self.dynamic_fields:
            return data
        prefix_length = len(collection_name) + 1  # +1 for the underscore
        for full_field_name in self.dynamic_fields[collection_name]:
            current_app.logger.debug(f"{full_field_name}: {full_field_name}")
            original_field_name = full_field_name[prefix_length:]
            current_app.logger.debug(f"{original_field_name}: {original_field_name}")
            field = getattr(self, full_field_name)
            current_app.logger.debug(f"{field}: {field}")
            # Parse JSON for tagging_fields type
-            if isinstance(field, TextAreaField) and field.data:
+            if isinstance(field, TaggingFieldsField) and field.data:
                try:
                    data[original_field_name] = json.loads(field.data)
                except json.JSONDecodeError:
                    # Validation should catch this, but just in case
                    data[original_field_name] = field.data
            elif isinstance(field, ChunkingPatternsField):
                try:
                    data[original_field_name] = patterns_to_json(field.data)
                except Exception as e:
                    current_app.logger.error(f"Error converting initial data to patterns: {e}")
            else:
                data[original_field_name] = field.data
        return data
@@ -212,5 +284,3 @@ def validate_tagging_fields(form, field):
    except (TypeError, ValueError) as e:
        raise ValidationError(f"Invalid field definition: {str(e)}")
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Josako	53c625599a	- Release documentation update	2024-12-13 11:18:07 +01:00
Josako	88ee4f482b	- Try to move Add... functionality to overviews	2024-12-13 10:41:29 +01:00
Josako	3176b95323	- finished add_document on Zapier interface	2024-12-13 10:40:57 +01:00
Josako	46c60b36a0	- First 'working' version of the Zapier plugin. Needs further debugging and needs additional functionalty (only add_document.js)	2024-12-12 16:36:41 +01:00
Josako	d35ec9f5ae	- Addition of general chunking parameters chunking_heading_level and chunking patterns - Addition of Processor types docx and markdown	2024-12-05 15:19:37 +01:00
Josako	311927d5ea	Just because	2024-11-29 14:11:49 +01:00
Josako	fb798501b9	- Build and Release script for WordPress plugins	2024-11-29 14:11:36 +01:00
Josako	99135c9b02	- Updated OpenAI client due to 'proxies' error (no longer supported).	2024-11-29 14:11:07 +01:00
Josako	425b580f15	- Corrected problem where Language Dropdown was not filled	2024-11-29 11:49:50 +01:00
Josako	b658e68e65	- Minor bugfixes	2024-11-29 11:24:32 +01:00
Josako	b8e07bec77	- RAG Specialist was not using detailed_question, but original question. Resulting in Evie not returning good or correct answers.	2024-11-29 11:23:54 +01:00
Josako	344ea26ecc	- Security improvements to Docker images (Docker Scout advise)	2024-11-27 12:27:28 +01:00
Josako	98cb4e4f2f	- Created a new eveai_chat plugin to support the new dynamic possibilities of the Specialists. Currently only supports standard Rag retrievers (i.e. no extra arguments).	2024-11-27 12:26:49 +01:00
Josako	07d89d204f	- Created a new eveai_chat plugin to support the new dynamic possibilities of the Specialists. Currently only supports standard Rag retrievers (i.e. no extra arguments).	2024-11-26 13:35:29 +01:00
Josako	7702a6dfcc	- Modernized authentication with the introduction of TenantProject - Created a base mail template - Adapt and improve document API to usage of catalogs and processors - Adapt eveai_sync to new authentication mechanism and usage of catalogs and processors	2024-11-21 17:24:33 +01:00
Josako	4c009949b3	- Changes to support SpecialistID being passed iso CatalogID - Removed error that stopped sync	2024-11-15 13:13:45 +01:00
Josako	aa4ac3ec7c	- Changes to support SpecialistID being passed iso CatalogID - Removed error that stopped sync	2024-11-15 13:13:33 +01:00
Josako	1807435339	- Introduction of dynamic Retrievers & Specialists - Introduction of dynamic Processors - Introduction of caching system - Introduction of a better template manager - Adaptation of ModelVariables to support dynamic Processors / Retrievers / Specialists - Start adaptation of chat client	2024-11-15 10:00:53 +01:00
Josako	55a8a95f79	- Finalisation of the Specialist model, forms and views	2024-11-04 11:22:40 +01:00
Josako	503ea7965d	- Temporary checkin to branch for the rest of the introduction of experts	2024-11-03 16:18:14 +01:00