--- id: retrievers title: Understanding Retrievers description: Learn how retrievers find and extract relevant information from your documents sidebar_label: Retrievers sidebar_position: 3 --- # Understanding Retrievers ## Overview Retrievers are essential components in Evie's Library that help find and extract relevant information from your documents. Think of retrievers as intelligent search engines that understand the meaning behind your questions and find the most relevant content from your stored documents. ```mermaid classDiagram class Catalog { +id: Integer +name: String +description: Text +type: String +min_chunk_size: Integer +max_chunk_size: Integer +user_metadata: JSON } class Retriever { +id: Integer +name: String +description: Text +catalog_id: Integer +type: String +tuning: Boolean +configuration: JSON +arguments: JSON } class StandardRAGRetriever { +configuration es_k: Integer es_similarity_threshold: Float +arguments query: String } class DossierRetriever { +configuration es_k: Integer es_similarity_threshold: Float tag_conditions: JSON +arguments query: String } Catalog "1" -- "*" Retriever : has Retriever <|-- StandardRAGRetriever Retriever <|-- DossierRetriever note for StandardRAGRetriever "Default similarity threshold: 0.3
Default es_k: 8" note for DossierRetriever "Coming soon
Specialized for Dossier catalogs" ``` ## Key Concepts ### What is a Retriever? A retriever is responsible for: - Understanding the meaning of your questions - Searching through document chunks in your catalog - Finding the most relevant information based on semantic similarity - Providing context for Evie's responses ```mermaid flowchart LR A[User Question] --> B[Retriever] B --> C[Document Chunks] C --> D[Most Relevant Information] D --> E[Evie's Response] style A fill:#9c2d66,stroke:#333,stroke-width:2px style B fill:#423372,stroke:#333,stroke-width:2px style C fill:#423372,stroke:#333,stroke-width:2px style D fill:#423372,stroke:#333,stroke-width:2px style E fill:#9c2d66,stroke:#333,stroke-width:2px ``` ### How Retrievers Work When you ask Evie a question, the retriever: 1. Analyzes your question to understand its meaning 2. Compares it with stored document chunks 3. Assigns similarity scores to each chunk 4. Returns the most relevant chunks based on configuration settings ## Types of Retrievers ### Standard RAG Retriever The Standard RAG (Retrieval-Augmented Generation) Retriever is the default option suitable for most use cases. It searches through all documents in a catalog to find relevant information. Configuration options include: - **Maximum Results (es_k)**: Controls how many document chunks to retrieve (default: 8) - **Similarity Threshold**: Determines how closely chunks must match your question (default: 0.3) - Lower threshold = stricter matching - Higher threshold = more permissive matching ### Dossier Retriever (Coming Soon) A specialized retriever for Dossier catalogs that will allow: - Filtering by document tags - Creating specific "viewpoints" based on tag combinations - Combining semantic search with tag-based filtering ## Setting Up Retrievers ### Creating a New Retriever To create a retriever: 1. Enter standard values such as name and description 2. Select the target catalog 3. Choose the retriever type 4. After saving, you will have the ability to set the specific configuration (based on the type) ### Configuration Best Practices 1. **Similarity Threshold Tuning**: - Start with the default 0.3 threshold - If receiving too much information: Lower the threshold - If receiving too little information: Raise the threshold 2. **Multiple Retrievers**: You can create multiple retrievers for the same catalog to serve different purposes. For example: - A broad retriever with higher threshold for general questions - A strict retriever with lower threshold for specific queries - Different retrievers for different document subsets (in Dossier catalogs) ## Practical Examples ### Standard RAG Retriever Example ```json { "name": "General Knowledge Retriever", "type": "STANDARD_RAG", "configuration": { "es_k": 8, "es_similarity_threshold": 0.3 } } ``` ### Future Dossier Retriever Example ```json { "name": "Quarterly Reports 2024", "type": "DOSSIER_RAG", "configuration": { "es_k": 8, "es_similarity_threshold": 0.3, "tag_conditions": { "document_type": "quarterly_report", "year": 2024 } } } ``` ## Tips for Optimal Retrieval 1. **Name Retrievers Clearly**: Use descriptive names that indicate their purpose and configuration 2. **Monitor Performance**: - If answers are missing important information, consider: - Increasing the similarity threshold - Increasing the maximum results (es_k) - If answers contain irrelevant information, consider: - Decreasing the similarity threshold - Decreasing the maximum results 3. **Use Multiple Retrievers**: Create specialized retrievers for different use cases within the same catalog