[RAG] Operations

[RAG] Adhoc Load Document

The Rag adhoc load document operation retrieves information based on a plain text prompt and file from embedding and LLM.

Rag Adhoc Load Document

Input Fields

Module Configuration

This refers to the Einstein AI configuration set up in the getting started section.

General Operation Fields

  • Prompt: What is the file query or prompt or question.
  • Input Stream: This field contains a file input stream for the document to be created as embeddings. Use the File Connector to read the file into an input stream.

Additional Properties

  • Model Name: The model name to be used (default is OpenAI Ada 002).
  • File Type: This field contains the type of the document to be ingested into the embedding store. Currently, four file types are supported:
    • text: any type of text files (json, xml, txt, csv, etc.),
    • pdf: only system-generated,
    • csv: comma-separated values,
    • url: only a single URL supported.
  • Option Type: This field defines how the document is going to be split prior to ingestion into the embedding/vector database.
  • Probability: The model's probability to stay accurate (default is 0.8).
  • Locale: Localization information, which can include the default locale, input locale(s), and expected output locale(s) (default is en_US).

XML Configuration

Below is the XML configuration for this operation:

<mac-einstein:rag-adhoc-load-document 
  doc:name="Rag adhoc load document" 
  doc:id="edaea124-a8aa-4d4a-8f85-0f32ee4c9858" 
  config-ref="Einstein_AI" 
  prompt="#[payload.prompt]" 
  filePath="#[payload.filePath]" 
  optionType="PARAGRAPH"
/>

Output Field

This operation responds with a json payload.

Example Use Cases

The Rag adhoc load document operation can be used in various scenarios, such as:

  • Customer Service: Quickly retrieve relevant information from documents to answer customer queries accurately.
  • Legal Teams: Load legal documents and query specific sections to find relevant case information.
  • Research: Access specific sections of large research documents based on queries to support ongoing studies.