[Embedding] Operations

[Embedding] Adhoc File Query

The Embedding adhoc file query operation takes a document and ingests it into the vector database along with its query. The output of this operation is a set of scores with the complete content of the document, which is most likely the answer to the query. The vector database is used to identify the numeric representation of the content before creating the likely score.

Embedding Adhoc File Query

Input Fields

Module Configuration

This refers to the Einstein AI configuration set up in the getting started section.

General Operation Fields

  • Prompt: What is the file query or prompt or question.
  • File Path: This field contains a full file path for the document to be ingested into the embedding store. Ensure the file path is accessible. You can also use a DataWeave expression for this field, such as mule.home ++ "/apps/" ++ app.name ++ "/customer-service.pdf".

Additional Properties

  • Model Name: The model name to be used (default is OpenAI Ada 002).
  • File Type: This field contains the type of the document to be ingested into the embedding store. Currently, four file types are supported:
    • text: any type of text files (json, xml, txt, csv, etc.),
    • pdf: only system-generated,
    • csv: comma-separated values,
    • url: only a single URL supported.
  • Option Type: This field defines how the document is going to be split prior to ingestion into the embedding/vector database.

XML Configuration

Below is the XML configuration for this operation:

<mac-einstein:embedding-adhoc-file-query 
  doc:name="Embedding adhoc file query" 
  doc:id="b68d63c5-9780-4d56-add0-1543a86f08d7" 
  config-ref="Einstein_AI" 
  prompt="#[payload.prompt]" 
  filePath="#[payload.filePath]"
/>

Output Field

This operation responds with a json payload.

Example Use Cases for [Embedding] Adhoc File Query

The Embedding adhoc file query operation can be used in various scenarios, such as:

  • Customer Service: Quickly find relevant information in large documents to respond to customer inquiries.
  • Legal Teams: Search through legal documents to find references to specific cases or laws.
  • Research: Extract relevant sections from academic papers or reports to support research findings.

[Embedding] Generate from File

The Embedding generate from file operation takes a document and ingests it into the vector database. The output of this operation is a numeric representation of the content.

Embedding Generate from File

Input Fields

Module Configuration

This refers to the Einstein AI configuration set up in the getting started section.

General Operation Fields

  • File Path: This field contains a full file path for the document to be ingested into the embedding store. Ensure the file path is accessible. You can also use a DataWeave expression for this field, such as mule.home ++ "/apps/" ++ app.name ++ "/customer-service.pdf".

Additional Properties

  • Model Name: The model name to be used (default is OpenAI Ada 002).
  • File Type: This field contains the type of the document to be ingested into the embedding store. Currently, four file types are supported:
    • text: any type of text files (json, xml, txt, csv, etc.),
    • pdf: only system-generated,
    • csv: comma-separated values,
    • url: only a single URL supported.
  • Option Type: This field defines how the document is going to be split prior to ingestion into the embedding/vector database.

XML Configuration

Below is the XML configuration for this operation:

<mac-einstein:embedding-generate-from-file 
  doc:name="Embedding generate from file" 
  doc:id="d39a9c2d-25b2-4d56-add0-1543a86f08d7" 
  config-ref="Einstein_AI" 
  filePath="#[payload.filePath]"
/>

Output Field

This operation responds with a json payload.

Example Use Cases for [Embedding] Generate from File

The Embedding generate from file operation can be used in various scenarios, such as:

  • Document Management: Create searchable embeddings for documents to improve retrieval accuracy.
  • Knowledge Bases: Ingest documents into a vector database to enhance knowledge base systems with semantic search capabilities.
  • Content Analysis: Generate embeddings for content analysis, clustering, or categorization in various domains.

[Embedding] Generate from Text

The Embedding generate from text operation takes text and ingests it into the vector database. The output of this operation is a numeric representation of the content.

Embedding Generate from Text

Input Fields

Module Configuration

This refers to the Einstein AI configuration set up in the getting started section.

General Operation Fields

  • Text: This field contains the text to be ingested into the vector database.

Additional Properties

  • Model Name: The model name to be used (default is OpenAI Ada 002).

XML Configuration

Below is the XML configuration for this operation:

<mac-einstein:embedding-generate-from-text 
  doc:name="Embedding generate from text" 
  doc:id="2e7d40a6-5c4c-41e2-851e-22849bee9752" 
  config-ref="Einstein_AI" 
  text="#[payload.text]" 
/>

Output Field

This operation responds with a json payload.

Example Use Cases for [Embedding] Generate from Text

The Embedding generate from text operation can be used in various scenarios, such as:

  • Chatbots: Ingest text data to improve chatbot responses through semantic understanding.
  • Content Recommendation: Use text embeddings to recommend similar articles or products based on semantic similarity.
  • Sentiment Analysis: Generate embeddings for text data to support sentiment analysis and natural language understanding applications.