[Embedding] Operations
[Embedding] Adhoc File Query
The Embedding adhoc file query
operation takes a document and ingests it into the vector database along with its query. The output of this operation is a set of scores with the complete content of the document, which is most likely the answer to the query. The vector database is used to identify the numeric representation of the content before creating the likely score.
Input Fields
Module Configuration
This refers to the Einstein AI configuration set up in the getting started section.
General Operation Fields
- Prompt: What is the file query or prompt or question.
- File Path: This field contains a full file path for the document to be ingested into the embedding store. Ensure the file path is accessible. You can also use a DataWeave expression for this field, such as
mule.home ++ "/apps/" ++ app.name ++ "/customer-service.pdf"
.
Additional Properties
- Model Name: The model name to be used (default is
OpenAI Ada 002
). - File Type: This field contains the type of the document to be ingested into the embedding store. Currently, four file types are supported:
- text: any type of text files (json, xml, txt, csv, etc.),
- pdf: only system-generated,
- csv: comma-separated values,
- url: only a single URL supported.
- Option Type: This field defines how the document is going to be split prior to ingestion into the embedding/vector database.
XML Configuration
Below is the XML configuration for this operation:
<mac-einstein:embedding-adhoc-file-query
doc:name="Embedding adhoc file query"
doc:id="b68d63c5-9780-4d56-add0-1543a86f08d7"
config-ref="Einstein_AI"
prompt="#[payload.prompt]"
filePath="#[payload.filePath]"
/>
Output Field
This operation responds with a json
payload.
Example Use Cases for [Embedding] Adhoc File Query
The Embedding adhoc file query
operation can be used in various scenarios, such as:
- Customer Service: Quickly find relevant information in large documents to respond to customer inquiries.
- Legal Teams: Search through legal documents to find references to specific cases or laws.
- Research: Extract relevant sections from academic papers or reports to support research findings.
[Embedding] Generate from File
The Embedding generate from file
operation takes a document and ingests it into the vector database. The output of this operation is a numeric representation of the content.
Input Fields
Module Configuration
This refers to the Einstein AI configuration set up in the getting started section.
General Operation Fields
- File Path: This field contains a full file path for the document to be ingested into the embedding store. Ensure the file path is accessible. You can also use a DataWeave expression for this field, such as
mule.home ++ "/apps/" ++ app.name ++ "/customer-service.pdf"
.
Additional Properties
- Model Name: The model name to be used (default is
OpenAI Ada 002
). - File Type: This field contains the type of the document to be ingested into the embedding store. Currently, four file types are supported:
- text: any type of text files (json, xml, txt, csv, etc.),
- pdf: only system-generated,
- csv: comma-separated values,
- url: only a single URL supported.
- Option Type: This field defines how the document is going to be split prior to ingestion into the embedding/vector database.
XML Configuration
Below is the XML configuration for this operation:
<mac-einstein:embedding-generate-from-file
doc:name="Embedding generate from file"
doc:id="d39a9c2d-25b2-4d56-add0-1543a86f08d7"
config-ref="Einstein_AI"
filePath="#[payload.filePath]"
/>
Output Field
This operation responds with a json
payload.
Example Use Cases for [Embedding] Generate from File
The Embedding generate from file
operation can be used in various scenarios, such as:
- Document Management: Create searchable embeddings for documents to improve retrieval accuracy.
- Knowledge Bases: Ingest documents into a vector database to enhance knowledge base systems with semantic search capabilities.
- Content Analysis: Generate embeddings for content analysis, clustering, or categorization in various domains.
[Embedding] Generate from Text
The Embedding generate from text
operation takes text and ingests it into the vector database. The output of this operation is a numeric representation of the content.
Input Fields
Module Configuration
This refers to the Einstein AI configuration set up in the getting started section.
General Operation Fields
- Text: This field contains the text to be ingested into the vector database.
Additional Properties
- Model Name: The model name to be used (default is
OpenAI Ada 002
).
XML Configuration
Below is the XML configuration for this operation:
<mac-einstein:embedding-generate-from-text
doc:name="Embedding generate from text"
doc:id="2e7d40a6-5c4c-41e2-851e-22849bee9752"
config-ref="Einstein_AI"
text="#[payload.text]"
/>
Output Field
This operation responds with a json
payload.
Example Use Cases for [Embedding] Generate from Text
The Embedding generate from text
operation can be used in various scenarios, such as:
- Chatbots: Ingest text data to improve chatbot responses through semantic understanding.
- Content Recommendation: Use text embeddings to recommend similar articles or products based on semantic similarity.
- Sentiment Analysis: Generate embeddings for text data to support sentiment analysis and natural language understanding applications.