Media Operations

Supported Storage Options

Media operation should be used to load a single media or a list of media. The media is at first loaded and then optionally processed (eg. resize image).

These operations are supposed to be followed by a generate embedding from media operation. The media operations output payload is ready to be used by the generate embedding from media operation without any transformation.

Media | Load single

The [Media] Load single operation load a media and optionally process it (for example to resize an image).

Media Load Single

How to Use

Add Media to Store

The [Media] Load single operation should be followed by an [Embedding] Generate from media operation. The output payload is ready to be used by the [Embedding] Generate from media operation without any transformation.

Media Load Single Use Case

Input Fields

Module Configuration

This refers to the MuleSoft Vectors Storage Configuration set up in the Getting Started section.

Media Fields

  • Context Path: Behaviour changes based on storage type.

    • Local: Contains the path for the documents to be ingested into the embedding store. Ensure the file path is accessible. You can also use a DataWeave expression for this field, e.g., mule.home ++ "/apps/" ++ app.name ++ "/".
    • AZURE_BLOB: Contains container name and blob item name in the following format <container-name>/<blob-item-name> (eg. ms-vectors-container/invoicesample.pdf, ms-vectors-container/folder/invoicesample.pdf, ...)
    • S3: Contains AWS S3 Bucket and AWS S3 Object Key in the following format s3://<s3-bucket>/<s3-object-key> (eg. s3://ms-vectors-bucket/setup.adoc, s3://ms-vectors-bucket/folder/setup.adoc,...)
  • Media Type: Contains the type of the media to be loaded and processed.

    • image: .png, .jpeg, .jpg, .gif, .bmp
  • Processor Settings:
    • Target Width (pixels): Contains the width of the image in pixels.
    • Target Height (pixels): Contains the height of the image in pixels.
    • Compression Quality: The compression quality for media (between 0.0 and 1.0, where 1.0 is highest quality).
    • Scale Strategy:
      • Fit (Default): Resizes the image to fit within the specified width and height while maintaining the aspect ratio. The image is padded with a background color to fit the specified width and height.
      • Fill: Resizes the image to fit within the specified width and height while maintaining the aspect ratio. The image is cropped to fill the target width and height.
      • Stretch: Resizes the image to fit within the specified width and height without maintaining the aspect ratio.

XML Configuration

Below is the XML configuration for this operation:

<ms-vectors:media-load-single
  doc:name="[Local] [Media] Load single"
  doc:id="365baf3a-b43c-4b2d-ada6-d2e201ff82fb"
  contextPath="#[payload.contextPath]">
    <ms-vectors:media-processor-parameters>
        <ms-vectors:image-processor-parameters targetWidth="512" targetHeight="512" compressionQuality="1"/>
    </ms-vectors:media-processor-parameters>
</ms-vectors:media-load-single>

Output Fields

Payload

This operation responds with a json payload.

Example

Here an example of the JSON output.

{
    "metadata": {
        "absolute_directory_path": "/Users/tbolis/Downloads",
        "source": "file:///Users/tbolis/Downloads/4866963-200.png",
        "media_type": "image",
        "mime_type": "image/png",
        "file_type": "png",
        "file_name": "4866963-200.png"
    },
    "base64Data": "iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACAI..."
}
  • metadata: The metadata key-value pairs.
    • absolute_directory_path: The full path to the file which contains relevant text segment.
    • source: File path set by cloud storage services (eg. Amazon S3)
    • media_type: The type of the media (eg. image).
    • mime_type: The media mime type.
    • file_Type: The file/source type.
    • file_name: The name of the media file.
  • base64Data: The base64 encoded media data.

Attributes

  • MediaResponseAttributes:
    • fileType: Contains the type of the document to be ingested into the embedding store.
    • contextPath: Behaviour changes based on storage type.
    • media_type: The type of the media (eg. image).
    • mime_type: The media mime type.
    • url: The URL of the media file.

Media | Load list

The [Media] Load list operation load a list of media and optionally process them (for example to resize all images to the same size).

Media Load List

How to Use

Add Media Folder to Store

The [Media] Load list operation can be followed by a Batch Job, For Each or For Each Parallel including an [Embedding] Generate from media operation. The output payload is ready to be used by the [Embedding] Generate from media operation without any transformation.

Media Load List Use Case For Each

Input Fields

Module Configuration

This refers to the MuleSoft Vectors Storage Configuration set up in the Getting Started section.

Media Fields

  • Context Path: Behaviour changes based on storage type.

    • Local: Contains the path for the documents to be ingested into the embedding store. Ensure the file path is accessible. You can also use a DataWeave expression for this field, e.g., mule.home ++ "/apps/" ++ app.name ++ "/".
    • AZURE_BLOB: Contains container name and blob item name in the following format <container-name>/<blob-item-name> (eg. ms-vectors-container/invoicesample.pdf, ms-vectors-container/folder/invoicesample.pdf, ...)
    • S3: Contains AWS S3 Bucket and AWS S3 Object Key in the following format s3://<s3-bucket>/<s3-object-key> (eg. s3://ms-vectors-bucket/setup.adoc, s3://ms-vectors-bucket/folder/setup.adoc,...)
  • Media Type: Contains the type of the media to be loaded and processed.

    • image: .png, .jpeg, .jpg, .gif, .bmp
  • Processor Settings:
    • Target Width (pixels): Contains the width of the image in pixels.
    • Target Height (pixels): Contains the height of the image in pixels.
    • Compression Quality: The compression quality for media (between 0.0 and 1.0, where 1.0 is highest quality).
    • Scale Strategy:
      • Fit (Default): Resizes the image to fit within the specified width and height while maintaining the aspect ratio. The image is padded with a background color to fit the specified width and height.
      • Fill: Resizes the image to fit within the specified width and height while maintaining the aspect ratio. The image is cropped to fill the target width and height.
      • Stretch: Resizes the image to fit within the specified width and height without maintaining the aspect ratio.

XML Configuration

Below is the XML configuration for this operation:

<ms-vectors:media-load-list
  doc:name="[S3] [Media] Load list"
  doc:id="b1f7b1a0-3c67-46a5-b6fe-4b0d7e7eb8e9"
  config-ref="Storage_Config_Amazon_S3"
  contextPath="#[vars.contextPath]">
    <ms-vectors:media-processor-parameters>
        <ms-vectors:image-processor-parameters targetWidth="512" targetHeight="512" compressionQuality="1"/>
    </ms-vectors:media-processor-parameters>
</ms-vectors:media-load-list>

Output Fields

Payload

This operation responds with a json payload.

Example

Here an example of the JSON output.

[
    {
        "metadata": {
            "absolute_directory_path": "/Users/tbolis/Downloads",
            "source": "file:///Users/tbolis/Downloads/4866963-200.png",
            "media_type": "image",
            "mime_type": "image/png",
            "file_type": "png",
            "file_name": "4866963-200.png"
        },
        "base64Data": "iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACAI..."
    }
]
  • list-item (media):
    • metadata: The metadata key-value pairs.
      • absolute_directory_path: The full path to the file which contains relevant text segment.
      • source: File path set by cloud storage services (eg. Amazon S3)
      • media_type: The type of the media (eg. image).
      • mime_type: The media mime type.
      • file_Type: The file/source type.
      • file_name: The name of the media file.
    • base64Data: The base64 encoded media data.

Attributes

  • MediaResponseAttributes:
    • fileType: Contains the type of the document to be ingested into the embedding store.
    • contextPath: Behaviour changes based on storage type.
    • media_type: The type of the media (eg. image).
    • mime_type: The media mime type.
    • url: The URL of the media file.