Skip to main content

How to upload files

AgentFlow supports multimodal messages — messages that contain images, audio, documents, and other binary files alongside text. This guide shows you how to upload a file, reference it in a message, and send it to the agent.

Prerequisites

  • A configured AgentFlowClient. See how-to/client/create-client.
  • The API server running with a media storage backend configured (check with getMultimodalConfig()).
  • The underlying LLM must support the file type you are uploading (e.g. GPT-4V for images, Gemini for audio).

Step 1: Check the server's multimodal configuration

Before uploading, confirm the server accepts files and check the size limit:

const config = await client.getMultimodalConfig();
console.log('Storage backend:', config.data.media_storage_type); // 'local', 's3', etc.
console.log('Max size (MB):', config.data.media_max_size_mb); // e.g. 10
console.log('Document handling:', config.data.document_handling); // 'extract' or 'raw'

If media_storage_type is empty or the endpoint returns an error, media uploads are not configured on the server.


Step 2: Upload a file

From a browser file input

const input = document.querySelector<HTMLInputElement>('input[type=file]')!;
const file = input.files![0];

const upload = await client.uploadFile(file);
console.log('File ID:', upload.data.file_id);
console.log('Access URL:', upload.data.url);
console.log('MIME type:', upload.data.mime_type);

From a Blob (Node.js or programmatic)

import { readFileSync } from 'fs';

const buffer = readFileSync('./diagram.png');
const blob = new Blob([buffer], { type: 'image/png' });

const upload = await client.uploadFile({ data: blob, filename: 'diagram.png' });

From a URL (fetch first)

const imageResponse = await fetch('https://example.com/photo.jpg');
const blob = await imageResponse.blob();

const upload = await client.uploadFile({ data: blob, filename: 'photo.jpg' });

Step 3: Build a multimodal message

Use the file_id or url from the upload in a MediaRef inside the appropriate block:

Image message

import { Message, ImageBlock, TextBlock, MediaRef } from '@10xscale/agentflow-client';

const imageMsg = new Message('user', [
new TextBlock('What is shown in this image?'),
new ImageBlock(
new MediaRef('url', upload.data.url),
'Uploaded image' // alt text
),
]);

Document message (PDF, DOCX)

import { DocumentBlock } from '@10xscale/agentflow-client';

const docMsg = new Message('user', [
new TextBlock('Summarise the key points from this document.'),
new DocumentBlock(
new MediaRef('url', upload.data.url)
),
]);

If document_handling is 'extract', the server extracts text from the document at upload time and the extracted text is available in upload.data.extracted_text.

Audio message

import { AudioBlock } from '@10xscale/agentflow-client';

const audioMsg = new Message('user', [
new TextBlock('Please transcribe and summarise this audio.'),
new AudioBlock(
new MediaRef('url', upload.data.url)
),
]);

Step 4: Send the message to the agent

const result = await client.invoke([imageMsg]);

const answer = result.messages
.filter(m => m.role === 'assistant')
.flatMap(m => m.content)
.filter(b => b.type === 'text')
.map(b => (b as any).text as string)
.join('');

console.log('Agent says:', answer);

Step 5: Verify file metadata

Check what the server knows about an uploaded file:

const info = await client.getFileInfo(upload.data.file_id);
console.log(info.data.mime_type);
console.log(info.data.size_bytes);
console.log(info.data.extracted_text); // Non-null for documents with text extraction

Step 6: Get a fresh access URL

For cloud-backed storage (S3, GCS, Azure Blob) the initial URL is a signed URL that expires. Always refresh it before rendering files that users might encounter a long time after upload:

const urlInfo = await client.getFileAccessUrl(upload.data.file_id);

// Check if the URL is still valid
const isExpired = urlInfo.data.expires_at
? Date.now() > urlInfo.data.expires_at
: false;

const freshUrl = isExpired
? (await client.getFileAccessUrl(upload.data.file_id)).data.url
: urlInfo.data.url;

renderImage(freshUrl);

Step 7: Download a file

Retrieve the raw file bytes as a Blob:

const blob = await client.getFile(upload.data.file_id);

// Create a download link in the browser
const objUrl = URL.createObjectURL(blob);
const link = document.createElement('a');
link.href = objUrl;
link.download = 'downloaded-file';
link.click();
URL.revokeObjectURL(objUrl);

Complete end-to-end example

import {
AgentFlowClient,
Message,
ImageBlock,
TextBlock,
MediaRef,
} from '@10xscale/agentflow-client';

const client = new AgentFlowClient({ baseUrl: 'http://localhost:8000' });

async function describeImage(imageFile: File): Promise<string> {
// 1. Upload
const upload = await client.uploadFile(imageFile);

// 2. Build message
const msg = new Message('user', [
new TextBlock('Describe this image in detail.'),
new ImageBlock(
new MediaRef('url', upload.data.url),
imageFile.name
),
]);

// 3. Invoke
const result = await client.invoke([msg]);

// 4. Extract text response
return result.messages
.filter(m => m.role === 'assistant')
.flatMap(m => m.content)
.filter(b => b.type === 'text')
.map(b => (b as any).text as string)
.join('');
}

Supported file types

The server accepts any MIME type permitted by the configured storage backend. Common types:

CategoryMIME types
Imagesimage/jpeg, image/png, image/gif, image/webp
Audioaudio/mpeg, audio/wav, audio/ogg, audio/webm
Videovideo/mp4, video/webm
Documentsapplication/pdf, text/plain, text/markdown

The LLM also determines which types it can process. Check your model's documentation for supported media types.


Common errors

ErrorCauseFix
AgentFlowError status 413File exceeds media_max_size_mb.Compress the file or raise the server limit.
AgentFlowError status 415Unsupported MIME type.Use a supported file type (see table above).
AgentFlowError status 404 on downloadfile_id not found (deleted or wrong).Re-upload the file.
Signed URL expireddirect_url_expires_at is in the past.Call getFileAccessUrl(file_id) to get a fresh URL.
extracted_text is nullDocument handling is 'raw' or the file has no extractable text.Set document_handling: 'extract' on the server, or pass the text manually in the message.

What you learned

  • uploadFile() accepts File, Blob, or { data: Blob; filename: string }.
  • Reference uploaded files in messages via MediaRef('url', upload.data.url).
  • For cloud-backed storage, refresh signed URLs with getFileAccessUrl() before rendering.
  • getFileInfo() returns metadata including extracted text from documents.

Next step

See how-to/client/register-remote-tools to learn how to register browser-side functions that the agent can call remotely.