VecML RAG API Documentation

Introduction

The VecML File System RAG API provides a hierarchical file system-based approach to document management and RAG queries. Upload documents to your personal file system, organize them in folders, and query them using natural language. This is syncronized with chat.vecml.com AI Knowledge Hub. All your operation can be visualized at chat.vecml.com AI Knowledge Hub.

File System

Organize files in folders with familiar experience like Finder or Windows Explorer

Flexible Queries

Query files or folders as you like

Auto Syncronization

Files are automatically synchronized to the chat.vecml.com AI Knowledge Hub

Before You Start

Get Your API Key

First, you need to obtain an API key from VecML:

Sign up for a new account or log in to your existing account at https://account.vecml.com
Visit https://account.vecml.com/user-api-keys
Generate a new API key for your application, and please save it in a secure location

File System

Base URL

All API requests should be made to the following base URL:

https://filesystem.vecml.com/api

Upload File

Upload a file to your file system. The file will be automatically indexed for RAG queries by default, and will be synchronized to the chat.vecml.com AI Knowledge Hub.

We introduce how to upload a file, multiple files, and a folder with structure in the following sections.

Endpoint

POST /upload

Input (multipart/form-data)

file (file, required): The file to upload
target_path (string, optional): Target directory path (default: "/" for root). Directories are created automatically if they don't exist.
use_mm (boolean, optional): Enable multimodal indexing for images/figures (default: false)
auto_index (boolean, optional): Automatically index the file after upload (default: true)

Supported File Types & Size Limits

Documents

• PDF: up to 100MB
• DOCX: up to 10MB
• DOC: up to 10MB
• PPTX: up to 25MB

Text Files

• TXT: up to 10MB
• CSV: up to 10MB
• JSON/JSONL: up to 10MB
• Markdown: up to 10MB

Images

• PNG, JPEG, GIF: up to 10MB
• BMP, WebP: up to 10MB
• TIFF: up to 10MB

Output

success (boolean): Whether the upload was successful
message (string): Status message
file (object): File information including:
- file_id (string): Unique identifier for the uploaded file (use this for /force-reindex and other operations)
- filename (string): Name of the file
- path (string): Relative path in your file system
- size_bytes (integer): File size in bytes
- mime_type (string): Detected MIME type
- is_indexed (boolean): Whether the file is indexed for RAG
- is_indexing (boolean): Whether the file is currently being indexed

Uploading a Single File

Python Example

Python

import requests

# Get your API key from https://account.vecml.com/user-api-keys
api_key = "your_api_key_here"
base_url = "https://filesystem.vecml.com/api"

headers = {
    "X-API-Key": api_key
}

# Upload a file to your file system
files = {
    'file': ('document.pdf', open('path/to/document.pdf', 'rb'), 'application/pdf')
}
data = {
    'target_path': '/my_documents',  # Target directory (will be created if not exists)
    'use_mm': 'false',               # Enable multimodal indexing
    'auto_index': 'true'             # Automatically index after upload
}

response = requests.post(
    f"{base_url}/upload",
    headers=headers,
    files=files,
    data=data
)

result = response.json()
print(f"File ID: {result['file']['file_id']}")
print(f"Uploaded: {result['file']['filename']}")
print(f"Path: {result['file']['path']}")
print(f"Indexed: {result['file']['is_indexed']}")

cURL Example

Bash

# Upload a file to your file system
curl -X POST "https://filesystem.vecml.com/api/upload" \
  -H "X-API-Key: your_api_key_here" \
  -F "file=@/path/to/document.pdf" \
  -F "target_path=/my_documents" \
  -F "use_mm=false" \
  -F "auto_index=true"

Uploading Multiple Files

To upload multiple files, make separate API calls for each file. For best performance:

⚡ Best Practices

Concurrency: Limit to 5 concurrent uploads to avoid overwhelming the server
Timeout: Set timeout to at least 1800 seconds (30 minutes) for large files
Error handling: Implement retry logic for failed uploads
Progress tracking: Track successful/failed uploads for user feedback

Python Example (Concurrent)

Python

import requests
import os
from concurrent.futures import ThreadPoolExecutor, as_completed

api_key = "your_api_key_here"
base_url = "https://filesystem.vecml.com/api"
headers = {"X-API-Key": api_key}

# List of files to upload
files_to_upload = [
    "document1.pdf",
    "document2.pdf", 
    "report.docx",
    "data.csv"
]

def upload_file(filepath, target_path="/my_documents"):
    """Upload a single file with proper timeout"""
    with open(filepath, 'rb') as f:
        response = requests.post(
            f"{base_url}/upload",
            headers=headers,
            files={'file': (os.path.basename(filepath), f)},
            data={'target_path': target_path, 'auto_index': 'true'},
            timeout=1800  # 30 minutes timeout for large files
        )
    return filepath, response.json()

# Upload with max 5 concurrent connections
with ThreadPoolExecutor(max_workers=5) as executor:
    futures = {executor.submit(upload_file, f): f for f in files_to_upload}
    
    for future in as_completed(futures):
        filepath, result = future.result()
        if result.get('success'):
            file_id = result['file']['file_id']
            print(f"✓ Uploaded: {filepath} (file_id: {file_id})")
        else:
            print(f"✗ Failed: {filepath} - {result.get('detail')}")

Bash Example (Concurrent)

Bash

# Upload multiple files using a bash loop
# Recommended: max 5 concurrent uploads, 1800s timeout

FILES=("doc1.pdf" "doc2.pdf" "report.docx")

for file in "${FILES[@]}"; do
  curl -X POST "https://filesystem.vecml.com/api/upload" \
    -H "X-API-Key: your_api_key_here" \
    -F "file=@${file}" \
    -F "target_path=/my_documents" \
    -F "auto_index=true" \
    --max-time 1800 &
  
  # Limit to 5 concurrent uploads
  if [[ $(jobs -r -p | wc -l) -ge 5 ]]; then
    wait -n
  fi
done
wait
echo "All uploads complete!"

Uploading a Folder (with Structure)

To upload a folder while preserving its directory structure, iterate through all files and use thetarget_path parameter to specify each file's destination directory. Directories are automatically created if they don't exist.

📁 How It Works

Walk through your local folder to collect all files
For each file, calculate its relative path within the folder
Upload each file with target_path set to the remote directory path
The server automatically creates any missing directories

Python Example (Folder Upload)

Python

import requests
import os
from concurrent.futures import ThreadPoolExecutor, as_completed

api_key = "your_api_key_here"
base_url = "https://filesystem.vecml.com/api"
headers = {"X-API-Key": api_key}

def upload_folder(local_folder, remote_base_path="/"):
    """
    Upload a local folder while preserving directory structure.
    The target_path parameter auto-creates directories if they don't exist.
    """
    files_to_upload = []
    
    # Collect all files with their remote paths
    for root, dirs, files in os.walk(local_folder):
        # Calculate relative path from local_folder
        rel_path = os.path.relpath(root, local_folder)
        
        if rel_path == ".":
            remote_dir = remote_base_path.rstrip("/")
        else:
            remote_dir = f"{remote_base_path.rstrip('/')}/{rel_path}".replace("\\", "/")
        
        for filename in files:
            local_file = os.path.join(root, filename)
            files_to_upload.append((local_file, remote_dir, filename))
    
    def upload_single(args):
        local_file, remote_dir, filename = args
        with open(local_file, 'rb') as f:
            response = requests.post(
                f"{base_url}/upload",
                headers=headers,
                files={'file': (filename, f)},
                data={'target_path': remote_dir, 'auto_index': 'true'},
                timeout=1800  # 30 minutes timeout
            )
        return f"{remote_dir}/{filename}", response.json()
    
    # Upload with max 5 concurrent connections
    results = {"success": 0, "failed": 0}
    with ThreadPoolExecutor(max_workers=5) as executor:
        futures = [executor.submit(upload_single, args) for args in files_to_upload]
        
        for future in as_completed(futures):
            path, result = future.result()
            if result.get('success'):
                results["success"] += 1
                file_id = result['file']['file_id']
                print(f"✓ {path} (file_id: {file_id})")
            else:
                results["failed"] += 1
                print(f"✗ {path}: {result.get('detail')}")
    
    print(f"\nDone! {results['success']} succeeded, {results['failed']} failed")

# Usage: Upload local "my_project" folder to remote "/projects/my_project"
upload_folder("./my_project", "/projects/my_project")

Create Directory

Create a directory (or nested directories) in your file system. This works like mkdir -p - all intermediate directories are created automatically if they don't exist. If the directory already exists, it returns success without error.

Endpoint

POST /mkdir

Input (JSON Body)

path (string, required): Directory path to create (e.g., "/my_folder/nested/subfolder")

Output

success (boolean): Whether the operation was successful
message (string): Success or error message
directory (object): Created directory info (file_id, filename, path, is_directory)

Usage Notes

Useful for pre-creating folder structure before uploading files
The /upload endpoint also auto-creates directories if target_path doesn't exist
Idempotent: calling with the same path multiple times is safe

Python Example

Python

# Create a directory (supports nested paths)
import requests
import json

response = requests.post(
    f"{base_url}/mkdir",
    headers={**headers, "Content-Type": "application/json"},
    data=json.dumps({
        "path": "/my_folder/nested/subfolder"  # Creates all intermediate directories
    })
)

result = response.json()
print(f"Directory created: {result['directory']['path']}")

cURL Example

Bash

# Create a directory (supports nested paths like mkdir -p)
curl -X POST "https://filesystem.vecml.com/api/mkdir" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"path": "/my_folder/nested/subfolder"}'

List Files

List files and folders in a directory with sorting and pagination support. This is syncronized with chat.vecml.com AI Knowledge Hub.

Endpoint

GET /list

Input (Query Parameters)

path (string, optional): Directory path to list (default: "/" for root)
recursive (boolean, optional): Include subdirectories (default: false)
sort_by (string, optional): Sort field — name, updated_at, size, type, or status (default: "name")
sort_dir (string, optional): Sort direction — asc or desc (default: "asc")
offset (integer, optional): Number of items to skip for pagination (default: 0)
limit (integer, optional): Maximum number of items to return, 1–500 (default: 100)

Output

success (boolean): Whether the operation was successful
path (string): The listed directory path
files (array): List of files and folders with metadata (filename, path, size, is_directory, is_indexed, is_indexing)
total_count (integer): Total number of items matching the query
offset (integer): Current offset
limit (integer): Current limit
has_more (boolean): Whether there are more items beyond this page

Tip: Directories are always listed first, then files. Both groups are sorted independently by the selected field.

Python Example

Python

# List files in a directory with sorting and pagination
response = requests.get(
    f"{base_url}/list",
    headers=headers,
    params={
        "path": "/my_documents",  # Directory path (use "/" for root)
        "recursive": False,       # Set to True to include subdirectories
        "sort_by": "name",        # Sort by: name, updated_at, size, type, status
        "sort_dir": "asc",        # Sort direction: asc or desc
        "offset": 0,              # Skip N items (for pagination)
        "limit": 100              # Max items to return (1-500)
    }
)

result = response.json()
print(f"Path: {result['path']}")
print(f"Total: {result['total_count']} items (showing {len(result['files'])})")
print(f"Has more: {result['has_more']}")
for file_info in result['files']:
    file_type = "DIR" if file_info['is_directory'] else "FILE"
    status = "indexing" if file_info['is_indexing'] else ("indexed" if file_info['is_indexed'] else "not indexed")
    print(f"  [{file_type}] {file_info['filename']} - {status}")

cURL Example

Bash

# List files with sorting and pagination
curl -X GET "https://filesystem.vecml.com/api/list?path=/my_documents&sort_by=name&sort_dir=asc&offset=0&limit=100" \
  -H "X-API-Key: your_api_key_here"

Search Files

Search for files by name using fuzzy matching. Supports both CJK (Chinese, Japanese, Korean) substring search and similarity-based search for other languages.

Endpoint

GET /search

Input (Query Parameters)

query (string, required): Search query to match against filenames
path (string, optional): Directory scope to search within (default: "/" for all files)
limit (integer, optional): Maximum number of results, 1–100 (default: 20)

Output

success (boolean): Whether the search was successful
query (string): The search query used
results (array): List of matching files, each with:
- file_id, filename, path, size_bytes, mime_type
- is_directory, is_indexed, is_indexing
- similarity_score (float): How closely the filename matches (0–1)

Tip: Results are sorted by similarity score (highest first). Use the path parameter to narrow the search to a specific folder.

Python Example

Python

# Search for files by name
response = requests.get(
    f"{base_url}/search",
    headers=headers,
    params={
        "query": "report",           # Search query (matches filename)
        "path": "/my_documents",     # Directory scope (default: "/" for all)
        "limit": 20                  # Max results (1-100, default: 20)
    }
)

result = response.json()
print(f"Search: '{result['query']}'")
print(f"Found {len(result['results'])} results:")
for item in result['results']:
    print(f"  {item['filename']} (score: {item['similarity_score']}, path: {item['path']})")

cURL Example

Bash

# Search for files by name
curl -X GET "https://filesystem.vecml.com/api/search?query=report&path=/my_documents&limit=20" \
  -H "X-API-Key: your_api_key_here"

Get File Info

Get detailed information about a specific file or folder, including indexing status.

Endpoint

GET /file

Input (Query Parameters)

path (string, required): Relative path of the file or folder

Output (for files)

file_id (string): Unique identifier
filename (string): Name of the file
path (string): Relative path
size_bytes (integer): File size in bytes
is_directory (boolean): Whether it's a directory
is_indexed (boolean): Whether it's indexed for RAG
is_indexing (boolean): Whether it's currently being indexed
created_at, updated_at (string): Timestamps
indexing_status: null for files

Output (for directories)

In addition to the fields above, directories include an indexing_status object:

indexing_status.total_files (integer): Total number of files under this directory
indexing_status.indexed_files (integer): Number of files that are indexed
indexing_status.indexing_files (integer): Number of files currently being indexed
indexing_status.unindexed_files (integer): Number of files not yet indexed
indexing_status.is_complete (boolean): Whether all files are indexed

File Info Example

Python Example (File)

Python

# Get information about a specific file
response = requests.get(
    f"{base_url}/file",
    headers=headers,
    params={"path": "/my_documents/document.pdf"}
)

result = response.json()
print(f"Filename: {result['filename']}")
print(f"Path: {result['path']}")
print(f"Size: {result['size_bytes']} bytes")
print(f"Is Directory: {result['is_directory']}")
print(f"Is Indexed: {result['is_indexed']}")
print(f"Is Indexing: {result['is_indexing']}")

cURL Example

Bash

# Get file information
curl -X GET "https://filesystem.vecml.com/api/file?path=/my_documents/document.pdf" \
  -H "X-API-Key: your_api_key_here"

# Get directory info (includes indexing_status summary)
curl -X GET "https://filesystem.vecml.com/api/file?path=/my_documents" \
  -H "X-API-Key: your_api_key_here"

Directory Indexing Status Example

Python Example (Directory)

Python

# Get directory info with indexing status summary
response = requests.get(
    f"{base_url}/file",
    headers=headers,
    params={"path": "/my_documents"}
)

result = response.json()
print(f"Directory: {result['filename']}")
status = result['indexing_status']
print(f"Total files: {status['total_files']}")
print(f"Indexed: {status['indexed_files']}")
print(f"Indexing: {status['indexing_files']}")
print(f"Unindexed: {status['unindexed_files']}")
print(f"Complete: {status['is_complete']}")

Download File

Download a file or directory from your file system. Single files are returned directly, while directories are automatically packaged as zip archives.

Endpoint

GET /download

Input (Query Parameters)

path (string, required): Relative path of the file or directory to download

Response

For single files: Returns the file with Content-Disposition: attachment header
For directories: Returns a zip archive containing all files and subdirectories
The filename is automatically set in the response headers
Appropriate Content-Type header is included (e.g., application/pdf, application/zip)

Note: When downloading directories, the zip file is created on-the-fly and includes the complete directory structure. Large directories may take a moment to compress.

Python Example

Python

# Download a single file
response = requests.get(
    f"{base_url}/download",
    headers=headers,
    params={"path": "/my_documents/document.pdf"}
)

# Save the downloaded file
with open("downloaded_document.pdf", "wb") as f:
    f.write(response.content)
print("File downloaded successfully!")

# Download a directory (returns as zip)
response = requests.get(
    f"{base_url}/download",
    headers=headers,
    params={"path": "/my_documents/"}
)

# Save the downloaded zip file
with open("my_documents.zip", "wb") as f:
    f.write(response.content)
print("Directory downloaded as zip successfully!")

cURL Example

Bash

# Download a single file
curl -X GET "https://filesystem.vecml.com/api/download?path=/my_documents/document.pdf" \
  -H "X-API-Key: your_api_key_here" \
  -o downloaded_document.pdf

# Download a directory (returns as zip)
curl -X GET "https://filesystem.vecml.com/api/download?path=/my_documents/" \
  -H "X-API-Key: your_api_key_here" \
  -o my_documents.zip

Index File / Directory

Index an existing file or an entire directory for RAG queries. Use this if you uploaded files with auto_index=false. If a file is already indexed, it will be skipped. When a directory path is provided, all unindexed files within it are indexed recursively.

Endpoint

POST /index

Input (multipart/form-data)

path (string, required): Relative path of the file or directory to index
use_mm (boolean, optional): Enable multimodal indexing (default: false)

Output (Single File)

success (boolean): Whether indexing was successful
message (string): Status message

Output (Directory)

success (boolean): Whether the overall operation succeeded
message (string): Summary message
path (string): Directory path that was indexed
total_files (integer): Total unindexed files found
indexed_count (integer): Files successfully indexed
already_indexed_count (integer): Files that were already indexed
failed_count (integer): Files that failed to index
results (array): Per-file results, each with file_id, filename, path, success, message

Tip: Directory indexing runs files concurrently for faster processing. Use the GET /file endpoint on the directory path to check overall indexing progress.

Index a Single File

Python Example

Python

# Index a single file for RAG queries
response = requests.post(
    f"{base_url}/index",
    headers=headers,
    data={
        "path": "/my_documents/document.pdf",
        "use_mm": "false"  # Enable multimodal indexing
    }
)

result = response.json()
print(f"Success: {result['success']}")
print(f"Message: {result['message']}")

cURL Example

Bash

# Index a single file
curl -X POST "https://filesystem.vecml.com/api/index" \
  -H "X-API-Key: your_api_key_here" \
  -F "path=/my_documents/document.pdf" \
  -F "use_mm=false"

# Index an entire directory (recursive)
curl -X POST "https://filesystem.vecml.com/api/index" \
  -H "X-API-Key: your_api_key_here" \
  -F "path=/my_documents" \
  -F "use_mm=false"

Index an Entire Directory

Python Example (Directory)

Python

# Index all unindexed files in a directory (recursive)
response = requests.post(
    f"{base_url}/index",
    headers=headers,
    data={
        "path": "/my_documents",   # Pass a directory path
        "use_mm": "false"
    }
)

result = response.json()
print(f"Success: {result['success']}")
print(f"Total files: {result['total_files']}")
print(f"Indexed: {result['indexed_count']}")
print(f"Already indexed: {result['already_indexed_count']}")
print(f"Failed: {result['failed_count']}")

# Per-file results
for r in result.get('results', []):
    status = "OK" if r['success'] else "FAIL"
    print(f"  [{status}] {r['filename']}: {r['message']}")

Force Reindex File / Directory

Force reindex a file or directory. This endpoint will be useful when you believe a file or directory is not indexed correctly.

Endpoint

POST /force-reindex

Input (multipart/form-data)

file_id (string, required): ID of the file or directory to force reindex

Output (Single File)

success (boolean): Whether force reindexing was successful
message (string): Status message

Output (Directory)

success (boolean): Whether the overall operation succeeded
message (string): Summary message
path (string): Directory path that was indexed
total_files (integer): Total unindexed files found
indexed_count (integer): Files successfully indexed
already_indexed_count (integer): Files that were already indexed
failed_count (integer): Files that failed to index
results (array): Per-file results, each with file_id, filename, path, success, message

Tip: Directory indexing runs files concurrently for faster processing. Use the GET /file endpoint on the directory path to check overall indexing progress.

Index a Single File

Python Example

Python

# Force reindex a single file
  response = requests.post(
      f"{base_url}/force-reindex",
      headers=headers,
      data={
          "file_id": "file_id"
      }
  )
  result = response.json()
  print(f"Success: {result['success']}")
  print(f"Message: {result['message']}")

cURL Example

Bash

# Force reindex a single file
  curl -X POST "https://filesystem.vecml.com/api/force-reindex" \
    -H "X-API-Key: your_api-key_here" \
    -F "file_id=file_id"

Force Reindex an Entire Directory

Python Example (Directory)

Python

# Force reindex an entire directory
  response = requests.post(
      f"{base_url}/force-reindex",
      headers=headers,
      data={
          "file_id": "file_id"
      }
  )
  result = response.json()
  print(f"Success: {result['success']}")
  print(f"Message: {result['message']}")

Query

Query your indexed documents using natural language and get AI-generated answers. This single endpoint supports non-streaming responses, real-time streaming, and multimodal image attachments — controlled by the input parameters.

Note: This is a stateless, single-turn Q&A endpoint — it does not maintain conversation history. If you need multi-turn chat with files, please see the Chat System section below.

Endpoint

POST /query

Input (JSON)

query (string, required): Natural language question or search query
file_paths (array, required): List of file/folder paths to query (e.g., ["/docs/report.pdf", "/notes/"])
llm_model (string, optional): LLM model to use (default: "qwen3_8b"). gpt-4.1 is suggested for better performance.
streaming (boolean, optional): Enable real-time streaming response (default: false)
temperature (float, optional): Response creativity 0.0-1.0 (default: 0.7)
max_retrieve_tokens (integer, optional): Maximum RAG context tokens (default: 5000)
system_prompt (string, optional): Custom system prompt for the LLM
additional_attachments (array, optional): Base64-encoded images as data URIs (e.g., "data:image/png;base64,..."). When provided, the multimodal LLM is used for visual understanding. Supported formats: PNG, JPEG, GIF, WebP, TIFF.
custom_base_url (string, optional): Custom base URL for file reference links embedded in RAG answers. By default, the answer includes clickable file links pointing to VecML's hosted viewer (e.g., [report.pdf](https://chat.vecml.com/files?file_id=xxx)). When you set this parameter, those links will point to your own server instead (e.g., [report.pdf](https://myapp.example.com/files?file_id=xxx)). This is useful when you want to host your own file preview, download, or viewer page — you can retrieve the file_id from the /upload response in your backend server and map it to your custom backend for a fully branded experience.

Available Models

qwen3_8b
qwen3_4b
gpt-4.1-nano
gemini-2.0-flash
gpt-4o-mini
gpt-4.1-mini

gpt-4.1
gemini-3-pro
claude-4-5-sonnet
gpt-5.2
claude-4-5-opus
claude-4-6-opus

Output (Non-streaming)

answer (string): AI-generated answer based on retrieved content
usage (object, optional): Token usage information with prompt_tokens, completion_tokens, and total_tokens

Output (Streaming: `streaming: true`)

Text stream: Real-time streaming of the LLM-generated answer
Content-Type: text/plain

Tip: You can query entire folders by specifying a folder path. All indexed files within the folder will be searched. Use --no-buffer with curl and stream=True with requests to see streaming responses as they are generated.

Multimodal: When additional_attachments is provided, the system automatically uses the multimodal LLM for visual understanding. You can ask questions about image content, extract text (OCR), or analyze charts and diagrams. Images can be combined with file_paths for RAG + image analysis.

Note: Credit balance is checked before each query. If you have insufficient credits for the selected model, you will receive an HTTP 402 (Payment Required) error. Use the /usage endpoint to check your available credits, or use free models (qwen3_8b, qwen3_4b) which don't require credits.

Non-streaming

Python Example

Python

# Query your files using natural language
query_data = {
    "query": "What is the main topic of this document?",
    "file_paths": ["/my_documents/document.pdf"],  # Can include files and/or folders
    "llm_model": "qwen3_8b",      # Options: qwen3_8b, gemini-2.0-flash, gpt-4o-mini, etc.
    "streaming": False,
    "temperature": 0.7,
    "max_retrieve_tokens": 5000,  # Maximum tokens for RAG context
    "system_prompt": "You are a helpful assistant.",  # Optional custom prompt
    # "custom_base_url": "https://myapp.example.com"  # Optional: override file link URLs in answers
}

response = requests.post(
    f"{base_url}/query",
    headers={**headers, "Content-Type": "application/json"},
    json=query_data
)

result = response.json()
print("Answer:", result['answer'])
if result.get('usage'):
    print(f"Token usage: {result['usage']['total_tokens']} total tokens")

cURL Example

Bash

# Query files with RAG
curl -X POST "https://filesystem.vecml.com/api/query" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the main topic of this document?",
    "file_paths": ["/my_documents/document.pdf"],
    "llm_model": "qwen3_8b",
    "streaming": false,
    "temperature": 0.7,
    "custom_base_url": "https://myapp.example.com"
  }'

Streaming

Python Example

Python

# Streaming query example
import requests

query_data = {
    "query": "Summarize this document in detail.",
    "file_paths": ["/my_documents"],  # Query all files in a folder
    "llm_model": "qwen3_8b",
    "streaming": True,
    "temperature": 0.7
}

response = requests.post(
    f"{base_url}/query",
    headers={**headers, "Content-Type": "application/json"},
    json=query_data,
    stream=True
)

print("Streaming response:")
for chunk in response.iter_content(chunk_size=None, decode_unicode=True):
    if chunk:
        print(chunk, end='', flush=True)
print()  # New line after streaming

cURL Example

Bash

# Streaming query
curl -X POST "https://filesystem.vecml.com/api/query" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Summarize this document in detail.",
    "file_paths": ["/my_documents"],
    "llm_model": "qwen3_8b",
    "streaming": true,
    "temperature": 0.7
  }' \
  --no-buffer

With Image Attachments (Multimodal)

Python Example

Python

import requests
import base64

api_key = "your_api_key_here"
base_url = "https://filesystem.vecml.com/api"
headers = {"X-API-Key": api_key}

# Encode images as base64 data URIs
def encode_image(image_path):
    with open(image_path, "rb") as f:
        b64 = base64.b64encode(f.read()).decode()
    return f"data:image/png;base64,{b64}"

image_uris = [
    encode_image("./chart.png"),
    encode_image("./screenshot.png"),
]

# Query with images + optional RAG context
query_data = {
    "query": "What information is shown in these images?",
    "file_paths": ["/my_documents"],           # Optional: combine with RAG
    "llm_model": "qwen3_8b",
    "additional_attachments": image_uris,       # Base64 data URIs
    "streaming": False
}

response = requests.post(
    f"{base_url}/query",
    headers={**headers, "Content-Type": "application/json"},
    json=query_data
)

result = response.json()
print("Answer:", result['answer'])

Retrieve Files

Return the list of indexed files most relevant to a query, ranked by relevance. No LLM response is generated.

Endpoint

POST /retrieve_files

Input (JSON)

query (string, required): Natural language query driving the retrieval
file_paths (array, required): Relative file or folder paths to search within (e.g., ["/docs/report.pdf", "/notes/"])
max_file_numbers (integer, optional): Max number of files to return. Must be a positive integer; values above 100 are clamped to 100 (default: 5).

Output

files (array of string): Relative file paths ranked by relevance, highest first.

Example Response

JSON

{
  "files": [
    "/my_documents/q3_report.pdf",
    "/my_documents/revenue_memo.md",
    "/notes/earnings_call.txt"
  ]
}

Errors: Returns 400 if max_file_numbers is not a positive integer,404 if any path in file_paths does not exist, and 429 if the server is at capacity. If no indexed non-image files are under the given paths, the response is {"files": []} with status 200.

Python Example

Python

# Retrieve the list of files most relevant to a query
retrieve_data = {
    "query": "quarterly revenue breakdown",
    "file_paths": ["/my_documents"],   # Can include files and/or folders
    "max_file_numbers": 5              # Max number of files to return
}

response = requests.post(
    f"{base_url}/retrieve_files",
    headers={**headers, "Content-Type": "application/json"},
    json=retrieve_data
)

result = response.json()
for path in result["files"]:
    print(path)

cURL Example

Bash

# Retrieve ranked file paths (no LLM response)
curl -X POST "https://filesystem.vecml.com/api/retrieve_files" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "quarterly revenue breakdown",
    "file_paths": ["/my_documents"],
    "max_file_numbers": 5
  }'

Rename

Rename a file or folder. The item stays in the same directory but gets a new name.

Endpoint

POST /rename

Input (JSON Body)

path (string, required): Relative path of the file or folder to rename (e.g., "/folder/old_name.pdf")
new_name (string, required): New name for the file or folder (e.g., "new_name.pdf")

Output

success (boolean): Whether the rename was successful
message (string): Status message
file (object): Updated file information with the new name and path

Tip: When renaming a folder, all children paths are automatically updated. The RAG index references are also updated.

Python Example

Python

# Rename a file or folder
import json

response = requests.post(
    f"{base_url}/rename",
    headers={**headers, "Content-Type": "application/json"},
    data=json.dumps({
        "path": "/my_documents/old_name.pdf",
        "new_name": "new_name.pdf"
    })
)

result = response.json()
print(f"Success: {result['success']}")
print(f"Message: {result['message']}")
print(f"New path: {result['file']['path']}")

cURL Example

Bash

# Rename a file or folder
curl -X POST "https://filesystem.vecml.com/api/rename" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"path": "/my_documents/old_name.pdf", "new_name": "new_name.pdf"}'

Move

Move a file or folder to a different directory. The destination directory will be created automatically if it doesn't exist.

Endpoint

POST /move

Input (JSON Body)

source_path (string, required): Relative path of the file or folder to move (e.g., "/folder/file.pdf")
destination_path (string, required): Relative path of the target directory (e.g., "/archive/")

Output

success (boolean): Whether the move was successful
message (string): Status message
file (object): Updated file information with the new path

Tip: Moving a folder also moves all of its contents. RAG index references are automatically updated.

Python Example

Python

# Move a file or folder to a new directory
import json

response = requests.post(
    f"{base_url}/move",
    headers={**headers, "Content-Type": "application/json"},
    data=json.dumps({
        "source_path": "/my_documents/document.pdf",
        "destination_path": "/archive/"
    })
)

result = response.json()
print(f"Success: {result['success']}")
print(f"Message: {result['message']}")
print(f"New path: {result['file']['path']}")

cURL Example

Bash

# Move a file or folder
curl -X POST "https://filesystem.vecml.com/api/move" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"source_path": "/my_documents/document.pdf", "destination_path": "/archive/"}'

Delete File

Delete a file or folder from your file system. Deleting a folder removes all its contents.

Endpoint

DELETE /file

Input (Query Parameters)

path (string, required): Relative path of the file or folder to delete

Output

success (boolean): Whether deletion was successful
message (string): Status message

Warning: This action cannot be undone. Deleting a folder will permanently remove all files and subfolders within it.

Python Example

Python

# Delete a file or folder
response = requests.delete(
    f"{base_url}/file",
    headers=headers,
    params={"path": "/my_documents/document.pdf"}
)

result = response.json()
print(f"Success: {result['success']}")
print(f"Message: {result['message']}")

cURL Example

Bash

# Delete a file or folder
curl -X DELETE "https://filesystem.vecml.com/api/file?path=/my_documents/document.pdf" \
  -H "X-API-Key: your_api_key_here"

Usage & Credits

Get your current credit balances and storage usage. This helps you monitor your usage and plan accordingly.

Endpoint

GET /usage

Output

credits (object): Credit information for each tier
- low_tier: Remaining credits and models (gpt-4.1-nano)
- middle_tier: Remaining credits and models (gpt-4o-mini, gpt-4.1-mini, gemini-2.0-flash)
- high_tier: Remaining credits and models (gpt-4o, gpt-4.1, gpt-5, etc.)
- free_models: Models that don't consume credits (qwen3_8b, qwen3_4b)
- available_models: All models you can currently use based on your credits
- total_credit_usage: Total credits consumed to date
storage (object): Storage usage information
- used_bytes, used_mb: Current storage used
- limit_bytes, limit_mb: Your storage limit
- remaining_bytes, remaining_mb: Available storage
- usage_percentage: Percentage of storage used
user_level (string): Your account level (normal, plus, pro)

Credit Tiers

Free Tier

• qwen3_8b
• qwen3_4b

Low/Middle Tier

• gpt-4.1-nano (low)
• gpt-4o-mini (middle)
• gemini-2.0-flash (middle)

High Tier

• gpt-4.1, gpt-5.2
• claude-4-5-sonnet/opus
• claude-4-6-opus, gemini-3-pro

Credit Allocation

Storage Limits

Normal users: 200 MB total storage
Plus/Pro users: 5 GB total storage

Python Example

Python

# Get your credit and storage usage
response = requests.get(
    f"{base_url}/usage",
    headers=headers
)

result = response.json()

# Credit information
print("=== Credits ===")
print(f"Low tier remaining: {result['credits']['low_tier']['remaining']}")
print(f"Middle tier remaining: {result['credits']['middle_tier']['remaining']}")
print(f"High tier remaining: {result['credits']['high_tier']['remaining']}")
print(f"Free models: {result['credits']['free_models']}")
print(f"Available models: {result['credits']['available_models']}")

# Storage information
print("\n=== Storage ===")
print(f"Used: {result['storage']['used_mb']} MB / {result['storage']['limit_mb']} MB")
print(f"Remaining: {result['storage']['remaining_mb']} MB")
print(f"Usage: {result['storage']['usage_percentage']}%")

cURL Example

Bash

# Get your credit and storage usage
curl -X GET "https://filesystem.vecml.com/api/usage" \
  -H "X-API-Key: your_api_key_here"

Chat System

The Chat System API provides multi-turn conversation capabilities on top of the File System RAG. Create chat sessions, send messages with or without document context, and maintain full conversation history. Chat sessions are synchronized with chat.vecml.com and will appear in the website's chat history.

Create Chat Session

Create a new chat session and get a session ID for subsequent messages.

Endpoint

POST /chat/create

Input (JSON Body)

title (string, optional): Initial title for the chat session (default: "API Chat Session"). The chat system automatically generates and updates the title based on your messages and LLM responses after each turn, so leaving this as default is perfectly fine. This parameter is useful when you want to assign a custom initial title (e.g., "New Chat", "Project Q&A") before any messages are sent.

Output

success (boolean): Whether creation was successful
chat_id (string): Unique session ID — use this in all subsequent calls
title (string): The chat title
created_at (string): ISO 8601 timestamp

Python Example

Python

import requests

api_key = "your_api_key_here"
base_url = "https://filesystem.vecml.com/api"
headers = {"X-API-Key": api_key}

# Create a new chat session
response = requests.post(
    f"{base_url}/chat/create",
    headers={**headers, "Content-Type": "application/json"},
    json={"title": "My Research Chat"}  # Optional, defaults to "API Chat Session"
)

result = response.json()
chat_id = result["chat_id"]
print(f"Chat ID: {chat_id}")
print(f"Title: {result['title']}")
print(f"Created at: {result['created_at']}")

cURL Example

Bash

# Create a new chat session
curl -X POST "https://filesystem.vecml.com/api/chat/create" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"title": "My Research Chat"}'

Send Chat Message

Send a message to a chat session and get an LLM response. This single endpoint supports non-streaming responses, real-time streaming, and multimodal image attachments — controlled by the input parameters. It also supports RAG context via file_paths and maintains full conversation history.

Endpoint

POST /chat/message

Input (JSON Body)

chat_id (string, required): Session ID from /chat/create
query (string, required): User message / question
file_paths (array, optional): File/folder paths for RAG context this turn. Empty or null means no RAG.
llm_model (string, optional): LLM model to use (default: "qwen3_8b"). gpt-4.1 is suggested for better performance.
streaming (boolean, optional): Enable real-time streaming response (default: false)
temperature (float, optional): Response creativity 0.0–1.0 (default: 0.7)
max_retrieve_tokens (integer, optional): Maximum RAG context tokens (default: 5000)
system_prompt (string, optional): Custom system prompt
additional_attachments (array, optional): Base64-encoded images as data URIs (e.g., "data:image/png;base64,..."). When provided, the multimodal LLM is used for visual understanding. Supported formats: PNG, JPEG, GIF, WebP, TIFF.
custom_base_url (string, optional): Custom base URL for file reference links embedded in RAG answers. By default, the answer includes clickable file links pointing to VecML's hosted viewer (e.g., [report.pdf](https://chat.vecml.com/files?file_id=xxx)). When you set this parameter, those links will point to your own server instead (e.g., [report.pdf](https://myapp.example.com/files?file_id=xxx)). This is useful when you want to host your own file preview, download, or viewer page — you can retrieve the file_id from the /upload response in your backend server and map it to your custom backend for a fully branded experience.

Output (Non-streaming)

answer (string): LLM-generated answer
chat_id (string): The session ID
message_id (string): ID of the assistant message
usage (object, optional): Token usage info

Output (Streaming: `streaming: true`)

Text stream: Real-time streaming of the LLM-generated answer
Content-Type: text/plain

How RAG works per turn: The file_paths you provide in each message determines whether RAG is used for that specific turn. If file_paths is empty or not provided, the LLM answers using only the conversation history (no document retrieval). This gives you full control over when to use RAG context.

Streaming Tip: Use --no-buffer with curl and stream=True with requests to see the response as it's generated. The message is automatically saved to the chat history after streaming completes.

Non-streaming

Python Example

Python

# Send a message to a chat session (non-streaming)
response = requests.post(
    f"{base_url}/chat/message",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,                           # Required: from /chat/create
        "query": "What is the main topic of the document?",
        "file_paths": ["/my_documents/report.pdf"],    # Optional: paths for RAG context
        "llm_model": "qwen3_8b",                      # Optional: default "qwen3_8b"
        "streaming": False,                            # Optional: default False
        "temperature": 0.7,                            # Optional: default 0.7
        "max_retrieve_tokens": 5000,                   # Optional: default 5000
        "system_prompt": "You are a helpful assistant.", # Optional: custom prompt
        # "custom_base_url": "https://myapp.example.com"  # Optional: override file link URLs
    }
)

result = response.json()
print(f"Answer: {result['answer']}")
print(f"Chat ID: {result['chat_id']}")
if result.get("usage"):
    print(f"Tokens used: {result['usage']['total_tokens']}")

cURL Example

Bash

# Send a message to a chat session
curl -X POST "https://filesystem.vecml.com/api/chat/message" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_id": "your-chat-id",
    "query": "What is the main topic of the document?",
    "file_paths": ["/my_documents/report.pdf"],
    "llm_model": "qwen3_8b",
    "streaming": false,
    "temperature": 0.7,
    "custom_base_url": "https://myapp.example.com"
  }'

Streaming

Python Example

Python

# Streaming chat message
response = requests.post(
    f"{base_url}/chat/message",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,
        "query": "Summarize this document in detail.",
        "file_paths": ["/my_documents"],
        "llm_model": "qwen3_8b",
        "streaming": True
    },
    stream=True
)

print("Streaming response:")
for chunk in response.iter_content(chunk_size=None, decode_unicode=True):
    if chunk:
        print(chunk, end='', flush=True)
print()

cURL Example

Bash

# Streaming chat message
curl -X POST "https://filesystem.vecml.com/api/chat/message" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_id": "your-chat-id",
    "query": "Summarize this document in detail.",
    "file_paths": ["/my_documents"],
    "llm_model": "qwen3_8b",
    "streaming": true
  }' \
  --no-buffer

With Image Attachments (Multimodal)

Python Example

Python

import base64

# Read images and encode as data URIs
def encode_image(image_path):
    with open(image_path, "rb") as f:
        b64 = base64.b64encode(f.read()).decode()
    return f"data:image/png;base64,{b64}"

image_uris = [
    encode_image("./image1.png"),
    encode_image("./image2.png"),
]

# Send message with image attachments (triggers multimodal LLM)
response = requests.post(
    f"{base_url}/chat/message",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,
        "query": "What do you see in these images?",
        "llm_model": "qwen3_8b",
        "additional_attachments": image_uris  # Base64 data URIs
    }
)

result = response.json()
print(f"Answer: {result['answer']}")

cURL Example

Bash

# Note: For images, it's easier to use Python or another language
# to encode images as base64 data URIs.
# The additional_attachments field accepts an array of data URIs:
# "data:image/png;base64,iVBORw0KGgo..."

curl -X POST "https://filesystem.vecml.com/api/chat/message" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "chat_id": "your-chat-id",
    "query": "What do you see in this image?",
    "llm_model": "qwen3_8b",
    "additional_attachments": ["data:image/png;base64,iVBORw0KGgo..."]
  }'

List Chat Sessions

List all your chat sessions with pagination support. Only returns chats that have at least one message.

Endpoint

GET /chat/list

Input (Query Parameters)

offset (integer, optional): Number of items to skip for pagination (default: 0)
limit (integer, optional): Maximum items to return, 1–200 (default: 50)

Output

success (boolean): Whether the operation was successful
chats (array): List of chat sessions, each with:
- chat_id (string): Session ID
- title (string): Chat title
- created_at (string): Creation timestamp
- updated_at (string): Last activity timestamp
total_count (integer): Total number of chat sessions
offset, limit (integer): Current pagination values
has_more (boolean): Whether there are more items beyond this page

Tip: Chats are sorted by most recently updated first. This list includes chats created via both the API and the chat.vecml.com website.

Python Example

Python

# List all chat sessions with pagination
response = requests.get(
    f"{base_url}/chat/list",
    headers=headers,
    params={
        "offset": 0,    # Skip N items (for pagination)
        "limit": 50      # Max items to return (1-200)
    }
)

result = response.json()
print(f"Total chats: {result['total_count']}")
print(f"Has more: {result['has_more']}")
for chat in result['chats']:
    print(f"  [{chat['chat_id'][:8]}...] {chat['title']} (updated: {chat['updated_at']})")

cURL Example

Bash

# List all chat sessions
curl -X GET "https://filesystem.vecml.com/api/chat/list?offset=0&limit=50" \
  -H "X-API-Key: your_api_key_here"

Get Chat Messages

Retrieve the full message history of a chat session, along with the currently attached files.

Endpoint

GET /chat/messages

Input (Query Parameters)

chat_id (string, required): Session ID of the chat

Output

success (boolean): Whether the operation was successful
chat_id (string): The session ID
messages (array): Ordered list of messages, each with:
- id (string): Message ID
- role (string): "user" or "assistant"
- content (string): Message text
- created_at (string): Timestamp
attached_files (array or null): Currently attached file paths (most recent selection)

Python Example

Python

# Get full message history of a chat session
response = requests.get(
    f"{base_url}/chat/messages",
    headers=headers,
    params={"chat_id": chat_id}
)

result = response.json()
print(f"Chat ID: {result['chat_id']}")
print(f"Attached files: {result.get('attached_files')}")
print(f"Messages ({len(result['messages'])}):")
for msg in result['messages']:
    role = msg['role'].upper()
    content = msg['content'][:100]
    print(f"  [{role}] {content}...")

cURL Example

Bash

# Get message history of a chat session
curl -X GET "https://filesystem.vecml.com/api/chat/messages?chat_id=your-chat-id" \
  -H "X-API-Key: your_api_key_here"

Rename Chat

Rename a chat session. Useful for organizing your conversations with descriptive titles. Note that our chat system automatically generates a chat title based on your messages and the LLM response, so you may not need this endpoint if you are satisfied with the auto-generated titles.

Endpoint

POST /chat/rename

Input (JSON Body)

chat_id (string, required): Session ID of the chat to rename
new_name (string, required): New title for the chat session (cannot be empty)

Output

success (boolean): Whether the rename was successful
chat_id (string): The session ID
title (string): The new title
message (string): Status message

Python Example

Python

# Rename a chat session
response = requests.post(
    f"{base_url}/chat/rename",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,
        "new_name": "Q4 Financial Analysis"
    }
)

result = response.json()
print(f"Success: {result['success']}")
print(f"New title: {result['title']}")

cURL Example

Bash

# Rename a chat session
curl -X POST "https://filesystem.vecml.com/api/chat/rename" \
  -H "X-API-Key: your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{"chat_id": "your-chat-id", "new_name": "Q4 Financial Analysis"}'

Delete Chat

Delete a chat session and all its messages permanently.

Endpoint

DELETE /chat

Input (Query Parameters)

chat_id (string, required): Session ID of the chat to delete

Output

success (boolean): Whether deletion was successful
message (string): Status message

Warning: This action cannot be undone. All messages in the chat session will be permanently deleted.

Python Example

Python

# Delete a chat session and all its messages
response = requests.delete(
    f"{base_url}/chat",
    headers=headers,
    params={"chat_id": chat_id}
)

result = response.json()
print(f"Success: {result['success']}")
print(f"Message: {result['message']}")

cURL Example

Bash

# Delete a chat session
curl -X DELETE "https://filesystem.vecml.com/api/chat?chat_id=your-chat-id" \
  -H "X-API-Key: your_api_key_here"

Full Chat Lifecycle Example

A complete example showing the typical workflow: create a session → send messages with RAG → follow-up questions → rename → review history.

Python Example (Complete Workflow)

Python

import requests
import base64

api_key = "your_api_key_here"
base_url = "https://filesystem.vecml.com/api"
headers = {"X-API-Key": api_key}

# 1. Create a chat session
create_resp = requests.post(
    f"{base_url}/chat/create",
    headers={**headers, "Content-Type": "application/json"},
    json={"title": "Document Analysis"}
)
chat_id = create_resp.json()["chat_id"]
print(f"Created chat: {chat_id}")

# 2. Ask questions with RAG context
msg1 = requests.post(
    f"{base_url}/chat/message",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,
        "query": "What are the key findings in this report?",
        "file_paths": ["/my_documents/report.pdf"],
        "llm_model": "qwen3_8b"
    }
)
print(f"Answer 1: {msg1.json()['answer'][:200]}...")

# 3. Follow-up question (uses chat history automatically)
msg2 = requests.post(
    f"{base_url}/chat/message",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,
        "query": "Can you elaborate on the second point?",
        "file_paths": ["/my_documents/report.pdf"],
        "llm_model": "qwen3_8b"
    }
)
print(f"Answer 2: {msg2.json()['answer'][:200]}...")

# 4. Ask without RAG (no file_paths = pure LLM)
msg3 = requests.post(
    f"{base_url}/chat/message",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "chat_id": chat_id,
        "query": "Summarize everything we discussed so far.",
        "llm_model": "qwen3_8b"
    }
)
print(f"Summary: {msg3.json()['answer'][:200]}...")

# 5. Rename the chat
requests.post(
    f"{base_url}/chat/rename",
    headers={**headers, "Content-Type": "application/json"},
    json={"chat_id": chat_id, "new_name": "Q4 Report Analysis"}
)
print("Chat renamed!")

# 6. View all messages
msgs = requests.get(
    f"{base_url}/chat/messages",
    headers=headers,
    params={"chat_id": chat_id}
).json()
print(f"Total messages: {len(msgs['messages'])}")

# 7. Clean up (optional)
# requests.delete(f"{base_url}/chat", headers=headers, params={"chat_id": chat_id})