Architecture overview

Understanding the Multimodal Editor

The Multimodal Editor allows you to build flexible annotation interfaces that incorporate text, media, and custom logic. At a high level, the system consists of:

A Client Front End that renders and executes the interface and its code in the browser.
A SuperAnnotate Editor Backend that manages form definitions, data persistence, workflow states, and visibility rules.
A SuperAnnotate Orchestrate Backend for more advanced pipelines, triggers, and integrations.

Client Front End

The front end is responsible for rendering the annotation UI and executing custom logic. The user’s browser requests the interface definition (layout, components, etc.) from the Editor Backend, then compiles any Python code into WebAssembly (via Pyodide). This inline code can handle lightweight tasks such as:

Updating component values in response to user interactions.
Making synchronous or asynchronous requests to external APIs.
Adding logic to show or hide interface elements based on form state.

Because Python code runs in a browser-based WebAssembly environment, a subset of the Python ecosystem is available. For more resource-intensive tasks or access to specific Python libraries not available in Pyodide, you can integrate with Orchestrate (explained below).

SuperAnnotate Editor Backend

The Editor Backend is the core hub for:

Form Definitions and Custom Code: Storing and serving the UI layout and associated Python/WASM code.
Data Persistence: Saving item data, annotation metadata, and any custom fields you define.
Workflow Orchestration and Permissions: Handling item assignment, role-based visibility, and status transitions.

In practical terms, the Editor Backend ensures that the correct form structure, code, and data go to each user based on their project role and the item’s status in the workflow.

SuperAnnotate Orchestrate Backend

For actions that go beyond lightweight code in the browser—such as long-running tasks or large-scale data transformation—you can use the Orchestrate Backend. This environment allows you to:

Create dockerized Python scripts.
Expose these scripts as API endpoints that can be triggered from front-end code.
Link scripts into orchestration pipelines that automatically run on specific events (e.g., item status change, external triggers).

How to think about the split

Consider the nature of each feature when deciding whether to place logic in the Multi-Modal Code Editor (front end) or to implement it as an Orchestrate Action (backend pipeline):

In the Multi-Modal Code Editor
- Interaction-Focused: If annotators or QA specialists need direct, immediate access to certain logic during the annotation process—e.g., clicking a button to trigger a quick model inference, controlling video frames with event markers—implement this code inline.
- Lightweight Tasks: Ideal for asynchronous calls that return relatively fast (a few seconds) and for dynamic, UI-specific logic.
In Orchestrate Pipelines
- Post-Annotation or Bulk Operations: For tasks that happen after the annotation process or at scale—e.g., automatically moving items to a different folder based on status change, running batch validations, or sending data to an LLM for spell-checking.
- Long-Running or Resource-Heavy: Deploy large Python libraries or heavier workflows using dockerized scripts.
- Event-Driven: Integrate with external triggers or internal events like “folder complete” or “status updated.” Pipelines can be run automatically or manually via the SuperAnnotate UI.

By splitting logic between the inline code editor and Orchestrate pipelines, you can maintain a responsive, user-centric annotation interface while still leveraging robust automation and external integrations.

Annotation User Interface

The Annotation User Interface (UI) is defined at the project level. It consists of two main setup steps:

UI Form Builder – A drag-and-drop interface (or JSON import) for creating layout structures and components.
Code Editor – An optional Python environment where you can add custom functionality and dynamic behaviors to UI components.

Categories of Components

Organizational Components
- Examples: Groups, Dividers, Grids.
- Purpose: Arrange and group other components visually or logically.
- Dynamic Behavior: Groups, in particular, can be repeated or removed programmatically (e.g., to handle multi-step chat interfaces, alternate user paths).
Data Components
- Examples: Markdown, Number, Code, Text Input, Paragraph, Media (images, PDFs, videos, etc.).
- Purpose: Store, display, and/or collect data from users. Some components are user-editable by default (e.g., Text Input), while others may only display dynamic or static information.
- Note on Media: Media is referenced via external source URLs (e.g., AWS, Azure, GCP). URLs must be publicly accessible or pre-signed to allow viewing in the browser. If needed, you can define a custom signing function in the Code Editor to handle secure or temporary URLs.
Action Components
- Examples: Buttons, Select fields, toggles, or checkboxes.
- Purpose: Enable user-triggered events, such as button clicks that execute code or open modals.

Visibility and Permissions

Each component can be configured with Role-Based and Item Status-Based permissions, specifying whether it’s visible, hidden, or read-only for particular roles or stages in the workflow. These properties ensure that users only see and edit components relevant to their role and the item’s lifecycle stage.

Custom Components

You can also create fully custom interactive components using HTML, JavaScript, and iframes. For more information on creating or importing custom components, see [Custom Components].

Component IDs

Every component has a unique component ID (editable in the Form Builder). This ID is critical for:

Retrieving/Modifying values in the Code Editor.
Aligning data fields when importing/exporting item data in JSON format.

UIs can be built entirely via the drag-and-drop Form Builder or by uploading JSON files that define the layout. You can export and import Form Templates to reuse or share your custom annotation setups.

Extending functionality with code

After configuring the UI in the Form Builder, you can optionally add Python code in the Code Editor to implement custom logic. This code runs in the browser using Pyodide (Python compiled to WebAssembly), so it has limited library support. It is ideal for:

User Interactions: Register event handlers (e.g., on_<component_id>_click) that respond when a button is clicked, a field value changes, etc.
Dynamic Updates: Programmatically set or read component values, add or remove rows in Groups, show or hide UI elements, or fetch data from external APIs.
Lightweight Processing: Validate user input, do quick calculations, or parse small amounts of data on the fly.

For heavier or long-running tasks, consider using the Orchestrate Backend to leverage dockerized scripts with full Python library support.

Items

Once your Annotation UI and any optional code are ready, you can publish them to an annotation project. You’ll then add Items to the project:

Creating Empty Items: Start with a blank state, using default or placeholder values defined in your form.
Populating Items from External Sources: Import data from external URLs or storage systems (e.g., AWS S3) to initialize component values. If you’re displaying media (images, PDFs, videos), be sure to provide valid, accessible URLs for each item.

Annotators, QAs, or Admins can then open these items in the multimodal editor, see the configured interface, and input or modify data according to the project’s requirements.

Workflow

A project is always tied to a specific Project Workflow that defines:

Roles: E.g., Annotator, QA, Project Admin.
Item Statuses: E.g., NotStarted, InProgress, QualityCheck, Completed, Returned, Skipped.
Allowed Transitions: Which statuses can be moved to next, and by which roles.

These workflows control which user roles can view or edit items at certain stages. They also enable advanced features like transition-based event triggers for Orchestrate pipelines (e.g., automatically moving items to a new stage upon “QualityCheck,” sending items for review, etc.).

Default Workflow: If you don’t need custom roles or item statuses, the system provides a default set (Annotator, QA, Admin).
Custom Workflows: You can create a JSON file defining custom roles, statuses, and transitions. Import this workflow before creating a project that uses it. (Workflows cannot currently be changed once a project is created.)

Data Management

During the annotation process, item data is stored in the SuperAnnotate backend. You can also connect your project to external data storage to initialize or update item states:

Browser State: When a user opens an item, the front end loads all visible data into the browser. Edits remain local in the browser state until one of the following occurs:
- The user manually saves.
- The user changes the item status.
SuperAnnotate SDK: You can also modify item data programmatically using the SDK. Any newly uploaded data will overwrite the existing data in the editor, so ensure your changes are intentional and aligned with the current project state.

You can import data from external data sources like Snowflake, Databricks, AWS and more using Integrations.

Next Steps

See Building a UI for a deeper dive into the Form Builder interface and component properties.
Explore Behavior Design & Code Editor for details on event handlers, lifecycle hooks, and best practices in Python code.
Refer to Orchestrate Documentation if you need to schedule or automate tasks beyond the editor’s scope.