Sea12Docs
Core Concepts

Schemas In-Depth

Schemas, data blocks, data points, data types, edges, collections, import/export, AI generation, JSON schema output.


A schema defines the structure of data that your workflows process. Schemas are reusable across multiple processes and serve as the blueprint for AI-powered data extraction (schematization).


Schema Structure

Schema
 ├── name: string
 ├── description: string
 ├── Data Blocks[]
 │    ├── name: string
 │    ├── description: string
 │    ├── position: { x, y }
 │    └── Data Points[]
 │         ├── name: string
 │         ├── type: string
 │         ├── description: string
 │         └── config: object
 └── Edges[]
      ├── sourceDataBlockID
      ├── targetDataBlockID
      ├── sourceToTargetFunction (optional)
      └── targetToSourceFunction (optional)

Data Blocks

A data block is a named group of fields representing an entity, message, or document type.

Creating a Data Block

  1. Open a schema in the editor.
  2. Right-click the canvas or use the toolbar to add a data block.
  3. Set the name (used as a key in schematization output and template references).
  4. Set the description (fed to the LLM during extraction -- be specific).

Naming

Data block names become keys in the schematization output. If your data block is named Order Request, the output is:

json
{
  "Order Request": {
    "field1": "value1",
    "field2": "value2"
  }
}

Downstream template references use this name: {{input["Schematize Node"]["Order Request"]["field1"]}}.


Data Points (Fields)

Each data block contains data points. A data point defines a single field.

Data Point Properties

PropertyRequiredDescription
nameYesField name (becomes JSON key in output)
typeYesData type (see table below)
descriptionYesWhat this field contains -- this is read by the LLM during schematization, so be descriptive
configNoType-specific configuration

Data Types

TypeJSON OutputConfig OptionsExample Value
string"text"None"John Smith"
number123 or 45.67None5000
booleantrue/falseNonetrue
date"YYYY-MM-DD"None"2026-04-12"
date-time"ISO 8601"None"2026-04-12T14:30:00Z"
time"HH:MM:SS"None"14:30:00"
enum"value"values: list of allowed values"standard"
datablock{...}datablockId: reference to another data blocknested object

Enum Configuration

For enum-type data points, provide the allowed values in the config:

json
{
  "values": ["standard", "rush", "emergency"]
}

The LLM will constrain its extraction to one of these values.

Nested Data Blocks (datablock type)

A data point of type datablock references another data block in the same schema, creating a nested structure. Set the datablockId in the config to the target data block's ID.

Example: An Invoice data block with a line_items field of type datablock referencing a Line Item data block produces:

json
{
  "Invoice": {
    "invoice_number": "INV-001",
    "line_items": [
      {
        "description": "Carbon Steel",
        "quantity": 5000,
        "unit_price": 0.45
      }
    ]
  }
}

Schema Edges

Edges connect two data blocks within a schema. They define relationships and can carry coupling functions that transform data between related blocks.

Creating an Edge

  1. In the schema editor, drag from one data block's output port to another data block's input port.
  2. The edge appears as a connection line.

Edge Properties

PropertyDescription
sourceDataBlockIDThe originating data block
targetDataBlockIDThe destination data block
sourceToTargetFunctionOptional function that derives the target from the source
targetToSourceFunctionOptional function that derives the source from the target

How Edges Affect Schematization

When a schematization node extracts multiple data blocks from the same schema:

  • Data blocks without edges (solo) are extracted independently in parallel.
  • Data blocks connected by edges (coupled) are extracted as a group. The engine:
    1. Picks a root data block
    2. Extracts the root using the LLM
    3. Runs the coupling function on each edge to derive related blocks
    4. Continues until all coupled blocks are extracted

This ensures related data stays consistent (e.g., an invoice header and its line items).


JSON Schema Output

Each data block can be exported as a JSON Schema. This is used internally by the schematization node for structured LLM output.

API endpoint: GET /schemas/data-blocks/by-id/{dataBlockId}/json-schema

Example output for the Order Request block:

json
{
  "type": "object",
  "properties": {
    "customer_name": {
      "type": "string",
      "description": "Name of the requesting customer"
    },
    "customer_email": {
      "type": "string",
      "description": "Email address of the customer"
    },
    "steel_type": {
      "type": "string",
      "description": "Type of steel requested (e.g., carbon, stainless, alloy)"
    },
    "weight_lbs": {
      "type": "number",
      "description": "Weight of steel requested in pounds"
    },
    "urgency": {
      "type": "string",
      "enum": ["standard", "rush", "emergency"],
      "description": "Urgency level of the order"
    }
  },
  "required": ["customer_name", "customer_email", "steel_type", "weight_lbs", "urgency"]
}

AI Schema Generation

You can generate schemas automatically from:

From a Text Description

  1. Click Generate with AI in the schema editor.
  2. Describe your data in plain English:
    I need a schema for tracking steel orders. Each order has a customer name,
    email, the type of steel (carbon, stainless, alloy, galvanized, tool),
    weight in lbs, and an urgency level (standard, rush, emergency).
  3. The AI creates data blocks and data points with appropriate types and descriptions.

From a PDF

  1. Click Generate from PDF.
  2. Upload a sample document (invoice, order form, report).
  3. The AI analyzes the document structure and creates a matching schema.

In both cases, review and adjust the generated schema before using it in processes.


Schema Validation (Linting)

The lint endpoint checks a schema for issues:

API endpoint: POST /schemas/by-id/{schemaId}/lint

Checks performed:

  • Data blocks have at least one data point
  • Data point names are unique within a block
  • Enum types have at least one value defined
  • Datablock references point to valid blocks
  • Edge source/target blocks exist

Collections

Schemas can be organized into collections (folders). Collections support:

  • Nesting (a collection can have a parent collection)
  • Multiple membership (a schema can be in multiple collections)
  • Descriptions for documentation

Managing Collections

OperationHow
CreateSchemas page > New Collection
Add schema to collectionDrag schema onto collection, or use the API
Remove schema from collectionRight-click > Remove from Collection
Nest collectionsSet parentCollectionID when creating

Import and Export

Export

Export a schema as JSON for backup or sharing:

API endpoint: GET /schemas/by-id/{schemaId}/export

The export includes:

  • Schema metadata (name, description)
  • All data blocks with positions
  • All data points with types and configs
  • All edges with coupling function references

Import

Import a schema from JSON:

API endpoint: POST /schemas/bulk-import/{orgId}

Request body:

json
{
  "name": "Steel Order",
  "description": "Schema for steel ordering workflow",
  "dataBlocks": [
    {
      "name": "Order Request",
      "description": "Incoming steel order details",
      "position": { "x": 100, "y": 200 },
      "dataPoints": [
        {
          "name": "customer_name",
          "type": "string",
          "description": "Name of the requesting customer"
        },
        {
          "name": "weight_lbs",
          "type": "number",
          "description": "Weight of steel in pounds"
        }
      ]
    }
  ],
  "edges": []
}

Using Schemas in Processes

Schemas are referenced in two node types:

Schematization Node

The schematization node uses a schema to extract structured data from unstructured input. You select:

  • Which schema to use
  • Which data blocks to extract
  • An LLM model and prompt

The node outputs one key per selected data block, with field values extracted by the LLM.

API Input Node (Webhook)

The API input node can reference a schema to validate incoming webhook payloads against a specific data block's structure.