LangChain
π― Purposeβ
The LangChain node connects workflows to multiple data sources and prepares documents for AI analysis.
It can load content from external systems, process raw text, and split large documents into chunks optimized for AI models.
Think of it as a document preparation assistant that transforms unstructured data into AI-ready input.
π₯ Inputsβ
- Text Input (String, Optional): When using
Input String, the node accepts text data from a previous node. - Data Loader Input (Config, Optional): When using
Data Loaders, the node fetches content directly from external systems.
π€ Outputsβ
The node generates two standard properties for downstream nodes:
page_contentβ Extracted or chunked text ready for AI processing.metadataβ Source information (file name, creation date, document type, etc.).
If the operation is set to "Split into chunks", multiple records are output (one per chunk).
βοΈ Parametersβ
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
Data Source | Dropdown | β Yes | Data Loaders | Select source of data: Data Loaders (external sources) or Input String (workflow text). |
Input Property Name | String | No | β | Property name of the text input from previous node (only used if Input String selected). |
Data Loader | Dropdown | No | β | Choose a specific integration (e.g., Airtable, Confluence, CSV, etc). |
Operation | Dropdown | No | Split into chunks | Processing mode: Single value (full document) or Split into chunks. |
chunkSize | Number | No | 1000 | Size of each chunk (1β10,000 characters). Active only if chunking enabled. |
chunkOverlap | Number | No | 200 | Overlap between chunks (0β1,000 characters). Prevents loss of context. |
π Best Practicesβ
- Chunking:
- Q&A Systems: 500β800 size, overlap 150β200.
- Summarization: 2000β3000 size, overlap 100.
- Analysis: 1000β1500 size, overlap 300β400.
- Performance: Process documents in batches. Keep chunk sizes aligned with your AI modelβs token limits.
- Output Handling: Always reference
page_contentandmetadatain downstream nodes.
π§ͺ Test Casesβ
- Given: Input string
"Customer review: Product arrived late.",Operation = Split into chunks,chunkSize = 20.
β Expected: Output chunks of max 20 chars with overlap. - Given: DirectoryLoader pointing to 5 contracts.
β Expected: 5 documents processed intopage_contentchunks withmetadatashowing file names.