LangChain

🎯 Purpose

The LangChain node connects workflows to multiple data sources and prepares documents for AI analysis.
It can load content from external systems, process raw text, and split large documents into chunks optimized for AI models.

Think of it as a document preparation assistant that transforms unstructured data into AI-ready input.

📥 Inputs

Text Input (String, Optional): When using Input String, the node accepts text data from a previous node.
Data Loader Input (Config, Optional): When using Data Loaders, the node fetches content directly from external systems.

📤 Outputs

The node generates two standard properties for downstream nodes:

page_content → Extracted or chunked text ready for AI processing.
metadata → Source information (file name, creation date, document type, etc.).

If the operation is set to "Split into chunks", multiple records are output (one per chunk).

⚙️ Parameters

Name	Type	Required	Default	Description
`Data Source`	Dropdown	✅ Yes	`Data Loaders`	Select source of data: `Data Loaders` (external sources) or `Input String` (workflow text).
`Input Property Name`	String	No	—	Property name of the text input from previous node (only used if `Input String` selected).
`Data Loader`	Dropdown	No	—	Choose a specific integration (e.g., Airtable, Confluence, CSV, etc).
`Operation`	Dropdown	No	`Split into chunks`	Processing mode: `Single value` (full document) or `Split into chunks`.
`chunkSize`	Number	No	1000	Size of each chunk (1–10,000 characters). Active only if chunking enabled.
`chunkOverlap`	Number	No	200	Overlap between chunks (0–1,000 characters). Prevents loss of context.

📘 Best Practices

Chunking:
- Q&A Systems: 500–800 size, overlap 150–200.
- Summarization: 2000–3000 size, overlap 100.
- Analysis: 1000–1500 size, overlap 300–400.
Performance: Process documents in batches. Keep chunk sizes aligned with your AI model’s token limits.
Output Handling: Always reference page_content and metadata in downstream nodes.

🧪 Test Cases

Given: Input string "Customer review: Product arrived late.", Operation = Split into chunks, chunkSize = 20.
→ Expected: Output chunks of max 20 chars with overlap.
Given: DirectoryLoader pointing to 5 contracts.
→ Expected: 5 documents processed into page_content chunks with metadata showing file names.

🎯 Purpose​

📥 Inputs​

📤 Outputs​

⚙️ Parameters​

📘 Best Practices​

🧪 Test Cases​