Web Search Node - Synthreo Builder

Web Search node for Builder - perform live internet searches from within an AI agent workflow to retrieve current information, news, and web data for LLM context enrichment.

Overview

The Web Search node enables your workflow to automatically search the internet and extract information from web pages. It supports two distinct modes: performing live searches using the Bing Search Engine API, and scraping specific known web pages using XPath selectors. Both modes allow workflows to gather current, real-world data without manual research.

Use this node when your AI agent needs up-to-date information from the web, when you want to monitor specific websites for changes, or when you need to enrich workflow data with content from publicly available sources.

What This Node Does

The Web Search node acts as your workflow’s research layer, automatically:

Searching the web using the Bing Search Engine API to find pages matching a query.
Extracting specific data from known web pages using XPath selectors.
Processing search results and formatting them for use in subsequent workflow steps.
Handling multiple search queries and organizing results for downstream processing.

Parameters

Search Method Selection

Field Name	Type	Default	Description
`mode`	Dropdown	Custom URL with XPath parser	The search and extraction method to use.

Mode	Description
Custom URL with XPath parser	Scrapes data from specific, known web pages using XPath selectors to extract precise content from defined page elements.
Bing Search Engine API	Performs live web searches using Microsoft’s Bing search engine and returns structured results including titles, descriptions, and URLs.

Custom URL with XPath Parser

Use this mode when you know exactly which web pages to extract data from and need to pull specific elements rather than full page content.

When to Use

Monitoring competitor product pages for price changes.
Extracting news headlines from a specific publication’s website.
Pulling contact details from company directory pages.
Reading structured data from a site that does not provide an API.

Key Benefits

Precise extraction from specific page elements using XPath expressions.
No API costs or usage limits.
Works with any publicly accessible web page.
Highly targeted - pulls only the data you need.

XPath Configuration

XPath (XML Path Language) is a query language for selecting elements within an HTML or XML document. To configure the XPath parser:

Identify the page element containing the data you need (use browser developer tools to inspect the page).
Write or generate an XPath expression that targets that element (for example, //div[@class="price"]/text() to extract price text from a div with class price).
Enter the target URL and XPath expression in the node configuration.

Common XPath examples:

//h1/text()                          - Extract the main page heading
//div[@id="product-price"]/text()    - Extract text from a specific ID
//table//tr/td[2]/text()             - Extract the second column of every table row
//a[@class="article-link"]/@href     - Extract href values from links with a given class

Bing Search Engine API

Use this mode when you need to search the entire web rather than scrape known pages. This is the right choice for research, content discovery, brand monitoring, and competitive intelligence gathering.

When to Use

Finding companies in specific industries for lead generation.
Researching trending topics or keywords in a given domain.
Monitoring brand mentions across the web.
Discovering new competitors or market opportunities.
Gathering broad intelligence on any subject where the sources are not known in advance.

Key Benefits

Access to Bing’s search index covering billions of web pages.
Structured results with titles, descriptions, and URLs for each result.
Supports advanced search filtering and query customization.
Enterprise-grade infrastructure for reliable search at scale.

API Configuration

To use the Bing Search Engine API mode, you need a Bing Search API key from Microsoft Azure Cognitive Services. Configure the node with:

Your Bing API key in the authentication field.
The search query (supports dynamic variables from upstream nodes, for example {{searchTerm}}).
Optional result count and filtering parameters.

Output

The node outputs the search results or extracted content as structured data for downstream nodes.

Output Format (Bing API example):

{
  "web_search_result": [
    {
      "title": "Result page title",
      "description": "Brief description or snippet from the page",
      "url": "https://example.com/page"
    }
  ]
}

Output Format (XPath example):

{
  "web_search_result": "$24.99"
}

Real-World Use Cases

Market Research Automation

A marketing agency monitors competitor pricing across multiple e-commerce sites daily to adjust client pricing strategies.

Configuration:

Mode: Custom URL with XPath parser
Target URLs: competitor product pages
XPath: expression targeting the price element on each page

Outcome: The workflow visits each product page, extracts the current price, and passes the data to a comparison and reporting node.

Lead Generation Through Web Research

A B2B sales team automatically finds and qualifies potential customers by searching for companies that use specific technologies.

Configuration:

Mode: Bing Search Engine API
Search Query: {{targetTechnology}} vendor site:linkedin.com
Result Count: 10 per query

Outcome: The workflow returns a list of companies matching the search criteria, which are then passed to an LLM node for qualification scoring.

Content Research and Brand Monitoring

A content marketing team tracks brand mentions and industry trend coverage across the web.

Configuration:

Mode: Bing Search Engine API
Search Query: "{{brandName}}" -site:owneddomain.com
Scheduled to run daily

Outcome: New brand mentions and industry articles are collected and passed to an email or Slack notification node for team review.

Real Estate Market Data Collection

A real estate investment team pulls property listing data from multiple listing sites.

Configuration:

Mode: Custom URL with XPath parser
Target URLs: specific property listing pages
XPath: expressions targeting price, square footage, and address elements

Outcome: Structured property data is extracted and stored in a database node for comparative analysis.

Step-by-Step Configuration

Setting Up the Node

Drag the Web Search node from the node panel onto your workflow canvas.
Connect it to the node that provides search terms or URLs.
Click the node to open the settings panel.

Configuring Custom URL with XPath Parser

In the Type dropdown, select Custom URL with XPath parser.
Enter the target URL in the URL field. Use {{variableName}} to make URLs dynamic.
Enter your XPath expression to target the specific page element you want to extract.
Test with a sample URL to verify the XPath returns the expected content.
Set a descriptive Result Property Name and save.

Configuring Bing Search Engine API

In the Type dropdown, select Bing Search Engine API.
Enter your Bing API key in the authentication field.
Enter the search query in the query field. Use {{searchTerm}} to pass dynamic queries from upstream nodes.
Set the desired number of results per query.
Configure any additional filtering options shown in the settings panel.
Test with a sample query, review the results, and save.

Integration with Other Nodes

Recommended Upstream Nodes

Form inputs or triggers - providing search terms or target URLs.
Database nodes - supplying lists of competitor URLs or monitored keywords.
LLM nodes - generating search queries based on prior analysis.

Recommended Downstream Nodes

LangChain - for chunking and preparing extracted content for AI analysis.
OpenAI GPT - for summarizing or classifying search results.
CRUD Integration - for storing findings in Monday.com, Airtable, or similar platforms.
Send Email / Send SMS - for alerting teams to relevant search findings.
HTTP Client - for making follow-up API calls to URLs found in search results.

Troubleshooting

Issue	Likely Cause	Resolution
XPath returns empty results	XPath expression is incorrect or the page structure has changed	Use browser developer tools to re-inspect the page and update the XPath expression.
Bing API returns no results	API key is invalid or the query has no matching results	Verify the API key in Azure portal and test the query directly in a Bing search to confirm results exist.
Page content changes between runs	The target site has updated its HTML structure	Re-inspect the page and update the XPath selector to match the new structure.
Rate limiting errors from Bing	Too many API calls in a short window	Reduce the frequency of workflow runs or add a delay between batched search queries.
Search results are irrelevant	Query is too broad	Refine the search query with more specific terms, site filters, or date restrictions.
Dynamic page content not extracted	Site renders content via JavaScript	For JavaScript-heavy pages, consider using the URL Scraper node with the Chrome Browser engine instead.

Best Practices

Search Method Selection

Use Custom URL with XPath parser when you have specific, known pages to monitor and need precise data from defined elements.
Use Bing Search Engine API when you need broad web coverage and the relevant sources are not known in advance.
The Custom URL method has no API costs or rate limits, making it more economical for high-volume monitoring of known pages.

Data Quality

Test XPath expressions against live pages before deploying to production. Page structures change, and expressions that work today may break when a site is redesigned.
Validate search results with spot checks on a regular basis.
Set up error handling (for example, using a conditional node) to detect when expected content is missing.

Performance Optimization

Limit the number of Bing API results to what is actually needed to reduce processing time and API costs.
Use specific search queries with additional filters to reduce irrelevant results that require downstream filtering.
Schedule search-intensive workflows during off-peak hours.

Compliance and Ethics

Respect the terms of service of websites you scrape with the XPath method.
Only extract publicly available information.
Implement reasonable delays between requests to avoid placing excessive load on target servers.

URL Scraper - for full page content extraction when XPath is too narrow or when PDF processing is needed.
HTTP Client - for calling a structured API endpoint at a URL discovered through search results.
LangChain - for preparing extracted web content for AI model consumption.
OpenAI GPT - for analyzing, summarizing, or classifying content retrieved by this node.

Web Search Node - Synthreo Builder

Overview

What This Node Does

Parameters

Search Method Selection

Custom URL with XPath Parser

When to Use

Key Benefits

XPath Configuration

Bing Search Engine API

When to Use

Key Benefits

API Configuration

Output

Real-World Use Cases

Market Research Automation

Lead Generation Through Web Research

Content Research and Brand Monitoring

Real Estate Market Data Collection

Step-by-Step Configuration

Setting Up the Node

Configuring Custom URL with XPath Parser

Configuring Bing Search Engine API

Integration with Other Nodes

Recommended Upstream Nodes

Recommended Downstream Nodes

Troubleshooting

Best Practices

Search Method Selection

Data Quality

Performance Optimization

Compliance and Ethics

ThreoAI

Builder

Tenant Management

MSP Onboarding

Web Search Node - Synthreo Builder

Overview

What This Node Does

Parameters

Search Method Selection

Custom URL with XPath Parser

When to Use

Key Benefits

XPath Configuration

Bing Search Engine API

When to Use

Key Benefits

API Configuration

Output

Real-World Use Cases

Market Research Automation

Lead Generation Through Web Research

Content Research and Brand Monitoring

Real Estate Market Data Collection

Step-by-Step Configuration

Setting Up the Node

Configuring Custom URL with XPath Parser

Configuring Bing Search Engine API

Integration with Other Nodes

Recommended Upstream Nodes

Recommended Downstream Nodes

Troubleshooting

Best Practices

Search Method Selection

Data Quality

Performance Optimization

Compliance and Ethics

Related Nodes

ThreoAI

Builder

Tenant Management

MSP Onboarding