Skip to content
synthreo.ai

Web Search Node - Synthreo Builder

Web Search node for Builder - perform live internet searches from within an AI agent workflow to retrieve current information, news, and web data for LLM context enrichment.

The Web Search node enables your workflow to automatically search the internet and extract information from web pages. It supports two distinct modes: performing live searches using the Bing Search Engine API, and scraping specific known web pages using XPath selectors. Both modes allow workflows to gather current, real-world data without manual research.

Use this node when your AI agent needs up-to-date information from the web, when you want to monitor specific websites for changes, or when you need to enrich workflow data with content from publicly available sources.


The Web Search node acts as your workflow’s research layer, automatically:

  • Searching the web using the Bing Search Engine API to find pages matching a query.
  • Extracting specific data from known web pages using XPath selectors.
  • Processing search results and formatting them for use in subsequent workflow steps.
  • Handling multiple search queries and organizing results for downstream processing.

Field NameTypeDefaultDescription
modeDropdownCustom URL with XPath parserThe search and extraction method to use.
ModeDescription
Custom URL with XPath parserScrapes data from specific, known web pages using XPath selectors to extract precise content from defined page elements.
Bing Search Engine APIPerforms live web searches using Microsoft’s Bing search engine and returns structured results including titles, descriptions, and URLs.

Use this mode when you know exactly which web pages to extract data from and need to pull specific elements rather than full page content.

  • Monitoring competitor product pages for price changes.
  • Extracting news headlines from a specific publication’s website.
  • Pulling contact details from company directory pages.
  • Reading structured data from a site that does not provide an API.
  • Precise extraction from specific page elements using XPath expressions.
  • No API costs or usage limits.
  • Works with any publicly accessible web page.
  • Highly targeted - pulls only the data you need.

XPath (XML Path Language) is a query language for selecting elements within an HTML or XML document. To configure the XPath parser:

  1. Identify the page element containing the data you need (use browser developer tools to inspect the page).
  2. Write or generate an XPath expression that targets that element (for example, //div[@class="price"]/text() to extract price text from a div with class price).
  3. Enter the target URL and XPath expression in the node configuration.

Common XPath examples:

//h1/text() - Extract the main page heading
//div[@id="product-price"]/text() - Extract text from a specific ID
//table//tr/td[2]/text() - Extract the second column of every table row
//a[@class="article-link"]/@href - Extract href values from links with a given class

Use this mode when you need to search the entire web rather than scrape known pages. This is the right choice for research, content discovery, brand monitoring, and competitive intelligence gathering.

  • Finding companies in specific industries for lead generation.
  • Researching trending topics or keywords in a given domain.
  • Monitoring brand mentions across the web.
  • Discovering new competitors or market opportunities.
  • Gathering broad intelligence on any subject where the sources are not known in advance.
  • Access to Bing’s search index covering billions of web pages.
  • Structured results with titles, descriptions, and URLs for each result.
  • Supports advanced search filtering and query customization.
  • Enterprise-grade infrastructure for reliable search at scale.

To use the Bing Search Engine API mode, you need a Bing Search API key from Microsoft Azure Cognitive Services. Configure the node with:

  • Your Bing API key in the authentication field.
  • The search query (supports dynamic variables from upstream nodes, for example {{searchTerm}}).
  • Optional result count and filtering parameters.

The node outputs the search results or extracted content as structured data for downstream nodes.

Output Format (Bing API example):

{
"web_search_result": [
{
"title": "Result page title",
"description": "Brief description or snippet from the page",
"url": "https://example.com/page"
}
]
}

Output Format (XPath example):

{
"web_search_result": "$24.99"
}

A marketing agency monitors competitor pricing across multiple e-commerce sites daily to adjust client pricing strategies.

Configuration:

  • Mode: Custom URL with XPath parser
  • Target URLs: competitor product pages
  • XPath: expression targeting the price element on each page

Outcome: The workflow visits each product page, extracts the current price, and passes the data to a comparison and reporting node.

A B2B sales team automatically finds and qualifies potential customers by searching for companies that use specific technologies.

Configuration:

  • Mode: Bing Search Engine API
  • Search Query: {{targetTechnology}} vendor site:linkedin.com
  • Result Count: 10 per query

Outcome: The workflow returns a list of companies matching the search criteria, which are then passed to an LLM node for qualification scoring.

A content marketing team tracks brand mentions and industry trend coverage across the web.

Configuration:

  • Mode: Bing Search Engine API
  • Search Query: "{{brandName}}" -site:owneddomain.com
  • Scheduled to run daily

Outcome: New brand mentions and industry articles are collected and passed to an email or Slack notification node for team review.

A real estate investment team pulls property listing data from multiple listing sites.

Configuration:

  • Mode: Custom URL with XPath parser
  • Target URLs: specific property listing pages
  • XPath: expressions targeting price, square footage, and address elements

Outcome: Structured property data is extracted and stored in a database node for comparative analysis.


  1. Drag the Web Search node from the node panel onto your workflow canvas.
  2. Connect it to the node that provides search terms or URLs.
  3. Click the node to open the settings panel.
  1. In the Type dropdown, select Custom URL with XPath parser.
  2. Enter the target URL in the URL field. Use {{variableName}} to make URLs dynamic.
  3. Enter your XPath expression to target the specific page element you want to extract.
  4. Test with a sample URL to verify the XPath returns the expected content.
  5. Set a descriptive Result Property Name and save.
  1. In the Type dropdown, select Bing Search Engine API.
  2. Enter your Bing API key in the authentication field.
  3. Enter the search query in the query field. Use {{searchTerm}} to pass dynamic queries from upstream nodes.
  4. Set the desired number of results per query.
  5. Configure any additional filtering options shown in the settings panel.
  6. Test with a sample query, review the results, and save.

  • Form inputs or triggers - providing search terms or target URLs.
  • Database nodes - supplying lists of competitor URLs or monitored keywords.
  • LLM nodes - generating search queries based on prior analysis.
  • LangChain - for chunking and preparing extracted content for AI analysis.
  • OpenAI GPT - for summarizing or classifying search results.
  • CRUD Integration - for storing findings in Monday.com, Airtable, or similar platforms.
  • Send Email / Send SMS - for alerting teams to relevant search findings.
  • HTTP Client - for making follow-up API calls to URLs found in search results.

IssueLikely CauseResolution
XPath returns empty resultsXPath expression is incorrect or the page structure has changedUse browser developer tools to re-inspect the page and update the XPath expression.
Bing API returns no resultsAPI key is invalid or the query has no matching resultsVerify the API key in Azure portal and test the query directly in a Bing search to confirm results exist.
Page content changes between runsThe target site has updated its HTML structureRe-inspect the page and update the XPath selector to match the new structure.
Rate limiting errors from BingToo many API calls in a short windowReduce the frequency of workflow runs or add a delay between batched search queries.
Search results are irrelevantQuery is too broadRefine the search query with more specific terms, site filters, or date restrictions.
Dynamic page content not extractedSite renders content via JavaScriptFor JavaScript-heavy pages, consider using the URL Scraper node with the Chrome Browser engine instead.

  • Use Custom URL with XPath parser when you have specific, known pages to monitor and need precise data from defined elements.
  • Use Bing Search Engine API when you need broad web coverage and the relevant sources are not known in advance.
  • The Custom URL method has no API costs or rate limits, making it more economical for high-volume monitoring of known pages.
  • Test XPath expressions against live pages before deploying to production. Page structures change, and expressions that work today may break when a site is redesigned.
  • Validate search results with spot checks on a regular basis.
  • Set up error handling (for example, using a conditional node) to detect when expected content is missing.
  • Limit the number of Bing API results to what is actually needed to reduce processing time and API costs.
  • Use specific search queries with additional filters to reduce irrelevant results that require downstream filtering.
  • Schedule search-intensive workflows during off-peak hours.
  • Respect the terms of service of websites you scrape with the XPath method.
  • Only extract publicly available information.
  • Implement reasonable delays between requests to avoid placing excessive load on target servers.

  • URL Scraper - for full page content extraction when XPath is too narrow or when PDF processing is needed.
  • HTTP Client - for calling a structured API endpoint at a URL discovered through search results.
  • LangChain - for preparing extracted web content for AI model consumption.
  • OpenAI GPT - for analyzing, summarizing, or classifying content retrieved by this node.