Crawl Web

Overview:

The Crawl Web block is designed to systematically explore websites and extract (nested) links for various purposes, such as analysis, SEO enhancements, or competitive research.

Inputs & Outputs

I/O	Feature	Type	Simple Explanation
input	`url`	string	The URL of the website you wish to crawl.
input	`depth`	number	Determines how many layers deep the crawler will go; max value is 3.
input	`domain_whitelist` (optional)	list	Restricts crawling to specified domains only.
output	`url_list`	string[]	A collection of all URLs found during the crawl process.

Setting a depth of 3 for the web crawler might take longer processing times as it will crawl nested links with exponential growth in complexity. Use it wisely.

The domain whitelist accepts domains such as “keyflow.space”, “www.keyflow.space”, or “docs.keyflow.space”. You can also specify full URLs like “https://keyflow.space”, and only the domain name will be processed by the crawler automatically.

Use Cases

Consider how this block can be beneficial in various scenarios:

Link Collection: Ideal for gathering all hyperlinks on a targeted website, which can assist marketers or researchers looking for resources.
Website Structure Analysis: Use this tool to visualize and understand how different pages interconnect within a site, aiding developers in optimizations.
SEO and Web Audit: Conduct thorough checks on internal and external links found across a website to improve SEO strategies or identify broken links.
Competitive Insight: Map out the link structure of competitor sites or analyze your own sites for strategic content placement based on observed patterns.

In summary, whenever you need to reveal or analyze hidden connections within websites, the Crawl Web block proves invaluable!

Getting Started

App Essentials

Blocks

Triggers

Overview:

Inputs & Outputs

Use Cases

Getting Started

App Essentials

Blocks

Triggers

​Overview:

​Inputs & Outputs

​Use Cases

Overview:

Inputs & Outputs

Use Cases