Crawl Tool

Finds links on the current page using a selector and returns new download tasks to continue the crawl; supports notifying on retries.

Arguments

Name Type Description
task DownloadTask Required. A task from the previous Start or Crawl tool response
selector String Required. Selector for getting interesting links on a web page
attributeName String Optional. Attribute name to get data from. Use val to get inner text. Default value: href
Remarks

The selector argument is a selector of the following format: CSS|XPATH: selector. The first part defines the selector type, the second one should be a selector in the corresponding type. Supported types:

DownloadTask

Represents a single page download request produced by a crawl or scrape job.

Fields:

Name Type Description
Id String Required. Task Id
Url String Required. Page URL

Return Type

Array of DownloadTask

Please rotate your device to landscape mode

This documentation is specifically designed with a wider layout to provide a better reading experience for code examples, tables, and diagrams.
Rotating your device horizontally ensures you can see everything clearly without excessive scrolling or resizing.

Return to Web Data Source Home