Crawl Tool
Finds links on the current page using a selector and returns new download tasks to continue the crawl; supports notifying on retries.
Arguments
Name | Type | Description |
---|---|---|
task | DownloadTask | Required. A task from the previous Start or Crawl tool response |
selector | String | Required. Selector for getting interesting links on a web page |
attributeName | String | Optional. Attribute name to get data from. Use val to get inner text. Default value: href |
Remarks
The selector argument is a selector of the following format: CSS|XPATH: selector
. The first part defines the selector type, the second one should be a selector in the corresponding type.
Supported types:
DownloadTask
Represents a single page download request produced by a crawl or scrape job.
Fields:
Name | Type | Description |
---|---|---|
Id | String | Required. Task Id |
Url | String | Required. Page URL |
Return Type
Array of DownloadTask