JobConfig
Defines the top‑level job configuration when starting via MSSQL: connection to the WDS API, entry URLs, job type, and request/runtime settings (headers, cookies, HTTPS, proxies, error handling, domain scope).
Name | Type | Description |
---|---|---|
Server | ServerConfig | Required. WDS API Server connection parameters |
StartUrls | Array of Strings | Required. Initial URLs. Crawling entry points |
JobName | String | Optional. Job name. If not specified a random generated value is used |
JobType | JobTypes | Optional. Job type |
Headers | HeadersConfig | Optional. Headers settings |
Restart | RestartConfig | Optional. Job restart settings |
Https | HttpsConfig | Optional. HTTPS settings |
Cookies | CookiesConfig | Optional. Cookies settings |
Proxy | ProxiesConfig | Optional. Proxy settings |
DownloadErrorHandling | DownloadErrorHandling | Optional. Download errors handling settings |
CrawlersProtectionBypass | CrawlersProtectionBypass | Optional. Crawlers protection countermeasure settings |
CrossDomainAccess | CrossDomainAccess | Optional. Cross-domain access settings |
Initialization String Format
An instance can be initialized with a string of the following format: JobName: jobname; Server: serverConnectionString; StartUrls: URL1, URL2
Methods
Methods that help with initialization.
AddStartUrl
Adds a new start URL
Syntax
AddStartUrl( url )
Arguments
Name | Type | Description |
---|---|---|
url | String | Start Url |
Return type
Return value
Returns the instance on which it was called
Examples
Creating a new instance initialized from a string:
DECLARE @jobConfig wds.JobConfig = 'JobName: TestJob1; ServerCS: wds://localhost:2807; StartUrls: http://playground';
Adding one more start URL:
SET @jobConfig = @jobConfig.AddStartUrl('http://example.com');
JobTypes
Specifies how and where the crawler operates. Choose the mode that matches the environment your job targets.
Possible values restrictions and the default value for all jobs can be configured in the Dapi service.
Additionally, the Crawler service should be correctly configured to handle jobs of different types.
Values
Name | Description |
---|---|
Internet | Crawl data from internet sources via request gateways (Proxy addresses, Host IP addresses, etc.) |
Intranet | Crawl data from intranet sources with no limits |
Examples
Changing the job type:
SET @jobConfig.JobType = 'Intranet';