Overview
The WDS API Server consists of two groups of components:
- Core Services
- Auxiliary Services
Core Services
Core Services handle the main workload and support two deployment modes:
- Single Service Mode
- Multi Service Mode
At a Glance
- Dapi: public REST API and orchestration gateway for jobs and tasks
- Datakeeper: durable storage and cache of downloaded pages and metadata
- Crawler: high‑throughput HTTP downloader with proxy, cookie, HTTPS controls
- Scraper: selector‑driven extraction of text and attributes from pages
- Idealer: unique ID generation and consistency across entities
Single Service Mode
In this mode, the entire application is packaged into a single service — Solidstack. Only one instance can run at a time. It is ideal for quick evaluation, development, and small test tasks because it is simple to deploy and maintain and consumes fewer resources. The trade‑offs are that it does not scale horizontally, and its availability is tied to a single node; a node restart will cause a service outage until the instance starts again.
Multi Service Mode
In this mode, the application runs as a set of independent, horizontally scalable services — the core stack: Dapi, Datakeeper, Crawler, Scraper, and Idealer.
- Scalability: run multiple instances of each service to handle workload spikes and grow with demand.
- Resilience: isolates failures; individual services can restart without taking the whole system down.
- Availability: when deployed to Kubernetes with the Helm Chart, the platform tolerates node restarts and continues operating.
- Operations: enables rolling upgrades and resource isolation per service.
This mode is available starting from the Business plan and is the recommended setup for staging and production environments.
Choosing a Mode
- Choose Single Service (Solidstack) for quick trials, demos, and small non‑critical workloads.
- Choose Multi Service for environments that require scaling, fault isolation, and zero‑downtime operations.
Third-Party Components
Core Services have one required third‑party dependency: MongoDB, which stores all system data.
Optionally, to optimize cost and performance, you can use:
- S3‑compatible storage — caches and reuses downloaded web resource pages. If not configured, the MongoDB is used for this purpose.
MongoDB
Both in-cluster and managed (SaaS) MongoDB deployments work well with the WDS Server.
- Supported versions: 6.x, 7.x, 8.x
- Supported deployments: Atlas, Enterprise, Community
S3-Compatible Storage
NOTE: Available only in the Multi Service Mode. See the Datakeeper documentation for configuration instructions.
Any S3‑compatible storage can be used, for example:
- AWS S3
- MinIO
- Other compatible services. The MinIO .NET Client is used to integrate with S3‑compatible storage.
Auxiliary Services
Services that help in testing and evaluation:
- Playground - to test queries and prompts idempotently in a stable environment
- Docs - to have all necessary documentation at hand in case the website is not accessible, for example, in air-gapped environments.