Hawk Workers
Workers are services for processing hawk's background tasks
Requirements
Registry - RabbitMQ
More info on setting up Registry here
For simplicity, Hawk workers can be used as part of the Mono repository
How to write a Worker
-
Inherit from
Workerclass and implementhandlemethod which process tasks from registry (see more inlib/worker.js) Example -
Define
type- worker type (e.gerrors/nodejs), which is also a Registry queue from where worker pulls tasks -
Edit
.envfile (see more below) -
Use
worker.start()to start your worker. You should write a simple runner like this -
Set
LOG_LEVELtoverboseif you want message logsAlso you can use
worker.loggerwhich iswinston.Loggerto log something > >
How to run workers
- Make sure you are in Workers root directory
- Add worker package as Yarn Workspace — add worker's path to the root's package.json at "workspaces" section
yarn installyarn worker worker-package-name(package name from worker's package.json). You can pass several workers separated by space.- (Optionally) You can add own script to run specified worker, for example
"run-js": "ts-node ./runner.ts hawk-worker-javascript"
Note. You can override some env variables on worker running:
SIMULTANEOUS_TASKS=1 yarn worker hawk-worker-sourcemaps
Running workers with PM2
-
Install PM2
yarn global add pm2 OR npm i -g pm2
-
If you've written your worker add it to
ecosystem.config.jslike the existing ones -
Edit
.envfiles -
Run it
# Run all workers pm2 start # Run specific worker, e.g. nodejs pm2 start nodejs
Feel free to tune your setting in
ecosystem.config.jsfile, more info
Running workers with Docker
Basic configuration is in docker-compose.dev.yml.
Pull image from https://hub.docker.com/r/codexteamuser/hawk-workers
docker-compose -f docker-compose.dev.yml pull
If you run mongodb and rabbitmq with hawk.mono repository, by default your docker network will be named hawkmono_default.
This network name is written as external for workers.
Run chosen worker (say hawk-worker-javascript)
docker-compose -f docker-compose.dev.yml up hawk-worker-javascript
Adding new workers
Make sure that your .env configurations exists.
Add new section to the docker-compose.{dev,prod}.yml files.
hawk-worker-telegram:
image: "codexteamuser/hawk-workers:prod"
env_file:
- .env
- workers/telegram/.env
restart: unless-stopped
entrypoint: /usr/local/bin/node runner.js hawk-worker-telegram
Error handling
If an error is thrown inside handle method it will be ignored, except if it is CriticalError or NonCriticalError
On CriticalError the currently processing message will be requeued to the same queue in Registry using Worker.requeue method
On NonCriticalError the currently processing message will be queued to stash queue in Registry using Worker.sendToStash method
Env vars
| Variable | Description | Default value |
|---|---|---|
| REGISTRY_URL | RabbitMQ connection URL | amqp://localhost |
| SIMULTANEOUS_TASKS | RabbitMQ Consumer prefetch value (How many tasks can do simultaneously) | 1 |
| LOG_LEVEL | Log level (error,warn,info,versobe,debug,silly) See more | info |
IMPORTANT
.envfile in root act like global preferences for all workers inworkersfolder.If some variable is present in root
.envfile, it is NOT overwritten by local.envin worker's folderThis allows to set global MongoDB or/and RabbitMQ connection settings while leaving possibility to set local options for each worker
Testing
- Make
.envfile with test settings - Run
yarn test:<component name>
| Component | Command | Requirements |
|---|---|---|
Base worker(lib) |
yarn test:base |
- RabbitMQ |
NodeJS worker(workers/nodejs) |
yarn test:nodejs |
None |
Database controller
MongoDB controller is bundled(see lib/db)
You can tweak it (add schemas, etc) and use it in your workers to handle database communication
Example
const db = require("lib/db/mongoose-controller");
await db.connect(); // Requires `MONGO_URL`
await db.saveEvent(event);Env vars
| Variable | Description | Default value |
|---|---|---|
| MONGO_URL | MongoDB connection URL | mongodb://localhost:27017/hawk-dev |
Testing
yarn test:db
Worker message format
{
// Access token with `projectId` in payload
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
// Worker specific payload
"payload": {
"title": "Error: ..."
// other fields
}
}Cache controller
To reduce an amount of requests or any performance improvements you can use an lib/cache/controller.
To use it in worker, you need to:
- Call
this.prepareCache();somewhere in worker to activate the cache module. For example, instart()method - Use
this.cache.get(key, resover?, ttl?)orthis.cache.set(key, value, ttl?)
Available methods:
set(key: string, value: any, ttl?: number)— cache dataget(key: string, resolver?: Function, ttl?: number)— get cached data (or resolve and cache). If you're passing a resolver, you may pass ttl too (for internal set command with resolver)del(key: string|string[])— delete cached dataflushAll()— flush the whole data
ttl(time to live) in seconds
Migrations
To create new migration use command
yarn migration create {migrationName}Each migration file contains two methods: up and down.
Up method executes revision and increases database version.
Down method rollbacks database changes
To execute migration run
yarn migrateTodo
Refactor mongo-migrate commands to have an opportunity to create or rollback

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
