Ingest plugin#3708
Conversation
- Add ingest plugin that sends applied operations to external HTTP endpoint - Add name_from_type helper function in operations.cpp to extract operation names from type names - Plugin supports configurable endpoint URL and sends operation JSON via HTTP POST - Includes CMakeLists.txt, plugin.json, and header/implementation files
…s for ingest plugin Client-side improvements: - Add automatic retry mechanism with 3-second delay (max 5 attempts) - Implement batch sending support (configurable batch size and timeout) - Add connection pooling for better performance - Implement block-only records for blocks without operations - Add post_apply_block signal handler to detect empty blocks - Implement blocking queue to protect API endpoint from overload - Add http_retry_worker thread for asynchronous retry processing - Add send_http_batch method for batch endpoint support - Add get_or_create_connection for connection pool management - Add parse_endpoint_url for URL parsing - Update configuration options: --ingest-batch-size, --ingest-batch-timeout - Update README.md documentation with new features - Update plugin initialization to start HTTP worker early These changes ensure reliable data delivery with automatic retries and complete block coverage in MongoDB. Client will block replay when queue is full to prevent API overload. Failed HTTP requests are automatically retried with exponential backoff.
- Add curl to apt-get install for HTTP client support in ingest plugin - Update all Dockerfiles (azurelinux3.0, debian13, ubuntu20.04, ubuntu22.04, ubuntu24.04) - Required for boost::beast HTTP client functionality This ensures the ingest plugin can make HTTP requests to the external ingest service.
|
Nice idea, I hope you have a usecase for this. I haven't run it or tested it yet. From what I've seen in the code, there doesn't seem to be any differentiation for replay or resync cases. That would slow down the process considerably when the plugin is activated. |
I'm currently refactoring the SteemDB project. In order to accelerate the database recovery speed during the cold start, I developed this plugin. This plugin directly retrieves the required data from the block_log. Currently, the plugin is still under testing, and the relevant test code is here. https://github.com/steemit/steemdb/tree/next/steemdb-sync/test/docker-compose |
…oint, use batch endpoint only - Removed send_http_post() method - Removed HandleAppliedOp() handler - Removed /ingest/applied_op endpoint support - All operations sent as arrays via /ingest/applied_ops endpoint - Updated default endpoint to /ingest/applied_ops - Updated all documentation and test code - Fixed HTTP error retry mechanism (400 errors now retry) - Added service readiness check on startup - Added configurable timeout for queue draining and service waiting
) During blockchain synchronization, receiving blocks from forks is expected behavior as multiple peers may be on different forks. Previously, these were logged as errors via elog(), causing log noise and false alerts. This change: - Checks sync_mode in the unlinkable_block_exception handler - Uses fc_wlog (warning level) for sync mode instead of elog (error level) - Returns false immediately in sync mode, avoiding unnecessary exception propagation to node.cpp - Preserves existing error behavior for non-sync mode (triggers sync restart) Fixes: Error when pushing block elog noise during sync Reference: i-05d97adb280d9914e (steemit-production-steemd-002)
No description provided.