Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required
An event-driven monitoring tool that can consume messages from Apache Kafka clusters and display the aggregated data on a dashboard for analysis and maintenance.
The main goal is to play with Kafka Connect and Streams. We have store-api that inserts/updates records in MySQL; Source connectors that monitor inserted/updated records in MySQL and push messages related to those changes to Kafka; Sink connectors that read messages from Kafka and insert documents in ES; Store-streams that listens for messages in Kafka, treats them using Kafka Streams and push new messages back to Kafka.
Simulated production environment running Kubernetes targeting Apache Kafka and Confluent components on Confluent Cloud. Managed by declarative infrastructure and GitOps.
The title might seem a bit vague but I don't know how to describe it any better tbh :-)
Anyway this is what happened: I got some 500 responses from the schema registry and all I could see in the logs was :
The logs di