Kafka Interaction - Overview

Interaction Architecture

┌─────────────────────────┐           ┌──────────┐         ┌──────────────────────┐
│  Main Application       │ ───────>  │  Kafka   │────────>│ Analytics Microservice│
│ (Producer and Consumer) │ Events    │          │ Events  │    (Consumer)        │
└─────────────────────────┘           └──────────┘         └──────────────────────┘
         ▲                                                          │
         │                                                          │
         │                  ┌──────────┐                            │
         └──────────────────│  Kafka   │<───────────────────────────┘
          Aggregated        │          │   Aggregated data
             data           └──────────┘    (every 1 minute)

Forward Flow: Main Application → Kafka → Analytics Microservice

1. Sending Events (Producer)

Main Application sends events to Kafka when users perform actions:

Action	Controller/Service	Topic	Partitioning Key
View book	`BookController.getBookById()`	`book.views`	`bookId`
Download book	`BookFileController.downloadBook()`	`book.downloads`	`userId`
Purchase book	`StripeService.handlePaymentSuccess()`	`book.purchases`	`userId`
Create/update review	`ReviewController`	`book.reviews`	`bookId`
Create/update rating	`RatingController`	`book.ratings`	`bookId`

2. Processing Events (Consumer)

Analytics Microservice subscribes to topics and actively requests (polls) events from Kafka:

How it works:

Kafka uses a pull model (not push): Consumer requests data itself, rather than receiving it automatically
The microservice periodically polls Kafka for new messages in subscribed topics
Kafka distributes events across partitions based on key
Consumer reads events from assigned partitions
Statistics are updated in the microservice's memory (ConcurrentHashMap)
Kafka tracks offset (reading position) for each consumer

Important: Kafka does not send data automatically - the microservice requests it itself through the poll() method

Reverse Flow: Analytics Microservice → Kafka → Main Application

1. Data Aggregation

Important: These are two different processes in the analytics microservice:

Continuous event reading (see "Processing Events" section above):
- The microservice continuously requests (polls) events from Kafka
- Each event is processed immediately and updates statistics in memory
- This happens in real-time, not once per minute
Periodic aggregation (every 1 minute):
- A separate scheduled task runs once per minute
- Aggregates already accumulated data in memory
- Sends aggregated results back to Kafka

Types of aggregated data:

BOOK_STATS - statistics for each book
SYSTEM_OVERVIEW - overall system statistics
POPULAR_BOOKS - list of popular books

2. Receiving and Saving (Consumer)

Main Application also uses Kafka Consumer and actively requests (polls) aggregated data from the analytics.aggregated-stats topic:

How it works:

Main application subscribes to the analytics.aggregated-stats topic
Consumer periodically polls Kafka for new aggregated data
Upon receiving data, the main application saves it to the database (tables book_analytics, system_analytics)
This is also a pull model: the main application requests data itself, rather than receiving it automatically

Key Concepts

Partitioning

Why: Parallel processing and guaranteed order for related events

How it works:

Kafka calculates partition: partition = hash(key) % numberOfPartitions
Events with the same key go to the same partition
Processing order is guaranteed for events of the same book/user

Example:

Topic book.views (3 partitions):
Partition 0: [bookId=3] [bookId=6]
Partition 1: [bookId=1] [bookId=1] ← All bookId=1 events here
Partition 2: [bookId=2] [bookId=5]

Consumer Groups

Why: Load distribution between multiple service instances

How it works:

All consumers in one group share partitions among themselves (there can be multiple microservice instances or different services, but in this case only one)
Each message is processed by only one consumer from the group
If a consumer fails, its partitions are redistributed

Example:

Consumer Group: analytics-service-group
- Consumer 1 → processes Partition 0
- Consumer 2 → processes Partition 1
- Consumer 3 → processes Partition 2

Offset

Why: Tracking reading position to avoid losing messages

How it works:

Kafka remembers which message has already been read
On restart, consumer continues from the last offset
Guarantees that each message is processed

Full Data Cycle

User performs an action (view, purchase, etc.)
Main Application sends an event to Kafka (this is called asynchronous, as after sending, the main application does not wait for the analytics microservice to process the sent message)
Kafka saves the event in the corresponding topic and partition
Analytics Microservice continuously requests (polls) new events from Kafka, receives the event and immediately updates statistics in memory (in this case, it has no database, it stores everything in RAM)
Every 1 minute a separate scheduled task aggregates already accumulated data in memory and sends aggregated results back to Kafka (to the analytics.aggregated-stats topic)
Main Application continuously requests (polls) aggregated data from the analytics.aggregated-stats topic and saves it to the database
Admin Panel receives statistics via REST API from the database

Kafka Topics

Topic	Partitions	Key	Purpose
`book.views`	3	`bookId`	Book view events
`book.downloads`	3	`userId`	Book download events
`book.purchases`	2	`userId`	Book purchase events
`book.reviews`	2	`bookId`	Review creation/update events
`book.ratings`	2	`bookId`	Rating creation/update events
`analytics.aggregated-stats`	2	`aggregationType`	Aggregated statistics (reverse flow)

Advantages of the Approach

Non-blocking processing - main application sends an event and immediately returns a response to the user, without waiting for analytics processing
Scalability - can run multiple microservice instances
Reliability - Kafka saves events, they are not lost if the service fails
Separation of concerns - analytics is separated into a separate microservice
Data history - aggregated data is saved to the database for analysis

Monitoring

Kafka UI (http://localhost:8089) (admin, admin):

View topics and partitions
View messages
Monitor consumer groups and offsets
Track lag (processing delay)