High Volume Messaging Bots - WhatsApp

In this guide we will discuss how to handle high volume messaging bots through API & Webhook's for Official WhatsApp

Understand WhatsApp Docker , Core App & Database

WhatsApp used a very interesting and completely different solution from other messenger like Messenger or Telegram.

To have a bot in the WhatApp we need to rotate a Node (node) in an instance of our own. This node maintains connectivity to WhatsApp over a long-lived TCP connection.

Our backend is attached to this node and not directly to the WhatsApp servers. Below we will explain a little better this architecture.

Instance / Node

Each WhatsApp number needs a dedicated machine to run the WhatsApp client. On this machine we will run

CoreApp MySQL WebApp


CoreApp is the application that does all the magic. It will connect to the WhatsApp servers, it will keep the encryption keys (do not forget that WhatsApp is encrypted end-to-end , the node keeps all these keys in MySQL ), it will manage the messages coming and going, backup contacts, etc.


MySQL here WhatsApp stores all incoming and outgoing messages along with media in encrypted format.


WebApp is the application that interfaces between the backend and CoreApp. Basically it is a web service authenticated by tokens for sending messages out.

By default all WhatsApp Messaging Accounts are configured on a Single Docker Container and able to handle 10 to 25 Concurrent Messages. WhatsApp Offers High Availability and Multiconnect to Scale up at extra cost.

Understand Your Allocated TPS

Through Put Per Second (TPS) plays a big role to handle high messaging volumes, based on your selected plan we have allocated dedicated TPS like 10, 15, 20 to each number

The TPS is calculated based on the following events ;

  1. Receiving Message

  2. Sending Message

  3. Getting Delivery Reports

  4. Getting Read Reports

So let's assume that you have 10 TPS this means you can handle upto 10 interactions per second, please note the delivery and read reports also counted in the TPS limit.

Webhook is responsible to push incoming messages to your server on real time basis

Event Webhook is responsible to push delivery and report reports to your server when its available

Push API is responsible to send messages out, recommend to use Webhook Instant reply if you want to give an instant reply back to the user

Prepare your Server & Application

Please make sure your web server and application can handle the concurrent load as per the allocated TPS especially the incoming messages as the load is not predictable.

Below is the basic server configuration needed for your webserver to handle concurrent requests, please note the below is just for your reference and prepared based on Apache Web Server on a Linux Machine considering no other apps running.

Server Configuration


2 CPU , 2 GB RAM, SSD or IO Optimised Hard disk recommended

10 TPS

4 CPU , 4 GB RAM , SSD or IO Optimised Hard disk recommended

15 TPS

6 CPU, 6 GB RAM , SSD or IO Optimised Hard disk recommended

20 TPS

You need to have enough bandwidth if your bot is sending and receiving many media files

How Queuing Works

Incoming Queue

Incoming messages will be queued at WhatsApp end if it reaches the TPS allocated and if the queue exceeds the core app capability then the WhatsApp starts rejecting the incoming messages and this leads to missing of incoming messages forever.

Delivery & Read receipts also consider as incoming message queue , since WhatsApp push this to the docker when its avaiablle.

If your server gets overloaded try removing the Event Webhook from Settings -> Webhook -> Event Webhook this may decrease the load by 50%, as for each outgoing message there will be 1 - 2 event webhook get triggers to your server i.e (Delivered & Read) , by disabling this you can give more priority to the incoming message webhook.

It's possible to disable Delivery Report Events in WhatsApp Docker which will increase the TPS, if your bot is expecting high volume of messaging then we recommend to disable this , by default delivery reports events are enabled and if you need to disable the same then please contact us. Currently it's not possible to disable delivery reports for the group messages.

Outgoing Queue

There will be no queue at WhatsApp end for outgoing messages so you need to manage the outgoing message queue at your end based on your avaiablle TPS and message priority

By default we dont manage the message queue from an API request as we dont know the message priority however if you wish to enable the queue at our end then please keep in touch with us. We can manage the queue only from PUSH API i.e we will not able to queue messages which is coming as an instant response of the webhook.

High Availability & MultiConnect

The standard WhatsApp Business API Client solution runs on a single Docker container. In case you want to split the load and have multiple servers send and receive messages to WhatsApp, you can make use of the multiconnect solution on top of it offered by WhatsApp

High Availability

The standard WhatsApp Business API Client solution runs on a single Docker container. High availability, allowing you to have Docker containers on stand-by in case the primary Docker container goes down.

A high availability cluster requires at least two Master nodes and two Coreapp nodes as seen in the following diagram:

High Availability setup does not increase the TPS but it ensures business continuity even one WhatsApp docker goes down automatically another docker will start processing the messages.

We charge extra for configuring high availability WhatsApp Infra

MultiConnect Instances

With high availability, only one Docker container is responsible for sending and receiving messages from WhatsApp servers. If messaging traffic exceeds the maximum throughput of a single Docker container, there will be backlog of message sends and message delivery latency will increase. To scale out the WhatsApp Business API Client, multiconnect supports sharding to spread loads across multiple Docker containers. Currently, WhatsApp only support static sharding with a shard number of 1, 2, 4, 8, 16, or 32. High availability is a special case of multiconnect where the shard number is 1.

Please contact us if you wish to setup multiconnect infrastructure, this service will cost you extra.

Groups & Scalability

If you are using WhatsApp Groups in your WhatsApp Official Account then please aware about the following limitations ;

  1. WhatsApp Groups is in beta

  2. Delivery / Read reports cant be disabled in Docker

If you have large number of groups and high engaging users in the group this can affect your TPS badly , see how it affects your TPS ;

Let's assume you have 10 Groups with 256 members in each group i.e total 2560 members and when you or someone sends a message to the group and when each user reads the message an event webhook will trigger to your server so one message in a group leads to 2560 event triggers. This will consume your TPS.

Please consider the above limitations while you create and manage the groups , currently its not possible to disable delivery / read report events in the docker.

Last updated