How to change the number of replicas of a Kafka topic?

Apache Kafka

Apache Kafka Problem Overview


After a Kafka topic has been created by a producer or an administrator, how would you change the number of replicas of this topic?

Apache Kafka Solutions


Solution 1 - Apache Kafka

To increase the number of replicas for a given topic you have to:

  1. Specify the extra replicas in a custom reassignment json file

For example, you could create increase-replication-factor.json and put this content in it:

{"version":1,
  "partitions":[
     {"topic":"signals","partition":0,"replicas":[0,1,2]},
     {"topic":"signals","partition":1,"replicas":[0,1,2]},
     {"topic":"signals","partition":2,"replicas":[0,1,2]}
]}

2. Use the file with the --execute option of the kafka-reassign-partitions tool

[or kafka-reassign-partitions.sh - depending on the kafka package]

For example:

$ kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

3. Verify the replication factor with the kafka-topics tool

[or kafka-topics.sh - depending on the kafka package]

 $ kafka-topics --zookeeper localhost:2181 --topic signals --describe

Topic:signals	PartitionCount:3	ReplicationFactor:3	Configs:retention.ms=1000000000
Topic: signals	Partition: 0	Leader: 2	Replicas: 0,1,2 Isr: 2,0,1
Topic: signals	Partition: 1	Leader: 2	Replicas: 0,1,2 Isr: 2,0,1
Topic: signals	Partition: 2	Leader: 2	Replicas: 0,1,2 Isr: 2,0,1

See also: the part of the official documentation that describes how to increase the replication factor.

Solution 2 - Apache Kafka

Edit: I was proven to be wrong - please check excellent answer from Łukasz Dumiszewski.

I'm leaving my original answer for completness for now.



I don't think you can. Normally it would be something like

> ./kafka-topics.sh --zookeeper localhost:2181 --alter --topic test2 > --replication-factor 3

but it says

> Option "[replication-factor]" can't be used with option"[alter]"

It is funny that you can change number of partitions on the fly (which is often hugely destructive action when done in runtime), but cannot increase replication factor, which should be transparent. But remember, it is 0.10, not 10.0... Please see here for enhancement request https://issues.apache.org/jira/browse/KAFKA-1543

Solution 3 - Apache Kafka

You can also use kafkactl for this:

# first run with --validate-only to see what kafkactl will do
kafkactl alter topic my-topic --replication-factor 2 --validate-only

# then do the replica reassignment
kafkactl alter topic my-topic --replication-factor 2

Note that the Kafka API that kafkactl is using for this is only available for Kafka ≥ 2.4.0.

Disclaimer: I am contributor to this project

Solution 4 - Apache Kafka

Łukasz Dumiszewski's answer is correct but manually generating that file is a bit hard. Luckily there are some easy ways to achieve what @Łukasz Dumiszewski said.

  • If you are using kafka-manager tool, from version 2.0.0.2 you can change the replication factor in Generate Partition Assignment section in a topic view. Then you should click on Reassign Partitions to apply the generated partition assignment (if you select a different replication factor, you will get a warning but you can click on Force Reassign afterward).

  • If you have ruby installed you can use this helper script

  • If you prefer nodejs you can generate the file with this gist too.

Solution 5 - Apache Kafka

This script may help you, if you want change replication factor for all topics:

#!/bin/bash

topics=`kafka-topics --list --zookeeper zookeeper:2181`

while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
  "partitions":[' > tmp.json
for t in $topics; do 
    if [ "${t}" == "${lines[-1]}" ]; then
        echo "    {\"topic\":\"${t}\",\"partition\":0,\"replicas\":[0,1,2]}" >> tmp.json
    else
        echo "    {\"topic\":\"${t}\",\"partition\":0,\"replicas\":[0,1,2]}," >> tmp.json
    fi
done

echo '  ]
}' >> tmp.json

kafka-reassign-partitions --zookeeper zookeeper:2181 --reassignment-json-file tmp.json --execute

Solution 6 - Apache Kafka

The scripted answer of @Дмитрий-Шепелев did not include a solution for topics with multiple partitions. This updated version does:

#!/bin/bash

brokerids="1,2,3"
topics=`kafka-topics --list --zookeeper zookeeper:2181`

while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
  "partitions":['
for t in $topics; do
    sep=","
    pcount=$(kafka-topics --describe --zookeeper zookeeper:2181 --topic $t | awk '{print $2}' | uniq -c |awk 'NR==2{print $1}')
    for i in $(seq 0 $[pcount - 1]); do
        if [ "${t}" == "${lines[-1]}" ] && [ "$[pcount - 1]" == "$i" ]; then sep=""; fi
        randombrokers=$(echo "$brokerids" | sed -r 's/,/ /g' | tr " " "\n" | shuf | tr  "\n" "," | head -c -1)
        echo "    {\"topic\":\"${t}\",\"partition\":${i},\"replicas\":[${randombrokers}]}$sep"
    done
done

echo '  ]
}'

Note: it also randomizes the brokers and picks two replicas per partition. So make sure the brokerid's in the script are correctly defined.

Execute as follows:

$ ./reassign.sh > reassign.json
$ kafka-reassign-partitions --zookeeper zookeeper:2181 --reassignment-json-file reassign.json --execute

Solution 7 - Apache Kafka

If you have a lot of partitions, using kafka-reassign-partitions to generate the json file required by Łukasz Dumiszewski's answer (and the official documentation) can be a timesaver. Here is an example of replicating a 64 partition topic from 1 to 2 servers without having to specify all the partitions:

expand_topic=TestTopic
current_server=111
new_servers=111,222
echo '{"topics": [{"topic":"'${expand_topic}'"}], "version":1}' > /tmp/topics-to-expand.json
/bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file /tmp/topics-to-expand.json --broker-list "${current_server}" --generate | tail -1 | sed s/\\[${current_server}\\]/\[${new_servers}\]/g | tee /tmp/topic-expand-plan.json
/bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /tmp/topic-expand-plan.json --execute
/bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic ${expand_topic}

Outputs:

Topic:TestTopic	PartitionCount:64	ReplicationFactor:2	Configs:retention.ms=6048000
    Topic: TestTopic	Partition: 0	Leader: 111	Replicas: 111,222	Isr: 111,222
    Topic: TestTopic	Partition: 1	Leader: 111	Replicas: 111,222	Isr: 111,222
    ....

Solution 8 - Apache Kafka

1. Copy all topics to json file

#!/bin/bash
topics=`kafka-topics.sh --zookeeper localhost:2181 --list`

while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
 "topics":['
 for t in $topics; do
     echo -e '     { "topic":' \"$t\" '},'
done

echo '  ]
}'

bash alltopics.sh > alltopics.json

2. Run kafka-reassign-partitions.sh to generate rebalanced file

kafka-reassign-partitions.sh --zookeeper localhost:2181 --broker-list "0,1,2" --generate --topics-to-move-json-file alltopics.json > reassign.json

3. Cleanup reassign.json file it contains existing and proposed values

4. Run kafka-reassign-partitions.sh to rebalance topics

kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file reassign.json --execute

Solution 9 - Apache Kafka

In the first step we need to alter topics with replicas

./kafka-topics.sh --describe --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --topic test2

then in the next step we need to identify brokers list where we need to sync our replicas and it requires topic rebalance to do this create a json file and define all the ISR brokers and topic

    {"version":1,
    "partitions":[
     {"topic":"test2","partition":0,"replicas":[0,10]},
     {"topic":"test2","partition":1,"replicas":[10,20]}
    ]}

In the last we need to rebalance the topics for partitions

./kafka-reassign-partitions.sh --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --reassignment-json-file /tmp/increase-replication-factor.json --execute

To verify

[root@prod-az-p2-kafka02 bin]# ./kafka-topics.sh --describe --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --topic test2
Topic: test2	TopicId: -LoL36ztSeyC8rzvnp4YMw	PartitionCount: 2	ReplicationFactor: 2	Configs:
	Topic: test2	Partition: 0	Leader: 10	Replicas: 0,10	Isr: 10
	Topic: test2	Partition: 1	Leader: 20	Replicas: 10,20	Isr: 20,10

Solution 10 - Apache Kafka

This script will generate the JSON for kafka-reassign-partitions.sh and feed it into that script to increase the replication factor. The new set of replicas will:

  • Keep the current replicas
  • Add new unique brokers (this will prevent unneeded data migrations)

This script was tested with 2.8.0 Kafka scripts. Only the variables at the top of the file will need modified.

#!/bin/bash

KAFKA_BIN="./bin"
KAFKA_CONNECTION_ARGS="--bootstrap-server localhost:9094"

broker_ids="1,2,3"
topic="topic_foobar"
new_replication_factor=3 # New replication factor


reassignment_file="./reassignment.json"


#~~~~ Don't change anything after this line ~~~~#


# Generate a list of "partition|replicas"
topic_data="$("$KAFKA_BIN/kafka-topics.sh" $KAFKA_CONNECTION_ARGS --describe --topic "$topic" | tail -n +2 | sed -E 's/.*Partition:\s+([0-9]+).*Replicas:\s+([0-9,]+).*/\1|\2/g')"
partition_count=$(echo "$topic_data" | wc -l)

echo '{
	"version": 1,
	"partitions": [' > "$reassignment_file"


log_dirs="$(yes '"any"' | head -n $new_replication_factor | sed -e ':a;N;$!ba;s/\n/,/g')"
obj_sep=","
while read -r partition_data; do
	partition=$(echo "$partition_data" | cut -d '|' -f 1)
	replicas=$(echo "$partition_data" | cut -d '|' -f 2)

	# Randomize the replicas (using this list as a queue)
	random_replicas="$(echo $broker_ids | tr "," "\n" | shuf)"
	
	# Loop until the replicas has desired RF - 1 commas
	while [ "$(echo "$replicas" | tr -dc , | wc -c)" != $((new_replication_factor-1)) ]; do
		# Pick the next replica, add it to the list if it isn't already there, otherwise advance the queue
		next_replica="$(echo "$random_replicas" | head -1)"
		if [[ $replicas != *$next_replica* ]]; then
			replicas="$replicas,$next_replica"
		else
			random_replicas="$(echo "$random_replicas" | tail -n +2)"
		fi
	done
	
	# Don't add a comma on the last object
	if [ "$((partition_count-1))" == "$partition" ]; then obj_sep=""; fi
	
	echo '		{
			"topic": "'"$topic"'",
			"partition": '"$partition"',
			"replicas": ['"$replicas"'],
			"log_dirs": ['"$log_dirs"']
		}'$obj_sep >> "$reassignment_file"
done < <(echo "$topic_data")

echo '	]
}' >> "$reassignment_file"


cat "$reassignment_file"
read -p "Apply the above reassignment? (Ctrl-C to exit): "


"$KAFKA_BIN/kafka-reassign-partitions.sh" $KAFKA_CONNECTION_ARGS --execute --reassignment-json-file "$reassignment_file"

Solution 11 - Apache Kafka

To increase the number of replicas for a given topic you have to:

1. Specify the extra partitions to the existing topic with below command(let us say increase from 2 to 3)

bin/kafktopics.sh --zookeeper localhost:2181 --alter --topic topic-to-increase --partitions 3

2. Specify the extra replicas in a custom reassignment json file

For example, you could create increase-replication-factor.json and put this content in it:

{"version":1,
  "partitions":[
     {"topic":"topic-to-increase","partition":0,"replicas":[0,1,2]},
     {"topic":"topic-to-increase","partition":1,"replicas":[0,1,2]},
     {"topic":"topic-to-increase","partition":2,"replicas":[0,1,2]}
]}

3. Use the file with the --execute option of the kafka-reassign-partitions tool

bin/kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

4. Verify the replication factor with the kafka-topics tool

bin/kafka-topics --zookeeper localhost:2181 --topic topic-to-increase --describe

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGuruPoView Question on Stackoverflow
Solution 1 - Apache KafkaŁukasz DumiszewskiView Answer on Stackoverflow
Solution 2 - Apache KafkaArtur BiesiadowskiView Answer on Stackoverflow
Solution 3 - Apache KafkaD-rkView Answer on Stackoverflow
Solution 4 - Apache KafkaAmir Masud Zare BidakiView Answer on Stackoverflow
Solution 5 - Apache KafkaДмитрий ШепелевView Answer on Stackoverflow
Solution 6 - Apache KafkaFilidor WieseView Answer on Stackoverflow
Solution 7 - Apache KafkaMilesHampsonView Answer on Stackoverflow
Solution 8 - Apache Kafkabhargav joshiView Answer on Stackoverflow
Solution 9 - Apache KafkaMansur Ul HasanView Answer on Stackoverflow
Solution 10 - Apache KafkaGarrett PooreView Answer on Stackoverflow
Solution 11 - Apache KafkaSivaPhaniView Answer on Stackoverflow