static Map<Integer,List<Integer>> |
assignReplicasToBrokers(Collection<BrokerMetadata> brokerMetadatas,
int nPartitions,
int replicationFactor,
int fixedStartIndex,
int startPartitionId)
There are 3 goals of replica assignment:
Spread the replicas evenly among brokers.
For partitions assigned to a particular broker, their other replicas are spread over the other brokers.
If all brokers have rack information, assign the replicas for each partition to different racks if possible
To achieve this goal for replica assignment without considering racks, we:
Assign the first replica of each partition by round-robin, starting from a random position in the broker list.
Assign the remaining replicas of each partition with an increasing shift.
Here is an example of assigning
broker-0broker-1broker-2broker-3broker-4
p0 p1 p2 p3 p4 (1st replica)
p5 p6 p7 p8 p9 (1st replica)
p4 p0 p1 p2 p3 (2nd replica)
p8 p9 p5 p6 p7 (2nd replica)
p3 p4 p0 p1 p2 (3nd replica)
p7 p8 p9 p5 p6 (3nd replica)
static List<Integer> |
getRackAlternatedBrokerList(Map<Integer,String> brokerRackMap)
Given broker and rack information, returns a list of brokers alternated by the rack.
|
-
-
Constructor Detail
-
AdminUtils
public AdminUtils()
-
Method Detail
-
assignReplicasToBrokers
public static Map<Integer,List<Integer>> assignReplicasToBrokers(Collection<BrokerMetadata> brokerMetadatas,
int nPartitions,
int replicationFactor,
int fixedStartIndex,
int startPartitionId)
There are 3 goals of replica assignment:
- Spread the replicas evenly among brokers.
- For partitions assigned to a particular broker, their other replicas are spread over the other brokers.
- If all brokers have rack information, assign the replicas for each partition to different racks if possible
To achieve this goal for replica assignment without considering racks, we:
- Assign the first replica of each partition by round-robin, starting from a random position in the broker list.
- Assign the remaining replicas of each partition with an increasing shift.
Here is an example of assigning
broker-0 | broker-1 | broker-2 | broker-3 | broker-4 | |
p0 | p1 | p2 | p3 | p4 | (1st replica) |
p5 | p6 | p7 | p8 | p9 | (1st replica) |
p4 | p0 | p1 | p2 | p3 | (2nd replica) |
p8 | p9 | p5 | p6 | p7 | (2nd replica) |
p3 | p4 | p0 | p1 | p2 | (3nd replica) |
p7 | p8 | p9 | p5 | p6 | (3nd replica) |
To create rack aware assignment, this API will first create a rack alternated broker list. For example,
from this brokerID -> rack mapping:
0 -> "rack1", 1 -> "rack3", 2 -> "rack3", 3 -> "rack2", 4 -> "rack2", 5 -> "rack1"
The rack alternated list will be:
0, 3, 1, 5, 4, 2
Then an easy round-robin assignment can be applied. Assume 6 partitions with replication factor of 3, the assignment
will be:
0 -> 0,3,1
1 -> 3,1,5
2 -> 1,5,4
3 -> 5,4,2
4 -> 4,2,0
5 -> 2,0,3
Once it has completed the first round-robin, if there are more partitions to assign, the algorithm will start
shifting the followers. This is to ensure we will not always get the same set of sequences.
In this case, if there is another partition to assign (partition #6), the assignment will be:
6 -> 0,4,2 (instead of repeating 0,3,1 as partition 0)
The rack aware assignment always chooses the 1st replica of the partition using round robin on the rack alternated
broker list. For rest of the replicas, it will be biased towards brokers on racks that do not have
any replica assignment, until every rack has a replica. Then the assignment will go back to round-robin on
the broker list.
As the result, if the number of replicas is equal to or greater than the number of racks, it will ensure that
each rack will get at least one replica. Otherwise, each rack will get at most one replica. In a perfect
situation where the number of replicas is the same as the number of racks and each rack has the same number of
brokers, it guarantees that the replica distribution is even across brokers and racks.
- Returns:
- a Map from partition id to replica ids
- Throws:
AdminOperationException - If rack information is supplied but it is incomplete, or if it is not possible to
assign each replica to a unique rack.
-
getRackAlternatedBrokerList
public static List<Integer> getRackAlternatedBrokerList(Map<Integer,String> brokerRackMap)
Given broker and rack information, returns a list of brokers alternated by the rack. Assume
this is the rack and its brokers:
rack1: 0, 1, 2
rack2: 3, 4, 5
rack3: 6, 7, 8
This API would return the list of 0, 3, 6, 1, 4, 7, 2, 5, 8
This is essential to make sure that the assignReplicasToBrokers API can use such list and
assign replicas to brokers in a simple round-robin fashion, while ensuring an even
distribution of leader and replica counts on each broker and that replicas are
distributed to all racks.
|