Add distributor which schedule task to be fairly#333
Add distributor which schedule task to be fairly#333morningman merged 4 commits intoapache:masterfrom
Conversation
| RoutineLoadJob routineLoadJob = routineLoad.getJobByTaskId(routineLoadTaskInfo.getSignature()); | ||
| RoutineLoadTask routineLoadTask = null; | ||
| if (routineLoadTaskInfo instanceof KafkaTaskInfo) { | ||
| routineLoadTask = new KafkaRoutineLoadTask(routineLoadJob.getResourceInfo(), |
There was a problem hiding this comment.
Is this better that adding a interface function to create LoadTask, like RoutineLoadJob.createTask(TaksInfo)
There was a problem hiding this comment.
Is this better than? @imay
I think so, because this can encapsulate creating KafkaRoutineLoadTask to KafkaRoutineLoadJob.
If we add new type of job, we needn't to change code here, and just add another RoutineLoadJob and RoutineLoadTask
| routineLoad.processTimeOutTasks(); | ||
|
|
||
| // get idle be task num | ||
| int totalIdleTaskNum = routineLoad.getTotalIdleTaskNum(); |
There was a problem hiding this comment.
Idle task? task to be scheduled? Could you change a name?
|
|
||
| // get idle be task num | ||
| int totalIdleTaskNum = routineLoad.getTotalIdleTaskNum(); | ||
| int allocatedTaskNum = 0; |
There was a problem hiding this comment.
runingTask? I think that allocate is used for some resource. And task is to be scheduled, not allocated?
| private RoutineLoad routineLoad = Catalog.getInstance().getRoutineLoadInstance(); | ||
|
|
||
| @Override | ||
| protected void runOneCycle() { |
There was a problem hiding this comment.
I suggest you put this function to another function like process, And this function call process and catch all Throwable to avoid this function throw a RuntimeError like NullpointerException
|
|
||
| public RoutineLoadTaskInfo(long signature) { | ||
| this.signature = signature; | ||
| this.lock = new ReentrantReadWriteLock(true); |
There was a problem hiding this comment.
Does this simple class need a lock?
There was a problem hiding this comment.
After discuss with morningman I will use one lock between different RoutineLoadTaskInfo, because the time of locked can be ignore.
| import java.util.concurrent.locks.ReentrantReadWriteLock; | ||
| import java.util.stream.Collectors; | ||
|
|
||
| public class RoutineLoad { |
There was a problem hiding this comment.
I think RoutineLoadManager is better.
There was a problem hiding this comment.
Why Load not use LoadManager?
There was a problem hiding this comment.
I think RoutineLoadManager is better too.
| // diff beIds and beIdToMaxConcurrentTasks.keys() | ||
| List<Long> newBeIds = beIds.parallelStream().filter(entity -> beIdToMaxConcurrentTasks.get(entity) == null) | ||
| .collect(Collectors.toList()); | ||
| List<Long> decommissionBeIds = beIdToMaxConcurrentTasks.keySet().parallelStream() |
There was a problem hiding this comment.
Do not name as 'decommissionBeIds', cause 'decommission' means the backend is being decommissioned.
So just name it as 'unavailableBeIds'
| beIdToConcurrentTasks.remove(beId); | ||
| } | ||
| LOG.info("There are {} backends which participate in routine load scheduler. " | ||
| + "There are {} new bes and {} decommission bes for routine load", |
| private Map<Long, RoutineLoadTaskInfo> idToRoutineLoadTask; | ||
| // KafkaPartitions means that partitions belong to one task | ||
| // kafka partitions == routine load task (logical) | ||
| private Queue<RoutineLoadTaskInfo> needSchedulerRoutineLoadTask; |
There was a problem hiding this comment.
needSchedulerRoutineLoadTask -> needSchedulerRoutineLoadTasks
| runningTasks.removeAll(needSchedulerRoutineLoadTask); | ||
|
|
||
| for (RoutineLoadTaskInfo routineLoadTaskInfo : runningTasks) { | ||
| routineLoadTaskInfo.writeLock(); |
There was a problem hiding this comment.
It's highly recommended NOT to expose lock outside a class.
It will cause a lot of troubles.
There was a problem hiding this comment.
And I can't see why you need a lock here?
Nothing need to be protected in RoutineLoadTaskInfo?
There was a problem hiding this comment.
‘idToRoutineLoadTask’ is the member of RoutineLoad, it should not be protected by lock in RoutineLoadTaskInfo.
There was a problem hiding this comment.
In logical, every task only can process one function, either processTimeOutTask or commit task. So it really need a segment lock for idToRoutineLoadTask. According to the lot of time of commit task, I will use one lock instead of per task per lock.
| idToRoutineLoadTask.put(kafkaTaskInfo.getSignature(), kafkaTaskInfo); | ||
| needSchedulerRoutineLoadTask.add(kafkaTaskInfo); | ||
| } | ||
| LOG.debug("Task {} was ran more then {} minutes. It was removed and rescheduled"); |
| for (Map.Entry<Long, Integer> entry : beIdToMaxConcurrentTasks.entrySet()) { | ||
| if (beIdToConcurrentTasks.get(entry.getKey()) == null) { | ||
| result = maxIdelTaskNum < entry.getValue() ? entry.getKey() : result; | ||
| maxIdelTaskNum = Math.max(maxIdelTaskNum, entry.getValue()); |
There was a problem hiding this comment.
maxIdelTaskNum: misspelling
There was a problem hiding this comment.
check all 'idel' misspelling, please...
There was a problem hiding this comment.
I will pay attention next time =_=
| if (routineLoadTaskInfo != null) { | ||
| // when routine load task is not abandoned | ||
| if (routineLoad.getIdToRoutineLoadTask().get(routineLoadTaskInfo.getSignature()) != null) { | ||
| long beId = routineLoad.getMinTaskBeId(); |
There was a problem hiding this comment.
What if routineLoad.getMinTaskBeId() return 0, which means no backend has available slots to work?
There was a problem hiding this comment.
It will not return 0, because the clusterIdleSlotNum is more then 0.
| routineLoad.addNumOfConcurrentTasksByBeId(beId); | ||
| } | ||
| } else { | ||
| LOG.debug("Task {} for job already has been discarded", routineLoadTaskInfo.getSignature()); |
There was a problem hiding this comment.
The correct grammar is: 'has already been' or 'has been already', not 'already has been'
| // TODO(ml): init load task | ||
| kafkaRoutineLoadTaskList.add(new KafkaRoutineLoadTask(getResourceInfo(), 0L, TTaskType.PUSH, | ||
| dbId, tableId, 0L, 0L, 0L, SystemIdGenerator.getNextId())); | ||
| kafkaRoutineLoadTaskList.add(new KafkaTaskInfo(SystemIdGenerator.getNextId())); |
There was a problem hiding this comment.
SystemIdGenerator will produce repetitive id if Frontend restart or Master FE changed.
Just use a UUID or random Long is better, if you don't want to persist this info.
There was a problem hiding this comment.
I think maybe I can reuse CatalogIdGenerator, but CatalogIdGenerator need add a field named name.
| List<RoutineLoadTask> routineLoadTaskList = null; | ||
| List<RoutineLoadTaskInfo> routineLoadTaskList = null; | ||
| try { | ||
| routineLoadJob.writeLock(); |
There was a problem hiding this comment.
use routineLoadJob.writeLock() to protect is weird.
And lock should be used as (lock() is outside the try{}):
lock();
try {
} finally {
unlock();
}
There was a problem hiding this comment.
Using routineLoadJob.writeLock() means that every process of different RoutineLoadJob will not be blocked.
382f4df to
38f90fd
Compare
Step1: updateBeIdTaskMaps, remove unalive be and add new alive be
Step2: process timeout tasks, if a task already has been allocate to be but not finished before DEFAULT_TASK_TIMEOUT_MINUTES, it will be discarded.
At the same time, the partitions belong to old task will be allocate to a new task. The new task with a signatrue will be add in the queue of needSchedulerRoutineLoadTask.
Step3: process all of needSchedulerRoutineLoadTask, allocate task to be. The task will be executed by backend.
| public Map<Long, RoutineLoadTask> getIdToRoutineLoadTask() { | ||
| return idToRoutineLoadTask; | ||
| public Map<Long, RoutineLoadTaskInfo> getIdToRoutineLoadTask() { | ||
| readLock(); |
There was a problem hiding this comment.
This lock protects nothing...
After the caller gets 'idToRoutineLoadTask', it can do anything without lock protection.
| } | ||
| } | ||
|
|
||
| public long getMinTaskBeId() { |
There was a problem hiding this comment.
It DOES return 0L in some cases.
| } | ||
|
|
||
| public Queue<RoutineLoadTaskInfo> getNeedSchedulerRoutineLoadTasks() { | ||
| readLock(); |
There was a problem hiding this comment.
Still, this lock protects nothing.
| } | ||
| } | ||
|
|
||
| public void processTimeOutTasks() { |
|
|
||
| for (RoutineLoadTaskInfo routineLoadTaskInfo : runningTasks) { | ||
| if ((System.currentTimeMillis() - routineLoadTaskInfo.getLoadStartTimeMs()) | ||
| > DEFAULT_TASK_TIMEOUT_MINUTES * 60 * 1000) { |
There was a problem hiding this comment.
5 min is too long? 10 sec I think?
| // judge nums of tasks more then max concurrent tasks of cluster | ||
| List<RoutineLoadTask> routineLoadTaskList = null; | ||
| List<RoutineLoadTaskInfo> routineLoadTaskList = null; | ||
| routineLoadJob.writeLock(); |
| try { | ||
| process(); | ||
| } catch (Throwable e) { | ||
| LOG.error("Failed to process one round of RoutineLoadTaskScheduler with error message {}", |
There was a problem hiding this comment.
warn level is appropriate
|
|
||
| for (RoutineLoadTaskInfo routineLoadTaskInfo : runningTasks) { | ||
| if ((System.currentTimeMillis() - routineLoadTaskInfo.getLoadStartTimeMs()) | ||
| > DEFAULT_TASK_TIMEOUT_SECONDS * 60 * 1000) { |
| } finally { | ||
| readUnlock(); | ||
| switch (jobState) { | ||
| case NEED_SCHEDULER: |
There was a problem hiding this comment.
Why not just:
stateJobs = idToRoutineLoadJob.values().stream()
.filter(entity -> entity.getState() == jobState)
.collect(Collectors.toList());
There was a problem hiding this comment.
switch case is unnecessary.
| } | ||
|
|
||
| private void process() { | ||
| private void process() throws LoadException { |
There was a problem hiding this comment.
The default interval of Daemon thread is 30 seconds, which means you have to wait at least 30 seconds to schedule next batch of tasks?
Maybe you need a trigger mechanism?
| default: | ||
| break; | ||
| } | ||
| idToRoutineLoadJob.values().stream() |
There was a problem hiding this comment.
You missed assigning 'stateJobs' variable....
…maps between segments if BE restart before publish (apache#48775) (apache#48873) (apache#333) pick apache#48775
Step1: updateBeIdTaskMaps, remove unalive be and add new alive be
Step2: process timeout tasks, if a task already has been allocate to be but not finished before DEFAULT_TASK_TIMEOUT_MINUTES, it will be discarded.
At the same time, the partitions belong to old task will be allocate to a new task. The new task with a signatrue will be add in the queue of needSchedulerRoutineLoadTask.
Step3: process all of needSchedulerRoutineLoadTask, allocate task to be. The task will be executed by backend.