16. 流批一体迭代: API
Unified Iteration for Streaming / Batch: API
public static class ModelCacheFunction extends ProcessFunction<double[], double[]>
implements IterationListener<double[]> {
private final double[] parameters = new double[N_DIM];
public void processElement(double[] update, Context ctx, Collector<O> output) {
// Suppose we have a util to add the second array to the first.
ArrayUtils.addWith(parameters, update);
}
void onEpochWatermarkIncremented(int epochWatermark, Context context, Collector<T> collector) {
if (epochWatermark < N_EPOCH * N_BATCH_PER_EPOCH) {
collector.collect(parameters);
}
}
public void onIterationEnd(int[] round, Context context) {
context.output(FINAL_MODEL_OUTPUT_TAG, parameters);
}
}
初始模型
Initial Model
模型缓存
Model Cache
训练数据源(有限)
Training Data
Source (Bounded)
模型更新
Model Update
训练节点 * n
Training Node * n
最终输出
收齐所有更新后再发
布新模型,从而实现
同步计算
24. 参考资料
• FLIP-173: Support DAG of algorithms
• FLIP-174: Improve the WithParam interface
• FLIP-175: Compose Estimator/Model/AlgoOperator from DAG of Estimator/Model/AlgoOperator
• FLIP-176: Unified Iteration to Support Algorithms
• Flink ML repo: github.com/apache/flink-ml
• Deep Learning on Flink
github.com/flink-extended/flink-ai-extended/tree/master/deep-learning-on-flink