Hi,
Without moving to the heaviness of the impl I will mention we already have standards in this field:
- rx
- microprofile-reactive (even if this one is neither adopted nor integrable)
- spring mono
- at some point apache camel (even if this one is more to be a long running instance but in terms of design it matches and is regularly used for batches)
What is important to see is that all these API - including java se CompletionStage - enable to define at least:
1. a flow thanks to a fluent API
2. a reactive model thanks to push/event like API
To answer your questions:
> 1. static, XML job definition
We can imagine it indeed since at the end it is about having a flow DSL, reactive or not, but I strongly think it is not needed and a bit against jakarta spirit since 1999 where all XML descriptors are slowly dropped from new specs because Java dev abandonned them in practise.
The other big advantage to not use that is to be type safe by construction and not with a maven plugin checking the job.xml and still failling at runtime because the data in the step/job context are not the expected ones.
If desired the user can make the flow configurable using jsonp/jsonb/jaxb/whatever fits its app config and not use a custom solution which must rely on system properties and not integrate properly with its env (thinking to k8s where the batch will just be a synchronous main with the state persistence but needs configmap/secrets support which is not built in in the spec).
> 2. programmatic, synchronous job definition
Being reactive does not mean being asynchronous, typically:
final var jobPromise = completedFuture(...);
jobPromise.get();
is actually synchronous because the impl of the code pushing the result is synchronous. What makes it reactive is the capacity to combine it with other steps which "react" to the completion of the previous step. Typical example is a NIO call ("when ready call next step") but if you previous call is immediately ready - or is not done async/in another thread - the "when ready" means "now".
> 3. programmatic, async / reactive definition
The nice thing about reactive support is that it unifies sync and async programming models.
The best for us would be to be able to rely on java 11 Flow interfaces since it would solve with a standard and java-se interoperable API the chunking too and batchlet would just be functions returning a CompletionStage.
To rephrase this part: once you have a reactive API you don't need a synchronous API.
A job definition is a supplier or function (depends if we inject configuration/let the user read the config from its own env and how we want to be able to wrap the CompletionStage to instrument them) of CompletionStage<BatchResult> which represents the end of the job.
In CDI land it can look like an observer - using the instrumentation from root instance which is the simplest probably in this sample but it is just to illustrate one API:
void defineMyBatch(@Observes BatchDefinitionCollector collector) {
collector.register("my-batch", root -> root
.thenApply(myCustomBean::readDataSinceLastTime) // will use @Inject EntityManager em; of the app
.thenApply(myOtherbean::process) // we don't care much as jbatch what it does but it is functionally a processor
.thenApply(this::toBatchResult));
}
If we want to be more explicit - extending default se API: it can look like:
// collector is the aggregator enabling to initialize the job repository, reverse pattern is to not initialize it and lookup the instance at usage but it means we can't list the available batches which is a blocker for admins/ops
void defineMyBatch(@Observes BatchDefinitionCollector collector) {
collector.register("my-batch", root -> root
.thenApply("step1", myCustomBean::readDataSinceLastTime) // will use @Inject EntityManager em; of the app and uses the app state (last execution time)
.thenCompose("step2", myOtherbean::process) // we don't care much as jbatch what it does but it is functionally a processor and signature can look like CompletionStage<X> process(List<Data> list);
.thenApply("step3", this::toBatchResult));
}
Indeed which bean do what can be configured, very concretely I can register a camel route being "process" method and even split it by component to map components/processors on steps and reflect in jbatch state tracking the full camel execution.
Gain is obvious:
1. works with modern/current technology stacks
2. integrates smoothly and optionally iteratively (in terms of adoptions) with any framework by not being a container anymore but a set of extension points (almost a library)
Indeed the drawback is that it makes us rethink the whole design but from what I saw adoption was very low when 1.0 was hit and it got almost abandonned at the same time of the microservices adoption due to the new programming style java adopted in the mean time so think it is worth evaluating this to propose to fulfill this scope - even if it is a new spec and jbatch 1.x API style moves to maintenance mode as CMP had been when JPA popped out.
Hope it is a bit clearer.