SlideShare a Scribd company logo
1 of 16
Spring Batch
This is a reference/guide for software developers to understand/learn Spring Batch.
Jayasree Perilakkalam
Spring Batch Layered Architecture
• Reference: https://docs.spring.io/spring-
batch/docs/3.0.x/reference/html/spring-batch-intro.html
Application
Spring Batch Core
Infrasrtucture
All batch jobs and custom code written
by developers using Spring Batch
Core runtime classes necessary to
launch and control a batch job such as
JobLauncher, Job, and Step
Common readers, writers , and
services such as the RetryTemplate
Batch Stereotypes
• Refernce : https://docs.spring.io/spring-
batch/docs/current/reference/html/domain.html#job
1 1
1 * 1 1
1
JobLauncher Job Step
JobRepository
ItemReader
ItemProcessor
ItemWriter
Spring Batch Domain/Concepts
• Job
• “Job” encapsulates an entire batch process. As is common with other Spring projects, a “Job” is
wired together using either an XML configuration or a Java configuration.
• A “Job” has to one to many steps each of which has exactly one ItemReader, ItemWriter, and
ItemProcessor. A “Job” allows for configuration of properties global to all steps, such as
restartability.
• A “Job” needs to be launched using a “JobLauncher” and the metadata about the currently
running process is stored in “JobRepository”.
• A default implementation of “Job” interface is provided in Spring Batch in the form of the
“SimpleJob” class. When using Java based configuration, a collection of builders is available for the
instantiation of “Job”.
• A “JobInstance” refers to the concept of a logical job run. So a “Job” has many “JobInstance”. A
“Job” can be scheduled to run many times. Each of these is a “JobInstance”. Each “JobInstance” is
tracked separately and thus if it fails, it needs to be run again. Therefore, each “JobInstance” can
have multiple executions (“JobExecution”). Only one “JobInstance” corresponding to a particular
“Job” and identifying “JobParameters” can run at a given time.
• The definition of “JobInstance” has no bearing on the data to be loaded. It’s entirely up to the
“ItemReader” implementation to determine how the data is loaded.
Spring Batch Domain/Concepts
• Using the same “JobInstance” determines whether or not the same state (i.e. “ExecutionContext”) from the
previous execution is used. Using a new “JobInstance” means start from the beginning, and using and existing
“JobInstance” generally means start from where you left off.
• Now this question arises. How is one “JobInstance” distinguished from the other? The answer is
“JobParameters”. A “JobParameters” object holds a set of parameters used to start a batch job. They can be
used for identification or even as reference data during the run. Thus, JobInstance = Job + identifying
JobParameters . Note: Not all job parameters are required to contribute to the identification of “JobInstance”.
• “JobExecution” is a technical concept of a single attempt to run a job. An execution may end in a failure or a
success, but “JobInstance” corresponding to a given execution is not considered complete unless the
execution completes successfully. Consider a “JobInstance” that failed, when it is run again with the same
identifying “JobParameters”, a new “JobExecution” is created. However, there is still only one “JobInstance”
(the same one as before).
• A “Job” defines what a job is and how it is to be executed. A “JobInstance” is a purely organizational object to
group executions together, primarily to enable correct restart semantics. A “JobExecution” , however, is the
primary storage mechanism for what actually happened during a run and contains properties that must be
controlled and persisted.
Spring Batch Domain/Concepts
• “executionContext” is a property of “JobExecution”. It is the property bag that contains any user data that need to be
preserved(persisted) between executions.
• A batch job metadata tables are “batch_job_instance”, batch_job_execution_params”, “batch_job_execution”.
• Step
• This is a domain object that encapsulates an independent, sequential phase of a batch job. Thus every job is
composed of one or more steps.
• As with “Job”, “Step” has an individual “StepExecution” that correlates with a unique “JobExecution”.
“StepExecution” represents a single attempt to execute a “Step”. A new “StepExecution” is created each time a
“Step” is run similar to “JobExecution”. However, if a “Step” fails because a “Step” before it failed, no execution is
persisted for it. A “StepExecution” is created only when its “Step” is actually started. “Step” executions are
represented by objects of the “StepExecution” class. Each execution contains reference to its corresponding step and
“JobExecution” and transaction related data such as commit and rollback counts and start and end times.
• Additionally, each “StepExecution” has a “executionContext” property which contains any data a developer needs to
have persisted across batch runs such as statistics or state information needed to restart.
• An executionContext represents a collection of key/value pairs that are persisted/controlled by the framework in
order to allow developers to store persistent state that is scoped to a “StepExecution” object or a ”JobExecution”
object. The best usage example is to facilitate restart.
• Also, there is at least one executionContext per JobExecution and one for every StepExecution. They are two
different executionContexts. The one scoped to the step is saved at every commit point in the step, whereas the one
scoped to the job is saved in between every step execution.
Spring Batch Domain/Concepts
• Reference: https://docs.spring.io/spring-
batch/docs/current/reference/html/domain.html#job
1 *
*
1 *
*
*
Job
JobInstance
JobExecution StepExecution
Step
Spring Batch Domain/Concepts
• JobRepository
• “JobRepository” is the persistence mechanism for all the batch stereotypes.
• It provides CRUD operations for JobLauncher, Job and Step implementations.
• When a “Job” is first launched, a “JobExecution” is obtained from “JobRepository” and during the
course of the execution, “StepExecution” and “JobExecution” implementations are persisted by
passing them to “JobRepository”.
• When using Java configuration, @EnableBatchProcessing annotation provides a “JobRepository”
as one of the components automatically configured.
• JobLauncher
• “JobLauncher” represents a simple interface for a launching a “Job” with a given set of
“JobParameters”.
Public interface JobLauncher {
public JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException,
JobParametersValidException;
}
• It is expected that a valid “JobExecution” is obtained from “JobRepository” to execute the “Job”.
Spring Batch Domain/Concepts
• ItemReader
• This is an abstraction that represents the retrieval of input for a “Step”, one item at a
time.
• When “ItemReader” has exhausted the items it can provide, it indicates this by
returning null.
• ItemWriter
• “ItemWriter” is an abstraction that represents the output of a “Step”, one batch or
chunk of items at a time.
• ItemProcessor
• “ItemProcessor” is an abstraction that represents the business processing of an item.
• If while processing the item, it is determined that the item is not valid, returning null
indicates that the item should not be written out.
Maven Dependency Configuration
• Add the following in the pom.xml file
<!-- https://mvnrepository.com/artifact/org.springframework.batch/spring-batch-core -->
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-core</artifactId>
<version>4.2.0.RELEASE</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.springframework.batch/spring-batch-infrastructure -->
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-infrastructure</artifactId>
<version>4.2.0.RELEASE</version>
</dependency>
Spring Batch Sample Configuration
Reference: https://docs.spring.io/spring-
batch/docs/current/reference/html/job.html#javaConfig
@Configuration
@EnableBatchProcessing
@Import(PersistExampleConfig.class)
public class ExampleBatchConfig {
@Autowired
private JobBuilderFactory jobs;
@Autowired
private StepBuilderFactory steps;
Contd…
Spring Batch Sample Configuration
Contd…
@Bean
public Job job(@Qualifier("step1") Step step1, @Qualifier("step2") Step step2) {
return jobs.get("myJob").start(step1).next(step2).build();
}
@Bean
protected Step step1(ItemReader<Person> reader, ItemProcessor<Person, Person> processor, ItemWriter<Person> writer) {
return steps.get("step1")
.<Person, Person> chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
Contd…
Spring Batch Sample Configuration
Contd…
@Bean
protected Step step2(Tasklet tasklet) {
return steps.get("step2")
.tasklet(tasklet)
.build();
}
}
Notes:
1. “Tasklet” is a simple interface which has only one method “execute” which is called repeatedly by “TaskletStep”
until it returns either “RepeatStatus.FINISHED” or throws an exception to signal a failure. A “Tasklet” is supposed
to perform a single task within a step. To create a “TaskletStep”, the bean passed to the tasklet method of the
step builder (as indicated above) must implement the “Tasklet” interface.
2. Spring Batch incorporates chunk-oriented processing as well. Instead of processing all the data at once, it
processes chunks of data. One item is read by “ItemReader” and passed to “ItemProcessor” and aggregated. Once
the number of items read/processed equals the commit interval, the entire chunk is written out by “ItemWriter”.
Reference: https://docs.spring.io/spring-batch/docs/current/reference/html/step.html#chunkOrientedProcessing
Intercepting Job Execution
• Reference: https://docs.spring.io/spring-
batch/docs/current/reference/html/job.html#interceptingJobExecution
• During the course of the job execution, it may be useful to be notified of
various events in the lifecycle. This can be achieved by adding
“JobExecutionListener” object to the listener element on the job.
e.g. @Bean
public Job footballJob() {
return this.jobBuilderFactory.get("footballJob")
.listener(sampleListener())
...
.build();
}
Intercepting Job Execution
• “JobExecutionListener” is an interface in Spring Batch (shown below):
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution);
}
Conclusion
• This is a reference for developers for understanding/implementing
Spring Batch in a software application.
• There are other frameworks too for batch processing like “Easy
Batch”.
• This reference will help developers to build batch processing
applications faster.
Thank you

More Related Content

What's hot

Parallel batch processing with spring batch slideshare
Parallel batch processing with spring batch   slideshareParallel batch processing with spring batch   slideshare
Parallel batch processing with spring batch slideshare
Morten Andersen-Gott
 

What's hot (20)

Spring batch
Spring batch Spring batch
Spring batch
 
Design & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEEDesign & Develop Batch Applications in Java/JEE
Design & Develop Batch Applications in Java/JEE
 
Spring batch for large enterprises operations
Spring batch for large enterprises operations Spring batch for large enterprises operations
Spring batch for large enterprises operations
 
Spring Batch Behind the Scenes
Spring Batch Behind the ScenesSpring Batch Behind the Scenes
Spring Batch Behind the Scenes
 
Spring batch showCase
Spring batch showCaseSpring batch showCase
Spring batch showCase
 
Spring Batch Workshop (advanced)
Spring Batch Workshop (advanced)Spring Batch Workshop (advanced)
Spring Batch Workshop (advanced)
 
Java EE 7 Batch processing in the Real World
Java EE 7 Batch processing in the Real WorldJava EE 7 Batch processing in the Real World
Java EE 7 Batch processing in the Real World
 
Spring batch in action
Spring batch in actionSpring batch in action
Spring batch in action
 
Parallel batch processing with spring batch slideshare
Parallel batch processing with spring batch   slideshareParallel batch processing with spring batch   slideshare
Parallel batch processing with spring batch slideshare
 
Spring Batch Performance Tuning
Spring Batch Performance TuningSpring Batch Performance Tuning
Spring Batch Performance Tuning
 
Java spring batch
Java spring batchJava spring batch
Java spring batch
 
Atlanta JUG - Integrating Spring Batch and Spring Integration
Atlanta JUG - Integrating Spring Batch and Spring IntegrationAtlanta JUG - Integrating Spring Batch and Spring Integration
Atlanta JUG - Integrating Spring Batch and Spring Integration
 
Apache Airflow | What Is An Operator
Apache Airflow | What Is An OperatorApache Airflow | What Is An Operator
Apache Airflow | What Is An Operator
 
Event driven-arch
Event driven-archEvent driven-arch
Event driven-arch
 
Apache airflow
Apache airflowApache airflow
Apache airflow
 
Talend Open Studio For Data Integration Training Curriculum
Talend Open Studio For Data Integration Training CurriculumTalend Open Studio For Data Integration Training Curriculum
Talend Open Studio For Data Integration Training Curriculum
 
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
Grokking TechTalk #24: Thiết kế hệ thống Background Job Queue bằng Ruby & Pos...
 
Writing code that writes code - Nguyen Luong
Writing code that writes code - Nguyen LuongWriting code that writes code - Nguyen Luong
Writing code that writes code - Nguyen Luong
 
Airflow for Beginners
Airflow for BeginnersAirflow for Beginners
Airflow for Beginners
 
SFDC Batch Apex
SFDC Batch ApexSFDC Batch Apex
SFDC Batch Apex
 

Similar to Spring Batch

Toms introtospring mvc
Toms introtospring mvcToms introtospring mvc
Toms introtospring mvc
Guo Albert
 
Introduction to Spring
Introduction to SpringIntroduction to Spring
Introduction to Spring
Sujit Kumar
 
The Basic Concept Of IOC
The Basic Concept Of IOCThe Basic Concept Of IOC
The Basic Concept Of IOC
Carl Lu
 
Workflow demo
Workflow demoWorkflow demo
Workflow demo
Kamal Raj
 
Project FoX: A Tool That Offers Automated Testing Using a Formal Approach
Project FoX: A Tool That Offers Automated Testing Using a Formal ApproachProject FoX: A Tool That Offers Automated Testing Using a Formal Approach
Project FoX: A Tool That Offers Automated Testing Using a Formal Approach
Ivo Neskovic
 

Similar to Spring Batch (20)

Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)Batching and Java EE (jdk.io)
Batching and Java EE (jdk.io)
 
Spring boot for buidling microservices
Spring boot for buidling microservicesSpring boot for buidling microservices
Spring boot for buidling microservices
 
springn batch tutorial
springn batch tutorialspringn batch tutorial
springn batch tutorial
 
Toms introtospring mvc
Toms introtospring mvcToms introtospring mvc
Toms introtospring mvc
 
Gain Proficiency in Batch Processing with Spring Batch
Gain Proficiency in Batch Processing with Spring BatchGain Proficiency in Batch Processing with Spring Batch
Gain Proficiency in Batch Processing with Spring Batch
 
Hybrid test automation frameworks implementation using qtp
Hybrid test automation frameworks implementation using qtpHybrid test automation frameworks implementation using qtp
Hybrid test automation frameworks implementation using qtp
 
Springboot2 postgresql-jpa-hibernate-crud-example with test
Springboot2 postgresql-jpa-hibernate-crud-example with testSpringboot2 postgresql-jpa-hibernate-crud-example with test
Springboot2 postgresql-jpa-hibernate-crud-example with test
 
Understanding Framework Architecture using Eclipse
Understanding Framework Architecture using EclipseUnderstanding Framework Architecture using Eclipse
Understanding Framework Architecture using Eclipse
 
Introduction to Spring
Introduction to SpringIntroduction to Spring
Introduction to Spring
 
Xke spring boot
Xke spring bootXke spring boot
Xke spring boot
 
Hibernate example1
Hibernate example1Hibernate example1
Hibernate example1
 
The Basic Concept Of IOC
The Basic Concept Of IOCThe Basic Concept Of IOC
The Basic Concept Of IOC
 
Maven: Managing Software Projects for Repeatable Results
Maven: Managing Software Projects for Repeatable ResultsMaven: Managing Software Projects for Repeatable Results
Maven: Managing Software Projects for Repeatable Results
 
Workflow demo
Workflow demoWorkflow demo
Workflow demo
 
Enhanced Workflows in Cascade Server by Leah Einecker
Enhanced Workflows in Cascade Server by Leah EineckerEnhanced Workflows in Cascade Server by Leah Einecker
Enhanced Workflows in Cascade Server by Leah Einecker
 
Variables Arguments and control flow_UiPath.ppt
Variables Arguments and control flow_UiPath.pptVariables Arguments and control flow_UiPath.ppt
Variables Arguments and control flow_UiPath.ppt
 
FireWorks workflow software
FireWorks workflow softwareFireWorks workflow software
FireWorks workflow software
 
Maven Introduction
Maven IntroductionMaven Introduction
Maven Introduction
 
Project FoX: A Tool That Offers Automated Testing Using a Formal Approach
Project FoX: A Tool That Offers Automated Testing Using a Formal ApproachProject FoX: A Tool That Offers Automated Testing Using a Formal Approach
Project FoX: A Tool That Offers Automated Testing Using a Formal Approach
 
Spring data jpa are used to develop spring applications
Spring data jpa are used to develop spring applicationsSpring data jpa are used to develop spring applications
Spring data jpa are used to develop spring applications
 

Recently uploaded

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 

Spring Batch

  • 1. Spring Batch This is a reference/guide for software developers to understand/learn Spring Batch. Jayasree Perilakkalam
  • 2. Spring Batch Layered Architecture • Reference: https://docs.spring.io/spring- batch/docs/3.0.x/reference/html/spring-batch-intro.html Application Spring Batch Core Infrasrtucture All batch jobs and custom code written by developers using Spring Batch Core runtime classes necessary to launch and control a batch job such as JobLauncher, Job, and Step Common readers, writers , and services such as the RetryTemplate
  • 3. Batch Stereotypes • Refernce : https://docs.spring.io/spring- batch/docs/current/reference/html/domain.html#job 1 1 1 * 1 1 1 JobLauncher Job Step JobRepository ItemReader ItemProcessor ItemWriter
  • 4. Spring Batch Domain/Concepts • Job • “Job” encapsulates an entire batch process. As is common with other Spring projects, a “Job” is wired together using either an XML configuration or a Java configuration. • A “Job” has to one to many steps each of which has exactly one ItemReader, ItemWriter, and ItemProcessor. A “Job” allows for configuration of properties global to all steps, such as restartability. • A “Job” needs to be launched using a “JobLauncher” and the metadata about the currently running process is stored in “JobRepository”. • A default implementation of “Job” interface is provided in Spring Batch in the form of the “SimpleJob” class. When using Java based configuration, a collection of builders is available for the instantiation of “Job”. • A “JobInstance” refers to the concept of a logical job run. So a “Job” has many “JobInstance”. A “Job” can be scheduled to run many times. Each of these is a “JobInstance”. Each “JobInstance” is tracked separately and thus if it fails, it needs to be run again. Therefore, each “JobInstance” can have multiple executions (“JobExecution”). Only one “JobInstance” corresponding to a particular “Job” and identifying “JobParameters” can run at a given time. • The definition of “JobInstance” has no bearing on the data to be loaded. It’s entirely up to the “ItemReader” implementation to determine how the data is loaded.
  • 5. Spring Batch Domain/Concepts • Using the same “JobInstance” determines whether or not the same state (i.e. “ExecutionContext”) from the previous execution is used. Using a new “JobInstance” means start from the beginning, and using and existing “JobInstance” generally means start from where you left off. • Now this question arises. How is one “JobInstance” distinguished from the other? The answer is “JobParameters”. A “JobParameters” object holds a set of parameters used to start a batch job. They can be used for identification or even as reference data during the run. Thus, JobInstance = Job + identifying JobParameters . Note: Not all job parameters are required to contribute to the identification of “JobInstance”. • “JobExecution” is a technical concept of a single attempt to run a job. An execution may end in a failure or a success, but “JobInstance” corresponding to a given execution is not considered complete unless the execution completes successfully. Consider a “JobInstance” that failed, when it is run again with the same identifying “JobParameters”, a new “JobExecution” is created. However, there is still only one “JobInstance” (the same one as before). • A “Job” defines what a job is and how it is to be executed. A “JobInstance” is a purely organizational object to group executions together, primarily to enable correct restart semantics. A “JobExecution” , however, is the primary storage mechanism for what actually happened during a run and contains properties that must be controlled and persisted.
  • 6. Spring Batch Domain/Concepts • “executionContext” is a property of “JobExecution”. It is the property bag that contains any user data that need to be preserved(persisted) between executions. • A batch job metadata tables are “batch_job_instance”, batch_job_execution_params”, “batch_job_execution”. • Step • This is a domain object that encapsulates an independent, sequential phase of a batch job. Thus every job is composed of one or more steps. • As with “Job”, “Step” has an individual “StepExecution” that correlates with a unique “JobExecution”. “StepExecution” represents a single attempt to execute a “Step”. A new “StepExecution” is created each time a “Step” is run similar to “JobExecution”. However, if a “Step” fails because a “Step” before it failed, no execution is persisted for it. A “StepExecution” is created only when its “Step” is actually started. “Step” executions are represented by objects of the “StepExecution” class. Each execution contains reference to its corresponding step and “JobExecution” and transaction related data such as commit and rollback counts and start and end times. • Additionally, each “StepExecution” has a “executionContext” property which contains any data a developer needs to have persisted across batch runs such as statistics or state information needed to restart. • An executionContext represents a collection of key/value pairs that are persisted/controlled by the framework in order to allow developers to store persistent state that is scoped to a “StepExecution” object or a ”JobExecution” object. The best usage example is to facilitate restart. • Also, there is at least one executionContext per JobExecution and one for every StepExecution. They are two different executionContexts. The one scoped to the step is saved at every commit point in the step, whereas the one scoped to the job is saved in between every step execution.
  • 7. Spring Batch Domain/Concepts • Reference: https://docs.spring.io/spring- batch/docs/current/reference/html/domain.html#job 1 * * 1 * * * Job JobInstance JobExecution StepExecution Step
  • 8. Spring Batch Domain/Concepts • JobRepository • “JobRepository” is the persistence mechanism for all the batch stereotypes. • It provides CRUD operations for JobLauncher, Job and Step implementations. • When a “Job” is first launched, a “JobExecution” is obtained from “JobRepository” and during the course of the execution, “StepExecution” and “JobExecution” implementations are persisted by passing them to “JobRepository”. • When using Java configuration, @EnableBatchProcessing annotation provides a “JobRepository” as one of the components automatically configured. • JobLauncher • “JobLauncher” represents a simple interface for a launching a “Job” with a given set of “JobParameters”. Public interface JobLauncher { public JobExecution run(Job job, JobParameters jobParameters) throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersValidException; } • It is expected that a valid “JobExecution” is obtained from “JobRepository” to execute the “Job”.
  • 9. Spring Batch Domain/Concepts • ItemReader • This is an abstraction that represents the retrieval of input for a “Step”, one item at a time. • When “ItemReader” has exhausted the items it can provide, it indicates this by returning null. • ItemWriter • “ItemWriter” is an abstraction that represents the output of a “Step”, one batch or chunk of items at a time. • ItemProcessor • “ItemProcessor” is an abstraction that represents the business processing of an item. • If while processing the item, it is determined that the item is not valid, returning null indicates that the item should not be written out.
  • 10. Maven Dependency Configuration • Add the following in the pom.xml file <!-- https://mvnrepository.com/artifact/org.springframework.batch/spring-batch-core --> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-core</artifactId> <version>4.2.0.RELEASE</version> </dependency> <!-- https://mvnrepository.com/artifact/org.springframework.batch/spring-batch-infrastructure --> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-infrastructure</artifactId> <version>4.2.0.RELEASE</version> </dependency>
  • 11. Spring Batch Sample Configuration Reference: https://docs.spring.io/spring- batch/docs/current/reference/html/job.html#javaConfig @Configuration @EnableBatchProcessing @Import(PersistExampleConfig.class) public class ExampleBatchConfig { @Autowired private JobBuilderFactory jobs; @Autowired private StepBuilderFactory steps; Contd…
  • 12. Spring Batch Sample Configuration Contd… @Bean public Job job(@Qualifier("step1") Step step1, @Qualifier("step2") Step step2) { return jobs.get("myJob").start(step1).next(step2).build(); } @Bean protected Step step1(ItemReader<Person> reader, ItemProcessor<Person, Person> processor, ItemWriter<Person> writer) { return steps.get("step1") .<Person, Person> chunk(10) .reader(reader) .processor(processor) .writer(writer) .build(); } Contd…
  • 13. Spring Batch Sample Configuration Contd… @Bean protected Step step2(Tasklet tasklet) { return steps.get("step2") .tasklet(tasklet) .build(); } } Notes: 1. “Tasklet” is a simple interface which has only one method “execute” which is called repeatedly by “TaskletStep” until it returns either “RepeatStatus.FINISHED” or throws an exception to signal a failure. A “Tasklet” is supposed to perform a single task within a step. To create a “TaskletStep”, the bean passed to the tasklet method of the step builder (as indicated above) must implement the “Tasklet” interface. 2. Spring Batch incorporates chunk-oriented processing as well. Instead of processing all the data at once, it processes chunks of data. One item is read by “ItemReader” and passed to “ItemProcessor” and aggregated. Once the number of items read/processed equals the commit interval, the entire chunk is written out by “ItemWriter”. Reference: https://docs.spring.io/spring-batch/docs/current/reference/html/step.html#chunkOrientedProcessing
  • 14. Intercepting Job Execution • Reference: https://docs.spring.io/spring- batch/docs/current/reference/html/job.html#interceptingJobExecution • During the course of the job execution, it may be useful to be notified of various events in the lifecycle. This can be achieved by adding “JobExecutionListener” object to the listener element on the job. e.g. @Bean public Job footballJob() { return this.jobBuilderFactory.get("footballJob") .listener(sampleListener()) ... .build(); }
  • 15. Intercepting Job Execution • “JobExecutionListener” is an interface in Spring Batch (shown below): public interface JobExecutionListener { void beforeJob(JobExecution jobExecution); void afterJob(JobExecution jobExecution); }
  • 16. Conclusion • This is a reference for developers for understanding/implementing Spring Batch in a software application. • There are other frameworks too for batch processing like “Easy Batch”. • This reference will help developers to build batch processing applications faster. Thank you