This document provides an overview and introduction to SpringBatch presented by Slobodan Lohja in 2021. The presentation covers why companies should open up their data, what Spring and SpringBatch are, demonstrates building a sample project to import data from a CSV file to a MongoDB database using SpringBatch, and discusses some closing thoughts and taking questions. The sample project walkthrough provides steps to set up the necessary configurations, components, and classes to run a batch job that reads from a CSV and writes to a MongoDB database using SpringBatch.
3. • Certi
fi
ed IBM Application Developer for Collaboration (Lotus
Domino).
• 20[18,19,20,21] Collabsphere Speaker
• 2019 IBM Think Speaker
• 20[19,20,21] Senior Developer and technical team lead to
modernize mainframe apps to SpringBoot Microservices for
insurance claim processing
• 2021 Developer Specialist; Full stack. REACT/SpringBoot
modernization of JSP monolith Websphere application.
Introduction to SpringBatch
Slobodan Lohja
https://www.linkedin.com/in/slobodanlohja/
4. Introduction to SpringBatch
• Why open up your data
• What is Spring and SpringBatch
• Demo project - Let’s build something
• Closing thoughts
• Q/A
Agenda
5. Introduction to SpringBatch
• No access to data, no business intelligence
• No access to data, no artificial intelligence
• No access to data, no machine learning
• No access to data, data Science department will not like you
• No access to data, you cannot be a digital company
Integrate with devices
Bring the relevant data where the user is
Integrate with other cloud services / applications
Why Open up your data “securely”
6. Introduction to SpringBatch
• Why open up your data
• What is Spring and SpringBatch
• Demo project - Let’s build something
• Closing thoughts
• Q/A
Agenda
8. Introduction to SpringBatch
• It is a Java project built on top of Spring Core Framework
• It is a stand alone application that can run anywhere there is a
Java JVM.
• It moves data between data sources in a consistent manner
• It has its own database schema to track its work (H2 default)
• It allows manipulating data “processing” between source and
destination
https://spring.io/projects/spring-batch
So what is the SpringBatch project?
9. Introduction to SpringBatch
• We put our code in specific places and the framework calls our
code.
How does SpringBatch work?
App
Starts
Application
Context
Created
Similar to JSF/
XPages
managed
beans, spring
core will scan all
java classes and
load them into
a global scoped
memory space
Configurations
are run
JDBC
Message Queue
Features
Schedulers
HTTP Server
Depends on
config the what
SpringBatch
Starts a Job
Step1
Optional
Processor
Step 2
Step N
Run
next step?
Run
next step?
App
Ends
JVM runs
main() method
typical Java
application
A job instance is
created and
based on how
job is
assembled all
the steps are
run. Optionally
a processor to
change data.
A processor can
be shared /
reused by other
jobs.
Every step of
the way the
framework is
logging activity
and placing
breadcrumbs
where it is in
the batch
process to the
SQL db.
10. Introduction to SpringBatch
https://docs.spring.io/spring-batch/docs/current/reference/html/index-single.html#meta DataSchemaOverview
Restartability
Restart a job and continue where it left off.
Intercepting Job Execution
Lifecycle events beforeJob,afterJob
JobParameters
Send parameters and param validation
Chunk Processing
Processes data sets in chunks [a]synchronously
Step Flows; skips, retry
Chaining steps into flows; conditional forking steps
Rollback
11. Introduction to SpringBatch
• Why open up your data
• What is Spring and SpringBatch
• Demo project - Let’s build something
• Closing thoughts
• Q/A
Agenda
12. Introduction to SpringBatch
• We will import a Lego Parts list ~37K records from CSV
• We will use last years session model objects
• We will use a NoSQL JSON Document repository (storage)
• The only less experimental option is using Domino Data Access Services.
Let’s put together a SpringBatch app that imports from a CSV into a Database
https://youtu.be/JY_KOfHJk8w
See Collabsphere 2020 Presentation…
13. Introduction to SpringBatch
• Install MongoDB
$> brew tap mongodb/brew
$> brew install mongodb/brew/mongodb-community@5.0
• Uninstall Mongo
$> brew uninstall mongodb-community
• Staring MongoDB
$> brew services start mongodb-community@5.0
$> brew services stop mongodb-community@5.0
* In MongoDB Shell Client ‘shell> shutdown’
$> brew services list
$> mongotop //db tools command line (backup/import/export, monitoring, similar to Admin client)
$> mongsh //test queries and db operations
Let’s use a NoSQL JSON document store
No salesmen
No websites
No registration
It works out of the box
14. Introduction to SpringBatch
• Start SpringBoot initialize
Web: https://start.spring.io
IDE: IntelliJ or Eclipse
• Fill out the form, select the
dependencies and finally
‘Generate’ to download a
starter project.
• No account or login required.
Recipe: Start a project 1 of 4
15. Introduction to SpringBatch
• Use your favorite IDE to open the project.
• Add a CSV file to the resources folder (lego parts 43,695 rows)
• Add H2 and Mongo connection info application.properties
• Add packages config, controller, launcher, model
• Add Model objects or import them as dependency
Recipe: preparation 2 of 4
16. Introduction to SpringBatch
• Add LegoBatchConfig class
• annotate @Configuration, @EnableBatchProcessing
• Add a ItemReader<YourPojo>
• Add a ItemWriter ‘new MongoItmWriter<>()’ built in.
• Add one or more steps
• Add one or more jobs
Recipe: Job Configuration 3 of 4
17. Introduction to SpringBatch
• Add a JobLauncher class and implement CommandLineRunner
• Call jobLauncher.run; pass in a job, and a unique job id.
• Run the application
If Port is being used by another application.
$> lsof -i :8080 | grep LISTEN
application.properties > server.port={newport}
Open H2 Web Console to see Job status
http://localhost:8080/h2/
select ji.JOB_INSTANCE_ID, ji.JOB_NAME,
je.CREATE_TIME, je.START_TIME, je.END_TIME, je.STATUS, je.END_TIME - je.START_TIME, je.EXIT_CODE,
se.STEP_NAME, se.READ_COUNT, se.WRITE_COUNT, se.END_TIME - se.START_TIME
from BATCH_JOB_INSTANCE ji
join BATCH_JOB_EXECUTION je on je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
Join BATCH_STEP_EXECUTION se on se.JOB_EXECUTION_ID = je.JOB_EXECUTION_ID;
MongoDB Shell Commands
show databases; //view all databases. Similar to showing all NSF files.
Use bricks; //select a database to work with. Like opening a Notes database.
db.parts.drop(); //remove a collection. Like deleting a Notes View.
Recipe: Create a Job Launcher 4 of 4
18. Introduction to SpringBatch
• Similar to a Domino Agent Manager; bonus, can be debugged easily.
• Add an annotation @EnableScheduling to LegoJobLauncher
• Add an annotation @Scheduled(cron = “0 */1 * * * ?”) to run the
job Launcher.
Mongo DB Commands
db.parts.count();
db.parts.find(); //get a collection, then it for more to page through.
db.parts.find({partNum: “10178pr0004"}); //optional .limit(5)
Recipe: Add Scheduler - Spring TaskScheduler Project
19. Introduction to SpringBatch
• Add a JobController class in the controller package.
• Annotate with @RestController, @RestMapping(“/api/job”)
• Autowire the LegoJobLauncher
• Add a method for POST to call legoJobLauncher.run()
Recipe: Add REST Controller - Spring Web Project
20. Introduction to SpringBatch
• Why open up your data
• What is Spring and SpringBatch
• Demo project - Let’s build something
• Closing thoughts
• Q/A
Agenda
21. Introduction to SpringBatch
Technology Review : Challenges with Domino as NoSQL Storage
Domino
• Standing up a server is difficult on a developer’s workstation.
• Developers have to use a MS Windows workstation.
• Domino is not designed for a NoSQL storage service.
• Domino is not integrated into other echo systems like Java and .NET Core libraries.
• NSF is a key value pair flat document store.
• Once installed, Domino is not configured as a NoSQL storage service (DAS) turned off.
Domino Data Access (Production ready REST API)
• DAS REST API does not handle complete JSON schemas, only array of key value pairs.
• DAS needs a middleware, like SpringBoot to validate and insert data into documents and response documents.
• Hidden features: default has limitation on retrieved documents. We do not get all the data requested. Notes.ini
• Uses Vectors instead of Java 8 streams.
Beta Projects to watch
• KEEP A middleware, RESTful way to access Domino backend; Domino Docker image on MacBook and install
challenges because there is no Domino Administrator to continue server setup, maybe next year.
22. Introduction to SpringBatch
Technology Review : Designed and build for NoSQL Storage
Source: https://en.wikipedia.org/wiki/NoSQL
Note: CouchDb inspired by Lotus Notes… Damian Katz Ex Iris engineer.
23. Intro to Microservices with Domino Use Case
Have fun exploring SpringBatch
Twitter: @XPagesBeast
LinkedIn: https://www.linkedin.com/in/slobodanlohja/
Samples: https://github.com/spring-projects/spring-batch/tree/main/spring-batch-samples
Thank You !