1. SpringBatch overview
1.1 general
Spring Batch is a lightweight and perfect batch application framework designed to support enterprise systems to build robust and efficient batch applications. Spring Batch is a sub project of spring. It uses the java language and is developed based on the spring framework, making it easier for developers or enterprises that have used the spring framework to access and utilize enterprise services.
Spring Batch provides reusable functions that are critical for processing large amounts of data, including recording / tracking, transaction management, job processing statistics, job restart, skipping, and resource management. It also provides more advanced technical services and functions to realize extremely high-capacity and high-performance batch jobs through optimization and zoning technology. Spring Batch can be used on two simple use cases (such as reading files into a database or running stored procedures) and a large number of complex use cases (such as moving a large amount of data between databases, converting it, and so on). Mass batch jobs can use the framework to process large amounts of information in a highly scalable manner.
However, Spring Batch is not a scheduling framework. It only focuses on task processing, such as log monitoring, transactions, concurrency, etc. but it can be used in conjunction with other scheduling frameworks to complete corresponding scheduling tasks, such as Quartz, Tivoli, Control-M, etc.
1.2 the framework mainly has the following functions:
-
Transaction management
-
Chunk based processing
-
Declarative I/O
-
Start/Stop/Restart
-
Retry/Skip
1.3 the framework has four roles:
- JobLauncher is a task launcher. It is used to start tasks and can be regarded as the entry point of the program.
- Job represents a specific task, and Step represents a specific Step. A task can include one Step (imagine how many steps it takes to put the elephant in the refrigerator) or multiple steps, which are started by the task launcher. The specific execution content of a task. The execution process of a Step includes reading data (ItemReader), processing data (ItemProcessor), and writing data (ItemWriter).
- JobRepository is a place where data is stored. It can be seen as a database interface. It is used to record task status and other information during task execution.
2. set up SpringBatch project
2.1 Build with Spring initializr
2.2 decompress and import IDEA
2.3 add database driver dependency before starting the project
<dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <scope>runtime</scope> </dependency>
3.SpringBatch starter
3.1 create a configuring package and write a class JobConfiguration
- adding annotations
3.2 add Job and Step injection
3.3 create task object (the code is as follows)
@Bean public Job helloWorldJob(){ return jobBuilderFactory.get("helloWorldJob") .start(step1()) .build(); } @Bean //The core idea is based on the task, and the task executes the step public Step step1() { return stepBuilderFactory.get("step1") .tasklet(new Tasklet() { //Functions performed @Override //execute requires a return value of RepeatStatus public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception { System.out.println("Hello World"); //Specify the status value for RepeatStatus return RepeatStatus.FINISHED; } }).build(); }
4. replace H2 database with Mysql
- Add dependency
<dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-jdbc</artifactId> </dependency>
4.1 configuring yml
spring: datasource: driver-class-name: com.mysql.cj.jdbc.Driver url: jdbc:mysql://localhost:3306/batch?serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=utf8&autoReconnect=true&allowMultiQueries=true username: root password: root schema: classpath:/org/springframework/batch/core/schema-mysql.sql batch: initialize-schema: always
-Start the program again to the database refresh meeting
- Persistence related information
5. core API
The figure above introduces some related concepts of task Job:
- Job: encapsulates processing entities and defines process logic.
- JobInstance: the running instance of a Job has different parameters for different instances, so a Job can be run multiple times with different parameters after it is defined.
- JobParameters: parameters associated with JobInstance.
- JobExecution: represents an actual execution of a Job, which may succeed or fail.
Therefore, what developers need to do is to define jobs.
The following figure introduces some related concepts of step steps:
Step is the encapsulation of a process of a Job. A Job can contain one or more steps. Step by step steps are executed according to specific logic to represent the completion of the Job.
By defining steps to assemble jobs, you can more flexibly implement complex business logic.
**Input processing output**
Therefore, the key to defining a Job is to define one or more steps and then assemble them. There are many ways to define a Step, but a common model is input - processing - output, that is, Item Reader, Item Processor, and Item Writer. For example, input data from a file through the Item Reader, then perform business processing and data conversion through the Item Processor, and finally write to the database through the Item Writer.
Spring Batch provides us with many out of the box readers and writers, which are very convenient.
6. Job creation and use (the code is as follows)
package com.bosc.springbatch.config; @Configuration @EnableBatchProcessing public class JobDemo { @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; //Create task object @Bean public Job jobDemoJob(){ return jobBuilderFactory.get("jobDemoJob") //.start(step1()) //next() specifies the next step. By default, step1 is executed first and then step2 is executed //.next(step2()) //.next(step3()) .start(step1()) //On ("completed" < end step1>) is used to specify a condition .on("COMPLETED") //To (to step2()) .to(step2()) //Step 2 is executed successfully, and it will not end until it is satisfied .from(step2()).on("COMPLETED").to(step3()) //fail() indicates that step2 fails and step3 cannot be executed /*.from(step2()).on("COMPLETED").fail()*/ //stopAndRestart stop and restart are generally used for testing /*.from(step2()).on("COMPLETED").stopAndRestart(step2())*/ //end () .from(step3()).end() .build(); } @Bean public Step step1() { return stepBuilderFactory.get("step1") //Specific implementation functions of step .tasklet(new Tasklet() { @Override public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception { //Execute step1 function System.out.println("step1"); return RepeatStatus.FINISHED; } //The next one will be executed after normal completion }).build(); } @Bean public Step step2() { return stepBuilderFactory.get("step2") //Specific implementation functions of step .tasklet(new Tasklet() { @Override public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception { //Execute step1 function System.out.println("step2"); return RepeatStatus.FINISHED; } //The next one will be executed after normal completion }).build(); } @Bean public Step step3() { return stepBuilderFactory.get("step3") //Specific implementation functions of step .tasklet(new Tasklet() { @Override public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception { //Execute step1 function System.out.println("step3"); return RepeatStatus.FINISHED; } //Only after normal completion }).build(); } }