SpringBatch batch combat tutorial

1. SpringBatch overview

1.1 general

Spring Batch is a lightweight and perfect batch application framework designed to support enterprise systems to build robust and efficient batch applications. Spring Batch is a sub project of spring. It uses the java language and is developed based on the spring framework, making it easier for developers or enterprises that have used the spring framework to access and utilize enterprise services.

Spring Batch provides reusable functions that are critical for processing large amounts of data, including recording / tracking, transaction management, job processing statistics, job restart, skipping, and resource management. It also provides more advanced technical services and functions to realize extremely high-capacity and high-performance batch jobs through optimization and zoning technology. Spring Batch can be used on two simple use cases (such as reading files into a database or running stored procedures) and a large number of complex use cases (such as moving a large amount of data between databases, converting it, and so on). Mass batch jobs can use the framework to process large amounts of information in a highly scalable manner.

However, Spring Batch is not a scheduling framework. It only focuses on task processing, such as log monitoring, transactions, concurrency, etc. but it can be used in conjunction with other scheduling frameworks to complete corresponding scheduling tasks, such as Quartz, Tivoli, Control-M, etc.

1.2 the framework mainly has the following functions:

  • Transaction management

  • Chunk based processing

  • Declarative I/O

  • Start/Stop/Restart

  • Retry/Skip

1.3 the framework has four roles:

  • JobLauncher is a task launcher. It is used to start tasks and can be regarded as the entry point of the program.
  • Job represents a specific task, and Step represents a specific Step. A task can include one Step (imagine how many steps it takes to put the elephant in the refrigerator) or multiple steps, which are started by the task launcher. The specific execution content of a task. The execution process of a Step includes reading data (ItemReader), processing data (ItemProcessor), and writing data (ItemWriter).
  • JobRepository is a place where data is stored. It can be seen as a database interface. It is used to record task status and other information during task execution.

2. set up SpringBatch project

2.1 Build with Spring initializr

2.2 decompress and import IDEA


2.3 add database driver dependency before starting the project

		<dependency>
			<groupId>com.h2database</groupId>
			<artifactId>h2</artifactId>
			<scope>runtime</scope>
		</dependency>

3.SpringBatch starter

3.1 create a configuring package and write a class JobConfiguration

  • adding annotations

3.2 add Job and Step injection

3.3 create task object (the code is as follows)

    @Bean
    public Job helloWorldJob(){
        return jobBuilderFactory.get("helloWorldJob")
                .start(step1())
                .build();
    }

    @Bean
    //The core idea is based on the task, and the task executes the step
    public Step step1() {
        return stepBuilderFactory.get("step1")
                .tasklet(new Tasklet() {
                    //Functions performed
                    @Override
                    //execute requires a return value of RepeatStatus
                    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
                        System.out.println("Hello World");
                        //Specify the status value for RepeatStatus
                        return RepeatStatus.FINISHED;
                    }
                }).build();
    }

4. replace H2 database with Mysql

  • Add dependency
		<dependency>
			<groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-jdbc</artifactId>
		</dependency>

4.1 configuring yml

spring:
  datasource:
    driver-class-name: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://localhost:3306/batch?serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=utf8&autoReconnect=true&allowMultiQueries=true
    username: root
    password: root

    schema: classpath:/org/springframework/batch/core/schema-mysql.sql

  batch:
    initialize-schema: always

-Start the program again to the database refresh meeting

  • Persistence related information

5. core API


The figure above introduces some related concepts of task Job:

  • Job: encapsulates processing entities and defines process logic.
  • JobInstance: the running instance of a Job has different parameters for different instances, so a Job can be run multiple times with different parameters after it is defined.
  • JobParameters: parameters associated with JobInstance.
  • JobExecution: represents an actual execution of a Job, which may succeed or fail.
    Therefore, what developers need to do is to define jobs.

The following figure introduces some related concepts of step steps:

Step is the encapsulation of a process of a Job. A Job can contain one or more steps. Step by step steps are executed according to specific logic to represent the completion of the Job.

By defining steps to assemble jobs, you can more flexibly implement complex business logic.

**Input processing output**

Therefore, the key to defining a Job is to define one or more steps and then assemble them. There are many ways to define a Step, but a common model is input - processing - output, that is, Item Reader, Item Processor, and Item Writer. For example, input data from a file through the Item Reader, then perform business processing and data conversion through the Item Processor, and finally write to the database through the Item Writer.

Spring Batch provides us with many out of the box readers and writers, which are very convenient.

6. Job creation and use (the code is as follows)

package com.bosc.springbatch.config;

@Configuration
@EnableBatchProcessing
public class JobDemo {

    @Autowired
    private JobBuilderFactory jobBuilderFactory;

    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    //Create task object
    @Bean
    public Job jobDemoJob(){
        return jobBuilderFactory.get("jobDemoJob")
                //.start(step1())
                //next() specifies the next step. By default, step1 is executed first and then step2 is executed
                //.next(step2())
                //.next(step3())
                .start(step1())
                //On ("completed" < end step1>) is used to specify a condition
                .on("COMPLETED")
                //To (to step2())
                .to(step2())
                //Step 2 is executed successfully, and it will not end until it is satisfied
                .from(step2()).on("COMPLETED").to(step3())
                //fail() indicates that step2 fails and step3 cannot be executed
                /*.from(step2()).on("COMPLETED").fail()*/
                //stopAndRestart stop and restart are generally used for testing
                /*.from(step2()).on("COMPLETED").stopAndRestart(step2())*/
                //end ()
                .from(step3()).end()
                .build();

    }

    @Bean
    public Step step1() {
        return stepBuilderFactory.get("step1")
                //Specific implementation functions of step
                .tasklet(new Tasklet() {
                    @Override
                    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
                        //Execute step1 function
                        System.out.println("step1");
                        return RepeatStatus.FINISHED;
                    }
                    //The next one will be executed after normal completion
                }).build();
    }

    @Bean
    public Step step2() {
        return stepBuilderFactory.get("step2")
                //Specific implementation functions of step
                .tasklet(new Tasklet() {
                    @Override
                    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
                        //Execute step1 function
                        System.out.println("step2");
                        return RepeatStatus.FINISHED;
                    }
                    //The next one will be executed after normal completion
                }).build();
    }

    @Bean
    public Step step3() {
        return stepBuilderFactory.get("step3")
                //Specific implementation functions of step
                .tasklet(new Tasklet() {
                    @Override
                    public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
                        //Execute step1 function
                        System.out.println("step3");
                        return RepeatStatus.FINISHED;
                    }
                    //Only after normal completion
                }).build();
    }
}

Tags: Java Spring batch

Posted by etherboo on Wed, 01 Jun 2022 11:18:33 +0530