Tutorial

Spring Batch Example

Published on August 3, 2022
author

Pankaj

Spring Batch Example

Welcome to Spring Batch Example. Spring Batch is a spring framework module for execution of batch job. We can use spring batch to process a series of jobs.

Spring Batch Example

Before going through spring batch example program, let’s get some idea about spring batch terminologies.

  • A job can consist of ‘n’ number of steps. Each step contains Read-Process-Write task or it can have single operation, which is called tasklet.
  • Read-Process-Write is basically read from a source like Database, CSV etc. then process the data and write it to a source like Database, CSV, XML etc.
  • Tasklet means doing a single task or operation like cleaning of connections, freeing up resources after processing is done.
  • Read-Process-Write and tasklets can be chained together to run a job.

Spring Batch Example

Let us consider a working example for implementation of spring batch. We will consider the following scenario for implementation purpose. A CSV file containing data needs to be converted as XML along with the data and tags will be named after the column name. Below are the important tools and libraries used for spring batch example.

  1. Apache Maven 3.5.0 - for project build and dependencies management.
  2. Eclipse Oxygen Release 4.7.0 - IDE for creating spring batch maven application.
  3. Java 1.8
  4. Spring Core 4.3.12.RELEASE
  5. Spring OXM 4.3.12.RELEASE
  6. Spring JDBC 4.3.12.RELEASE
  7. Spring Batch 3.0.8.RELEASE
  8. MySQL Java Driver 5.1.25 - use based on your MySQL installation. This is required for Spring Batch metadata tables.

Spring Batch Example Directory Structure

Below image illustrates all the components in our Spring Batch example project. spring batch example

Spring Batch Maven Dependencies

Below is the content of pom.xml file with all the required dependencies for our spring batch example project.

<project xmlns="https://maven.apache.org/POM/4.0.0" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="https://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.journaldev.spring</groupId>
	<artifactId>SpringBatchExample</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<packaging>jar</packaging>

	<name>SpringBatchDemo</name>
	<url>https://maven.apache.org</url>

	<properties>
		<jdk.version>1.8</jdk.version>
		<spring.version>4.3.12.RELEASE</spring.version>
		<spring.batch.version>3.0.8.RELEASE</spring.batch.version>
		<mysql.driver.version>5.1.25</mysql.driver.version>
		<junit.version>4.11</junit.version>
	</properties>

	<dependencies>

		<!-- Spring Core -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-core</artifactId>
			<version>${spring.version}</version>
		</dependency>

		<!-- Spring jdbc, for database -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-jdbc</artifactId>
			<version>${spring.version}</version>
		</dependency>

		<!-- Spring XML to/back object -->
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-oxm</artifactId>
			<version>${spring.version}</version>
		</dependency>

		<!-- MySQL database driver -->
		<dependency>
			<groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
			<version>${mysql.driver.version}</version>
		</dependency>

		<!-- Spring Batch dependencies -->
		<dependency>
			<groupId>org.springframework.batch</groupId>
			<artifactId>spring-batch-core</artifactId>
			<version>${spring.batch.version}</version>
		</dependency>
		<dependency>
			<groupId>org.springframework.batch</groupId>
			<artifactId>spring-batch-infrastructure</artifactId>
			<version>${spring.batch.version}</version>
		</dependency>

		<!-- Spring Batch unit test -->
		<dependency>
			<groupId>org.springframework.batch</groupId>
			<artifactId>spring-batch-test</artifactId>
			<version>${spring.batch.version}</version>
		</dependency>

		<!-- Junit -->
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<version>${junit.version}</version>
			<scope>test</scope>
		</dependency>

		<dependency>
			<groupId>com.thoughtworks.xstream</groupId>
			<artifactId>xstream</artifactId>
			<version>1.4.10</version>
		</dependency>

	</dependencies>
	<build>
		<finalName>spring-batch</finalName>
		<plugins>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-eclipse-plugin</artifactId>
				<version>2.9</version>
				<configuration>
					<downloadSources>true</downloadSources>
					<downloadJavadocs>false</downloadJavadocs>
				</configuration>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>2.3.2</version>
				<configuration>
					<source>${jdk.version}</source>
					<target>${jdk.version}</target>
				</configuration>
			</plugin>
		</plugins>
	</build>
</project>

Spring Batch Processing CSV Input File

Here is the content of our sample CSV file for spring batch processing.

1001,Tom,Moody, 29/7/2013
1002,John,Parker, 30/7/2013
1003,Henry,Williams, 31/7/2013

Spring Batch Job Configuration

We have to define spring bean and spring batch job in a configuration file. Below is the content of job-batch-demo.xml file, it’s the most important part of spring batch project.

<beans xmlns="https://www.springframework.org/schema/beans"
	xmlns:batch="https://www.springframework.org/schema/batch" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="https://www.springframework.org/schema/batch
		https://www.springframework.org/schema/batch/spring-batch-3.0.xsd
		https://www.springframework.org/schema/beans 
		https://www.springframework.org/schema/beans/spring-beans-4.3.xsd
	">

	<import resource="../config/context.xml" />
	<import resource="../config/database.xml" />

	<bean id="report" class="com.journaldev.spring.model.Report"
		scope="prototype" />
	<bean id="itemProcessor" class="com.journaldev.spring.CustomItemProcessor" />

	<batch:job id="DemoJobXMLWriter">
		<batch:step id="step1">
			<batch:tasklet>
				<batch:chunk reader="csvFileItemReader" writer="xmlItemWriter"
					processor="itemProcessor" commit-interval="10">
				</batch:chunk>
			</batch:tasklet>
		</batch:step>
	</batch:job>

	<bean id="csvFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">

		<property name="resource" value="classpath:csv/input/report.csv" />

		<property name="lineMapper">
			<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
				<property name="lineTokenizer">
					<bean
						class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
						<property name="names" value="id,firstname,lastname,dob" />
					</bean>
				</property>
				<property name="fieldSetMapper">
					<bean class="com.journaldev.spring.ReportFieldSetMapper" />

					<!-- if no data type conversion, use BeanWrapperFieldSetMapper to map 
						by name <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"> 
						<property name="prototypeBeanName" value="report" /> </bean> -->
				</property>
			</bean>
		</property>

	</bean>

	<bean id="xmlItemWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
		<property name="resource" value="file:xml/outputs/report.xml" />
		<property name="marshaller" ref="reportMarshaller" />
		<property name="rootTagName" value="report" />
	</bean>

	<bean id="reportMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
		<property name="classesToBeBound">
			<list>
				<value>com.journaldev.spring.model.Report</value>
			</list>
		</property>
	</bean>

</beans>
  1. We are using FlatFileItemReader to read CSV file, CustomItemProcessor to process the data and write to XML file using StaxEventItemWriter.
  2. batch:job – This tag defines the job that we want to create. Id property specifies the ID of the job. We can define multiple jobs in a single xml file.
  3. batch:step – This tag is used to define different steps of a spring batch job.
  4. Two different types of processing style is offered by Spring Batch Framework, which are “TaskletStep Oriented” and “Chunk Oriented”. Chunk Oriented style is used in this example refers to reading the data one by one and creating ‘chunks’ that will be written out, within a transaction boundary.
  5. reader: spring bean used for reading the data. We have used csvFileItemReader bean in this example that is instance of FlatFileItemReader.
  6. processor: this is the class which is used for processing the data. We have used CustomItemProcessor in this example.
  7. writer: bean used to write data into xml file.
  8. commit-interval: This property defines the size of the chunk which will be committed once processing is done. Basically it means that ItemReader will read the data one by one and ItemProcessor will also process it the same way but ItemWriter will write the data only when it equals the size of commit-interval.
  9. Three important interface that are used as part of this project are ItemReader, ItemProcessor and ItemWriter from org.springframework.batch.item package.

Spring Batch Model Class

First of all we are reading CSV file into java object and then using JAXB to write it to xml file. Below is our model class with required JAXB annotations.

package com.journaldev.spring.model;

import java.util.Date;

import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name = "record")
public class Report {

	private int id;
	private String firstName;
	private String lastName;
	private Date dob;

	@XmlAttribute(name = "id")
	public int getId() {
		return id;
	}

	public void setId(int id) {
		this.id = id;
	}

	@XmlElement(name = "firstname")
	public String getFirstName() {
		return firstName;
	}

	public void setFirstName(String firstName) {
		this.firstName = firstName;
	}

	@XmlElement(name = "lastname")
	public String getLastName() {
		return lastName;
	}

	public void setLastName(String lastName) {
		this.lastName = lastName;
	}

	@XmlElement(name = "dob")
	public Date getDob() {
		return dob;
	}

	public void setDob(Date dob) {
		this.dob = dob;
	}

	@Override
	public String toString() {
		return "Report [id=" + id + ", firstname=" + firstName + ", lastName=" + lastName + ", DateOfBirth=" + dob
				+ "]";
	}

}

Note that the model class fields should be same as defined in the spring batch mapper configuration i.e. property name="names" value="id,firstname,lastname,dob" in our case.

Spring Batch FieldSetMapper

A custom FieldSetMapper is needed to convert a Date. If no data type conversion is required, then only BeanWrapperFieldSetMapper should be used to map the values by name automatically. The java class which extends FieldSetMapper is ReportFieldSetMapper.

package com.journaldev.spring;

import java.text.ParseException;
import java.text.SimpleDateFormat;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

import com.journaldev.spring.model.Report;

public class ReportFieldSetMapper implements FieldSetMapper<Report> {

	private SimpleDateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy");

	public Report mapFieldSet(FieldSet fieldSet) throws BindException {

		Report report = new Report();
		report.setId(fieldSet.readInt(0));
		report.setFirstName(fieldSet.readString(1));
		report.setLastName(fieldSet.readString(2));

		// default format yyyy-MM-dd
		// fieldSet.readDate(4);
		String date = fieldSet.readString(3);
		try {
			report.setDob(dateFormat.parse(date));
		} catch (ParseException e) {
			e.printStackTrace();
		}

		return report;

	}

}

Spring Batch Item Processor

Now as defined in the job configuration an itemProcessor will be fired before itemWriter. We have created a CustomItemProcessor.java class for the same.

package com.journaldev.spring;

import org.springframework.batch.item.ItemProcessor;

import com.journaldev.spring.model.Report;

public class CustomItemProcessor implements ItemProcessor<Report, Report> {

	public Report process(Report item) throws Exception {
		
		System.out.println("Processing..." + item);
		String fname = item.getFirstName();
		String lname = item.getLastName();
		
		item.setFirstName(fname.toUpperCase());
		item.setLastName(lname.toUpperCase());
		return item;
	}

}

We can manipulate data in ItemProcessor implementation, as you can see that I am converting first name and last name values to upper case.

Spring Configuration Files

In our spring batch configuration file, we have imported two additional configuration files - context.xml and database.xml.

<beans xmlns="https://www.springframework.org/schema/beans"
	xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="
		https://www.springframework.org/schema/beans 
		https://www.springframework.org/schema/beans/spring-beans-4.3.xsd">

	<!-- stored job-meta in memory -->
	<!--  
	<bean id="jobRepository"
		class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
		<property name="transactionManager" ref="transactionManager" />
	</bean>
 	 -->
 	 
 	 <!-- stored job-meta in database -->
	<bean id="jobRepository"
		class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
		<property name="dataSource" ref="dataSource" />
		<property name="transactionManager" ref="transactionManager" />
		<property name="databaseType" value="mysql" />
	</bean>
	
	<bean id="transactionManager"
		class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
	 
	<bean id="jobLauncher"
		class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
		<property name="jobRepository" ref="jobRepository" />
	</bean>

</beans>
  • jobRepository - The JobRepository is responsible for storing each Java object into its correct meta-data table for spring batch.
  • transactionManager- this is responsible for committing the transaction once size of commit-interval and the processed data is equal.
  • jobLauncher – This is the heart of spring batch. This interface contains the run method which is used to trigger the job.
<beans xmlns="https://www.springframework.org/schema/beans"
	xmlns:jdbc="https://www.springframework.org/schema/jdbc" xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="https://www.springframework.org/schema/beans 
		https://www.springframework.org/schema/beans/spring-beans-4.3.xsd
		https://www.springframework.org/schema/jdbc 
		https://www.springframework.org/schema/jdbc/spring-jdbc-4.3.xsd">

	<!-- connect to database -->
	<bean id="dataSource"
		class="org.springframework.jdbc.datasource.DriverManagerDataSource">
		<property name="driverClassName" value="com.mysql.jdbc.Driver" />
		<property name="url" value="jdbc:mysql://localhost:3306/Test" />
		<property name="username" value="test" />
		<property name="password" value="test123" />
	</bean>

	<bean id="transactionManager"
		class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />

	<!-- create job-meta tables automatically -->
	<!-- <jdbc:initialize-database data-source="dataSource"> <jdbc:script location="org/springframework/batch/core/schema-drop-mysql.sql" 
		/> <jdbc:script location="org/springframework/batch/core/schema-mysql.sql" 
		/> </jdbc:initialize-database> -->
</beans>

Spring Batch uses some metadata tables to store batch jobs information. We can get them created from spring batch configurations but it’s advisable to do it manually by executing the SQL files, as you can see in commented code above. From security point of view, it’s better to not give DDL execution access to spring batch database user.

Spring Batch Tables

Spring Batch tables very closely match the Domain objects that represent them in Java. For example - JobInstance, JobExecution, JobParameters and StepExecution map to BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_JOB_EXECUTION_PARAMS and BATCH_STEP_EXECUTION respectively. ExecutionContext maps to both BATCH_JOB_EXECUTION_CONTEXT and BATCH_STEP_EXECUTION_CONTEXT. The JobRepository is responsible for saving and storing each java object into its correct table. spring batch tables Below are the details of each meta-data table.

  1. Batch_job_instance: The BATCH_JOB_INSTANCE table holds all information relevant to a JobInstance.
  2. Batch_job_execution_params: The BATCH_JOB_EXECUTION_PARAMS table holds all information relevant to the JobParameters object.
  3. Batch_job_execution: The BATCH_JOB_EXECUTION table holds data relevant to the JobExecution object. A new row gets added every time a Job is run.
  4. Batch_step_execution: The BATCH_STEP_EXECUTION table holds all information relevant to the StepExecution object.
  5. Batch_job_execution_context: The BATCH_JOB_EXECUTION_CONTEXT table holds data relevant to an Job’s ExecutionContext. There is exactly one Job ExecutionContext for every JobExecution, and it contains all of the job-level data that is needed for that particular job execution. This data typically represents the state that must be retrieved after a failure so that a JobInstance can restart from where it had failed.
  6. Batch_step_execution_context: The BATCH_STEP_EXECUTION_CONTEXT table holds data relevant to an Step’s ExecutionContext. There is exactly one ExecutionContext for every StepExecution, and it contains all of the data that needs to persisted for a particular step execution. This data typically represents the state that must be retrieved after a failure so that a JobInstance can restart from where it failed.
  7. Batch_job_execution_seq: This table holds the data execution sequence of job.
  8. Batch_step_execution_seq: This table holds the data for sequence for step execution.
  9. Batch_job_seq: This table holds the data for sequence of job in case we have multiple jobs we will get multiple rows.

Spring Batch Test Program

Our Spring Batch example project is ready, final step is to write a test class to execute it as a java program.

package com.journaldev.spring;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class App {
	public static void main(String[] args) {

		String[] springConfig = { "spring/batch/jobs/job-batch-demo.xml" };

		ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext(springConfig);

		JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
		Job job = (Job) context.getBean("DemoJobXMLWriter");

		JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis())
				.toJobParameters();

		try {

			JobExecution execution = jobLauncher.run(job, jobParameters);
			System.out.println("Exit Status : " + execution.getStatus());

		} catch (Exception e) {
			e.printStackTrace();
		}

		System.out.println("Done");
		context.close();
	}
}

Just run above program and you will get output xml like below.

<?xml version="1.0" encoding="UTF-8"?><report><record id="1001"><dob>2013-07-29T00:00:00+05:30</dob><firstname>TOM</firstname><lastname>MOODY</lastname></record><record id="1002"><dob>2013-07-30T00:00:00+05:30</dob><firstname>JOHN</firstname><lastname>PARKER</lastname></record><record id="1003"><dob>2013-07-31T00:00:00+05:30</dob><firstname>HENRY</firstname><lastname>WILLIAMS</lastname></record></report>

That’s all for Spring Batch example, you can download final project from below link.

Download Spring Batch Example Project

Reference: Official Guide

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors
Default avatar
Pankaj

author

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
JournalDev
DigitalOcean Employee
DigitalOcean Employee badge
February 15, 2021

Can we convert the same spring batch application to spring boot batch application with out making major changes?

- Sri

    JournalDev
    DigitalOcean Employee
    DigitalOcean Employee badge
    June 20, 2019

    Hello Thank you for this Tutorial, but i have a request when i want to run the project i need the script who can create batch Database Table BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_JOB_EXECUTION_PARAMS and BATCH_STEP_EXECUTION if you can give it to me please !

    - HAJARI Youssef

      JournalDev
      DigitalOcean Employee
      DigitalOcean Employee badge
      July 24, 2018

      Hi , i followed your articles with clean examples.thanks for sharing your knowledge and clarifying doubts. i have one doubt spring batch, Question : suppose a batch running 100 records but job is failed 50th some other i need to restart the job automatically where it stopped or need skip the exception record …its like cyclic process failed record moved to list r file… can u help me …

      - veera

        JournalDev
        DigitalOcean Employee
        DigitalOcean Employee badge
        May 16, 2018

        The java class which implements** FieldSetMapper is ReportFieldSetMapper. Also that, please highlight the class/interface names which are only available from Spring Framework. Otherwise, as the readers of articles , we get wrong perception, that even, for instance, ReportFieldSetMapper is provided by Spring. On the whole, , article is very interesting. Thanks.

        - Chandra

          JournalDev
          DigitalOcean Employee
          DigitalOcean Employee badge
          December 11, 2017

          - malla

            JournalDev
            DigitalOcean Employee
            DigitalOcean Employee badge
            December 11, 2017

            Hi Pankaj Could you provide the following scripts for database.xml

            - malla

              Try DigitalOcean for free

              Click below to sign up and get $200 of credit to try our products over 60 days!

              Sign up

              Join the Tech Talk
              Success! Thank you! Please check your email for further details.

              Please complete your information!

              Featured on Community

              Get our biweekly newsletter

              Sign up for Infrastructure as a Newsletter.

              Hollie's Hub for Good

              Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

              Become a contributor

              Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

              Welcome to the developer cloud

              DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

              Learn more
              Animation showing a Droplet being created in the DigitalOcean Cloud console