Reader, Processor, Writer
The heart of a chunk-oriented step is the read → process → write trio. An ItemReader supplies items one at a time, an ItemProcessor transforms or filters each, and an ItemWriter persists them in batches. Spring Batch ships ready-made readers and writers for the common sources — flat files, JDBC, JPA — so most jobs are configuration, not code. This page wires them into a complete CSV→database import. For how the surrounding step and job are built, see Jobs & Steps.
ItemReader — where data comes from
A reader’s read() returns the next item or null to signal the end of input. You rarely implement it yourself; the built-ins cover most sources.
FlatFileItemReader (CSV)
FlatFileItemReader parses delimited or fixed-width files into objects. Build it with FlatFileItemReaderBuilder.
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.core.io.ClassPathResource;
@Bean
FlatFileItemReader<CustomerCsv> customerReader() {
return new FlatFileItemReaderBuilder<CustomerCsv>()
.name("customerReader")
.resource(new ClassPathResource("customers.csv"))
.linesToSkip(1) // skip header row
.delimited()
.names("firstName", "lastName", "email")
.targetType(CustomerCsv.class) // maps columns -> record/bean
.build();
}
firstName,lastName,email
Ada,Lovelace,[email protected]
Alan,Turing,[email protected]
A simple record models a CSV row well — see Java records:
public record CustomerCsv(String firstName, String lastName, String email) {}
JdbcCursorItemReader (database)
To read from a database, stream rows with JdbcCursorItemReader. It holds a single cursor over the result set, so memory stays flat even for huge tables.
import javax.sql.DataSource;
import org.springframework.batch.item.database.JdbcCursorItemReader;
import org.springframework.batch.item.database.builder.JdbcCursorItemReaderBuilder;
import org.springframework.jdbc.core.DataClassRowMapper;
@Bean
JdbcCursorItemReader<CustomerCsv> jdbcReader(DataSource dataSource) {
return new JdbcCursorItemReaderBuilder<CustomerCsv>()
.name("jdbcReader")
.dataSource(dataSource)
.sql("SELECT first_name AS firstName, last_name AS lastName, email FROM legacy_customer")
.rowMapper(new DataClassRowMapper<>(CustomerCsv.class))
.build();
}
| Reader | Source | Notes |
|---|---|---|
FlatFileItemReader | CSV / fixed-width files | header skipping, delimited or fixed-length |
JdbcCursorItemReader | SQL via a single cursor | low memory, no restart-from-position |
JdbcPagingItemReader | SQL paged by key | restartable, safer for very large tables |
JpaPagingItemReader | JPQL query | uses the EntityManager; paged |
StaxEventItemReader | XML | streams XML fragments |
Tip: Cursor readers are simplest but tie up one connection for the whole step. For multi-million-row jobs, or anything that must restart mid-step, prefer the paging readers (
JdbcPagingItemReader/JpaPagingItemReader), which sort by a key and fetch a page at a time.
ItemProcessor — transform and filter
The processor is the only optional member of the trio. It receives one input item and returns one output item — possibly a different type. Returning null filters the item out: it is silently dropped and never reaches the writer.
import org.springframework.batch.item.ItemProcessor;
import org.springframework.lang.Nullable;
public class CustomerProcessor implements ItemProcessor<CustomerCsv, CustomerEntity> {
@Override
@Nullable
public CustomerEntity process(CustomerCsv in) {
if (in.email() == null || !in.email().contains("@")) {
return null; // filter out rows with bad emails
}
CustomerEntity e = new CustomerEntity();
e.setFullName((in.firstName() + " " + in.lastName()).trim());
e.setEmail(in.email().toLowerCase());
return e;
}
}
To run several transformations in order, compose them with a CompositeItemProcessor. Keep validation in a processor (or a ValidatingItemProcessor) rather than the reader so bean validation errors flow through skip policies.
ItemWriter — where data lands
A writer receives a Chunk<T> — the whole batch of processed items — and persists it in one go, then the transaction commits.
JdbcBatchItemWriter
For plain JDBC, JdbcBatchItemWriter uses a parameterized statement and JDBC batching, which is the fastest way to bulk-insert.
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.database.builder.JdbcBatchItemWriterBuilder;
@Bean
JdbcBatchItemWriter<CustomerEntity> jdbcWriter(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<CustomerEntity>()
.dataSource(dataSource)
.sql("INSERT INTO customer (full_name, email) VALUES (:fullName, :email)")
.beanMapped() // bind :fullName/:email from bean properties
.build();
}
JpaItemWriter
If your target is a JPA entity, JpaItemWriter merges each item through the EntityManagerFactory.
import jakarta.persistence.EntityManagerFactory;
import org.springframework.batch.item.database.JpaItemWriter;
import org.springframework.batch.item.database.builder.JpaItemWriterBuilder;
@Bean
JpaItemWriter<CustomerEntity> jpaWriter(EntityManagerFactory emf) {
return new JpaItemWriterBuilder<CustomerEntity>()
.entityManagerFactory(emf)
.build();
}
Tip:
JdbcBatchItemWriteris generally faster and gives you direct control over the SQL;JpaItemWriteris convenient when you already have mapped JPA entities. For upserts, write database-specific SQL (INSERT ... ON CONFLICT/ON DUPLICATE KEY UPDATE) with the JDBC writer to keep restarts idempotent.
A full CSV → DB job
Putting the trio together: read customers.csv, validate and reshape each row, and bulk-insert. The entity:
import jakarta.persistence.*;
@Entity
@Table(name = "customer")
public class CustomerEntity {
@Id @GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String fullName;
private String email;
// getters and setters
}
The step and job — note the <CustomerCsv, CustomerEntity> chunk types because the processor changes type:
@Configuration
public class CustomerImportConfig {
@Bean
ItemProcessor<CustomerCsv, CustomerEntity> processor() {
return new CustomerProcessor();
}
@Bean
Step importStep(JobRepository jobRepository,
PlatformTransactionManager txManager,
FlatFileItemReader<CustomerCsv> customerReader,
ItemProcessor<CustomerCsv, CustomerEntity> processor,
JdbcBatchItemWriter<CustomerEntity> jdbcWriter) {
return new StepBuilder("importStep", jobRepository)
.<CustomerCsv, CustomerEntity>chunk(500, txManager)
.reader(customerReader)
.processor(processor)
.writer(jdbcWriter)
.build();
}
@Bean
Job customerImportJob(JobRepository jobRepository, Step importStep) {
return new JobBuilder("customerImportJob", jobRepository)
.start(importStep)
.build();
}
}
Output (console):
INFO o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=customerImportJob]] launched
INFO o.s.batch.core.job.SimpleStepHandler : Executing step: [importStep]
INFO o.s.batch.core.step.AbstractStep : Step: [importStep] executed in 412ms
INFO o.s.b.c.l.support.SimpleJobLauncher : Job: [customerImportJob] completed with status: [COMPLETED]
With a chunk size of 500 and 2,000 valid rows, the writer fires four batch inserts and commits four times; any row the processor returned null for is filtered out and never reaches the writer.