Optimizing Output File Testing in Spring Batch
source link: https://keyholesoftware.com/optimizing-output-file-testing-in-spring-batch/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Optimizing Output File Testing in Spring Batch
Jonny Hackett December 12, 2023 Java, Spring, Spring Batch, Testing Leave a Comment
It’s quite common to build Spring Batch jobs in which the output is a file for distribution to another team, or to another business. These text files can be in various formats from delimited, fixed length, XML, or some other structure such as an MT950 formatted file (common in financial institutions).
In a previous article, I discussed testing practices using Mockito, but they were primarily focused on testing the business logic that’s usually found in the ItemProcessor
or services called by the ItemProcessor
.
However, this time, we’ll shift our focus to the testing of the output component of our batch jobs. To illustrate this, we will walk through an example, examining how to efficiently test the results generated by our batch job. Our example centers on a batch job that extracts a catalog of books from a database and produces a file as its output. We will then provide insights into the testing methodology, making the process more robust and less error-prone.
The Example
The batch job I’m going to use as an example in this article will read a catalog of books from a database and generate a file as its output. For the first part of this example code, we’re going to create an ItemReader
that calls a service to obtain a list of books.
Disclaimer, I’m not going to list the code for the service because we’re going to mock the return values that it provides in our unit test. This service could connect to a database to query the results or obtain the results from an API call. In this example, the implementation isn’t important.
package com.keyhole.blog.batch; import java.util.List; import org.springframework.batch.core.annotation.BeforeStep; import org.springframework.batch.item.ItemReader; import org.springframework.batch.item.NonTransientResourceException; import org.springframework.batch.item.ParseException; import org.springframework.batch.item.UnexpectedInputException; import org.springframework.batch.item.support.IteratorItemReader; import org.springframework.beans.factory.annotation.Autowired; public class BookReader implements ItemReader<Book> { @Autowired private BookService bookService; private IteratorItemReader<Book> delegateReader; @BeforeStep public void beforeStep() { List<Book> bookResults = this.bookService.findAllBooksByAuthorLastName("King"); this.delegateReader = new IteratorItemReader<>(bookResults); } @Override public Book read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException { return this.delegateReader.read(); } }
For the Book
object, I kept it simple for the sake of our example.
package com.keyhole.blog.batch; public class Book { private String title; private String authorFirstName; private String authorLastName; private String publicationDate; public String getTitle() { return title; } public void setTitle(String title) { this.title = title; } public String getAuthorFirstName() { return authorFirstName; } public void setAuthorFirstName(String authorFirstName) { this.authorFirstName = authorFirstName; } public String getAuthorLastName() { return authorLastName; } public void setAuthorLastName(String authorLastName) { this.authorLastName = authorLastName; } public String getPublicationDate() { return publicationDate; } public void setPublicationDate(String publicationDate) { this.publicationDate = publicationDate; } @Override public String toString() { return "Book [title=" + title + ", authorFirstName=" + authorFirstName + ", authorLastName=" + authorLastName + ", publicationDate=" + publicationDate + "]"; } public Book(String title, String authorFirstName, String authorLastName, String publicationDate) { super(); this.title = title; this.authorFirstName = authorFirstName; this.authorLastName = authorLastName; this.publicationDate = publicationDate; } }
Since we’re not implementing any business logic, we don’t need to create an ItemProcessor
. We are going to be outputting the results to a pipe-delimited file using a standard FileItemWriter
, so we can build the writer in the job configuration like this:
@Bean public FlatFileItemWriter<Book> bookExtractWriter() { Resource outputResource = new FileSystemResource(Path.of(System.getProperty("java.io.tmpdir"), "/BookListing.txt")); return new FlatFileItemWriterBuilder<Book>().delimited() .delimiter("|") .names("title", "authorFirstName","authorLastName","publicationDate") .saveState(false) .name("bookExtractWriter") .resource(outputResource) .build(); }
And here’s the full job configuration.
package com.keyhole.blog.batch; import java.nio.file.Path; import org.springframework.batch.core.Job; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepScope; import org.springframework.batch.item.file.FlatFileItemWriter; import org.springframework.batch.item.file.builder.FlatFileItemWriterBuilder; import org.springframework.batch.support.transaction.ResourcelessTransactionManager; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.io.FileSystemResource; import org.springframework.core.io.Resource; @Configuration public class BookExtractJobConfig { @Bean public BookService bookService() { return new BookServiceImpl(); } @Bean @StepScope public BookReader bookReader() { return new BookReader(); } @Bean public Step processBooksStep(StepBuilderFactory stepFactory) { return stepFactory.get("processBooksStep") .transactionManager(new ResourcelessTransactionManager()) .<Book,Book>chunk(500) .reader(bookReader()) .writer(bookExtractWriter()).build(); } @Bean public Job bookExtractJob(JobBuilderFactory jobBuilderFactory) { return jobBuilderFactory.get("BookExtractJob") .start(processBooksStep(null)) .build(); } @Bean public FlatFileItemWriter<Book> bookExtractWriter() { Resource outputResource = new FileSystemResource(Path.of(System.getProperty("java.io.tmpdir"), "/BookListing.txt")); return new FlatFileItemWriterBuilder<Book>().delimited() .delimiter("|") .names("title", "authorFirstName","authorLastName","publicationDate") .saveState(false) .name("bookExtractWriter") .resource(outputResource) .build(); } }
Testing
As you can see, it’s a pretty simple job that calls a service to receive data and then outputs the results to a pipe-delimited file. To test the output file, I want to introduce a couple of options.
The first is a class provided in Spring Batch Test that allows you to assert that the two files are the same. The code looks like this:
AssertFile.assertFileEquals(actualFile, expectedFile);
The other option, which I think is better, is using Apache’s Commons IO IOUtils. It has a couple of options for verifying that the contents of two files are identical, one of which ignores the end-of-line characters. The code looks something like this:
try (BufferedReader outputReader = Files.newBufferedReader(actualFile.toPath()); BufferedReader acceptanceReader = Files.newBufferedReader(expectedFile.toPath())) { // verify that the contents two files match Assertions.assertThat(IOUtils.contentEquals(outputReader, acceptanceReader)).isTrue(); // if you want to verify the contents of the two files match but ignore EOL // characters Assertions.assertThat(IOUtils.contentEqualsIgnoreEOL(outputReader, acceptanceReader)).isTrue(); }
This has made comparing the output of text files much easier than what I did previously, which was reading each file and comparing them line by line. Or if the output was JSON or XML, parsing those files into objects and comparing the results.
If we did have business logic that was being applied to the data before sending it to the FlatFileItemWriter
, we could create multiple scenarios in the mocked data that would exercise all of the variations that would generate different file output results. You still may find that you could benefit from unit tests covering the business logic in services that transform the data, but this is a good place to start.
This strategy works great if the files are text-based, such as delimited, fixed length, XML, JSON, or any other plain text data file. But what if you want to compare MS Excel files? Unfortunately, I haven’t found as simple of a solution at this time for Excel. Because of their nature, you’ll need to read them using Apache POI (or another library suitable for reading MS Excel files) and compare the sheets, rows, and cells for equality.
For your convenience, the full code for the unit test is found at the end of this blog.
Conclusion
Testing the output of Spring Batch jobs offers a significant advantage in ensuring the quality and reliability of your data processing workflows. In your Spring Batch journey, effective testing practices are pivotal, and the strategies outlined in this article can serve as a valuable starting point for building robust and reliable batch-processing applications.
I hope you found this post useful! Please let me know if you have questions in the comments section, and check out my other posts surrounding Spring and Spring Batch on the Keyhole Dev Blog.
Unit Test Full Code
package com.keyhole.blog.batch; import static org.mockito.Mockito.when; import java.io.BufferedReader; import java.io.File; import java.nio.file.Files; import java.util.ArrayList; import java.util.List; import org.apache.commons.io.IOUtils; import org.assertj.core.api.Assertions; import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import org.springframework.batch.core.ExitStatus; import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.JobParameters; import org.springframework.batch.test.AssertFile; import org.springframework.batch.test.JobLauncherTestUtils; import org.springframework.batch.test.JobRepositoryTestUtils; import org.springframework.batch.test.context.SpringBatchTest; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.boot.test.mock.mockito.MockBean; import org.springframework.core.io.ClassPathResource; import org.springframework.test.context.junit.jupiter.SpringJUnitConfig; import com.keyhole.blog.DemoApplication; @SpringBatchTest @SpringJUnitConfig(classes = DemoApplication.class) class BookExtractJobTest { @Autowired @Qualifier(value = "bookExtractJob") private Job job; @Autowired private JobLauncherTestUtils jobLauncherTestUtils; @Autowired private JobRepositoryTestUtils jobRepositoryTestUtils; @MockBean private BookService bookService; @BeforeEach void setUp() throws Exception { // setup the test execution this.jobLauncherTestUtils.setJob(this.job); // this is optional if the job is unique this.jobRepositoryTestUtils.removeJobExecutions(); // define the mocking for the book service when(this.bookService.findAllBooksByAuthorLastName("King")).thenReturn(mockedBookList()); } private List<Book> mockedBookList() { // Create the objects used in the results returned by the mocked service List<Book> books = new ArrayList<>(); books.add(new Book("The Shining", "Stephen", "King", "1977-01-08")); books.add(new Book("Dreamcatcher", "Stephen", "King", "2001-02-20")); books.add(new Book("Pet Sematary", "Stephen", "King", "1983-11-14")); books.add(new Book("The Stand", "Stephen", "King", "1978-10-03")); books.add(new Book("Salem's Lot", "Stephen", "King", "1975-10-17")); books.add(new Book("Joyland", "Stephen", "King", "2013-06-04")); return books; } @Test void test() throws Exception { // given JobParameters jobParameters = this.jobLauncherTestUtils.getUniqueJobParameters(); // when JobExecution jobExecution = this.jobLauncherTestUtils.launchJob(jobParameters); // then // Verify that the job completed successfully. Assertions.assertThat(ExitStatus.COMPLETED).isEqualTo(jobExecution.getExitStatus()); // define the files that we'll be comparing for verification. File actualFile = new File(System.getProperty("java.io.tmpdir").concat("BookListing.txt")); File expectedFile = new ClassPathResource("BookListing-approval.txt").getFile(); // verify using the Spring Batch assertion // this is deprecated in future versions due to better options AssertFile.assertFileEquals(actualFile, expectedFile); // create the buffered readers for each of the files being compared. try (BufferedReader outputReader = Files.newBufferedReader(actualFile.toPath()); BufferedReader acceptanceReader = Files.newBufferedReader(expectedFile.toPath())) { // verify that the contents two files match Assertions.assertThat(IOUtils.contentEquals(outputReader, acceptanceReader)).isTrue(); // if you want to verify the contents of the two files match but ignore EOL // characters Assertions.assertThat(IOUtils.contentEqualsIgnoreEOL(outputReader, acceptanceReader)).isTrue(); } } }
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK