Using JAXB And StaxEventItemReader To Read XML Data
source link: https://keyholesoftware.com/2021/10/05/using-jaxb-and-staxeventitemreader-to-read-xml-data/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
In one of my previous Spring Batch blog articles, I wrote about the need to read a set of data, process the data, and export the transformed data into XML for consumption by another system.
In this blog, I’ll be doing the opposite. I’ll show you how to read data from an XML format instead.
Process Overview
For this particular example, we’re going to be reading an XML file that represents some basic employee contact info, parsing the XML, and logging it out to the console. Since this example is focused on the reading aspect, I won’t show any specific processing or a specific output method.
However, if there was a need to do additional transformation or processing of the data, you have the option of implementing a custom ItemProcessor
. For the output, you might have a need to store the data into a database or simply export the records to a pipe-delimited flat file.
Step 1: Creating the XML Mapped Bean and Batch ItemReader
Let’s say we’re given an XML file named contact-data.xml
. The file contains some simple data such as first name, last name, email, cell phone, and some info regarding the role of the employee in the company.
Here’s a snippet of what the XML will look like:
<EmployeeContact team="IT Operations" role="Developer" status="Full Time Employee"> <FirstName>John</FirstName> <LastName>Doe</LastName> <EmailAddress>[email protected]</EmailAddress> <CellPhone>111-543-1234</CellPhone> </EmployeeContact>
As you can see, it’s similar to the format we saw in my previous blog article for writing the XML output. The biggest difference is that I’ve added some XML attributes on the ExmployeeContact
element for providing the person’s role within the company. I could have achieved similar results using additional elements, but I wanted a simple example that would show how XML attributes are parsed.
Now that we know the format of the XML file, we’ll need to create a class that uses JAXB 2.0 binding annotations. These will provide direction on how to map the XML class to the marshaling engine.
Here’s the code for this new EmployeeContactXML
class that will be used to map the XML file. Getters and setters have been removed for brevity but can be found in the full code listing at the end of the article.
@XmlAccessorType(XmlAccessType.FIELD) @XmlType(propOrder = { "firstName", "lastName", "emailAddress", "cellPhone" }) @XmlRootElement(name = "EmployeeContact") public class EmployeeContactXml { @XmlAttribute(required = true) protected String team; @XmlAttribute(required =true) protected String role; @XmlAttribute(required=true) protected String status; @XmlElement(name = "FirstName", required = true) protected String firstName; @XmlElement(name = "LastName", required = true) protected String lastName; @XmlElement(name = "EmailAddress", required = true) protected String emailAddress; @XmlElement(name = "CellPhone", required = true) protected String cellPhone; }
For reference, here’s an explanation of the different annotations used in this example.
@XmlAccessorType
: This defines whether the fields and properties of a class will be serialized. In this example, I’ve set the value toXmlAccessType.FIELD
, which means that every non-static, nontransient field will be automatically bound to XML unless annotated by@XmlTransient
. The names for the XML elements will be derived by default from the field names.@XmlType
: This allows us to define additional properties such as mapping the class to a specific schema, namespace, and specific order of children. In this specific case, we’re only using it to define the particular order of elements.@XmlRootElement
: This is used to map the class to a specific root element. By default, it derives the root element tag from the class name. For this example, we’re specifying a different name.@XmlElement
: This maps a JavaBean property to an XML element derived from the property name by default.@XmlAttribute
: This maps a JavaBean property to an XML attribute derived from the property name by default.
Alright, so at this point, our class is defined so that the marshaller will be used to unmarshal the XML and map it to the object. Since this is set up, we can now define the ItemReader
bean in the job configuration.
For this, we’ll use the StaxEventItemReader
that is provided with Spring Batch. Here’s how we define that in the Spring Batch job configuration. I’ve added some comments to note what’s going on.
@Bean @StepScope public StaxEventItemReader<EmployeeContactXml> employeeContactsReader() { //define the resource that the reader will be consuming Resource resource = new FileSystemResource("/c:/dev/data/contact-data.xml"); //instantiate a new StaxEventItemReader binding the ExmployeeContactXml class StaxEventItemReader<EmployeeContactXml> xmlFileReader = new StaxEventItemReader<>(); //set the resource on the xmlFileReader xmlFileReader.setResource(resource); //define the root element of the xml fragment xmlFileReader.setFragmentRootElementName("EmployeeContact"); //instantiate a new Jaxb2Marshaller Jaxb2Marshaller xmlMarshaller = new Jaxb2Marshaller(); //define the Jaxb annotated classes to be recognized in the JAXBContext xmlMarshaller.setClassesToBeBound(EmployeeContactXml.class); //define the marshaller that maps xml fragments to objects xmlFileReader.setUnmarshaller(xmlMarshaller); return xmlFileReader; }
Step 2: Creating an ItemWriter To Display the Results
Since the focus of this article is reading XML files, there isn’t an ItemProcessor
to plug into the job’s step. To display the resulting output, I’ve created a simple ItemWriter
that calls the toString
on each object sent to the writer.
Here’s the code for that:
public class EmployeeContactWriter implements ItemWriter<EmployeeContactXml> { private static final Logger LOGGER = LoggerFactory.getLogger(EmployeeContactWriter.class); @Override public void write(List<? extends EmployeeContactXml> items) throws Exception { for ( EmployeeContactXml contact : items) { LOGGER.info("Writing contact: {}", contact); } } }
Here’s the resulting output from the writer:
[SimpleJob: [name=Exmployee-Contact-Processing-Job]] launched with the following parameters: [{}] Executing step: [processEmployeeContactsFile] Writing contact: ContactXml [team=IT Operations, role=Developer, status=Contractor, firstName=John, lastName=Smith, [email protected], cellPhone=111-333-4444] Writing contact: ContactXml [team=IT Operations, role=Developer, status=Full Time Employee, firstName=John, lastName=Doe, [email protected], cellPhone=111-543-1234] Writing contact: ContactXml [team=Human Resources, role=Manager, status=Full Time Employee, firstName=Jane, lastName=Doe, [email protected], cellPhone=111-463-8583] Writing contact: ContactXml [team=Finance, role=Director, status=Full Time Employee, firstName=Jimmy, lastName=Lovine, [email protected], cellPhone=111-234-9367] Step: [processEmployeeContactsFile] executed in 75ms Job: [SimpleJob: [name=Exmployee-Contact-Processing-Job]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 83ms
In Summary
As you can see, it’s fairly straightforward and simple to consume XML-based data using the provided StaxEventItemReader
in Spring Batch. Although this was a pretty simple XML document, it’s entirely possible to use the same pattern to read much more complex XML data.
In a real-world example, this job would likely have implemented a custom ItemProcessor
to further transform or enrich the data that was consumed from the XML. Because of the way Spring Batch works, there are several out-of-the-box ItemWriters
provided that could be plugged into this job configuration.
Thank you for reading, and please let me know if you have any questions in the comments below!
Complete Code Listing
Just so you have it all in one place for convenient access, here is the complete code listing for the example. I hope you find it useful!
XML Document
<?xml version="1.0" encoding="UTF-8"?> <EmployeeContacts> <EmployeeContact team="IT Operations" role="Developer" status="Contractor"> <FirstName>John</FirstName> <LastName>Smith</LastName> <EmailAddress>[email protected]</EmailAddress> <CellPhone>111-333-4444</CellPhone> </EmployeeContact> <EmployeeContact team="IT Operations" role="Developer" status="Full Time Employee"> <FirstName>John</FirstName> <LastName>Doe</LastName> <EmailAddress>[email protected]</EmailAddress> <CellPhone>111-543-1234</CellPhone> </EmployeeContact> <EmployeeContact team="Human Resources" role="Manager" status="Full Time Employee"> <FirstName>Jane</FirstName> <LastName>Doe</LastName> <EmailAddress>[email protected]</EmailAddress> <CellPhone>111-463-8583</CellPhone> </EmployeeContact> <EmployeeContact team="Finance" role="Director" status="Full Time Employee"> <FirstName>Jimmy</FirstName> <LastName>Lovine</LastName> <EmailAddress>[email protected]</EmailAddress> <CellPhone>111-234-9367</CellPhone> </EmployeeContact> </EmployeeContacts>
Job Configuration:
package com.example.demo.batch.xml.read; import org.springframework.batch.core.Job; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepScope; import org.springframework.batch.item.xml.StaxEventItemReader; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.io.FileSystemResource; import org.springframework.core.io.Resource; import org.springframework.oxm.jaxb.Jaxb2Marshaller; @Configuration @EnableBatchProcessing public class EmployeeContactProcessingJobConfig { public static final String JOB_NAME = "Exmployee-Contact-Processing-Job"; @Autowired private JobBuilderFactory jobBuilderFactory; @Autowired private StepBuilderFactory stepBuilderFactory; @Bean public Step processEmployeeContactsFile() { return this.stepBuilderFactory.get("processEmployeeContactsFile").<EmployeeContactXml, EmployeeContactXml>chunk(100).reader(employeeContactsReader()) .writer(employeeContactsWriter()).build(); } @Bean public Job processEmployeeContactsFileJob() { return this.jobBuilderFactory.get(JOB_NAME).start(processEmployeeContactsFile()).build(); } @Bean @StepScope public StaxEventItemReader<EmployeeContactXml> employeeContactsReader() { //define the resource that the reader will be consuming Resource resource = new FileSystemResource("/c:/dev/data/contact-data.xml"); //instantiate a new StaxEventItemReader binding the ExmployeeContactXml class StaxEventItemReader<EmployeeContactXml> xmlFileReader = new StaxEventItemReader<>(); //set the resource on the xmlFileReader xmlFileReader.setResource(resource); //define the root element of the xml fragment xmlFileReader.setFragmentRootElementName("EmployeeContact"); //instantiate a new Jaxb2Marshaller Jaxb2Marshaller xmlMarshaller = new Jaxb2Marshaller(); //define the Jaxb annotated classes to be recognized in the JAXBContext xmlMarshaller.setClassesToBeBound(EmployeeContactXml.class); //define the marshaller that maps xml fragments to objects xmlFileReader.setUnmarshaller(xmlMarshaller); return xmlFileReader; } @Bean @StepScope public EmployeeContactWriter employeeContactsWriter() { return new EmployeeContactWriter(); } }
EmployeeContactXml
package com.example.demo.batch.xml.read; import javax.xml.bind.annotation.XmlAccessType; import javax.xml.bind.annotation.XmlAccessorType; import javax.xml.bind.annotation.XmlAttribute; import javax.xml.bind.annotation.XmlElement; import javax.xml.bind.annotation.XmlRootElement; import javax.xml.bind.annotation.XmlType; @XmlAccessorType(XmlAccessType.FIELD) @XmlType(propOrder = { "firstName", "lastName", "emailAddress", "cellPhone" }) @XmlRootElement(name = "EmployeeContact") public class EmployeeContactXml { @XmlAttribute(required = true) protected String team; @XmlAttribute(required =true) protected String role; @XmlAttribute(required=true) protected String status; @XmlElement(name = "FirstName", required = true) protected String firstName; @XmlElement(name = "LastName", required = true) protected String lastName; @XmlElement(name = "EmailAddress", required = true) protected String emailAddress; @XmlElement(name = "CellPhone", required = true) protected String cellPhone; public String getCellPhone() { return cellPhone; } public String getEmailAddress() { return emailAddress; } public String getFirstName() { return firstName; } public String getLastName() { return lastName; } public String getRole() { return role; } public String getStatus() { return status; } public String getTeam() { return team; } public void setCellPhone(String cellPhone) { this.cellPhone = cellPhone; } public void setEmailAddress(String emailAddress) { this.emailAddress = emailAddress; } public void setFirstName(String firstName) { this.firstName = firstName; } public void setLastName(String lastName) { this.lastName = lastName; } public void setRole(String role) { this.role = role; } public void setStatus(String status) { this.status = status; } public void setTeam(String team) { this.team = team; } @Override public String toString() { return "ContactXml [team=" + team + ", role=" + role + ", status=" + status + ", firstName=" + firstName + ", lastName=" + lastName + ", emailAddress=" + emailAddress + ", cellPhone=" + cellPhone + "]"; } }
EmployeeContactWriter
package com.example.demo.batch.xml.read; import java.util.List; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.batch.item.ItemWriter; public class EmployeeContactWriter implements ItemWriter<EmployeeContactXml> { private static final Logger LOGGER = LoggerFactory.getLogger(EmployeeContactWriter.class); @Override public void write(List<? extends EmployeeContactXml> items) throws Exception { for ( EmployeeContactXml contact : items) { LOGGER.info("Writing contact: {}", contact); } } }
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK