4

Binary JSON with bson4jackson

 2 years ago
source link: https://michelkraemer.com/binary-json-with-bson4jackson/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Binary JSON with bson4jackson

Re­cently, JSON has be­come an ex­cel­lent al­tern­at­ive to XML. But most JSON pars­ers writ­ten in Java are still rather slow. On my search for faster lib­rar­ies I found two things: BSON and Jack­son.

BSON is bin­ary en­coded JSON. The format has been de­signed with fast ma­chine read­ab­il­ity in mind. BSON has gained prom­in­ence as the main data ex­change format for the doc­u­ment-ori­ented data­base man­age­ment sys­tem Mon­goDB. Ac­cord­ing to the JVM seri­al­izers bench­mark Jack­son is one of the fast­est JSON pro­cessors avail­able. Apart from that, Jack­son al­lows writ­ing cus­tom ex­ten­sions. This fea­ture can be used to add fur­ther data ex­change formats.

bson4jackson

This is the mo­ment where bson4­jack­son steps in. The lib­rary ex­tends Jack­son by the cap­ab­il­ity of read­ing and writ­ing BSON doc­u­ments. Since bson4­jack­son is fully in­teg­rated, you can use the very nice API of Jack­son to seri­al­ize simple PO­JOs. Think of the fol­low­ing class:

public class Person {
  private String _name;

  public void setName(String name) {
    _name = name;
  }

  public String getName() {
    return _name;
  }
}

You may use the ObjectMapper to quickly seri­al­ize ob­jects:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import com.fasterxml.jackson.databind.ObjectMapper;
import de.undercouch.bson4jackson.BsonFactory;

public class ObjectMapperSample {
  public static void main(String[] args) throws Exception {
    //create dummy POJO
    Person bob = new Person();
    bob.setName("Bob");

    //serialize data
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectMapper mapper = new ObjectMapper(new BsonFactory());
    mapper.writeValue(baos, bob);

    //deserialize data
    ByteArrayInputStream bais = new ByteArrayInputStream(
      baos.toByteArray());
    Person clone_of_bob = mapper.readValue(bais, Person.class);

    assert bob.getName().equals(clone_of_bob.getName());
  }
}

Or you may use Jack­son’s stream­ing API and seri­al­ize the ob­ject manu­ally:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import de.undercouch.bson4jackson.BsonFactory;

public class ManualSample {
  public static void main(String[] args) throws Exception {
    //create dummy POJO
    Person bob = new Person();
    bob.setName("Bob");

    //create factory
    BsonFactory factory = new BsonFactory();

    //serialize data
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    JsonGenerator gen = factory.createJsonGenerator(baos);
    gen.writeStartObject();
    gen.writeFieldName("name");
    gen.writeString(bob.getName());
    gen.close();

    //deserialize data
    ByteArrayInputStream bais = new ByteArrayInputStream(
      baos.toByteArray());
    JsonParser parser = factory.createJsonParser(bais);
    Person clone_of_bob = new Person();
    parser.nextToken();
    while (parser.nextToken() != JsonToken.END_OBJECT) {
      String fieldname = parser.getCurrentName();
      parser.nextToken();
      if ("name".equals(fieldname)) {
        clone_of_bob.setName(parser.getText());
      }
    }

    assert bob.getName().equals(clone_of_bob.getName());
  }
}

Optimized streaming

One dis­ad­vant­age of BSON is the fact that each doc­u­ment be­gins with a num­ber de­not­ing the doc­u­ment’s length. When cre­at­ing an ob­ject this length has to be known in ad­vance and bson4­jack­son is forced to buf­fer the whole doc­u­ment be­fore it can be writ­ten to the OutputStream. bson4­jack­son’s parser ig­nores this length field and so you may also leave it empty. There­fore, you have to cre­ate the BsonFactory as fol­lows:

BsonFactory fac = new BsonFactory();
fac.enable(BsonGenerator.Feature.ENABLE_STREAMING);

This trick can in­crease the seri­al­iz­a­tion per­form­ance for large doc­u­ments and re­duce the memory foot­print a lot. The of­fi­cial Mon­goDB Java driver also ig­nores the length field. So, you may also use this op­tim­iz­a­tion if your bson4­jack­son-cre­ated doc­u­ments shall be read by the Mon­goDB driver.

Performance

Ver­sion 1.1.0 of bson4­jack­son in­tro­duced sup­port for Jack­son 1.7 as well as a lot of per­form­ance im­prove­ments. At the mo­ment, bson4­jack­son is much faster than the of­fi­cial Mon­goDB driver for Java (as of Janu­ary 2011). For seri­al­iz­a­tion, this is only true us­ing the stream­ing API, since Jack­son’s ObjectMapper adds a little bit of over­head (ac­tu­ally the Mon­goDB driver also uses some kind of a stream­ing API). Deseri­al­iz­a­tion is al­ways faster. The latest bench­mark res­ults can be re­viewed on the fol­low­ing web­site:

https://github.com/eishay/jvm-serializers/wiki

Compatibility with MongoDB

In ver­sion 1.2.0 bson4­jack­son’s com­pat­ib­il­ity with Mon­goDB has been im­proved a lot. Thanks to the con­tri­bu­tion by James Roper the BsonParser class now sup­ports the new HONOR_DOCUMENT_LENGTH fea­ture which makes the parser honor the first 4 bytes of a doc­u­ment which usu­ally con­tain the doc­u­ment’s size. Of course, this only works if BsonGenerator.Feature.ENABLE_STREAMING has not been en­abled dur­ing doc­u­ment gen­er­a­tion.

This fea­ture can be use­ful for read­ing con­sec­ut­ive doc­u­ments from an in­put stream pro­duced by Mon­goDB. You can en­able it as fol­lows:

BsonFactory fac = new BsonFactory();
fac.enable(BsonParser.Feature.HONOR_DOCUMENT_LENGTH);
BsonParser parser = (BsonParser)fac.createJsonParser(...);

Compatibility with Jackson

bson4­jack­son 2.x is com­pat­ible to Jack­son 2.x and higher. Due to some com­pat­ib­il­ity is­sues both lib­rar­ies’ ma­jor and minor ver­sion num­bers have to match. That means you have to use at least bson4­jack­son 2.1 if you use Jack­son 2.1, bson4­jack­son 2.2 if you use Jack­son 2.2, etc. I will try to keep bson4­jack­son up to date. If there is a com­pat­ib­il­ity is­sue I will up­date bson4­jackon, usu­ally within a couple of days after the new Jack­son ver­sion has been re­leased.

Here’s the com­pat­ib­il­ity mat­rix for the cur­rent lib­rary ver­sions:

Jack­son 2.7.xJack­son 2.6.xJack­son 2.5.xbson4­jack­son 2.7.xYesYesYesbson4­jack­son 2.6.xNoYesYesbson4­jack­son 2.5.xNoNoYes

If you’re look­ing for a ver­sion com­pat­ible to Jack­son 1.x, please use bson4­jack­son 1.3. It’s the last ver­sion for the 1.x branch. bson4­jack­son 1.3 is com­pat­ible to Jack­son 1.7 up to 1.9.

Download

Pre-compiled binaries

Pre-com­piled bin­ary files of bson4­jack­son can be down­loaded from Maven Cent­ral. Ad­di­tion­ally, you will need a copy of Jack­son to start right away.

Maven/​Gradle/​buildr/​sbt

Al­tern­at­ively, you may also use Maven to down­load bson4­jack­son:

<dependencies>
  <dependency>
    <groupId>de.undercouch</groupId>
    <artifactId>bson4jackson</artifactId>
    <version>2.9.2</version>
  </dependency>
</dependencies>

For Gradle you may use the fol­low­ing snip­pet:

compile 'de.undercouch:bson4jackson:2.9.2'

For buildr use the fol­low­ing snip­pet:

compile.with 'de.undercouch:bson4jackson:jar:2.9.2'

If you’re us­ing sbt, you may add the fol­low­ing line to your pro­ject:

val bson4jackson = "de.undercouch" % "bson4jackson" % "2.9.2"

License

bson4­jack­son is li­censed un­der the Apache Li­cense, Ver­sion 2.0.

Un­less re­quired by ap­plic­able law or agreed to in writ­ing, soft­ware dis­trib­uted un­der the Li­cense is dis­trib­uted on an “AS IS” BASIS, WITHOUT WAR­RANTIES OR CON­DI­TIONS OF ANY KIND, either ex­press or im­plied. See the Li­cense for the spe­cific lan­guage gov­ern­ing per­mis­sions and lim­it­a­tions un­der the Li­cense.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK