jplaintext library

Copyright 2010 softwarewerke

Javadoc &bull Changes &bull Project Home &bull last update: 2010-04-09

jplaintext is a Java library to deal with data from plain text input.

Plain text files are still around and sometimes a Java program wants to process data from plain text source. Reading this data into POJO could be lead to lot of boilerplate code for parsing and mapping the values.

jplaintext not only parse the plain text but also injects the values into your domain objects.

If you have to deal with plain text files from your Java code this library may help you.

Status

Actually this library is used in internal projects. If this projects proof a stable state we going to make version 1.0 public. Estimated date for version 1.0 June 2010.

License

This library is public under the Apache License Version 2.0.

Requirements

Java 1.5 or better is required.

Version

Design goals

Main Features

Next Version

Complete Example

Here comes a complete usage example. This example is included in the source distribution.

Data

We consider following input file with some of the greatest music albums ever:

#         1         2         3         4         5         6
#12345678901234567890123456789012345678901234567890123456789012
RThe Beatles              Abbey Road                 1969-09-26
RThe Rolling Stones       Exile On Main Street       1972-05-26
CNeil Young               Harvest                    1972-02-01
            

The first two lines will help us as a ruler to get the positions right.

Control

Now we get processing this data with a simple example Java main program.

public class Main {

    public static void main(String[] args) throws Exception {

        PlainText p = new PlainText();

        p.registerPlainRecord(Album.class);
        p.registerPlainConverter(Date.class, new DateConverter("yyyy-mm-dd"));
        p.setCommentChar('#');

        p.addListener(new PlainListener() {

            public void read(Object obj) {
                System.err.println(obj);
            }


            public void write(Object obj) {
            }
        });

        Reader r = new InputStreamReader(Main.class.getResourceAsStream("albums.txt"));
        p.read(r);

    }
}
            

This is the output of the program:

Read band: The Beatles
R/The Beatles/Abbey Road/69
Read band: The Rolling Stones
R/The Rolling Stones/Exile On Main Street/72
            

Neil is not on the list. We will see later why not.

So lets go step by step thru this code:

Line 05 : Create a PlainText object, the main class of this library.

Line 07 : Register a class that we expect to read in later. This example shows only one class but you may add more classes if you can expect diferent records in the input file. The given class should have some PlainField annotations. Details later.

Line 08 : Register converter. The library converts strings into Java datatypes or customer classes. This is done with classes that implements the PlainConverter interface. All native data types are already in jplaintext (int, Integer, float, Boolean, etc.). For further data you must register a PlainConverter or the import fails. In this example we specify a DataConverter (already in jplaintext) but we expect the date in a special format (ex. 2010-04-09).

Line 09 : Set a comment character. If the library sees at the very first position this caharcter the line is a comment and not processed.

Line 11-20 : Register a PlainListener. Objects of classes that implements this interface get be called when a record was read. In this example when a record is read we print the object to stderr. You may register more of them.

Line 22 : Create a Reader on a file. That is pretty straight to get some input.

Line 23 : Ok, lets go ! Read input from reader, map to objects and call the listener.

Mapping

The PlainText class generates objects of the classes we register with #registerRecord. This classes are annotated with information about the position and length of the fields.

Continuing with the example lets have a look to our domain classes. We declare two classes Music and Album (is-a Music) to show you that the library works also with inheritance.

Lets look on Music:

public class Music {

    public enum Genre {

        R, // Rock
        C // Country
    }

    Genre genre = Genre.R;

    String band;

    @PlainField(index = 1, length = 25)
    public void setBand(String band) {
        this.band = band;
    }

    @PlainField(index = 0, length = 1)
    public void setGenre(Genre genre) {
        this.genre = genre;
    }

    public Genre getGenre() {
        return genre;
    }

    @Override
    public String toString() {
        return genre+"/"+band.trim();
    }
}
            

Now the inherited class, Album:

@PlainFilter(value = "R")
public class Album extends Music implements PlainRecord {

    String name;

    Date pubDate;

    @PlainField(index = 26, length = 20)
    public void setName(String name) {
        this.name = name;
    }

    @PlainField(index = 53, length = 10)
    public void setPubDate(Date pubDate) {
        this.pubDate = pubDate;
    }

    public void read() {
        System.err.println("Read band: " + band);
    }

    public void write() {
    }

    @Override
    public String toString() {
        return super.toString() + "/" + name.trim() + "/" + pubDate.getYear();
    }

}
            

The same annotations like in the parent class. But Album also implements PlainRecord and we must provide an implementation for #read() and #write(). This methods get invoked when PlainText has read or written a record of this type as a callback.

Album also overwrites #toString(). You see the effect on the example program output.

More Examples

jplaintext may read from CSV files like this one:

Rock;The Beatles;Abbey Road;1969-09-26
Rock;The Rolling Stones;Exile On Main Street;1972-05-26
Country;Neil Young;Harvest;1972-02-01
            

Even with field names in the data source to map to member variables:

pubDate;longGenre;artist;name
1969-09-26;Rock;The Beatles;Abbey Road
1972-05-26;Rock;The Rolling Stones;Exile On Main Street
1972-02-01;Country;Neil Young;Harvest
            

You can find more examples for this cases in the source distribution in src/com/softwarewerke/jplaintext/examples .

Process Reading And Writing

You may have noted that when you working with POJO that provides access thru public methods you will need a setter when you read plain text and a getter when you write plain text. When you annotate the methods there is no need to follow the setter/getter naming conventions. You may use any name for your method.

But if you want to read and write with the same domain classes this may need to maintain duplicate annotations.

public class Album extends Music  {

    String name;

    @PlainField(index = 26, length = 20) // same as in albumName(), keep in sync !
    public void nameOfAlbum(String name) {
        this.name = name;
    }

    @PlainField(index = 26, length = 20) // same as in nameOfAlbum(), keep in sync !
    public void albumName() {
        return name;
    }
}
            

This is fine from the point of view of encapsulation but may cause errors when you have to maintain the annotations in sync. A simple typo may crash your domain model.

public class Album extends Music  {

    String name;

    @PlainField(index = 26, length = 20)
    public void nameOfAlbum(String name) {
        this.name = name;
    }

    @PlainField(index = 26, length = 2) // length!=20 !!!! TYPO !!!!
    public void albumName() {
        return name;
    }
}
            

For this reason you may declare a SetterGetter (the best name i have found so far).

A SetterGetter is a (public) method that is a setter and may also return a value and accepts exactly one variable length parameter.

Said this the last example should be written as:

public class Album extends Music  {

    String name;

    @PlainField(index = 26, length = 20)
    public String nameOfAlbum(String ... name) {
        if (name.length > 1) throw IllegalArgumentException();
        if (name.length == 0) return this.name;
        this.name = name[0];
        return name[0];
    }
            

Throwing the IllegalArgumentException is optional, but good practice. Don't forget to document this !