narwhal-processor is a processing library that normalizes data of a known type. Current and proposed data types that can be normalized include date, country name, continent, state and province, coordinates, numeric range (altitude, depth) and scientific name.
Comments, contributions, reviews and help are welcomed.
This library is still under active development. Some parts may change based on the reviews, comments and usage. Do not hesitate to enter an Issue if you have any problems or questions.
The goal of this library is to provide a set of processing functions through a common Java interface that supports JavaBeans. This will ease the integration of the library in various biodiversity projects by providing a uniform way to access processing functions.
The narwhal-processor is meant to be used as a low-level processing library with few secondary or contextual validations. For example, given a date such as 1999-01-16, the output (if successful) will be parsed into day (16), month (01), and year (1999). However, if this date represents the date of collection, it is out of scope to determine the biological validity of Jan 16, 1999. The narwhal-processor only produces results from data that are without uncertainty.
See our wiki for all the information.
- GBIF Parsers 0.2 (included by Maven)
- Apache Commons BeanUtils 1.8.3 (included by Maven)
- canadensys-core 1.5 (included by Maven)
- JSR-310 0.6.3
To include JSR-310 into your Maven local repo
mvn install:install-file -DgroupId=threeten -DartifactId=threeten -Dversion=0.6.3 -Dpackaging=jar -Dfile=lib/threeten/threeten/0.6.3/threeten-0.6.3.jar
mvn package
Unit tests
mvn test