You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apache OpenNLP 2.5.0 has been released. This version contains new implementations of TokenNameFinder et al., that are Thread-Safe. Moreover, models for many new languages (32, as of Nov 2024) are now available. Those models are also available as Maven artifacts.
Apache OpenNLP 2.5.0 requires Java 17 and should be fully compatible with Java 21.
This task is update the OpenNLP dependency version to 2.5.x (x >= 0). Note: Release 2.5.1 is expected in December 2024.
The text was updated successfully, but these errors were encountered:
I was looking into this (trying to upgrade to 2.5.1) and initially ran into some failing test cases.
It looks like they were all related to the switch of the default POSTagFormat from Penn to UD. I was able to get all the tests passing by changing this line:
@msfroh Thx for checking. The option (PENN format) you chose is the quick option for updating to 2.5.x (hint: x=2 released today).
The UD format will give the Lucene project a possibility to rely on a wider range of models for 32 languages, we have trained and published (see: OpenNLP models page) recently. Might be an option for 2025 and onwards: just switch to the UD model files and the corresponding format.
Description
Apache OpenNLP 2.5.0 has been released. This version contains new implementations of TokenNameFinder et al., that are Thread-Safe. Moreover, models for many new languages (32, as of Nov 2024) are now available. Those models are also available as Maven artifacts.
Apache OpenNLP 2.5.0 requires Java 17 and should be fully compatible with Java 21.
This task is update the OpenNLP dependency version to 2.5.x (x >= 0). Note: Release 2.5.1 is expected in December 2024.
The text was updated successfully, but these errors were encountered: