Skip to content

morban/bilbo2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bilbo2 : Automatic reference labeling

Bilbo2 is an open source software for automatic annotation of bibliographic reference. It provides the segmentation and tagging of input XML document. Rewritten in python3 from Scratch, it comes from BILBO. Compare to the old one, a particular attention has been paid to the possibility of easily adding new algorithms of machine learning and test parameters. It can be used as much in live systems as for research.

Installation

Dependencies

Bilbo2 requires some dependencies:

  • python3.5
  • git >= 1.7.10 (needed by github)
  • pip and setuptools , necessary for launch python installation
  • libxml2-dev

User installation

python3 setup.py install --user

The documentation includes more detailed Installation Instructions

Usage

Getting started

See docs for complete cli usage and examples for python interface usage

Author and contributors

(C)Copyright 2019 OpenEdition by Mathieu Orban Main contributors are Yann Weber, Jérémy Trione. Special acknowledgements for Yoann Dupont (https://github.com/YoannDupont)

License

Bilbo2 is free and opensource. This project is licensed under the GNU AFFERO GENERAL PUBLIC LICENCE - see the LICENSE.txt file for details

External resources used by Bilbo2

Currently it is based on Conditional Random Fields (CRFs), machine learning technique to segment and label sequence data and on Support-Vector machines, machine learning technique to classify data.

As external softwares, it is used python-crfsuite_ for CRF learning and inference and and libSVM_ is used for sequence classification.

  1. Python-crfsuite Machine learning tools to segment and label sequence data with linear-chain CRF.
  2. LibSVM A Library for Support Vector Machines by Chih-Chung Chang and Chih-Jen Lin
  3. Lxml Library for processing XML and HTML in the Python Language.
  4. setuptools: to install Bilbo2.
  5. langdetect Langage detection.

Contributing

Feel free to submit ideas, bugs reports, pull requests or regular patches.

Tests

In order to run tests, launch:

cd bilbo2
python3 -m bilbo.tests.tests

Packages

No packages published

Languages

  • Python 91.2%
  • XSLT 6.2%
  • Shell 2.6%