Skip to content

Waikato/acceleratedWEKA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

acceleratedWEKA - easy GPU support using WEKA

GPLv3 license

contributions welcome

Accelerated WEKA unifies the WEKA software, a well-known and open-source Java software, with new technologies that leverage the GPU to shorten the execution time of ML algorithms. It has two benefits aimed at users without expertise in system configuration and coding: an easy installation and a GUI that guides the configuration and execution of the ML tasks. Accelerated WEKA is a collection of packages available for WEKA (e.g., WDL4J, wekaPython, and wekaRAPIDS). Accelerated WEKA can be easiy installed and anyone can extend it to support new tools and algorithms.

Installing acceleratedWEKA

Accelerated WEKA was designed to provide an easy installation process. Accelerated WEKA simplifies the installation process by using the conda environment. This makes straightforward to use Accelerated WEKA from the beginning. Once you have conda installed, Accelerated WEKA can be installed by issuing the following two commands:

Creating the conda environment

$ conda create --solver=libmamba -n accelweka -c rapidsai -c conda-forge -c nvidia -c waikato weka

Activating the conda environment

$ conda activate accelweka

Conda takes care of the configuration of dependencies. This means the required libraries will be installed and automatically configured. You do not need to go through any manual setup.

After finishing the installation and activation steps, you can start using Accelerated WEKA immediately by launching the WEKA GUI:

$ weka

The WEKA package is located in:

/path/to/conda/env/pkgs/weka

Using acceleratedWEKA

As most of Weka, AcceleratedWEKA's functionality is accessible in two ways:

  • Using the Weka workbench GUI
  • Via the commandline interface

Both ways are explained in the getting-started documentation.

A simple example that creates a dataset and runs a Support Vector Machine with it would look like the following:

$ weka -main weka.Run .RandomRBF -n 10000 -a 5000 > RBFa5kn10k.arff

$ weka -memory 12g -main weka.Run weka.classifiers.rapids.CuMLClassifier -split-percentage 80 -learner SVC -t $(pwd)/RBFa5kn10k.arff -py-command python

which results in:

Options: -learner SVC -py-command python 

=== Classifier model (full training set) ===

SVC()

Time taken to build model: 24.93 seconds

Time taken to test model on training data: 13.3 seconds

=== Error on training data ===

Correctly Classified Instances       10000              100      %
Incorrectly Classified Instances         0                0      %
Kappa statistic                          1     
Mean absolute error                      0     
Root mean squared error                  0     
Relative absolute error                  0      %
Root relative squared error              0      %
Total Number of Instances            10000     


=== Detailed Accuracy By Class ===

                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     c0
                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     c1
Weighted Avg.    1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     


=== Confusion Matrix ===

    a    b   <-- classified as
 5185    0 |    a = c0
    0 4815 |    b = c1

Time taken to test model on test split: 1.13 seconds

=== Error on test split ===

Correctly Classified Instances        2000              100      %
Incorrectly Classified Instances         0                0      %
Kappa statistic                          1     
Mean absolute error                      0     
Root mean squared error                  0     
Relative absolute error                  0      %
Root relative squared error              0      %
Total Number of Instances             2000     


=== Detailed Accuracy By Class ===

                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class
                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     c0
                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     c1
Weighted Avg.    1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     


=== Confusion Matrix ===

    a    b   <-- classified as
 1041    0 |    a = c0
    0  959 |    b = c1

Documentation

The full documentation, giving installation instructions and getting started guides, is available at https://waikato.github.io/acceleratedWEKA/.

Related projects

Misc.

Original code by Justin Liu