Neural Networks and Perl

#perl #machinelearning #ai #datascience

Q: What is the State of the Art for creating Artificial Neural Networks with Perl?

Why would I want to use an ANN in the first place? Well, maybe I have some crime/unusual incident data that I want to correlate with the Phases of the Moon to test the Lunar Effect, but the data is noisy, the effect is non-linear or confounded by weather. For whatever reason you want to “learn” a general pattern going from input to output, neural networks are one more method in your data science toolbox.

A search of CPAN for Neural Networks yields one page of results for you to sift through. The back propagation algorithm is a nice exercise in programming and it attracted a few attempts at the beginning of the century, starting with Statistics::LTU in 1997 before there was an AI namespace in CPAN. Neural networks then get their own namespace, leading to AI::NeuralNet::BackProp, AI::NeuralNet::Mesh, AI::NeuralNet ::Simple (for those wanting a gentle introduction to AI). Perl isn’t one for naming rigidity, so there’s also AI::Perceptron, AI::NNFlex, AI::NNEasy and AI::Nerl::Network (love the speeling). AI::LibNeural is the first module in this list to wrap an external C++ library for use with Perl.

Most of these have been given the thumbs up (look for ++ icons near the name) by interested Perl users to indicate that it’s been of some use to them. It means the documentation is there, it installs and works for them. Is it right for you? NeilB puts a lot of work into his reviews, but hasn’t scratched the AI itch yet, so I’ll have to give one a try.

Sometimes trawling the CPAN dredges up interesting results you weren’t thinking about. I had no idea we had AI::PSO for running Particle Swarm Optimizations, AI::DecisionTree or AI::Categorizer to help with categorization tasks and AI::PredictionClient for TensorFlow Serving. Maybe I’ll come back to these one day. Searching specifically for [Py]Torch gets you almost nothing, but I did find AI::TensorFlow::Libtensorflow which provides bindings for the libtensorflow deep learning library.

MXNet

A flexible and efficient library for Deep Learning

AI::MXNet gets lots of love from users (not surprising given the popularity of convolutional neural networks). With a recent update for recurrent neural networks (RNN) in June 2023 and the weight of an Apache project behind the underlying library, it should be the obvious choice. But checking out the project page and decision-making disaster strikes!

MXNet had a lot of work on it, but then was retired in Sep 2023 because the Project Management Committee were unresponsive over several months, having uploaded their consciousnesses to a datacube in Iceland or maybe they just went on to other things because of … reasons.

It should still be perfectly fine to use. That Apache project had 87 contributors, so I expect it to be feature-rich and generally bug-free. Any bugs in the Perl module could be reported/fixed and you always have the source code for the library to hack on to suit your needs. I’ll skip it this time because I’m really only after a simple ANN, not the whole Deep Learning ecosystem, and I couldn’t find the package in the Fedora repository (adding the extra friction of building it myself).

FANN

A Fast Artificial Neural Network

FANN has been around for over 15 years is generally faster to train and run than either TensorFlow or PyTorch. The speed and lightweight nature make it ideal for embedded systems. Its smaller community may have an impact on your choice. From my 10 minute inspection, AI::FANN seemed to be the easier to get up to speed with. It had a short, simple example at the top of the docs that I could understand and run without much fuss.

In contrast, AI::MXNet leads with a Convolutional Neural Net (CNN) for recognizing hand-written digits in the MNIST dataset. It gives you a feel for the depth of the feature set, at the risk of intimidating the casual reader. Mind you, if I was looking for image classification (where CNNs shine) or treating history as an input (using RNNs as mentioned above), I’d put the time in going through AI::MXNet.

The downside to the original FANN site is the documentation consists of a series of blog posts that tell you all the things you can do, but not how to do them. You’re best bet is to read the examples source code like all the other C programmers out there.

Getting Started

Installation was easy. You just need the FANN build libraries (header files, etc) and the Perl module that interfaces to them. You could build from source or get libfan-dev on Ubuntu. For me on Fedora, it was just a matter of

dnf install fann-devel
cpanm AI::FANN

(See Tools for using cpanm)

To get started, I tried out the XOR example in the docs. XOR is a classic example of how a multi-layered perceptron (MLP) can tackle problems that are not linearly separable. The hidden layers of the MLP can solve problems inaccessible to single layer perceptrons. It gave me confidence in using a data structure to initialize the network and importing data from a file. An hour later, I was already scratching the itch that drew me to neural networks in the first place.

Network design and evaluation

A nice introduction is FANN’s step-by-step guide which will take you through a bit about learning rates and activation functions as you consider how to build and tweak your first neural network. There are few heuristics to go by, so just start playing around until you get a result.

Be careful that too many neurons in the hidden layers will lead to overfitting of your data. You’ll end up with a network that can reproduce the training data perfectly, but fail to learn the underlying signal you wanted to discover. You might start with something between the number of input and output neurons. And be aware that machine learning algorithms are data-hungry.

Activation functions can affect how long it takes to train your network. Previous experience with other neural network tools way back in 2005 taught us the importance of normalizing the input, ideally to a range of [-1, 1], because most of the training time was spent just adjusting the weights to the point where the real learning could begin. Use your own judgement.

While we see the train_on_data and run methods in the example, you have to look down in the docs for the test method which you’ll need to evaluate the trained network. The MSE method will tell you the Mean Squared Error for your model and lower values are better. There’s no documentation for it yet, but it should do what it says on the tin.

A network that gives you rubbish is no good, so we need to evaluate how well it has learned on the training data. The usual process is to split the dataset into training and testing sets, reserving 20-30% of the data for testing. Once the network has finished training, its weights are fixed and then run on the testing data with the network’s output compared with the expected output given in the dataset.

Cross-validation is another popular method of evaluation, splitting the dataset into 10 subsets where you train on 9 sets and test on the 10th, rotating the sets to improve the network’s response. Once you are satisfied with the performance of your network, you are ready to run it on live data. Just remember to sanity check the results while you build trust in the responses.

Going back every time and manually creating networks with different sizes of layers sounds tedious. Ideally, I’d have a script that takes the network layers and sizes as arguments and returns the evaluation score. Couple this with the Minion job queue from Mojolicious (it’s nice!) and you’d have a great tool for finding the best available neural network for the given data while you’re doing other things.

The Missing Datafile Format

The one thing not easy to find on the website is the file format specification for the datafiles, so this is what I worked out. They are space separated files of integers or floats like this

Number_of_runs Number_of_inputs Number_of_outputs
Input row 1
Output row 1
Input row 2
Output row 2
...

This is a script that will turn an array of arrayrefs from the XOR example into the file format used by libfann.


use v5.24; # postfix dereferencing is cool

my @xor_data = ( [[-1, -1], [-1] ],
                 [[-1, 1], [1] ],
                 [[1, -1], [1] ],
                 [[1, 1], [-1] ] ); 
write_datafile('xor.data', @xor_data);

sub write_datafile {
    my ($filename, @data) = @_;

    open my $fh, '>', $filename;
    my ($in, $out) = $data[0]->@*;
    say $fh join q{ }, scalar @data, scalar @$in, scalar @$out; 

    for my $test (@data) {
        say $fh join q{ }, $test->[0]->@*;
        say $fh join q{ }, $test->[1]->@*;
    }
    close $fh;
}

Your turn ...

Have you used any of these modules? Share your experience to help the next person choose. Have I missed anything or got something wrong? Let us know in the comments below.

Thank you for your time!