Building and Training a Neural Network for Image Classification Using Java and DL4J

Creating an application to classify images as either a dog or a cat involves several steps. We’ll use a deep learning model for this task. Below is a detailed guide to building such an application using Java and the DL4J library, including dataset preparation, model training, and evaluation.

Dataset Preparation

Dataset Format

For this binary classification task, you need a dataset with images of dogs and cats. The dataset should be organized into training and testing sets, with subdirectories for each class (dog and cat).

  • Training Dataset: Contains labeled images for training the model.
    • train/dog/: Contains images of dogs.
    • train/cat/: Contains images of cats.
  • Testing Dataset: Contains labeled images for evaluating the model.
    • test/dog/: Contains images of dogs.
    • test/cat/: Contains images of cats.

Example of Loading the Dataset

We’ll use the ImageRecordReader class from DL4J to load and preprocess the images.

import org.datavec.api.io.labels.ParentPathLabelGenerator;
import org.datavec.api.split.FileSplit;
import org.datavec.api.split.InputSplit;
import org.datavec.image.loader.NativeImageLoader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.preprocessor.DataNormalization;
import org.nd4j.linalg.dataset.api.preprocessor.ImagePreProcessingScaler;

import java.io.File;
import java.io.IOException;

public class DataLoader {

    public static DataSetIterator loadData(String dataPath, int batchSize, int height, int width, int channels, int numClasses) throws IOException {
        File dataDir = new File(dataPath);
        FileSplit fileSplit = new FileSplit(dataDir);
        ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator();

        ImageRecordReader recordReader = new ImageRecordReader(height, width, channels, labelMaker);
        recordReader.initialize(fileSplit);

        DataSetIterator dataSetIterator = new RecordReaderDataSetIterator(recordReader, batchSize, 1, numClasses);

        DataNormalization scaler = new ImagePreProcessingScaler(0, 1);
        dataSetIterator.setPreProcessor(scaler);

        return dataSetIterator;
    }
}

Model Definition

Define a simple neural network architecture suitable for image classification.

import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import org.nd4j.linalg.learning.config.Adam;

public class ModelBuilder {

    public static MultiLayerNetwork buildModel(int inputSize, int hiddenLayerSize, int outputSize, double learningRate) {
        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .updater(new Adam(learningRate))
                .list()
                .layer(new DenseLayer.Builder().nIn(inputSize).nOut(hiddenLayerSize)
                        .activation(Activation.RELU).build())
                .layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
                        .activation(Activation.SOFTMAX)
                        .nIn(hiddenLayerSize).nOut(outputSize).build())
                .build();

        MultiLayerNetwork model = new MultiLayerNetwork(conf);
        model.init();

        return model;
    }
}

Training and Evaluation

Train the model using the training dataset and evaluate it using the testing dataset.





import org.deeplearning4j.eval.Evaluation;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;

public class DogCatClassifier {

    public static void main(String[] args) throws Exception {
        int height = 64;
        int width = 64;
        int channels = 3; // RGB
        int numClasses = 2; // Dog and Cat
        int batchSize = 32;
        int numEpochs = 10;
        double learningRate = 0.001;

        String trainDataPath = "path/to/train";
        String testDataPath = "path/to/test";

        // Load data
        DataSetIterator trainData = DataLoader.loadData(trainDataPath, batchSize, height, width, channels, numClasses);
        DataSetIterator testData = DataLoader.loadData(testDataPath, batchSize, height, width, channels, numClasses);

        // Calculate input size for fully connected layer
        int inputSize = height * width * channels;
        int hiddenLayerSize = 128;

        // Build model
        MultiLayerNetwork model = ModelBuilder.buildModel(inputSize, hiddenLayerSize, numClasses, learningRate);

        // Train model
        for (int epoch = 0; epoch < numEpochs; epoch++) {
            model.fit(trainData);
            System.out.println("Epoch " + epoch + " complete.");
        }

        // Evaluate model
        Evaluation eval = new Evaluation(numClasses);
        while (testData.hasNext()) {
            DataSet ds = testData.next();
            INDArray output = model.output(ds.getFeatures());
            eval.eval(ds.getLabels(), output);
        }
        System.out.println(eval.stats());
    }
}

Explanation

  1. Dataset Preparation:
    • DataLoader: A class to load and preprocess images using ImageRecordReader and RecordReaderDataSetIterator.
    • Directory Structure: Organize images in subdirectories (dog, cat) under training and testing directories.
  2. Model Definition:
    • ModelBuilder: A class to define the neural network architecture.
    • Neural Network: Consists of an input layer, one hidden layer with ReLU activation, and an output layer with Softmax activation for binary classification.
  3. Training and Evaluation:
    • DogCatClassifier: The main class to load data, build the model, train it, and evaluate its performance.
    • Training Loop: Iterates through epochs, fitting the model to the training data.
    • Evaluation: Assesses the model’s performance on the testing data, calculating accuracy and other metrics.

The labels for the images (e.g., “cat” and “dog”) are defined by the directory structure of the dataset and the ParentPathLabelGenerator used in the ImageRecordReader. Let’s go through the process step by step, showing how the labels are defined and how the model associates them with the images.

Defining Labels in the Dataset

  1. Directory Structure:
    • Organize your dataset such that each class has its own directory. The name of the directory is used as the label for the images contained within it.




dataset/
├── train/
│   ├── cat/
│   │   ├── cat001.jpg
│   │   ├── cat002.jpg
│   │   └── ...
│   └── dog/
│       ├── dog001.jpg
│       ├── dog002.jpg
│       └── ...
└── test/
    ├── cat/
    │   ├── cat101.jpg
    │   ├── cat102.jpg
    │   └── ...
    └── dog/
        ├── dog101.jpg
        ├── dog102.jpg
        └── ...

Label Generation:

  • The ParentPathLabelGenerator uses the name of the parent directory (i.e., “cat” or “dog”) as the label for each image.

You may also like...