Building and Training a Neural Network for Image Classification Using Java and DL4J
Creating an application to classify images as either a dog or a cat involves several steps. We’ll use a deep learning model for this task. Below is a detailed guide to building such an application using Java and the DL4J library, including dataset preparation, model training, and evaluation.
Dataset Preparation
Dataset Format
For this binary classification task, you need a dataset with images of dogs and cats. The dataset should be organized into training and testing sets, with subdirectories for each class (dog and cat).
- Training Dataset: Contains labeled images for training the model.
train/dog/
: Contains images of dogs.train/cat/
: Contains images of cats.
- Testing Dataset: Contains labeled images for evaluating the model.
test/dog/
: Contains images of dogs.test/cat/
: Contains images of cats.
Example of Loading the Dataset
We’ll use the ImageRecordReader
class from DL4J to load and preprocess the images.
import org.datavec.api.io.labels.ParentPathLabelGenerator;
import org.datavec.api.split.FileSplit;
import org.datavec.api.split.InputSplit;
import org.datavec.image.loader.NativeImageLoader;
import org.datavec.image.recordreader.ImageRecordReader;
import org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.dataset.api.preprocessor.DataNormalization;
import org.nd4j.linalg.dataset.api.preprocessor.ImagePreProcessingScaler;
import java.io.File;
import java.io.IOException;
public class DataLoader {
public static DataSetIterator loadData(String dataPath, int batchSize, int height, int width, int channels, int numClasses) throws IOException {
File dataDir = new File(dataPath);
FileSplit fileSplit = new FileSplit(dataDir);
ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator();
ImageRecordReader recordReader = new ImageRecordReader(height, width, channels, labelMaker);
recordReader.initialize(fileSplit);
DataSetIterator dataSetIterator = new RecordReaderDataSetIterator(recordReader, batchSize, 1, numClasses);
DataNormalization scaler = new ImagePreProcessingScaler(0, 1);
dataSetIterator.setPreProcessor(scaler);
return dataSetIterator;
}
}
Model Definition
Define a simple neural network architecture suitable for image classification.
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.layers.DenseLayer;
import org.deeplearning4j.nn.conf.layers.OutputLayer;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.nd4j.linalg.activations.Activation;
import org.nd4j.linalg.lossfunctions.LossFunctions;
import org.nd4j.linalg.learning.config.Adam;
public class ModelBuilder {
public static MultiLayerNetwork buildModel(int inputSize, int hiddenLayerSize, int outputSize, double learningRate) {
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.updater(new Adam(learningRate))
.list()
.layer(new DenseLayer.Builder().nIn(inputSize).nOut(hiddenLayerSize)
.activation(Activation.RELU).build())
.layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.activation(Activation.SOFTMAX)
.nIn(hiddenLayerSize).nOut(outputSize).build())
.build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
return model;
}
}
Training and Evaluation
Train the model using the training dataset and evaluate it using the testing dataset.
import org.deeplearning4j.eval.Evaluation;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
public class DogCatClassifier {
public static void main(String[] args) throws Exception {
int height = 64;
int width = 64;
int channels = 3; // RGB
int numClasses = 2; // Dog and Cat
int batchSize = 32;
int numEpochs = 10;
double learningRate = 0.001;
String trainDataPath = "path/to/train";
String testDataPath = "path/to/test";
// Load data
DataSetIterator trainData = DataLoader.loadData(trainDataPath, batchSize, height, width, channels, numClasses);
DataSetIterator testData = DataLoader.loadData(testDataPath, batchSize, height, width, channels, numClasses);
// Calculate input size for fully connected layer
int inputSize = height * width * channels;
int hiddenLayerSize = 128;
// Build model
MultiLayerNetwork model = ModelBuilder.buildModel(inputSize, hiddenLayerSize, numClasses, learningRate);
// Train model
for (int epoch = 0; epoch < numEpochs; epoch++) {
model.fit(trainData);
System.out.println("Epoch " + epoch + " complete.");
}
// Evaluate model
Evaluation eval = new Evaluation(numClasses);
while (testData.hasNext()) {
DataSet ds = testData.next();
INDArray output = model.output(ds.getFeatures());
eval.eval(ds.getLabels(), output);
}
System.out.println(eval.stats());
}
}
Explanation
- Dataset Preparation:
- DataLoader: A class to load and preprocess images using
ImageRecordReader
andRecordReaderDataSetIterator
. - Directory Structure: Organize images in subdirectories (
dog
,cat
) under training and testing directories.
- DataLoader: A class to load and preprocess images using
- Model Definition:
- ModelBuilder: A class to define the neural network architecture.
- Neural Network: Consists of an input layer, one hidden layer with ReLU activation, and an output layer with Softmax activation for binary classification.
- Training and Evaluation:
- DogCatClassifier: The main class to load data, build the model, train it, and evaluate its performance.
- Training Loop: Iterates through epochs, fitting the model to the training data.
- Evaluation: Assesses the model’s performance on the testing data, calculating accuracy and other metrics.
The labels for the images (e.g., “cat” and “dog”) are defined by the directory structure of the dataset and the ParentPathLabelGenerator
used in the ImageRecordReader
. Let’s go through the process step by step, showing how the labels are defined and how the model associates them with the images.
Defining Labels in the Dataset
- Directory Structure:
- Organize your dataset such that each class has its own directory. The name of the directory is used as the label for the images contained within it.
dataset/
├── train/
│ ├── cat/
│ │ ├── cat001.jpg
│ │ ├── cat002.jpg
│ │ └── ...
│ └── dog/
│ ├── dog001.jpg
│ ├── dog002.jpg
│ └── ...
└── test/
├── cat/
│ ├── cat101.jpg
│ ├── cat102.jpg
│ └── ...
└── dog/
├── dog101.jpg
├── dog102.jpg
└── ...
Label Generation:
- The
ParentPathLabelGenerator
uses the name of the parent directory (i.e., “cat” or “dog”) as the label for each image.