These three Problem Based Learning projects explore complementary neural network architectures, each tackling a distinct domain. Two studies address the same regression challenge Tabular Regression (Healthcare Analytics) predicting medical insurance charges from emographic and health factors using a Multi-Layer Perceptron (MLP). The primary objective is to develop a predictive model that estimates insurance charges based on various demographic and health-related factors such as age, bmi, smoking status, and region. The Radial Basis Function (RBF) model, a powerful machine learning approach, is used to train and predict insurance costs by capturing complex patterns in the data., allowing a direct architectural comparison on identical data.
The third study shifts to computer vision: a custom Convolutional Neural Network (CNN) trained from scratch to classify coastal objects that are shells versus pebbles working across a dataset of over 4,000 labelled images. Developed the CNN model for binary image classification, trained on a dataset containing two distinct image categories. The model achieved a training accuracy and a validation accuracy demonstrating its ability to learn meaningful patterns from image data and generalize to unseen samples with minimal overfitting. Together, the three projects cover the breadth of supervised ML, from tabular regression to image classification.
All models were trained, evaluated, and documented end-to-end: dataset preprocessing, architecture design, training runs, and performance analysis against standard metrics (MSE, R², accuracy, precision/recall). The common thread is understanding how different inductive biases, dense layers, kernel transformations, and spatial convolutions shape what a model learns and how well it generalises.
A Multi-Layer Perceptron with three hidden layers (200→100→50 neurons, ReLU activation) trained to predict insurance charges from age, BMI, smoking status, children count, sex, and region. Input encoded via Label Encoding, normalised with Min-Max Scaler. Trained for 2,000 iterations with backpropagation. The combined use of RBF feature mapping alongside MLP helped capture complex non-linear patterns in the tabular data.
A Radial Basis Function network implemented via Kernel Ridge Regression (α=0.001, γ=0.08), enhanced with K-Means feature engineering (45 clusters, Elbow Method). The RBF kernel K(x,x') = exp(−γ|x−x'|²) maps inputs to a higher-dimensional space, improving class separability. StandardScaler normalisation and 80/20 train-test split. Smoking status emerged as the dominant predictor (correlation 0.79 with charges).
A custom three-layer CNN (Conv2D 32→64→128, 3×3 kernels, ReLU + MaxPooling) built from scratch in TensorFlow/Keras for binary image classification. Trained on 4,284 images (150×150px) with a 80/20 split. Dropout(0.5) for regularisation, Adam optimiser, binary cross-entropy loss. 10 epochs. Pebble class achieved F1=0.82; Shell class F1=0.62, reflecting class imbalance (548 pebbles vs 308 shells in validation).
The two regression models reach near-identical R² scores (~86.7-86.8% training) on the same insurance dataset, but with very different learning mechanisms. The RBF model generalises slightly better (test R² 85.78% vs 84.27%), suggesting kernel methods may be more robust to the non-linear smoker-charge relationship that dominates this dataset. The CNN's ~11% train-to-val gap signals room for improvement via data augmentation or transfer learning.