This work tries to solve the problem of Visual aesthetic analysis without using the prevalent state of the art approaches which generally use pretrained weights on imagenet dataset to train deep convolutional neural networks. We have proposed a new neural network architecture which can model the data effectively by taking both low level and high level features into account. It is a variant of DenseNets which has a skip connection at the end of every dense block. Besides this, we also propose training techniques that can increase the accuracy with which the algorithm trains. These techniques are to train on LAB color space and to use similar images in a minibatch to train the algorithm, which we call coherent learning. Using these we get an accuracy of 78.7% of AVA2 dataset. The state of the art accuracy on AVA2 dataset is 85.6% which uses a deep Convolutional Neural Network with pretrained weights on imagenet dataset. The best accuracy on AVA2 dataset using hand crafted features is 68.55%. We also show that adding more data to our training set (from AVA dataset which is not included in AVA2) increases its accuracy to 81.48% on AVA2 Test Set, hence showing the model gets better with more data.
The entire point of the work is to have a network that has a large variance to model image virality from the huge image and their likes dataset we can crawl from the open web. We train it on our inhouse dataset and make it available as Virality Detection API.