Jakub Zak, Michal K Grzeszczyk, Antonina Pater, Lukasz Roszkowiak, Krzysztof Siemion, Anna Korzynska

One of the solutions to the problem of insufficiently large training datasets in image processing is data augmentation. This process artificially extends the size of training datasets to avoid overfitting. Generative Adversarial Networks yield that become increasingly difficult to differentiate from real images, until the differentiation is no longer possible. Thus, artificial images closely resembling original ones can be generated. Inclusion of artificial images contributes to improving the training process. Medical domain is one of the areas where data acquisition is burdened by many procedures, laws, and prohibitions. As a result the potential size of collected datasets is reduced. This article presents the results of training Convolutional Neural Networks on an artificially extended image datasets. The resulting classification accuracy on a cell classification task of models trained with images generated using the proposed method were increased by up to 12.9% in comparison to that of the model trained only with original dataset from the HErlev Pap smear dataset.

READ HERE