Anime character recognition/classification using PyTorch
arkel23/animesion
“Our best model, ViT L-16 with image size 128×128 and batch size 64 achieves to get 85.95% and 94.23% top-1 and top-5 classification accuracies, among 3263 characters, compared to the best CNN model (ResNet-18) that only achieved 69.09% and 84.64%, respectively.
We hope that this work inspires other researchers to follow and build upon this path. ViT models have interesting properties for domain transfer that haven’t been studied, and their big jump in terms of performance compared to CNNs suggest that they may be more suitable for drawn, sketched character recognition. This is due to the fact that CNNs are biased towards texture, and not shapes…”
Source: github.com/arkel23/animesion/tree/main/classification
May 3, 2021
Subscribe
Login
Please login to comment
0 Comments