Danbooru

DeepDanbooru: a prototype NN tagger

Posted under General

I've been playing with this for the past week and so far, the results are promising. Here are some performance numbers.

Overall Stats
Min_post_id Max_post_id Total_posts Total_tags Predictable_tags
3368754 3402356 33228 1280268 1088879

To start with, I ran all posts in the range id:3368754..3402356 through the model. This corresponds to all uploads in the month of January 2019 (date:2019-01-01..2019-02-01), which is outside the training set. This amounts to 33k posts with 1.28m total tags. 1.08m of these tags are predictable - the model was trained on them and can potentially recognize them (I spoke with the author and he shared some of the code, including the list of trained tags).

Overall Performance
Min_confidence Predicted_tags Correct_predictions Precision Recall F_score
0.5 557121 399297 0.72 0.37 0.49
0.6 448869 344494 0.77 0.32 0.45
0.7 355846 290220 0.82 0.27 0.4
0.8 269486 232216 0.86 0.21 0.34
0.9 179089 162722 0.91 0.15 0.26
0.95 120686 112876 0.94 0.1 0.19
0.99 47594 46005 0.97 0.04 0.08

This is a breakdown of the overall performance at different confidence levels.

So for example, at >=50% confidence, the model predicted 557k tags. 399k of these tags were correct, giving a precision of 72%. However, there were 1.08m tags in total it could have found, so the recall is only 37%.

(See [1] and [2] for background on precision and recall. In short, precision tells you how accurate it was (how many guesses it got right) and recall tells you how many tags it actually found.)

As you would expect, as you increase the confidence level, the precision goes up but the recall goes down (it gets more accurate but finds fewer tags). The F score gives the sweet spot where the precision and recall are both the highest. So overall, performance is best at >50% confidence, and we could possibly do even better if the confidence were lowered even more.

Performance By Tag (Most Common Tags)
Tag Actual_posts Predicted_posts Correct_predictions Predicted_frequency Actual_frequency Precision Recall F_score Min_confidence
1girl 24551 25143 23305 0.75 0.73 0.93 0.95 0.94 0.5
solo 20281 21571 19202 0.64 0.6 0.89 0.95 0.92 0.5
long_hair 20084 20259 16898 0.6 0.6 0.83 0.84 0.84 0.5
looking_at_viewer 16656 15974 11383 0.48 0.5 0.71 0.68 0.7 0.5
breasts 15349 13374 11606 0.4 0.46 0.87 0.76 0.81 0.5
blush 15128 16484 11274 0.49 0.45 0.68 0.75 0.71 0.5
smile 13283 12572 8463 0.37 0.4 0.67 0.64 0.65 0.5
bangs 13275 5718 3853 0.17 0.4 0.67 0.29 0.41 0.5
eyebrows_visible_through_hair 11961 3827 2531 0.11 0.36 0.66 0.21 0.32 0.5
open_mouth 10856 10477 7641 0.31 0.32 0.73 0.7 0.72 0.5
long_sleeves 8674 3610 2498 0.11 0.26 0.69 0.29 0.41 0.5
simple_background 8386 947 672 0.03 0.25 0.71 0.08 0.14 0.5
short_hair 8383 7045 4696 0.21 0.25 0.67 0.56 0.61 0.5
hair_between_eyes 7671 3455 2270 0.1 0.23 0.66 0.3 0.41 0.5
shirt 7515 2974 2090 0.09 0.22 0.7 0.28 0.4 0.5
blue_eyes 7442 7560 6074 0.23 0.22 0.8 0.82 0.81 0.5
skirt 7412 5523 4376 0.16 0.22 0.79 0.59 0.68 0.5
hair_ornament 7247 4580 3525 0.14 0.22 0.77 0.49 0.6 0.5
white_background 7191 565 446 0.02 0.21 0.79 0.06 0.12 0.5
brown_hair 6628 5763 4438 0.17 0.2 0.77 0.67 0.72 0.5
large_breasts 6356 5487 3897 0.16 0.19 0.71 0.61 0.66 0.5
multiple_girls 6337 5653 4995 0.17 0.19 0.88 0.79 0.83 0.5
gloves 6322 5786 3979 0.17 0.19 0.69 0.63 0.66 0.5
blonde_hair 6238 5975 5131 0.18 0.19 0.86 0.82 0.84 0.5
black_hair 6136 5722 4322 0.17 0.18 0.76 0.7 0.73 0.5
holding 5973 1473 958 0.04 0.18 0.65 0.16 0.26 0.5
ribbon 5811 2777 1793 0.08 0.17 0.65 0.31 0.42 0.5
very_long_hair 5596 2711 1866 0.08 0.17 0.69 0.33 0.45 0.5
bow 5592 2266 1743 0.07 0.17 0.77 0.31 0.44 0.5
thighhighs 5555 4785 4097 0.14 0.17 0.86 0.74 0.79 0.5
closed_mouth 5521 1775 937 0.05 0.16 0.53 0.17 0.26 0.5
red_eyes 5322 4422 3585 0.13 0.16 0.81 0.67 0.74 0.5
dress 5170 3355 2310 0.1 0.15 0.69 0.45 0.54 0.5
navel 5075 4240 3466 0.13 0.15 0.82 0.68 0.74 0.5
collarbone 5068 2117 1421 0.06 0.15 0.67 0.28 0.4 0.5
cleavage 4934 3717 2811 0.11 0.15 0.76 0.57 0.65 0.5
standing 4928 1255 644 0.04 0.15 0.51 0.13 0.21 0.5
bare_shoulders 4916 2040 1265 0.06 0.15 0.62 0.26 0.36 0.5
hat 4748 3792 3118 0.11 0.14 0.82 0.66 0.73 0.5
brown_eyes 4426 2839 2061 0.08 0.13 0.73 0.47 0.57 0.5
medium_breasts 4330 1994 1079 0.06 0.13 0.54 0.25 0.34 0.5
sitting 4280 2631 1897 0.08 0.13 0.72 0.44 0.55 0.5
twintails 4280 2824 2084 0.08 0.13 0.74 0.49 0.59 0.5
2girls 4003 3250 2596 0.1 0.12 0.8 0.65 0.72 0.5
sidelocks 3984 559 262 0.02 0.12 0.47 0.07 0.12 0.5
jewelry 3980 2394 1441 0.07 0.12 0.6 0.36 0.45 0.5
jacket 3904 1879 1252 0.06 0.12 0.67 0.32 0.43 0.5
white_shirt 3871 1009 706 0.03 0.12 0.7 0.18 0.29 0.5
underwear 3859 2587 2092 0.08 0.11 0.81 0.54 0.65 0.5
1boy 3748 3252 2247 0.1 0.11 0.69 0.6 0.64 0.5
black_legwear 3731 3037 2097 0.09 0.11 0.69 0.56 0.62 0.5
school_uniform 3637 3653 2390 0.11 0.11 0.65 0.66 0.66 0.5
animal_ears 3599 2828 2529 0.08 0.11 0.89 0.7 0.79 0.5
full_body 3539 2220 1493 0.07 0.11 0.67 0.42 0.52 0.5
hair_ribbon 3474 1593 967 0.05 0.1 0.61 0.28 0.38 0.5
:d 3447 1617 952 0.05 0.1 0.59 0.28 0.38 0.5
green_eyes 3339 2537 2159 0.08 0.1 0.85 0.65 0.73 0.5
upper_body 3329 2196 1287 0.07 0.1 0.59 0.39 0.47 0.5
pleated_skirt 3309 2010 1396 0.06 0.1 0.69 0.42 0.52 0.5
purple_eyes 3195 2662 1941 0.08 0.1 0.73 0.61 0.66 0.5
japanese_clothes 3184 2458 2149 0.07 0.09 0.87 0.67 0.76 0.5
comic 3152 3122 2892 0.09 0.09 0.93 0.92 0.92 0.5
panties 3151 2221 1815 0.07 0.09 0.82 0.58 0.68 0.5
short_sleeves 3124 1207 819 0.04 0.09 0.68 0.26 0.38 0.5
flower 3077 1903 1510 0.06 0.09 0.79 0.49 0.61 0.5
ahoge 3008 1924 1377 0.06 0.09 0.72 0.46 0.56 0.5
monochrome 2976 3145 2866 0.09 0.09 0.91 0.96 0.94 0.5
ponytail 2947 1707 1086 0.05 0.09 0.64 0.37 0.47 0.5
parted_lips 2933 321 158 0.01 0.09 0.49 0.05 0.1 0.5
hair_bow 2826 959 727 0.03 0.08 0.76 0.26 0.38 0.5
nipples 2826 2337 2140 0.07 0.08 0.92 0.76 0.83 0.5
yellow_eyes 2797 2164 1505 0.06 0.08 0.7 0.54 0.61 0.5
cowboy_shot 2757 1277 621 0.04 0.08 0.49 0.23 0.31 0.5
closed_eyes 2711 2362 1492 0.07 0.08 0.63 0.55 0.59 0.5
greyscale 2692 2634 2509 0.08 0.08 0.95 0.93 0.94 0.5
braid 2684 1552 1070 0.05 0.08 0.69 0.4 0.51 0.5
pink_hair 2649 2154 1798 0.06 0.08 0.83 0.68 0.75 0.5
ass 2579 1857 1478 0.06 0.08 0.8 0.57 0.67 0.5
silver_hair 2568 1809 1175 0.05 0.08 0.65 0.46 0.54 0.5
small_breasts 2501 624 406 0.02 0.07 0.65 0.16 0.26 0.5
weapon 2490 1718 1211 0.05 0.07 0.7 0.49 0.58 0.5
tail 2472 2119 1472 0.06 0.07 0.69 0.6 0.64 0.5
blue_hair 2453 2012 1599 0.06 0.07 0.79 0.65 0.72 0.5
kimono 2426 1591 1439 0.05 0.07 0.9 0.59 0.72 0.5
sweat 2415 710 400 0.02 0.07 0.56 0.17 0.26 0.5
purple_hair 2407 1950 1489 0.06 0.07 0.76 0.62 0.68 0.5
thighs 2350 160 67 0.0 0.07 0.42 0.03 0.05 0.5
boots 2265 2095 1190 0.06 0.07 0.57 0.53 0.55 0.5
heart 2243 879 572 0.03 0.07 0.65 0.26 0.37 0.5
open_clothes 2220 740 464 0.02 0.07 0.63 0.21 0.31 0.5
pantyhose 2191 1984 1442 0.06 0.07 0.73 0.66 0.69 0.5
swimsuit 2185 1927 1554 0.06 0.07 0.81 0.71 0.76 0.5
outdoors 2121 1058 675 0.03 0.06 0.64 0.32 0.42 0.5
wide_sleeves 2115 885 586 0.03 0.06 0.66 0.28 0.39 0.5
shiny 2101 61 24 0.0 0.06 0.39 0.01 0.02 0.5
frills 2096 484 319 0.01 0.06 0.66 0.15 0.25 0.5
white_legwear 2091 1178 904 0.04 0.06 0.77 0.43 0.55 0.5
earrings 2068 1106 664 0.03 0.06 0.6 0.32 0.42 0.5
lying 1992 1030 752 0.03 0.06 0.73 0.38 0.5 0.5
sleeveless 1953 412 262 0.01 0.06 0.64 0.13 0.22 0.5

Here's a breakdown of how well it's able to recognize individual tags. This is at >=50% confidence.

  • "Actual posts" is how posts actually had the given tag.
  • "Predicted posts" is how many posts it predicted should have the tag.
  • "Correct predictions" is how many of those predictions it actually got right (true positives).
  • "Actual frequency" is how often the tag actually appeared on uploads, while "Predicted frequency" is how often it predicted the tag should appear. If things are working correctly, a tag's predicted frequency should be close to its actual frequency.

So for example, 1girl was actually tagged on 24.5k posts (73% of uploads), while we predicted it should be tagged on 25.1k posts (75% of uploads). 23.3k of those predictions were right, for a precision of 93% and a recall of 95%.

Performance Per Tag (Most Accurate Tags)
Tag Actual_posts Predicted_posts Correct_predictions Predicted_frequency Actual_frequency Precision Recall F_score Min_confidence
1girl 24551 25143 23305 0.75 0.73 0.93 0.95 0.94 0.5
monochrome 2976 3145 2866 0.09 0.09 0.91 0.96 0.94 0.5
greyscale 2692 2634 2509 0.08 0.08 0.95 0.93 0.94 0.5
abigail_williams_(fate/grand_order) 219 201 197 0.01 0.01 0.98 0.9 0.94 0.5
solo 20281 21571 19202 0.64 0.6 0.89 0.95 0.92 0.5
comic 3152 3122 2892 0.09 0.09 0.93 0.92 0.92 0.5
bunnysuit 229 222 195 0.01 0.01 0.88 0.85 0.86 0.5
jeanne_d'arc_(fate)_(all) 290 232 221 0.01 0.01 0.95 0.76 0.85 0.5
long_hair 20084 20259 16898 0.6 0.6 0.83 0.84 0.84 0.5
blonde_hair 6238 5975 5131 0.18 0.19 0.86 0.82 0.84 0.5
multiple_girls 6337 5653 4995 0.17 0.19 0.88 0.79 0.83 0.5
nipples 2826 2337 2140 0.07 0.08 0.92 0.76 0.83 0.5
hatsune_miku 229 198 178 0.01 0.01 0.9 0.78 0.83 0.5
breasts 15349 13374 11606 0.4 0.46 0.87 0.76 0.81 0.5
blue_eyes 7442 7560 6074 0.23 0.22 0.8 0.82 0.81 0.5
artoria_pendragon_(all) 289 275 229 0.01 0.01 0.83 0.79 0.81 0.5
4koma 613 554 465 0.02 0.02 0.84 0.76 0.8 0.5
orange_bow 246 189 173 0.01 0.01 0.92 0.7 0.8 0.5
thighhighs 5555 4785 4097 0.14 0.17 0.86 0.74 0.79 0.5
animal_ears 3599 2828 2529 0.08 0.11 0.89 0.7 0.79 0.5
hakurei_reimu 233 163 156 0.0 0.01 0.96 0.67 0.79 0.5
sex 775 821 619 0.02 0.02 0.75 0.8 0.78 0.5
kaga_(kantai_collection) 198 189 151 0.01 0.01 0.8 0.76 0.78 0.5
penis 1067 1046 818 0.03 0.03 0.78 0.77 0.77 0.5
japanese_clothes 3184 2458 2149 0.07 0.09 0.87 0.67 0.76 0.5
swimsuit 2185 1927 1554 0.06 0.07 0.81 0.71 0.76 0.5
one_eye_closed 1797 1575 1276 0.05 0.05 0.81 0.71 0.76 0.5
pink_hair 2649 2154 1798 0.06 0.08 0.83 0.68 0.75 0.5
santa_hat 258 194 169 0.01 0.01 0.87 0.66 0.75 0.5
red_eyes 5322 4422 3585 0.13 0.16 0.81 0.67 0.74 0.5
navel 5075 4240 3466 0.13 0.15 0.82 0.68 0.74 0.5
one-piece_swimsuit 367 327 258 0.01 0.01 0.79 0.7 0.74 0.5
black_hair 6136 5722 4322 0.17 0.18 0.76 0.7 0.73 0.5
hat 4748 3792 3118 0.11 0.14 0.82 0.66 0.73 0.5
green_eyes 3339 2537 2159 0.08 0.1 0.85 0.65 0.73 0.5
bunny_ears 570 421 361 0.01 0.02 0.86 0.63 0.73 0.5
open_mouth 10856 10477 7641 0.31 0.32 0.73 0.7 0.72 0.5
brown_hair 6628 5763 4438 0.17 0.2 0.77 0.67 0.72 0.5
2girls 4003 3250 2596 0.1 0.12 0.8 0.65 0.72 0.5
blue_hair 2453 2012 1599 0.06 0.07 0.79 0.65 0.72 0.5
kimono 2426 1591 1439 0.05 0.07 0.9 0.59 0.72 0.5
bikini 1660 1438 1112 0.04 0.05 0.77 0.67 0.72 0.5
nude 1705 1584 1179 0.05 0.05 0.74 0.69 0.72 0.5
green_hair 1336 1084 874 0.03 0.04 0.81 0.65 0.72 0.5
witch_hat 324 236 202 0.01 0.01 0.86 0.62 0.72 0.5
blush 15128 16484 11274 0.49 0.45 0.68 0.75 0.71 0.5
fox_ears 700 515 429 0.02 0.02 0.83 0.61 0.71 0.5
maid_headdress 531 422 338 0.01 0.02 0.8 0.64 0.71 0.5
looking_at_viewer 16656 15974 11383 0.48 0.5 0.71 0.68 0.7 0.5
pantyhose 2191 1984 1442 0.06 0.07 0.73 0.66 0.69 0.5
hetero 1192 1351 880 0.04 0.04 0.65 0.74 0.69 0.5
mob_cap 298 201 172 0.01 0.01 0.86 0.58 0.69 0.5
skirt 7412 5523 4376 0.16 0.22 0.79 0.59 0.68 0.5
panties 3151 2221 1815 0.07 0.09 0.82 0.58 0.68 0.5
purple_hair 2407 1950 1489 0.06 0.07 0.76 0.62 0.68 0.5
kirisame_marisa 189 118 105 0.0 0.01 0.89 0.56 0.68 0.5
ass 2579 1857 1478 0.06 0.08 0.8 0.57 0.67 0.5
vaginal 567 602 394 0.02 0.02 0.65 0.69 0.67 0.5
large_breasts 6356 5487 3897 0.16 0.19 0.71 0.61 0.66 0.5
gloves 6322 5786 3979 0.17 0.19 0.69 0.63 0.66 0.5
school_uniform 3637 3653 2390 0.11 0.11 0.65 0.66 0.66 0.5
purple_eyes 3195 2662 1941 0.08 0.1 0.73 0.61 0.66 0.5
fox_tail 522 421 309 0.01 0.02 0.73 0.59 0.66 0.5
smile 13283 12572 8463 0.37 0.4 0.67 0.64 0.65 0.5
cleavage 4934 3717 2811 0.11 0.15 0.76 0.57 0.65 0.5
underwear 3859 2587 2092 0.08 0.11 0.81 0.54 0.65 0.5
serafuku 1864 1827 1202 0.05 0.06 0.66 0.64 0.65 0.5
red_hair 1841 1391 1055 0.04 0.05 0.76 0.57 0.65 0.5
cum 797 881 549 0.03 0.02 0.62 0.69 0.65 0.5
1boy 3748 3252 2247 0.1 0.11 0.69 0.6 0.64 0.5
tail 2472 2119 1472 0.06 0.07 0.69 0.6 0.64 0.5
censored 1184 976 695 0.03 0.04 0.71 0.59 0.64 0.5
scenery 206 219 137 0.01 0.01 0.63 0.67 0.64 0.5
oral 217 197 133 0.01 0.01 0.68 0.61 0.64 0.5
cat_ears 1002 700 539 0.02 0.03 0.77 0.54 0.63 0.5
maid 386 334 225 0.01 0.01 0.67 0.58 0.63 0.5
black_legwear 3731 3037 2097 0.09 0.11 0.69 0.56 0.62 0.5
beach 237 224 142 0.01 0.01 0.63 0.6 0.62 0.5
hair_tubes 311 205 159 0.01 0.01 0.78 0.51 0.62 0.5
short_hair 8383 7045 4696 0.21 0.25 0.67 0.56 0.61 0.5
flower 3077 1903 1510 0.06 0.09 0.79 0.49 0.61 0.5
yellow_eyes 2797 2164 1505 0.06 0.08 0.7 0.54 0.61 0.5
male_focus 1181 1215 735 0.04 0.04 0.6 0.62 0.61 0.5
pokemon_(creature) 304 205 155 0.01 0.01 0.76 0.51 0.61 0.5
umbrella 428 271 213 0.01 0.01 0.79 0.5 0.61 0.5
hair_flaps 396 238 194 0.01 0.01 0.82 0.49 0.61 0.5
hair_ornament 7247 4580 3525 0.14 0.22 0.77 0.49 0.6 0.5
glasses 1452 788 673 0.02 0.04 0.85 0.46 0.6 0.5
cat_tail 534 353 266 0.01 0.02 0.75 0.5 0.6 0.5
striped_legwear 342 256 178 0.01 0.01 0.7 0.52 0.6 0.5
headpiece 169 99 80 0.0 0.01 0.81 0.47 0.6 0.5
paws 196 102 90 0.0 0.01 0.88 0.46 0.6 0.5
twintails 4280 2824 2084 0.08 0.13 0.74 0.49 0.59 0.5
closed_eyes 2711 2362 1492 0.07 0.08 0.63 0.55 0.59 0.5
wings 1197 740 572 0.02 0.04 0.77 0.48 0.59 0.5
pussy 1123 694 540 0.02 0.03 0.78 0.48 0.59 0.5
cum_in_pussy 365 330 205 0.01 0.01 0.62 0.56 0.59 0.5
china_dress 192 129 95 0.0 0.01 0.74 0.49 0.59 0.5
teddy_bear 211 192 119 0.01 0.01 0.62 0.56 0.59 0.5
weapon 2490 1718 1211 0.05 0.07 0.7 0.49 0.58 0.5
Performance Per Tag (Least Accurate Tags)
Tag Actual_posts Predicted_posts Correct_predictions Predicted_frequency Actual_frequency Precision Recall F_score Min_confidence
hand_up 1822 8 3 0.0 0.05 0.38 0.0 0.0 0.5
shiny 2101 61 24 0.0 0.06 0.39 0.01 0.02 0.5
head_tilt 1949 62 18 0.0 0.06 0.29 0.01 0.02 0.5
thighs 2350 160 67 0.0 0.07 0.42 0.03 0.05 0.5
parted_lips 2933 321 158 0.01 0.09 0.49 0.05 0.1 0.5
artist_name 1532 352 93 0.01 0.05 0.26 0.06 0.1 0.5
white_background 7191 565 446 0.02 0.21 0.79 0.06 0.12 0.5
sidelocks 3984 559 262 0.02 0.12 0.47 0.07 0.12 0.5
teeth 1663 201 114 0.01 0.05 0.57 0.07 0.12 0.5
miniskirt 1773 221 128 0.01 0.05 0.58 0.07 0.13 0.5
simple_background 8386 947 672 0.03 0.25 0.71 0.08 0.14 0.5
grey_background 1637 184 143 0.01 0.05 0.78 0.09 0.16 0.5
signature 1643 418 183 0.01 0.05 0.44 0.11 0.18 0.5
collared_shirt 1642 332 186 0.01 0.05 0.56 0.11 0.19 0.5
alternate_costume 1587 452 204 0.01 0.05 0.45 0.13 0.2 0.5
standing 4928 1255 644 0.04 0.15 0.51 0.13 0.21 0.5
sleeveless 1953 412 262 0.01 0.06 0.64 0.13 0.22 0.5
frills 2096 484 319 0.01 0.06 0.66 0.15 0.25 0.5
indoors 1567 471 256 0.01 0.05 0.54 0.16 0.25 0.5
holding 5973 1473 958 0.04 0.18 0.65 0.16 0.26 0.5
closed_mouth 5521 1775 937 0.05 0.16 0.53 0.17 0.26 0.5
small_breasts 2501 624 406 0.02 0.07 0.65 0.16 0.26 0.5
sweat 2415 710 400 0.02 0.07 0.56 0.17 0.26 0.5
white_shirt 3871 1009 706 0.03 0.12 0.7 0.18 0.29 0.5
sailor_collar 1527 410 284 0.01 0.05 0.69 0.19 0.29 0.5
cowboy_shot 2757 1277 621 0.04 0.08 0.49 0.23 0.31 0.5
open_clothes 2220 740 464 0.02 0.07 0.63 0.21 0.31 0.5
black_skirt 1781 474 345 0.01 0.05 0.73 0.19 0.31 0.5
eyebrows_visible_through_hair 11961 3827 2531 0.11 0.36 0.66 0.21 0.32 0.5
black_gloves 1866 690 422 0.02 0.06 0.61 0.23 0.33 0.5
medium_breasts 4330 1994 1079 0.06 0.13 0.54 0.25 0.34 0.5
white_hair 1788 787 453 0.02 0.05 0.58 0.25 0.35 0.5
bare_shoulders 4916 2040 1265 0.06 0.15 0.62 0.26 0.36 0.5
heart 2243 879 572 0.03 0.07 0.65 0.26 0.37 0.5
hair_ribbon 3474 1593 967 0.05 0.1 0.61 0.28 0.38 0.5
:d 3447 1617 952 0.05 0.1 0.59 0.28 0.38 0.5
short_sleeves 3124 1207 819 0.04 0.09 0.68 0.26 0.38 0.5
hair_bow 2826 959 727 0.03 0.08 0.76 0.26 0.38 0.5
multicolored_hair 1876 664 478 0.02 0.06 0.72 0.25 0.38 0.5
striped 1687 1419 589 0.04 0.05 0.42 0.35 0.38 0.5
wide_sleeves 2115 885 586 0.03 0.06 0.66 0.28 0.39 0.5
shirt 7515 2974 2090 0.09 0.22 0.7 0.28 0.4 0.5
collarbone 5068 2117 1421 0.06 0.15 0.67 0.28 0.4 0.5
choker 1784 877 538 0.03 0.05 0.61 0.3 0.4 0.5
bangs 13275 5718 3853 0.17 0.4 0.67 0.29 0.41 0.5
long_sleeves 8674 3610 2498 0.11 0.26 0.69 0.29 0.41 0.5
hair_between_eyes 7671 3455 2270 0.1 0.23 0.66 0.3 0.41 0.5
food 1600 825 492 0.02 0.05 0.6 0.31 0.41 0.5
shoes 1800 781 533 0.02 0.05 0.68 0.3 0.41 0.5
ribbon 5811 2777 1793 0.08 0.17 0.65 0.31 0.42 0.5
earrings 2068 1106 664 0.03 0.06 0.6 0.32 0.42 0.5
outdoors 2121 1058 675 0.03 0.06 0.64 0.32 0.42 0.5
jacket 3904 1879 1252 0.06 0.12 0.67 0.32 0.43 0.5
bow 5592 2266 1743 0.07 0.17 0.77 0.31 0.44 0.5
very_long_hair 5596 2711 1866 0.08 0.17 0.69 0.33 0.45 0.5
jewelry 3980 2394 1441 0.07 0.12 0.6 0.36 0.45 0.5
shorts 1643 794 566 0.02 0.05 0.71 0.34 0.46 0.5
upper_body 3329 2196 1287 0.07 0.1 0.59 0.39 0.47 0.5
ponytail 2947 1707 1086 0.05 0.09 0.64 0.37 0.47 0.5
elbow_gloves 1532 1033 598 0.03 0.05 0.58 0.39 0.47 0.5
hairclip 1664 892 604 0.03 0.05 0.68 0.36 0.47 0.5
detached_sleeves 1624 940 634 0.03 0.05 0.67 0.39 0.49 0.5
lying 1992 1030 752 0.03 0.06 0.73 0.38 0.5 0.5
braid 2684 1552 1070 0.05 0.08 0.69 0.4 0.51 0.5
necktie 1671 1045 688 0.03 0.05 0.66 0.41 0.51 0.5
full_body 3539 2220 1493 0.07 0.11 0.67 0.42 0.52 0.5
pleated_skirt 3309 2010 1396 0.06 0.1 0.69 0.42 0.52 0.5
dress 5170 3355 2310 0.1 0.15 0.69 0.45 0.54 0.5
silver_hair 2568 1809 1175 0.05 0.08 0.65 0.46 0.54 0.5
sitting 4280 2631 1897 0.08 0.13 0.72 0.44 0.55 0.5
boots 2265 2095 1190 0.06 0.07 0.57 0.53 0.55 0.5
white_legwear 2091 1178 904 0.04 0.06 0.77 0.43 0.55 0.5
ahoge 3008 1924 1377 0.06 0.09 0.72 0.46 0.56 0.5
hairband 1873 1371 908 0.04 0.06 0.66 0.48 0.56 0.5
hair_flower 1793 1035 795 0.03 0.05 0.77 0.44 0.56 0.5
brown_eyes 4426 2839 2061 0.08 0.13 0.73 0.47 0.57 0.5
sky 1853 1274 885 0.04 0.06 0.69 0.48 0.57 0.5
weapon 2490 1718 1211 0.05 0.07 0.7 0.49 0.58 0.5
twintails 4280 2824 2084 0.08 0.13 0.74 0.49 0.59 0.5
closed_eyes 2711 2362 1492 0.07 0.08 0.63 0.55 0.59 0.5
hair_ornament 7247 4580 3525 0.14 0.22 0.77 0.49 0.6 0.5
short_hair 8383 7045 4696 0.21 0.25 0.67 0.56 0.61 0.5
flower 3077 1903 1510 0.06 0.09 0.79 0.49 0.61 0.5
yellow_eyes 2797 2164 1505 0.06 0.08 0.7 0.54 0.61 0.5
black_legwear 3731 3037 2097 0.09 0.11 0.69 0.56 0.62 0.5
1boy 3748 3252 2247 0.1 0.11 0.69 0.6 0.64 0.5
tail 2472 2119 1472 0.06 0.07 0.69 0.6 0.64 0.5
smile 13283 12572 8463 0.37 0.4 0.67 0.64 0.65 0.5
cleavage 4934 3717 2811 0.11 0.15 0.76 0.57 0.65 0.5
underwear 3859 2587 2092 0.08 0.11 0.81 0.54 0.65 0.5
serafuku 1864 1827 1202 0.05 0.06 0.66 0.64 0.65 0.5
red_hair 1841 1391 1055 0.04 0.05 0.76 0.57 0.65 0.5
large_breasts 6356 5487 3897 0.16 0.19 0.71 0.61 0.66 0.5
gloves 6322 5786 3979 0.17 0.19 0.69 0.63 0.66 0.5
school_uniform 3637 3653 2390 0.11 0.11 0.65 0.66 0.66 0.5
purple_eyes 3195 2662 1941 0.08 0.1 0.73 0.61 0.66 0.5
ass 2579 1857 1478 0.06 0.08 0.8 0.57 0.67 0.5
skirt 7412 5523 4376 0.16 0.22 0.79 0.59 0.68 0.5
panties 3151 2221 1815 0.07 0.09 0.82 0.58 0.68 0.5
purple_hair 2407 1950 1489 0.06 0.07 0.76 0.62 0.68 0.5

Here are the top 100 best and worst tags. Some observations:

  • But it also does very well on 1girl and solo, which is surprising considering the number of difficult corner cases these tags can have.
  • The worst tags tend to be very common features that are frequently tagged on new uploads, but not on old uploads. This can happen if a tag is only used by certain power uploaders (but not by other users in general), or if it only recently came into widespread use. For example, eyebrows visible through hair is used on 36% of all uploads from this year, but on nearly 0% of uploads before 2016. Training tags like this is probably very difficult given the huge number of false negatives they will have.

It should be emphasized that all precision and recall values here are underestimates. Many predictions are currently counted as wrong due to missing tags, but a tag not being present on a post doesn't necessarily mean the post shouldn't have the tag. So as missing tags are added, these numbers will improve.

Overall, while there's still ample room for improvement, this is already good enough for many purposes, including suggesting tags during editing and finding missing tags.

I implemented something similar previously but in practice I didn't find the tags it returned to be useful. It was incredibly obvious common tags like 1girl or monochrome. It could, for example, identify popular FGO characters, but for frequent uploaders those are some of the least important things that need to be identified. It's kind of a catch 22: an autotagger would be most useful for lesser known tags, but an autotagger would never learn them because there aren't enough examples of them.

The way users apply tags on Danbooru makes it not suitable for ML training applications like this because tags will be used even if they only describe a small percentage of the image, meaning the data set is very noisy. There's also a heavy bias towards new shows, games and characters which will have small training sets and will probably not get identified unless they get so popular that everyone knows about them.

It's a neat idea and I'm glad projects like this exist, but it just doesn't match the needs of most uploaders.

I am hoping better neural nets come out that can deal with things like rotation and flipping better.

albert said:

It's a neat idea and I'm glad projects like this exist, but it just doesn't match the needs of most uploaders.

I disagree. This is usually able to correctly identify 10-15 tags per upload. That's not bad at all. Even when you know everything it gives you, having half your tags handed to you is very convenient. It saves you the trouble of manually typing every tag out, then running everything through related tags to make sure you didn't forget anything.

It's not the case that it only finds very common tags either. It's often able to find surprisingly specific tags:

Tag suggestions aren't the only use case for this. It's extremely useful for tag gardening. It makes finding missing tags a lot easier:

albert said:

The way users apply tags on Danbooru makes it not suitable for ML training applications like this because tags will be used even if they only describe a small percentage of the image, meaning the data set is very noisy.

I'm not sure how true this is. Some tags are noisier than others, but the difficult tags aren't necessarily the ones you would expect. This is pretty good at tagging eye colors, for example, even though eyes are a very small feature and color tagging is noisy. On the other hand, it's bad at white background, even though this is a big, easy to recognize feature with fairly consistent tagging.

albert said:

I am hoping better neural nets come out that can deal with things like rotation and flipping better.

I'd think rotation and flipping shouldn't be an issue if you rotate and flip images during training, which is normal practice anyway to augment the training set.

I'd be very interested to know more about the approach you used when you tried this, to get a better idea of how it compares with the OP.

evazion said:

I personally feel like it's slower that way, in the same way when you tag garden you not only have to look for what tags are not present, you have to scan for incorrect tags as well, I can tag out an image faster from scratch than it takes to correct a list of tags given to me.

1