Danbooru

Upscaled waifu2x Images

Posted under General

Took the liberty of upscaling 200 images for myself and figured I'd share; I tried a variation of art styles (colors, complexity, etc.) and sources (Pixiv, Twitter, scans) to get a feel of the upscaler's abilities. Threw in a bunch of the low-res Sword Girls cards and upscaled them as much as 3.2x (2x first then 1.6x) and noticed some of the more simpler cards had a very negligible loss in quality.

Gallery can be found here on imgur. I limited uploads to those under 5mb so that imgur doesn't downsample them. The images aren't in any particular order, though most of the rating:q and rating:e stuff is grouped together.

I guess it's possible to have "create upsscale" link somewhere. The downside is that waifu2x operates with POST parameters, so it's hard to just pass it a link. Also, forking the project seems problematic - first, it's neural network indeed, so it needs proper training - not that we have a shortage of hq images, but second, it's written in LUA and requires NVIDIA card/CUDA to operate, so I don't know whether it's even possible to install on danbooru server which most probably doesn't even have a video card.

Type-kun said:

first, it's neural network indeed, so it needs proper training

Is this waifu2x's training? If so we could use it without having to retrain it ourselves.
https://github.com/nagadomi/waifu2x/tree/master/models

Type-kun said:

it's written in LUA and requires NVIDIA card/CUDA to operate, so I don't know whether it's even possible to install on danbooru server which most probably doesn't even have a video card.

There's a re-implementation written in C++, I believe it uses CPU instead of GPU:
https://github.com/WL-Amigo/waifu2x-converter-cpp (GUI version)
It's very slow though, taking several minutes to do a single image for me even though my CPU is pretty good.

There's also this re-implementation that gives the option to choose between CPU and GPU:
https://github.com/lltcggie/waifu2x-caffe

Toks said:

Is this waifu2x's training? If so we could use it without having to retrain it ourselves.
https://github.com/nagadomi/waifu2x/tree/master/models

Not sure, I have almost no experience with neural networks and not sure where and how this particular one stores its data.
However, there's an issue complaining that after retraining (albeit using caffe version) the results are inferior.

It's very slow though, taking several minutes to do a single image for me even though my CPU is pretty good.

Well, the requests to waifu2x demo take several seconds each, as well. It's probably possible to contact the author and convince them to collaborate and allow GET requests to API, but I'm not sure if their server will sustain the load if we direct our userbase there.

Just to toss in my 2¢, I'm against allowing waifu2x upscales.

If at some point in the future, we can install it on Danbooru, great. Or if someone is willing to host it up for us (like how iqdb used to provide our similar image search functionality), also great. Otherwise, users who want upscales can go do it themselves on the waifu2x site.

I don't think we should get into trying to form criteria about some upscales being better and thus permissible whereas others are not, or using it for noise reduction, or whatever -- it's a slippery slope once you start going there, and it's going to be tricky to form consistent criteria. Better not to permit in the first place than have to deal with the redundancy/bloat later.

For approval purposes, approvers unfortunately will have to bear the burden. I guess they'd have to be more wary of unsourced posts in particular. Perhaps integrate md5 checks into danbooru, to fire on upload or a change in the source field. md5 mismatch flagged by the system would be grounds for closer scrutiny. Posts that slipped past should get tagged with waifu2x and upscaled and should remain frowned upon.

Edit: nvm re: md 5 checks -- if a post gets uploaded by source url, there's no way it'd be a md5 mismatch. Maybe still of some worth if source field gets edited though.

Updated

One additional problem with waifu2x uploads: they interfere with training future upscalers/inpainters/etc NNs. Because you're implicitly claiming the earlier waifu2x upscales as true human-created artwork to be learned from, rather than the best-guesses of an earlier cruder upscaler. You don't want to feed NN outputs back into NNs as inputs! GIGO, essentially. Not a problem as long as they remain a vanishingly small part of the Danbooru dataset, though, and one can at least filter them out using the tag.

gwern-bot said:

One additional problem with waifu2x uploads: they interfere with training future upscalers/inpainters/etc NNs. Because you're implicitly claiming the earlier waifu2x upscales as true human-created artwork to be learned from, rather than the best-guesses of an earlier cruder upscaler. You don't want to feed NN outputs back into NNs as inputs! GIGO, essentially. Not a problem as long as they remain a vanishingly small part of the Danbooru dataset, though, and one can at least filter them out using the tag.

Agree, though I do wonder what the purpose of bumping a 2 year old thread was.

It's currently written out in help:third-party edit that waifu2x uploads are not to be accepted (although I know a few uploaders that still use waifu2x for "fixing" their scans without explicitly stating it in tags). Thing is though, some approvers don't check an image's source before approving a post, which is a little problematic.

Agree, though I do wonder what the purpose of bumping a 2 year old thread was.

I've made a Danbooru torrent, primarily for supporting ML stuff like waifu2x, and I was curious if anyone had done anything with neural networks & the Danbooru corpus before, so I was reading this & just wanted to mention that.

gwern-bot said:

I've made a Danbooru torrent, primarily for supporting ML stuff like waifu2x, and I was curious if anyone had done anything with neural networks & the Danbooru corpus before, so I was reading this & just wanted to mention that.

Hm, interesting! I've thought about it before, but there does exist illust2vec (which is a ML algorithm that basically tries to estimate characters, copyrights, and general tags by learning from the input here). Currently it exists as a beta feature for Builder+ users (it guesses characters it sees on an upload page).

But you may want to make a new thread for it, if that's your q.

Yeah, I've seen illustration2vec. Not to insult the creators, but I think it should be possible to do much better: they use a relative handful of possible tags, a small image dataset, minimal hyperparameter tuning or data augmentation, and a very obsolete CNN architecture.

Updated

gwern-bot said:

Yeah, I've seen illustration2vec. Not to insult the creators, but I think it should be possible to do much better: they use a relative handful of possible tags, a small image dataset, and a very obsolete CNN architecture.

I have no doubt you are right. I'd suggest opening up a totally new topic if you're into making a push for that sort of feature to be adopted.

If someone makes a good tagger, maybe. I imagine it would be difficult to integrate into the Danbooru codebase/server: you'd have to install a lot of libraries to get it running even on the CPU and then it'll probably be several seconds per image for a full-strength resnet tagger so it would have trouble keeping up with just new images... (I assume the server doesn't have a GPU!)

I was thinking more of taggers being useful for semi-automated editing of Danbooru images: take an 'active learning' approach and focus on adding tags that the CNN identifies as missing to the local dataset metadata & the live Danbooru website, which will both improve the Danbooru tagging and also fix the errors which damage the CNN training the most. Heavy uploaders can then figure out how to install locally to pre-generate tags.

Even if no one does that, the tagger will be good for creating embeddings for GANs and style transfer - right now the style2paints guy thinks that one reason the anime style transfer doesn't work too well (eg the 'watercolor effect') because the CNNs are trained on photographs rather than anime illustrations, so the features/embedding just don't encode what's necessary. Definitely lots of possibilities for fun applications beyond just slightly better waifu2x-style upscales.

1 2 3