Danbooru

New Meta tag: Datamine

Posted under General

Perhaps not datamine, but now that I think of it, I don't think there is a way to locate all posts that have text (i.e. not source:none) but don't have a link in the source field. Below are a couple of spitballed names, though if someone has got something better, we could go with that.

  • nonlink_source
  • nonweb_source
  • no_source_link

It would be relatively easy to tag all such posts just by scanning them and processing them through the API, so there aren't any limiting factors on implementing it.

This would be a useful tag, but can we call it something like game_asset instead? I've probably lost this battle before I even start, but data mining has an established definition and this isn't it.

BrokenEagle98 said:

Perhaps not datamine, but now that I think of it, I don't think there is a way to locate all posts that have text (i.e. not source:none) but don't have a link in the source field. Below are a couple of spitballed names, though if someone has got something better, we could go with that.

That could be useful too, but not as a replacement for OP's suggestion. We shouldn't lump together digital assets extracted from a game with scans of analog media.

Of your names, nonweb_source is my favourite, though I'd have spelt it "non-web_source".

☆♪ said:

That could be useful too, but not as a replacement for OP's suggestion. We shouldn't lump together digital assets extracted from a game with scans of analog media.

Of your names, nonweb_source is my favourite, though I'd have spelt it "non-web_source".

Yeah, I wasn't thinking of it as a replacement, but more like a kind of umbrella term (for all of the non-web sources).

If there are no objections, I'll go ahead and start working on how to scan and update all posts.

Alright, well that was easier than expected (the coding I mean). I've started going through the posts, and should be done by sometime tomorrow.

Also, I created a wiki for Non-Web Source. In it I mention both scans (for analog) and game assets (for digital). Take a look and change as necessary.

Also, although I added game asset to the wiki, it can still be changed if desired. Now that I look at it, maybe digital asset should be used instead, to also cover those digital image packages that can be downloaded from various publishing sites.

Yeah, I forgot to leave my computer on last night so it lost all of those hours of processing, but it's on now again and it should probably take the reset of today to finish.

Mysterious_Uploader said:

I just checked Non-Web Source. Amazing work, but some gardening is needed. I'm gonna start and try to clean it up a bit, but help would be appreciated.

Could you give an example(s)? If it's something my script can account for, it would make the gardening a lot faster.

Mysterious_Uploader said:

On that note, someone changed the source of some scans i uploaded to exhentai/nhentai links. Since it isn't a web source, could you write a script that detects images ''sourced'' with nhentai/exhentai links?

You mean like the tag search source:https://exhentai.org/* ...?

Mysterious_Uploader said:

post #3294015

If it's truly the same image as claimed on that post, then why not just use that image link for the source and mark the post with bad tumblr id. I do the same all of the time when I'm uploading Twitter or other posts that no longer exist but I still have the file downloaded to my hard drive.

But yeah, that post needs to be fixed up and have the non-web source removed. Anyways, it's my opinion that anything that isn't a web link, whether dead or not, should be considered a non-web source.

1 2