Danbooru

Danbooru Archives

Posted under General

葉月 said:
Thanks. Have you thought about how to tackle running updates? As I understand, you have enough data/infrastructure to determine "files newer than X", can you use that to generate periodic update torrents?

Yes, I've already written a script to do that. How frequently should I create update torrents?

Last torrent:
http://dedicated.yosome.org/c-f.torrent

Also, I've discovered that Transmission seems to be able to handle the 100k+ file torrents without any issues. With all 4 torrent loaded, it is consuming around 300MB of RAM.

yosome said:
Yes, I've already written a script to do that. How frequently should I create update torrents?

Good question, I guess two weeks would be a reasonable frequency? If you made an RSS feed to publish that, I could set it up to download automatically. Oh, and another question, do you also include deleted posts' data?

Also, I've discovered that Transmission seems to be able to handle the 100k+ file torrents without any issues. With all 4 torrent loaded, it is consuming around 300MB of RAM.

Ah, that sounds useful. With µT the biggest problem is not RAM, but the time it takes to start (somewhere between 30 minutes and 1h for two torrents) and that it runs into "too many open files" if I don't bump ulimits to several hundred k files, which is a PITA.

That's strange. Are you using the Mac version? The Windows version 1.8.2 of µT seems to work fine for me with these torrents aside from the fact that it takes a couple minutes to load them, and I've seen weird things happen if I try to load a second while loading the first. According to the logger, I don't get any errors related to the number of open files, and I haven't changed any configurations related to it as far as I know.

-----------------------------------------------------------
would it be possible to categorize all this into folders sorted by tags?
-----------------------------------------------------------
If you have access to the database it should be pretty easy. should only take couple of minutes to half an hour to code the snippet that will categorize all this for you.

This how: you take filename & send a query to the database. It "will return" the tag of the file. then you simply move the file like this: folder-tag/tag - filename.ext

And dont forget to rename all japanese & all other weird symbols like ///: windows wont let you make the folders.

I can help you out if you want to.

well theoretically this should be very simple.

Hire is some thinking material.

1 image can have several tags like:
loli,wet,unzipped
How do you sort them by folders if you do not own a database?

Option 1. - create a duplicate of each image for each tag. But this is not very smart, cuz it will require loads of disc space.

Option 2. - convert the image into *.tif format & auto add tagging information - unlikely to happen

Option 3. - database is most likely storing info something like this:

------------------------------------
Tags------------- Image------------
------------------------------------
loli.wet.unzipped c0\3215456466.jpg

we move each file into this location:

loli.wet.unzipped\3215456466.jpg

Since we do not know if the tags are stored in order in the database we can put them through a filter that will sort them auto ascending.

see, all is logical & if you have the info attached to a file you can lather manipulate the files the way you want. Once they are on your hard drive.

Updated

My last post was getting too big so let me make new one.

Another idea just hit me. This is probably the best idea so far.

you retrieve the information from the database & write it into the database.txt file. & then share it with the torrent or upload it. cuz its hard to find 1 file in this huge torrent.

Info needed is:

TAGS & File location:

So txt file will have values like this:

loli.wet.unzipped - c0\3215456466.jpg
shota.boobs.shemale - c0\4564879.jpg
etc...

this way we can DL all files & do thee sorting on our own.

What do you think. Can you do this?

Er... this would work terribly. You want a separate folder for each instantiated combination of tags? There will be still be way too many to manually walk though, and it will fail to capture similarities across single tags (since they will be in multiple folders). Moreover for every post that's not poorly tagged, you are likely to only have one or two posts per folder, which is mostly worthless. Tagged indexing does not map onto a hierarchical file system very well. That's part of the reason for Danbooru in the first place.

Shinjidude said:
Er... this would work terribly. You want a separate folder for each instantiated combination of tags?

Yes, this is why I asked for a copy or relations between images & tags.

The main point of this is that I can have images sorted by tags. & not all random images in 1 folder. because I will be still sorting out good images manually.

See i made a macro/& I will make more, that moves opened picture to desired folder via 1 button click. So this is not really an issue. the problem is, that I need to specify where to move this image. If Im interested in only hentai content, then I can narrow down the tags down to 20 & then I can use my numpadkeys & numerical keys to sort the images in an easy way.

If I have this info I can make another snippet that will move, lets say all images wit loli tag into 1 folder & then I can sort the folder etc.

See the main point is, that I will actually sort all this. But it will be alot simpler & take tons of less time If I have the relations between images & tags.

Soo I ask, can you retrieve this info for me? or if you need help with this, feel free to ask, maybe i can be any use.

I've already provided a mysql database dump which contains the information that you're looking for, as well as the scripts used to generate the database. Google should provide more than enough help with what I've already provided.

What you're planning is just going to make a huge mess:

SELECT COUNT(DISTINCT `tags`) FROM `posts`;

COUNT(DISTINCT `tags`)
403685

The `tags` field in the posts table contains the full set of tags associated with the post, sorted alphabetically. Doing what you want to do would create over 400,000 folders, nearly one for every post. Even if you limit the folder creation to a few tags, you're still going to be stuck with thousands of folders, each containing 1 or 2 images. Writing a virtual tagging file system in FUSE would be the only usable solution.

葉月 said:
If you made an RSS feed to publish that, I could set it up to download automatically. Oh, and another question, do you also include deleted posts' data?

Yes, I preserve the deleted/unapproved posts' data. I'll work on getting an RSS feed up this weekend.

I always figured Gelbooru was the dumping ground for Danbooru posts---deleted and otherwise.

If anything, I think an archive of deleted posts would be fun to look through. A bonus bag of fun, lackluster, disgusting, horrifying, laugh-inducing, love-inducing, and vomit-inducing things to gain access to and gape upon. Privileged users can already do this, I see, but the very old deleted posts from years ago end up deleted

gaastra said:
ok I finished sorting the first 0-3 torrent. Hopefully torrent 2 will have more good images :(

Hire are the images I kept, that is if someone is interested: 114 files / 41 MB

Hahahaha, what? Are you implying that you downloaded 32GB of data, then went through 100k+ untagged images and selected 114 of them to keep? That's... uh, awful, artificial and amusing I guess?

葉月 said:
Hahahaha, what? Are you implying that you downloaded 32GB of data, then went through 100k+ untagged images and selected 114 of them to keep? That's... uh, awful, artificial and amusing I guess?

I dont see whats so amusing. I have good sorting system that enables me to sort images really fast.I also have big monitor! But of course it puts a lot stress on my eyes, so i do not do it nonstop. After my auto script moves aside images with certain pattern I sort 5k images in a row, then I rest.

Im looking for new, good images I do not have & im interested in. You can DL the archive & see what images im interested in if you like. then you will see, that I do not keep images that do not met my set criteria.

Well anyway ill be DL second torrent now & since there will be a lot of images I will delete I dont need the tags. I will just sort the crap out & then just run the MD5 scan on the leftover images in the search engine. this will be more simple than I imagined.

I must really say ty for the md5 search feature.It will save a lot of my time that I would had to spend on manual tagging :)

Question. Im Using Utorrent. but ppl are downloading from me with pretty slow speed. Even tho my UL speed is set to unlimited. I mean they should be able to DL with at least 20 times bigger speed. I mean I can Upload with at least 70 Kb/s but my UL speed in Utorrent is like 5kb/s. This aint normal.

Is there anything I can do or configure to let ppl DL with bigger speed?

My current U torrent config is default

1 2 3 4