Danbooru

Feature suggestion: Multiple source fields per post

Posted under Bugs & Features

I suggest allowing an arbitrary number of "source" fields per post, for cases where the same image is found in multiple places (you know... Pixiv, Deviantart, Tumblr, Twitter, etc.)

Notes:
- Sometimes, the same image has different artist commentaries in each website. (post #2601820, post #3033573)
- Sometimes, the artist deletes one and keeps the other. (post #3033573 again, the DeviantArt version got deleted and the Pixiv one was kept)

Obviously, in the latter case, if we have the bad luck to choose the one source that gets deleted at some point later, it will become a bad id without any link to the remaining source. If we have multiple available sources per image, this could be avoided.

Plus in my opinion it's generally nice to see the multiple places where the same image can be found, even if a given image does not fit in any of the circumstances given above.

Any thoughts?

Updated

One matter to take into consideration is that often, different sources will have the "same" image, that isn't technically the same at all, due to differing resolution, compression, or file format. If the multiple source fields would only be for files that were exactly the same, then that would completely rule out Twitter, which does its own automatic recompression. And many of the sites that aren't Twitter allow in-place revisions, meaning a source could start out matching the others and then change.

Another matter to consider is how this would complicate the bad_id tag. Currently, there is a bot (or more?) that automatically checks uploaded images against their sources to detect md5 mismatches and bad_id's. Under a multi-source system, a bad_id wouldn't necessarily mean all of the sources on the post are deleted; steps would need to be taken to show which of the sources were still good. Either extra tags, or, since this would probably already involve modifying the database, some kind of flag for each source field that could be flipped if it were deleted.

The big matter is what kinds of modifications to the database this would involve. I'm not a coder for the site (hopefully some of them weigh in), but I remember the database modification requirements of feature requests being big deal breakers in the past.

Having more than one source per post would require a major rewrite of the code and a restructuring of the database. Currently the source is stored as a string, and everything in the code acts as if it were a string. Switching to arrays or some other storage scheme would be no easy task.

Instead of the above proposition though, I offer one of my own. Instead of mucking with the primary source field which many users depend on being a string and the primary source of the image, I propose to have something akin to the artist commentary field where it would display any alternative sources for the image post if they exist. I liken the above idea to the previous case of wanting to include the artist commentary on a post, but having no way to do so other than to leave a comment, which hopefully was the first comment but if not then too bad.

Some examples of how it has been done in the past:

Ideas

1. Present all information about the image source
SourceLinkMD5 HashTypeDimensionsFilesizeSimilarity
Nicoseigahttp://seiga.nicovideo.jp/seiga/im8110888985a9467364161e20083ed9add8d1fcajpg(1226, 2005)966591100%
Tinamihttp://www.tinami.com/view/5434033326ed5e28ca7bd943f3167a68a5b96ajpg(1226, 2005)97768298%
Twitterhttps://pbs.twimg.com/media/DcVkvxvUwAEmBgs.jpg:origb6d04a175f2aa729c87c07da0f000510jpg(1226, 2005)13825391%

The above would facilitate including non MD5 matching links such as Twitter.

Note: For the source above, the post link is included for some instead since the actual image links are only good for a short time-period.

2. Editing would supply the user with a textbox to add more image links

Danbooru would pull the image and fill in all of the fields to reduce the chance of transposition errors.

In the case of a site's image links not being supported for the above process (such as Tinami), the user would have the option to do an overide, in which case only the link itself would be presented.

3. When editing, each link entry would have a delete link to remove it

Aren't sources stored in the database as tags, together with the other tags?

For example, isn't it right that post #642652 has "tags" stored like these: ":> bangs blunt_bangs bob_cut child highres nagian original red_hair short_hair sitting stuffed_animal stuffed_toy teddy_bear wariza rating:s source:http://img08.pixiv.net/img/nagian/9610678.jpg"

Because if that's the case, you could add an arbitrary number of sources by adding more tags like ":> bangs blunt_bangs bob_cut child highres nagian original red_hair short_hair sitting stuffed_animal stuffed_toy teddy_bear wariza rating:s source:http://somethingfirst/ source:http://somethingsecond/ source:http://somethingthird/"

Please ignore this if my assumption was incorrect.

fossilnix said:

One matter to take into consideration is that often, different sources will have the "same" image, that isn't technically the same at all, due to differing resolution, compression, or file format.

BrokenEagle98 said:

Ideas

1. Present all information about the image source

I like BrokenEagle98's idea number 1. To be specific, I like the possibility of keeping a list of links to the "same" image even if they have different resolutions, compression or file format. But I know this means basically all posts with inconsistent sources would have to be tagged md5_mismatch and thus this tag may become all but useless.

This could be partially fixed by having separate tags: deviantart_md5_mismath, pixiv_md5_mismath, twitter_md5_mismath etc. so we could locate which images have that mismatch.

It would also be nice if the entirety of BrokenEagle98's idea number 1 were implemented -- the table with dimensions, filesize, similarity, etc.

fossilnix said:

One matter to take into consideration is that often, different sources will have the "same" image, that isn't technically the same at all, due to differing resolution, compression, or file format. If the multiple source fields would only be for files that were exactly the same, then that would completely rule out Twitter, which does its own automatic recompression. And many of the sites that aren't Twitter allow in-place revisions, meaning a source could start out matching the others and then change.

Another matter to consider is how this would complicate the bad_id tag. Currently, there is a bot (or more?) that automatically checks uploaded images against their sources to detect md5 mismatches and bad_id's. Under a multi-source system, a bad_id wouldn't necessarily mean all of the sources on the post are deleted; steps would need to be taken to show which of the sources were still good. Either extra tags, or, since this would probably already involve modifying the database, some kind of flag for each source field that could be flipped if it were deleted.

The big matter is what kinds of modifications to the database this would involve. I'm not a coder for the site (hopefully some of them weigh in), but I remember the database modification requirements of feature requests being big deal breakers in the past.

How about adding another field for image like source_2 or sources_alt so that the primary source field would nit be affected by this?

1