Danbooru

Baffling...keeping dupes?

Posted under General

This topic has been locked.

iridescent_slime said:
You keep maintaining that there is a problem, but you have yet to explain exactly why the existence of duplicate images is a problem...because it adds a bunch of identical copies to your search results? Good thing you can add duplicate to your blacklist.

Is the post marked with the "duplicate" tag always the inferior version? Or is the duplicate tag applied any post posted after the first one? If it is the former then I'd say you guys are already evaluating quality and thus to not use that knowledge to remove the inferior post is dumb. If it's the latter (I'm guessing it is) then I'd say as a quality snob myself I don't want to look at the smaller version...I want to look at THE BEST version available...so adding the "duplicate" tag to my blacklist would be a no from me, dog.

It's a problem because it does clog up the post list, and it just makes the posts look ugly when you have a single post with 2 children...I don't care to see that extraneous information as a user, and it trolls me all the time because I think that those other posts are minor variations, when in fact they're just the same image.

This is all very annoying to me as a user...ultimately the business model of this site is to serve users. Users click on posts and see ads, that's how it works...if it's annoying to navigate the site and do that well...they're gunna go to another image board site that doesn't annoy them.

iridescent_slime said:
Stop and think for just a second about the consequences of what you're suggesting. What happens when the artist accidentally uploads an absurdres or uncensored image to Twitter and then replaces it with a downscaled or censored revision, or uploads something to Twitter but shortly thereafter deletes their entire online presence without warning? The image in question never gets posted to Pixiv as it originally appeared. The window for archiving it closes and it's lost to us forever.

If an artist accidentally uploads an uncensored or absurdres version then immediately removes it then obviously they didn't want it up there in the first place. It's probably a Patreon/Fanbox exclusive, or timed exclusive so thus stealing it would be unethical anyways.

I'll admit there are downsides, but have you ever thought of the reverse? What if someone uploads a twitter image...then no one bothers to upload the Pixiv version because they see the twitter one and figure "well there it is...I won't upload it again."...then sometime later the artist decides to delete their online presence? Well now we're stuck with the damn twitter version for all eternity. I think that is much more likely than any of your extremely edge-case scenarios. I know because I've seen it all the time, after Tumblr went down we are left with many many inferior versions of posts with no way to get the superior ones at this point...because people see the shitty version and assume it's the best available.

iridescent_slime said:
Whatever your opinion of duplicate uploads or the quality of Twitter's image compression, having uploaders who race to grab art from Twitter without delay is a good thing.

No. It ultimately does more harm than good.

All this logic really falls flat coming from someone who literally does not upload anything. Your proposed system puts a much higher burden on people who do a job that you have never done. You are taking the burden of discovery from yourself (the one who allegedly cares a LOT about the quality) and on to literally hundreds of others who might not.
Duplicates of an image aren't great, but honestly asking a ton of others to sift through and removed/wait for the artist to do something that may never come is a much worse idea with a site focused on archival first. Who is wasting more time, someone uploading an image and then in hindsight tagging it as an inferior dupe, or someone seeing two basically identical images already connected by parent/child relatinships? Whose time is worth more? Hard to answer.

Updated

Dyrone said:

This is all very annoying to me as a user...ultimately the business model of this site is to serve users. Users click on posts and see ads, that's how it works...if it's annoying to navigate the site and do that well...they're gunna go to another image board site that doesn't annoy them.

Are you sure you're on Danbooru? Danbooru doesn't have ads.

Dyrone said:

I'll admit there are downsides, but have you ever thought of the reverse? What if someone uploads a twitter image...then no one bothers to upload the Pixiv version because they see the twitter one and figure "well there it is...I won't upload it again."...then sometime later the artist decides to delete their online presence? Well now we're stuck with the damn twitter version for all eternity. I think that is much more likely than any of your extremely edge-case scenarios. I know because I've seen it all the time, after Tumblr went down we are left with many many inferior versions of posts with no way to get the superior ones at this point...because people see the shitty version and assume it's the best available.

Nobody has ever said "well there it is...I won't upload it again." for the reasons you're proposing. Use of the upload bookmarklet completely eliminates that argument.
As for what you said about Tumblr, your statement is just flat-out wrong. From what I can tell, every image uploaded from Tumblr was automatically replaced with a fullsize (raw) version soon after the discovery of raws. Therefore, no fullsize images were lost, unless they were deleted before raws were discovered.

Dyrone said:
It's a problem because it does clog up the post list, and it just makes the posts look ugly when you have a single post with 2 children...I don't care to see that extraneous information as a user, and it trolls me all the time because I think that those other posts are minor variations, when in fact they're just the same image.

This is all very annoying to me as a user...ultimately the business model of this site is to serve users. Users click on posts and see ads, that's how it works...if it's annoying to navigate the site and do that well...they're gunna go to another image board site that doesn't annoy them.

Shouldn´t you, in this case, ask first, if some other users are really annoyed by this? Even if I personally upload stuff, I´m a User myself and don´t really care about knowing that a post has any duplicates at all.
This argument is just completely subjectiv, just as the question if someone wants the see the uploader_name or not.

Don´t assume that everyone else is annoyed as much as you, just because you´re an user. It´s not worth the effort. And at all: Danbooru is an imageboard for collecting good quality art, and most of the time duplicates aren´t that bad at all, just because there is a better version.

I don't see the point?
Don't want duplicates?
Add -duplicate to your search or "duplicate" to your blacklist.

Problem solved. Nobody else gets harmed and can still use Danbooru as they want to.

Also only sticking around with the Twitter version is better than sticking around with no version. Also @Dyrone nobody of the current uploaders thinks "There's a Twitter version, no point in uploading the Pixiv version". That only applies if the Twitter version is better, like in most yomu_(sgt_epper) cases. However, in cases where the Pixiv version is better then the Pixiv version gets always uploaded. Your point is mute.

Nobody has ever said "well there it is...I won't upload it again." for the reasons you're proposing. Use of the upload bookmarklet completely eliminates that argument.

Uh, what? Does this bookmarklet literally tell you when there is a Pixiv version available? No? Ok then I guarantee you things have slipped past.

seika0 said:
From what I can tell, every image uploaded from Tumblr was automatically replaced with a fullsize (raw) version soon after the discovery of raws.

Oh, "from what you can tell"...well unless you literally had some sort of bot that trawled Tumblr for the raws and uploaded every single one then I guarantee images were missed.

Guaro1238 said:
Shouldn´t you, in this case, ask first, if some other users are really annoyed by this? Even if I personally upload stuff, I´m a User myself and don´t really care about knowing that a post has any duplicates at all.

Well shouldn't you like...ask first, if some other users are really NOT annoyed by this? Play by your own rules. But see if you actually read the thread you would know a few others have already said dupes are "a problem" so no I don't have to go around asking people...they've already said as much.

Dupes are visual clutter that provides NOTHING extra for a common viewer...maybe you tolerate them, but at the end of the day they add nothing to the user experience, they can only exist as either a neutral entity (in your case) or a negative one. Your user experience is in no way enriched by having a slider on the top of the post that contains nothing but dupes.

Dyrone said:

well unless you literally had some sort of bot that trawled Tumblr for the raws and uploaded every single one then I guarantee images were missed.

Didn’t @RaisingK have a bot to do exactly that?

This discussion is leading nowhere. :<

Dyrone said:

Uh, what? Does this bookmarklet literally tell you when there is a Pixiv version available? No? Ok then I guarantee you things have slipped past.

Well I guess someone apparently does not know that many uploaders here actually follow artists on Pixiv, etc. to ensure that they will know when their favorite works are available. As Lacrimosa said, the current mindset of uploaders will often want to upload the superior version of Twitter posts when possible. So, I don't see how "things have slipped past".

Dyrone said:

No. It ultimately does more harm than good.

So, are you suggesting that uploaders shouldn't upload from Twitter? Because it "does more harm than good" as you said? Enlighten me. Not every artist have a completely predictable uploading schedule between when they upload to Twitter to when they upload to Pixiv. Sometimes they upload to Twitter but then never upload to Pixiv. Sometimes the Twitter version turns out to be better than the Pixiv version (such as one favorite artist I follow). Even if in some cases I prefer waiting for Pixiv, I honestly do not see any reason in why uploading from Twitter first (or at all) is considered "bad". Pointless banter is pointless.

Updated

Dyrone said:

Uh, what? Does this bookmarklet literally tell you when there is a Pixiv version available? No? Ok then I guarantee you things have slipped past.

Actually, it does if it’s already been uploaded. There’s also saucenao and IQDB.

Dyrone said:

Oh, "from what you can tell"...well unless you literally had some sort of bot that trawled Tumblr for the raws and uploaded every single one then I guarantee images were missed.

topic #14154

Having duplicate tagged properly on everything would be nice.
@Dyrone feel free to add that tag to duplicates.
Here is a little help, pairs of twitter/pixiv parent/childs feel free to add duplicate if they are.
1,2,3,etc_all

if anyone passes an eye on these posts could you add the parent/child to a favgroup so I can filter those out as not duplicated? ty

Dyrone said:

If an artist accidentally uploads an uncensored or absurdres version then immediately removes it then obviously they didn't want it up there in the first place. It's probably a Patreon/Fanbox exclusive, or timed exclusive so thus stealing it would be unethical anyways.

If you want to look at Danbooru's ethical/moral problems, you can take a look at the banned artist tag for all of the uploads the site preserves after a request to remove it. It's silly to argue that we should now attempt to judge morality and ethics when we permit post #3453481 and host material that is grotesque.

The primary issue I think you're driving at is that it's annoying to favorite a Twitter post and discover a Pixiv master later. This is not something that should be solved by adding arbitrary rules to the collection. Rather, it should be solved by changing the system behavior. Change how favorites work. Create a new "canonical" relationship. Introduce some system to migrate favorites. We don't need to draw the line at determining the purity of a source or if an image is a dupe after the fact.

Danbooru is open source, and if you make a spirited campaign to change the behavior or change it yourself, you will likely see traction. The reason why this hasn't been solved yet is because it is complicated, time consuming, and ultimately hard to judge where to draw the line.

Well shouldn't you like...ask first, if some other users are really NOT annoyed by this?

Let me tell you: I'm certainly annoyed by this. I favorite twitter uploads all the time and discover higher quality works later. I've even uploaded Twitter posts without finding a Pixiv source, and then had my uploads parented to a Pixiv source. I hate it. I wish there was a better solution. But I haven't had time to personally invest in trying to fix it. If I go back to an old favorited post and find a new Pixiv parent, I'm filled with joy and favorite it. I really appreciate finding out that someone got a better quality version. But what depresses me is when an image I want to favorite isn't on Danbooru because it's hard to source. I don't think we should make this worse.

inkuJerr said:
Well I guess someone apparently does not know that many uploaders here actually follow artists on Pixiv, etc. to ensure that they will know when their favorite works are available. As Lacrimosa said, the current mindset of uploaders will often want to upload the superior version of Twitter posts when possible. So, I don't see how "things have slipped past".

So you jut can't imagine that out of millions of posts that someone could post a twitter version and this site never gets the better version as a result? That is simply ridiculous on its face.

seika0 said:
Actually, it does (the Danbooru bookmarklet) if it’s already been uploaded.

Which means it's worthless as an indicator of whether or not a better versions exists off-site...so it has no relevance to the discussion.

There’s also saucenao and IQDB.

Which require a user to check manually, meaning it may or may not be checked. Neither of these things totally ensure that the superior versions are guaranteed be found and uploaded.

inkuJerr said:
So, are you suggesting that uploaders shouldn't upload from Twitter? Because it "does more harm than good" as you said? Enlighten me. Not every artist have a completely predictable uploading schedule between when they upload to Twitter to when they upload to Pixiv. Sometimes they upload to Twitter but then never upload to Pixiv. Sometimes the Twitter version turns out to be better than the Pixiv version (such as one favorite artist I follow). Even if in some cases I prefer waiting for Pixiv, I honestly do not see any reason in why uploading from Twitter first (or at all) is considered "bad". Pointless banter is pointless.

Maybe you should read some of my past replies as to why it's bad. 1. It creates dupes like 95% of the time 2. Leads to extra work for you or others...someone has to discover that an image is a dupe...tag it and parent/child it...it's basically like littering unless you are making sure you do this yourself. 3. It can lead to better images never being uploaded. None of these arguments have convinced me that #3 doesn't happen. It's basically "da uploaders are real good at da uploading so dat could NEEEVAH happeen." Yeah, ok, sure. I don't believe that at all.

I mean it's not like it's hard to check if they have a pixiv, and if they tend to upload images from their twitter to their Pixiv. Yes? Well then wait a bit for the Pixiv version. Not rocket science.

Also, I'm not saying Twitter is inherently bad...it's just a general rule of thumb that the Twitter version will be horrifically compressed...although I have seen a few accounts that have managed to post HD versions on Twitter...no idea how they did it, but they did.

Updated

Having an active Twitter post increases the likelihood that someone will upload the Pixiv version. That's because you can know with near-certainty that it'll be approved, and the post has already been tagged for you. It's practically a free +1 to your upload count.

You clearly care a lot about this issue. Why not upload some missing Pixiv posts right now?

Dyrone said:

So you jut can't imagine that out of millions of posts that someone could post a twitter version and this site never gets the better version as a result? That is simply ridiculous on its face.

Which means it's worthless as an indicator of whether or not a better versions exists off-site...so it has no relevance to the discussion.

Which require a user to check manually, meaning it may or may not be checked. Neither of these things totally ensure that the superior versions are guaranteed be found and uploaded.

Maybe you should read some of my past replies as to why it's bad. 1. It creates dupes like 95% of the time 2. Leads to extra work for you or others...someone has to discover that an image is a dupe...tag it and parent/child it...it's basically like littering unless you are making sure you do this yourself. 3. It can lead to better images never being uploaded. None of these arguments have convinced me that #2 doesn't happen. It's basically "da uploaders are real good at da uploading so dat could NEEEVAH happeen." Yeah, ok, sure. I don't believe that at all.

Also, I'm not saying Twitter is inherently bad...it's just a general rule of thumb that the Twitter version will be horrifically compressed...although I have seen a few accounts that have managed to post HD versions on Twitter...no idea how they did it, but they did.

Your entire argument is based on the possibility of human error and "I don’t like this, someone else should change it" instead of actual facts with evidence backing them up. Also, you’ve moved the goalposts so much throughout this thread that I doubt most people can even tell what you want anymore.
Honestly, this ridiculous farce of a thread should be locked. It’s not going to result in anything getting done, and has only resulted in you insulting other people, and those other people wasting their time to make counter-arguments about an issue that seems to only bother you, as nobody else has spoken up in DIRECT support of your ideas throughout this thread.

feline_lump said:
You clearly care a lot about this issue. Why not upload some missing Pixiv posts right now?

Yeah, sure, point me to the twitter posts...is there a way to search specifically for posts from Twitter?

seika0 said:
an issue that seems to only bother you, as nobody else has spoken up in DIRECT support of your ideas throughout this thread.

I guess EvanSoulEater doesn't exist? He's a nobody? Cause he said "Let me tell you: I'm certainly annoyed by this." Or are you gunna tell me how that support isn't direct enough?

Dyrone said:

Yeah, sure, point me to the twitter posts...is there a way to search specifically for posts from Twitter?

Man, you have do at least this much work yourself.

seika0 said:

Man, you have do at least this much work yourself.

That doesn't answer my question...is there a way to search specifically for posts with a twitter source? Or AGAIN are you guys just relying on sheer luck that people are trawling through all the posts and uploading superior versions?

Dyrone said:

That doesn't answer my question...is there a way to search specifically for posts with a twitter source? Or AGAIN are you guys just relying on sheer luck that people are trawling through all the posts and uploading superior versions?

source:*pbs.twimg or source:*twitter combined with another tag of your choosing so that you don’t time out the search.

1 2 3