Danbooru

Read the rules before proceeding!

Topic: [Prototype] User Report Ver 6.0

Posted under General

BrokenEagle98

Moderator Dashboard

Danbooru Report Implementation: http://isshiki.donmai.us/user-reports/
Danbooru Report Code Repository: https://github.com/r888888888/reportbooru

Latest Update

  • (2017-11-12) Updated data for previous 30 days
  • (2017-11-13) Finally pushed updated code to GitHub repository

Version History

Data

Post Data
Upload Data
Versioned Data
Non-versioned Data
Analytics Data
Comment Rakings

Raw Data

Code

https://github.com/BrokenEagle/PythonScripts

Updated by BrokenEagle98

  • ID: 118632
  • Permalink
  • tapnek

    Toks mentioned something about a similar tool being available to mods and above. Can we have that compared to your implementation?

  • ID: 118633
  • Permalink
  • BrokenEagle98

    tapnek said:

    Toks mentioned something about a similar tool being available to mods and above. Can we have that compared to your implementation?

    I planned on submitting a GitHub issue this weekend. I waited because I wanted to have something solid to show first. My idea would be that they could make available what they already have, just updated once a day instead of on demand like Mods/Admins have. Then, they could make any incremental changes as needed.

    Nitrogen09 said:

    @BrokenEagle98
    It seems that you forgot to include the "Note changes" category.

    Thanks... things tend to get missed when you're reviewing your own work :p

    I updated it just now.

  • ID: 118635
  • Permalink
  • Jarlath

    Apparently, I'm a chatty bastard. And one who sticks with safe content.

    This only goes back 60-90 days, right?

  • ID: 118636
  • Permalink
  • Sacriven

    As expected, it's hard to catch @Provence up when it comes to tagging and uploading, hahaha.

  • ID: 118637
  • Permalink
  • BrokenEagle98

    Jarlath said:

    Apparently, I'm a chatty bastard. And one who sticks with safe content.

    This only goes back 60-90 days, right?

    30 days, actually. I guess it might be good to include the number of days the report covers somewhere. I'll save that for tomorrow though. Danbooru practically dies whenever I tried to upload or update that first post (it's about 55KB), and I'm tired of fighting with it... :P

    Updated by BrokenEagle98

  • ID: 118638
  • Permalink
  • Wypatroszony

    tapnek said:

    Toks mentioned something about a similar tool being available to mods and above. Can we have that compared to your implementation?

    Taken on 25th August at 4:22 CEST and I think the data on these reaches back to midnight of 25th of July also in CEST. (Underline means no upload limit, ✓ means approval rights, which have no meaning in presented data)
    Part 1
    Part 2
    The dashboard also shows comments under the score 0 threshhold with at least 3 votes to them below these statistics. Then in a column next to these, there also are appeals (but only the ones on deleted posts are listed), most recent user records (10) and mod actions (also 10).
    I'll also add that it defaults to minimum date from 2 days ago and to only show base users, but it's flexible enough. I'll also note that the dashboard also takes longer to load than all other pages, but that's likely due to it loading up data from several different routes, given the variety.

  • ID: 118639
  • Permalink
  • user 460797

    Question about the error category by uploads:
    Does that also cover tags that are empty due to an alias request.
    For example if I like using the pearl_bracelet tag and use it. But currently there is a BUR that request "alias pearl_bracelet -> beads_bracelet". Does that also cover this?

  • ID: 118642
  • Permalink
  • yunashiku

    Jarlath said:

    Apparently, I'm a chatty bastard. And one who sticks with safe content.

    Nah. You're not the only one. Me too. It's not like I hate a pervy stuff, it"s just I'm not so fond with that kind of stuff.

  • ID: 118647
  • Permalink
  • user 460797

    And what does make this me? Well, lol. Someone needs to uploads those things anyway :< :P.

    But I consider it very interesting that nearly no user with Platinum- appears on top at any of the lists (except Claverhouse in the Pools and multiple users in the Appeal list). So I guess that nearly all users who deserve a promotion are promoted.

  • ID: 118648
  • Permalink
  • chinatsu

    I'd like to point out editing the body text for an artist entry counts as a wiki edit. I think this is the only reason I made the ranking on wiki edits since most of mine are related to artists.

  • ID: 118651
  • Permalink
  • Type-kun

    Hm. Wasn't original idea that those reports exist to show users ready for promotion - that is, below Builder? Contributors, builders and admins are going to dominate the ratings simply because they have more tools available to do so. If reports are to be integrated, they'll have to have some parameters after all - namely, max user level displayed.

    Also, thinking about it further, would this report make more sense if updated daily and discarded (as currently suggested), or monthly and stored? At a glance, second way seems to be more useful and fun to follow. Not only the change history can be seen this way (think of all the graphs that could be built with that), but data could also be aggregated over a few months if necessary, while placing relatively little load on the server. Maybe even go further and store weekly data, automatically aggregating 4 last weeks by default.

    Before ideas skyrocket further though, we'd better ask @albert - would it be possible to store weekly aggregated data for user activity on reportbooru, and then use it for reports mentioned here? I'm pretty sure reportbooru was created exactly for this kind of stuff, but I'm still not sure what it's capable of. If we're going to join data over few weeks, data should be stored entirely without cutoff counts, so it might end up as quite a lot of rows - but probably not that much, since most users are lurkers anyway, so I doubt it ever goes over 10MB total. @BrokenEagle98, you have some source data to work with, correct? Can you estimate how many rows those reports would have if done weekly and with no cutoff? Last 4 weeks, if you can, it should be illustrative enough.

  • ID: 118657
  • Permalink
  • user 460797

    I thought that this whole thing is based on a monthly base. I guess this whole thing is fetching data over a month (i.e. 30 days).
    And of course this should be stored, since one can then see where one user was active and understand better why they were promoted. At least that is how I wanted it in the first place.
    About the first paragraph: Yeah, there are very few Platinum- users there. The idea was 1. give users an easy way to create feedback for Builder/Platinum- users and 2. help Platinum- users getting promoted.
    So I guess two seperate tables would more sense, but also if we put everything in one big list, I see no much difference here.

  • ID: 118658
  • Permalink
  • kuuderes shadow

    The main things that make a big difference are:
    - admins for things where admin privelages make a difference (tag implication/aliasing, post changes)
    - whether or not they have unlimited upload permission for the uploads - where the top 3 places are all but impossible for those who don't have unlimited uploads, and I'd honestly say Rignak's upload tally is more impressive than even Provence's.

    Incidentally, I checked Provence's record for the month before gaining unlimited uploads, and the figures were:
    365 uploads
    6 deleted

    Impressive deletion ratio, but would be at the lower end of those rankings. And that's for the most prolific uploader on the site.

    While whether or not someone has builder status is useful information for promotion purposes, it doesn't make much of a difference to the tallies. Or at least I haven't come across ways in which it does.

  • ID: 118660
  • Permalink
  • user 460797

    kuuderes_shadow said:

    The main things that make a big difference are:
    - admins for things where admin privelages make a difference (tag implication/aliasing, post changes)
    - whether or not they have unlimited upload permission for the uploads - where the top 3 places are all but impossible for those who don't have unlimited uploads, and I'd honestly say Rignak's upload tally is more impressive than even Provence's.

    Incidentally, I checked Provence's record for the month before gaining unlimited uploads, and the figures were:
    365 uploads
    6 deleted

    Impressive deletion ratio, but would be at the lower end of those rankings. And that's for the most prolific uploader on the site.

    That is true, yes. I might report them again to a moderator and discuss if they could get promoted to Builder level. The big deletion rate is a big obstacle, though. So the odds of success are there, but it is also very unceratin...

    But I get what you mean: Lower the rate for Platinum- users in every category. Well, for uploads only, there is already a seperate list for Platnum- users. But for every other category....why not^^.

    Updated by user 460797

  • ID: 118661
  • Permalink
  • BrokenEagle98

    Type-kun said:

    @BrokenEagle98, you have some source data to work with, correct? Can you estimate how many rows those reports would have if done weekly and with no cutoff? Last 4 weeks, if you can, it should be illustrative enough.

    I gathered 30 days worth of data for each of the above reports. The following are the amount of rows for each category:

    posts/uploads: 2213 (it was easier handling these two together for the data gathering stage; they are separated in the data parsing stage)
    notes: 524
    artist commentary: 320
    pools: 558
    artist: 184
    wiki page: 223
    forum topic: 54
    forum post: 151
    tag implication: 15
    tag alias: 13
    bulk update request: 22
    post appeal: 69
    comment: 1821

    Also, for issue #2640, there is still no way to tell from the API whether someone has unlimited uploads/approval permissions. I made a comment of it on that issue, but there has still been no response. I'd rather not have to write or use an HTML parser to find that information out if at all possible.

    Provence said:

    Question about the error category by uploads:
    Does that also cover tags that are empty due to an alias request.
    For example if I like using the pearl_bracelet tag and use it. But currently there is a BUR that request "alias pearl_bracelet -> beads_bracelet". Does that also cover this?

    If the alias is already in affect at the time you tagged the image, then no. If a alias went into effect after you tagged the image, then yes.

    I hadn't thought of that case, but I can work on removing those. I just need to gather all of the aliases for that 30 day period and compare and contrast them to any "Tag Errors" that pop up...

    chodorov said:

    I'd like to point out editing the body text for an artist entry counts as a wiki edit. I think this is the only reason I made the ranking on wiki edits since most of mine are related to artists.

    They're technically wiki edits, but should those be attributed as artist edits then...?

    Updated by BrokenEagle98

  • ID: 118670
  • Permalink
  • kuuderes shadow

    I've noticed that removal of copyright/translation/character/commentary requests are not included in the tag error column. This is obviously as it should be, but it might be helpful to have a list of things that are left out?

    I also noted that it doesn't seem to have picked out an incorrect character tag here: http://danbooru.donmai.us/post_versions?search%5Bpost_id%5D=2462723
    Or is a day or so before the update too late for it to count?

    Updated by kuuderes shadow

  • ID: 118685
  • Permalink
  • BrokenEagle98

    kuuderes_shadow said:

    I've noticed that removal of copyright/translation/character/commentary requests are not included in the tag error column. This is obviously as it should be...

    I also noted that it doesn't seem to have picked out an incorrect character tag here: http://danbooru.donmai.us/post_versions?search%5Bpost_id%5D=2462723
    Or is a day or so before the update too late for it to count?

    Tag errors as displayed above aren't all the tag errors per se, but all of the tags that were added but now have a count of zero, meaning they are no longer populated. This can happen with misspellings, or if the wrong tag is used, e.g. when I was testing this out, I saw the addition of "serious_face" being replaced by "serious".

    It doesn't currently count the tagging of the wrong character, or copyright, or artist or anything else like that. I suppose the number of tags removed between the first version of the post and the current version of the post could be counted as a tag error. I'll need to go through the first hundred or so post versions and see if that holds true.

    kuuderes_shadow said:

    ...but it might be helpful to have a list of things that are left out?

    Do you mean posts that don't have the translation_request, commentary_request, artist_request, and so forth added by the original uploader?

    Edit:

    Forgot to mention, but Tag Errors also include unicode tags. Those shouldn't exist, since they're next to impossible for most people to type, and the ones I did find I replaced with a non-unicode version.

    Updated by BrokenEagle98

  • ID: 118687
  • Permalink
  • Type-kun

    BrokenEagle98 said:

    I gathered 30 days worth of data for each of the above reports. The following are the amount of rows for each category:

    I actually wanted to see a weekly count distribution - a table for week 1, week 2 etc. That said, we can safely assume that there'll be about 6000 rows per week. That's 310k per year, not a small amount, but not that large either. Given its nature, it would be a well-balanced tree if indexed by date+user and should work fast.

    BrokenEagle98 said:

    Also, for issue #2640, there is still no way to tell from the API whether someone has unlimited uploads/approval permissions. I made a comment of it on that issue, but there has still been no response. I'd rather not have to write or use an HTML parser to find that information out if at all possible.

    Albert's busy ironing out saved searches. I'm currently catching up on RoR and Git to be able to contribute directly rather than with pseudocode and ideas/issues, but it'll take some time 'till I'm ready. The remainder of issue #2640 is quite simple to fix, but it's something that needs testing, so I'm not going to fix it blindly. Once my dev environment is set up properly, small bugs will be squashed faster, hopefully.

  • ID: 118694
  • Permalink