Log in | Register
Forum > Site Discussion > Thread

Reposting

Oct 31, 2024 - permalink

This looks like someone tried to play tic-tac-toe.

Oct 31, 2024 - permalink

Yeah, that's because for the most popular women (Ruby Rivera is currently around the 30th most-subscribed on the site, out of over 78,000), whenever they make a new Instagram post a bunch of people will rush to upload it, and the various resolutions/etc available means they won't always be blocked by the automatic duplicate detection.

Chainer
Oct 31, 2024 - permalink

I've been trying to think of a way to disincentivize this but haven't come up with anything simple and likely to work. If anyone has ideas I'm all ears.

Nov 01, 2024 - permalink

Can you please tell me How does the similar image detection work ? By meta data ?

If yes it will be difficult to disincentivize.

2 days ago - edited 2 days ago - permalink

Bingo!

Yes I report them but it's like playing wac a mole.

2 days ago - edited 2 days ago - permalink

I've been trying to think of a way to disincentivize this but haven't come up with anything simple and likely to work. If anyone has ideas I'm all ears.

I've previous also asked what criteria is used to detect dups.

This is an area where AI might possibly be useful to scan images to flag up possible dups but I don't know how practical or useful it would be for you here even if it were possible to implement. Otherwise uploads say, to the most popular models (if named as they are uploaded) get held before manual mod approval?

tamarok
2 days ago - permalink

Can you please tell me How does the similar image detection work ? By meta data ?

If yes it will be difficult to disincentivize.

It’s by a library the creates a signature based on the image content. The challenge is that it is CPU intensive. Anything with a high CPU hit will impact site performance and hence degrade the experience to people visiting the site.

AI based solutions would likely impact the CPU even more.

Finding the right solution involves balancing many considerations and factors.

Chainer
2 days ago - permalink

There are two ways we detect duplicates:

  • Exact matching based on the contents of the file (aka, its hash). This means the file you are uploading is a literal, byte-for-byte duplicate of something already on the site. These are rejected outright at upload time.
  • The similarity-based method tamarok mentions above. This is able to detect images which are the same, even if one is a resize of the other. It's even tuned right now so that it basically never has a false positive. The problem is that this can't auto-reject images because we don't know which one is higher quality. In order to decide, we still need a mod to look at it. As a result, this just auto-generates an image report for mods to look at rather than outright rejecting the upload.

It's possible that many of those Leyvina uploads were in fact caught by the second one of these, but a mod hadn't yet had a chance to look at the image reports.

2 days ago - permalink

Is there any auto detection of video dups?

tamarok
1 day ago - permalink

Is there any auto detection of video dups?

Beyond basic file hash, there isn’t. I suppose a check based of video length and model name could be used, but it may be a source of too many false positives and it wouldn’t deal with videos which are partial videos of another.

1 day ago - permalink

Beyond basic file hash, there isn’t. I suppose a check based of video length and model name could be used, but it may be a source of too many false positives and it wouldn’t deal with videos which are partial videos of another.

Could probably do exact length down to the millisecond, but anything falling on exact second marks would have to be ignored as too common (possibly also 10's of milliseconds?)

« first < prev Page 1 of 1 next > last »