Idée Inc brings image de-duplication to Digg

Idée’s image recognition technology is at the heart of Digg’s dedicated images section making sure duplicate images are identified.

In preparing to launch a dedicated images section, Digg recognized that they needed to deal with duplicate image submissions. When you have over a million registered users that’s bound to happen! Digg already had controls in place to help avoid duplicate stories from being submitted and the same needed to happen for image submissions.

Now recognizing an exact duplicate image file is easy. Simply use a good hash function on all previously submitted images and compare the hash value to the newly submitted image hash and voila, problem solved!

However in real life, the problem is that as an image is copied and exchanged, many small edits are made. Everything from resizing, cropping and file format changes to color shifts, rotation adjustments and text overlays commonly occur. With any of these edits, a hash function approach breaks down completely.

That’s where Idée’s image recognition technology comes into play: being able to compare each newly submitted image to potentially hundreds of thousands of previously submitted ones. And to do this all in a fraction of a second.

How does Idée’s image recognition work?
Idée’s proprietary algorithms look at the patterns within an image to identify it. Each image submitted to the image recognition server is analyzed to build up an image identifier comprised of hundreds of image fingerprints. If you want an analogy you can simply think that each image you look at has a unique digital identifier, this identifier is comprised of hundreds of image fingerprints. This allows our image recognition technology to identify an image even if it has been cropped, rotated, color adjusted and had border applied.

To see great image recognition examples, visit the Idée image gallery.

We delivered this image recognition technology as a web service to Digg.
It has been designed and architected to deal with numerous image variations and the high volume of images expected to be submitted.

Let me take you through a Digg image submission to see how the image recognition works:

Submit a page containing an image
Step 1

Choose the image
Step 2

If the image has been submitted previously, thanks to the good folks at Idée, you will see this page:
Step 3

Happy image digging. We love feedback so please comment below or drop us an email with your feedback

If you are curious about what else we are up to, visit the Idée labs.


BROWSE / IN Idee Inc Image Recognition


[…] Idée has a follow-up post with a lot of information here. […]

Idée providing duplicate image detection to | StartupNorth Dec 04 07 at 9:22 am

[…] adds image submissions to one of the world's most visited websites, has a Canadian connection.Technology developed by Idée Inc. is used in Digg's newly unveiled dedicated image section to identify if any submitted pictures […]

Canadian software company aids Digg's new image feature - FP Posted Dec 04 07 at 6:58 pm

[…] or for a variety of other purposes. Digg, the social news powerhouse, launched Digg Images with a partnership with Idée. Idée helps the company to identify duplicate images when they’re submitted. Just as with […]

The Rev2 Cabinet: Idée - Feb 29 08 at 2:46 am

[…] one you will never win (unless you have a lot of money like the Digg guys and can borrow some fancy image recognition technology to deal with it :P). So the most usual approach is “why bother”, which, I should say […]

visualizeus » Fighting against duplicated images Jun 27 08 at 4:34 pm

Do you have the ability with your system to filter out certain images that could contain offensive content?

Noah Everett Dec 05 07 at 11:53 am

Voler une création artistique, violer un copyright, détourner une oeuvre sans autorisation: sur le net, c’est facile. Et il suffit de surfer sur quelques sites pour comprendre que les mentalités doivent changer. Et pourtant, il n’y a plus d’excuse[…]

Brian Jan 10 08 at 5:47 am

This is awesome!
The technology is amazing and the products like multicolor search are really well done!

I have a visual bookmark site. it’s
Any chance to do the same thing there?

I’m totally in love with idée
(and waiting for an invitation to tineye private beta)

great work!

Fabio Giolito Jul 18 08 at 10:22 am

Comments are moderated.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>