[NetBehaviour] Search term as text

Jim Andrews jim at vispo.com
Fri Feb 19 05:05:39 CET 2010


Hi Curt,

Thanks for the 'safe search' paper. I  use that feature in dbCinema. In the 
Shockwave version of dbCinema, 'safe search' is turned 'on' and can't be 
turned 'off'. Which is odd for a porn filter. Being permanently turned on I 
mean. It sounds different than what it means. And in the desktop version of 
dbCinema, it's 'off' by default and can be turned 'on'.

of course, the problem of determining whether an image is significantly 
'red' (or any given colour) is quite a bit easier than determining if it's 
'adult content'. the problem of determining if a picture is significantly 
'red', for instance, is just a matter of checking out the palette or colour 
set used in the image to see if a significant number of pixels have colour 
within a certain range of red. that's simple math stuff. and quite quickly 
done in gifs cuz there's just a 256 color palette. but with jpgs and so 
forth, i dunno, i don't know if there is a color table in the header of the 
image. i don't think so. which means you'd have to check it pixel by pixel. 
which would be quite slow. just one pass through the pixels, though.

whereas the problem of determining if an image is 'adult content' is 
considerably more complex. skin tones vary. bikinis are not adult content 
but show lots of skin. and many other photos can show lots of skin but not 
be adult content. a much more complex task than simply detecting color.

the paper you referenced starts its conclusion by saying

"The results above indicate that the system is able to
detect roughly 50% of the adult-content images in a
small test set, with roughly 10% of the safe images
being incorrectly marked as adult-content; or at a different
threshold detecting 90% of adult-content images
with a false alarm rate of 35%."

50% or even 90% is not good enough for their purposes. and when we try to 
get some 'adult content' while safe search is on, we see that they are 
getting results much much better than 50% or even 90%. it seems to be more 
like 99.999%.

i would expect that this is a result of using both the sort of analysis 
you've referred to plus other methods. such as not using images from sites 
known to show adult content. and not using images from pages or narrower 
contexts using certain language.

concerning the imgcolor parameter, if it were ready for primetime, it would 
be on http://images.google.ca/advanced_image_search , i expect. also, since 
the results are so good with just things like 
http://images.google.ca/images?q=turquoise  , or even 
http://images.google.ca/images?q=turquoise+car , i don't see what the value 
would be in investing in a parameter to detect color by analyzing the color 
in the image. any color analysis with the imagecolor parameter will be stuff 
they've already done in, say, the safe search image analysis for the 
presence of skin. safe search is important to the business viability of 
image search. and requires a bit of color analysis. i expect that's the end 
of the color analysis unless a case can be made for it concerning revenue.

but i think it's pretty impressive that things like 
http://images.google.ca/images?q=turquoise work so well without color 
analysis.

ja 




More information about the NetBehaviour mailing list