[NetBehaviour] Search term as text
Jim Andrews
jim at vispo.com
Fri Feb 19 05:05:39 CET 2010
Hi Curt,
Thanks for the 'safe search' paper. I use that feature in dbCinema. In the
Shockwave version of dbCinema, 'safe search' is turned 'on' and can't be
turned 'off'. Which is odd for a porn filter. Being permanently turned on I
mean. It sounds different than what it means. And in the desktop version of
dbCinema, it's 'off' by default and can be turned 'on'.
of course, the problem of determining whether an image is significantly
'red' (or any given colour) is quite a bit easier than determining if it's
'adult content'. the problem of determining if a picture is significantly
'red', for instance, is just a matter of checking out the palette or colour
set used in the image to see if a significant number of pixels have colour
within a certain range of red. that's simple math stuff. and quite quickly
done in gifs cuz there's just a 256 color palette. but with jpgs and so
forth, i dunno, i don't know if there is a color table in the header of the
image. i don't think so. which means you'd have to check it pixel by pixel.
which would be quite slow. just one pass through the pixels, though.
whereas the problem of determining if an image is 'adult content' is
considerably more complex. skin tones vary. bikinis are not adult content
but show lots of skin. and many other photos can show lots of skin but not
be adult content. a much more complex task than simply detecting color.
the paper you referenced starts its conclusion by saying
"The results above indicate that the system is able to
detect roughly 50% of the adult-content images in a
small test set, with roughly 10% of the safe images
being incorrectly marked as adult-content; or at a different
threshold detecting 90% of adult-content images
with a false alarm rate of 35%."
50% or even 90% is not good enough for their purposes. and when we try to
get some 'adult content' while safe search is on, we see that they are
getting results much much better than 50% or even 90%. it seems to be more
like 99.999%.
i would expect that this is a result of using both the sort of analysis
you've referred to plus other methods. such as not using images from sites
known to show adult content. and not using images from pages or narrower
contexts using certain language.
concerning the imgcolor parameter, if it were ready for primetime, it would
be on http://images.google.ca/advanced_image_search , i expect. also, since
the results are so good with just things like
http://images.google.ca/images?q=turquoise , or even
http://images.google.ca/images?q=turquoise+car , i don't see what the value
would be in investing in a parameter to detect color by analyzing the color
in the image. any color analysis with the imagecolor parameter will be stuff
they've already done in, say, the safe search image analysis for the
presence of skin. safe search is important to the business viability of
image search. and requires a bit of color analysis. i expect that's the end
of the color analysis unless a case can be made for it concerning revenue.
but i think it's pretty impressive that things like
http://images.google.ca/images?q=turquoise work so well without color
analysis.
ja
More information about the NetBehaviour
mailing list