I don’t specify a license for the content that I publish here at guyzero.com other than asserting copyright. Someday I’ll get around to specifying some form of CC-SA (Creative Commons – Share Alike), which would allow folks to be able to easily use my content. Maybe I’m just waiting to have some content that is actually worth using!
Anyhow, I noticed some additional traffic to one of my screenshot images. The owner of an Italian website must’ve found my image using some form of image search (the name of the image was the topic of his own content.) He could’ve simply downloaded the image and placed it on his site and I probably would’ve never known, and in this case, since the image was a program’s screenshot, I would not have cared. But instead, he linked to the image on my webserver, so that whenever a visitor loads his page, the image is pulled from me, utilizing my bandwidth each time.
Spotting a hijacker
One way to spot this activity is to parse the raw logs that your web server generates for every access. These logs include the hits to every page, every image, every file on the file system, so requests for resources that have been hijacked will appear here. This is how we did web metrics in the old days (i.e. 2004) and there is useful software that can parse these logs and generate nice reports for you. Check “Top Image” or “Top Resource” reports to see if any item is getting out-of-the-ordinary usage, and if you spot something, check your “Top Referrers” reports which may identify the hijacker. Please comment if you want an article about setup and use of log based web metrics software.
Another way to spot this activity is to use Yahoo Site Explorer or Google Web Master Tools, which can sometimes identify when an external site has linked to a resource within your site and may provide the added benefit of identifying the site that is hijacking your image.
So now that you’ve identified that somebody is linking to your work on the site, what do you do? Well, you’ve got options:
- Contact the website that is linking your resource and ask them to comply with your license. You may want them to simply stop linking to your work, or you may want to give them permission to continue to link to your work as long as they also provide a visible link back to your website or some other attribution.
- Rename the image to another name, and fix your content to point to the new image. This has the effect of displaying a broken image box on the hijacker’s page. You will continue to get requests to your webserver for the missing image.
- Rename the image to another name and substitute a new image for the old one. With a bit of imagination, this can have hilarious results.
For both of the last two options, you should consider adding “Disallow: /path/to-your/image.jpg ” to your robots.txt file. This will force well-behaved search engines to forget about your old image.
In this case, I choose the last option as I wanted to see how long the replaced image might continue to live on the hijacker’s website. Rather than replace the old image with an image of b00bies, or a message to not hijack my images, which was my first instinct, I instead placed a highly visible watermark across the image, giving my little low-traffic blog some free advertising to Italian computer enthusiasts. Buongiorno my friends!
I believe that it is possible to automatically show a “Do Not Hijack My Images” image in place of any single resource on your website by adjusting your website configuration to look for visitors with initial accesses to an image rather than a page. This issue has not yet become painful enough for me to look into how to do this, but if you can point me in the right direction, please share in the comments below.