Jul 152010

Created by the same team that also brought us the indispensable YSlow tests, Boomerang allows us to collect the page load times from actual visitors.

While performing page load testing on your development machines using Firebug, YSlow, and other tools is important, nothing beats getting real world “perceived” page load times from your real visitors, along with other useful metrics.

The software works by adding some javascript to the page(s) that you’d like to benchmark and then deploying some php or other code that will save the incoming results into a database. The website below offers example code:

Try Boomerang.

May 172010

blockOptions.cgi, afterworkOptions.cgi and others are requests from visitors who have tried to access your website from behind WebSense content filters and were likely blocked (or only allowed to see your website during certain sanctioned hours.) I discovered this after seeing these types of requests in my own referrer reports for guyzero.com.

If you see these types of accesses, you can possibly check for other accesses that occurred at the same time to discover which IP address or domain was trying to access your website.

Its not clear what factors can cause a website to be blocked. They might have some hand-created blacklist and there is likely also some automation — perhaps having certain words or phrases contained on any of your pages will get your entire site blocked.  A site might be blocked because another website at the same IP (most webservers host multiple sites) had content that tripped the filter. Websense also mentions some kind of mysterious “reputation” criteria in their documentation.

I believe, but am not certain, that blockOptions.cgi referred requests are from the IT administrator at the company that is running a Websense filter, probably checking up on your content to see what their employees are trying to browse during work hours.

Its also not clear what you can do as a webmaster to have your site unblocked. A quick perusal of the Websense company website did not provide an easy link to appeal my suspected block.  There is a way to navigate to open a Service Request, which I suspect is the right venue, but you are required to do the standard website registration rigmarole in order to start the request, so I lost interest and gave up.  Besides, it feels kind of cool to be on a list of banned websites somewhere, especially given all my dirty Android tips and subversive web developer links.

May 142010

I don’t specify a license for the content that I publish here at guyzero.com other than asserting copyright.  Someday I’ll get around to specifying some form of CC-SA (Creative Commons – Share Alike), which would allow folks to be able to easily use my content. Maybe I’m just waiting to have some content that is actually worth using!

Anyhow, I noticed some additional traffic to one of my screenshot images.   The owner of an Italian website must’ve found my image using some form of image search (the name of the image was the topic of his own content.) He could’ve simply downloaded the image and placed it on his site and I probably would’ve never known, and in this case, since the image was a program’s screenshot, I would not have cared. But instead, he linked to the image on my webserver, so that whenever a visitor loads his page, the image is pulled from me, utilizing my bandwidth each time.

Spotting a hijacker

Resource hijacking, where a website links to your image, movie or javascript file, will not be displayed in Google Analytics or most other common web metrics programs as they require the HTML or PHP page that contains both the call to the resource and the call to the metrics program to be read.

One way to spot this activity is to parse the raw logs that your web server generates for every access. These logs include the hits to every page, every image, every file on the file system, so requests for resources that have been hijacked will appear here. This is how we did web metrics in the old days (i.e. 2004) and there is useful software that can parse these logs and generate nice reports for you.  Check “Top Image” or “Top Resource” reports to see if any item is getting out-of-the-ordinary usage, and if you spot something, check your “Top Referrers” reports which may identify the hijacker.  Please comment if you want an article about setup and use of log based web metrics software.

Another way to spot this activity is to use Yahoo Site Explorer or Google Web Master Tools, which can sometimes identify when an external site has linked to a resource within your site and may provide the added benefit of identifying the site that is hijacking your image.

Possible responses

So now that you’ve identified that somebody is linking to your work on the site, what do you do? Well, you’ve got options:

  • Contact the website that is linking your resource and ask them to comply with your license. You may want them to simply stop linking to your work, or you may want to give them permission to continue to link to your work as long as they also provide a visible link back to your website or some other attribution.
  • Rename the image to another name, and fix your content to point to the new image. This has the effect of displaying a broken image box on the hijacker’s page. You will continue to get requests to your webserver for the missing image.
  • Rename the image to another name and substitute a new image for the old one. With a bit of imagination, this can have hilarious results.

For both of the last two options, you should consider adding “Disallow: /path/to-your/image.jpg ” to your robots.txt file. This will force well-behaved search engines to forget about your old image.

In this case, I choose the last option as I wanted to see how long the replaced image might continue to live on the hijacker’s website. Rather than replace the old image with an image of b00bies, or a message to not hijack my images, which was my first instinct, I instead placed a highly visible watermark across the image, giving my little low-traffic blog some free advertising to Italian computer enthusiasts.  Buongiorno my friends!

I believe that it is possible to automatically show a “Do Not Hijack My Images” image in place of any single resource on your website by adjusting your website configuration to look for visitors with initial accesses to an image rather than a page. This issue has not yet become painful enough for me to look into how to do this, but if you can point me in the right direction, please share in the comments below.

© 2010 guyzero.com Hosted by BartoliTech.com