Re-directs have been coming up a lot lately, and while I can walk someone through the basics of what one is... we have a friendly Webwizard (as he was referred to earlier today) on hand today. Nakchak was gracious enough to really delve into what one is, what it means, and how to check them.
I asked Nakchak if I could post some of his explanations today, and he took it one step further with an email and tutorial. I am hoping that this will help readers, both longtime and just visiting.
He starts off in a response to a question of someone posting "http://www.games-workshop.com/en-US/insert random words here" and coming up with the standard error page on the GW site that we all know and love.
A huge thanks to Nakchak for taking the time to do this.
that is a friendly error page, you need to look at the http status and or the presence of capitalized words in the url for it to be a redirect.
In short (assuming you have the ability to examine the http response (everyone does if you are using a pc based browser and have the skills to press f12 and understand what you see)) if you see a 301 status followed by a 404 then you have found a url that exists in the url rewriting engine of GW's site, but either doesnt have a published page or is unused if you just see a 404 status such as http://www.games-workshop.com/en-GB/i-tried-to-be-clever-but-just-made-myself-look-like-a-twat
then all you have done is get the page not found error displayed, no 301 status no redirect (although most redirects found so far have been capitalized thats only because of default settings of the cms will auto create a url redirect based on the page title, even if GW override the url to include the SKU in it i.e. the suffix of -EN to English codices as opposed to -DE for the German versions....
The redirects prove exist in the database that powers GW's website. They don't prove that the products exist other than as part of a SEO (Search Engine Optimization) strategy. That said when the redirects are tied against leaked release schedule it does give us some insight into the internal working of GW's web product team.
The presence of redirects seems to suggest that although not published to the public, the pages have been created within the CMS and are probably going through an internal drafting process. Due to the design of Oracle's ATG CMS (which gw are using) even though the page hasn't been published the url redirect engine doesn't honor the pages publication date and redirects requests to the friendly URL (the lower case variant responds with a 301 permanent redirect HTTP status and redirects the the capitalized variant), which attempts to serve the unpublished page, which is then detected and serves the friendly 404 (page not found) error page.
Now some of the redirects also seem to indicate retired URL's are still in the redirect engine (as they will have been spidered by search engines so are still valuable as an entry point to the site) logan grimnar springs to mind.
That said the one thing that can be divined from redirects is that the url exists in the db and the keywords trigger a rewrite, so there is some intent to utilize that URL on GW's part. Combined with educated guesses the presence of a 301 redirect does lend some weight to the redirects validity, but only if you observe a 301 http status when you first load the page using some HTTP analysis tools, other wise its just a friendly error page
Tutorial via Nakchak
Im using firebug for firefox, but really any of the built in browsers developer tools can do this, you just need to examine the network response from the initial page request.
1. Install firefox (you can do this in any of the major browsers as they all have developer tools built in now, I just like firebug for firefox the best)
2. Install firebug (http://www.getfirebug.com)
3. Browse to http://www.games-workshop.com
4. Press “F12” to open firebug
5. Click on the “Net” tab
6. Enter a URL you are checking a redirect for
7. Watch the responses displayed in the firebug console window
8. Bonus points you can delve deeper by expanding a request and looking at its http headers, this is where you get told that GW are using Oracles ATG Web Commerce platform to run their site, and so further information can be found on how the platform works by digging into oracles documentation
It’s the top 2 lines which are of interest the 301 status tells your browser to remember the fact that the URL has changed, and to redirect the request to the second line which currently is not published so the CMS instructs the requester to be served a 404 page not found error page.
This screen shot illustrates the difference between a redirect and a simple page not found:
The key is the lack of a 301 redirect as the first entry on the request list.
A few thing to note are
· that by using a 301 redirect you will only see the redirect once, unless you clear your browsers cache or are in private/pr0n browsing mode as the 301 response instructs your browser to remember the fact that the page has been redirected and to always go directly to the redirected content on subsequent requests.
· Although capitalised words in a url are a good indication of a redirect, that is almost certainly default behaviour to the CMS that GW are using, the real test is the presence of a 301 response status.
· Just because a 301 exists it doesn’t mean that a page will, consider this:
This (and the dark angel url I fear) shows that the default auto url is the products title, but the correct url for the BA codex includes an SKU (for codices it’s the publication langugae –EN for english, -DE for german, -ES for spanish etc, or the publication year i.e. Codex-Necrons-2015). These are place holder urls, and GW could very well decide in the future to have multiple URL’s pointing to single products, i.e. chaptername-Assult-Marines wouldn’t indicate that Chaptername are getting a dedicated box, but exist for search engine food so that the url exists to snare people searching google for Chaptername Assult Marines.
Let me know if you would like any further clarification or information on this sort of thing ecommerce has been my day job since ’97….