Safari incorporating URL # fragment into its browser caching

Safari incorporating URL # fragment into its browser caching - javascript

I'm working on a solution to speed our website up. I'm having the client first ajax load the expected next page of the application:
$.ajax({url: '/some/real/path', ...});
The server responds to this and includes in the header:
Cache-Control => 'max-age=20'
which marks the response as being cachable.
The clientside application then waits to see if its prediction was correct, and upon finding that it was, transitions the browser to that same page, but adds a few bits of information into the URL as a # fragment, where this info is available to us only when the user has actually committed their action (i.e. not predictable):
location.href = '/some/real/path#additionalInfoInFragement';
When the browser transition to the page the additional info in the fragment is picked up by that page's javascript and worked to achieve some effect there.
For all browser, including Safari, the response to the starting ajax request IS properly inserted into the browser cache.
And then, for all browsers except Safari, the browser pulls that content out of the cache when we effect the location.href transition to that page. This avoids the server hit and is the basis for our speed-up.
Safari though is not using the cache to re-serve the content. It seems to get tripped up by the '#additionalInfoInFragment' part of the transition. It is including the fragment in its construction of the cache key it uses to check for existing cached content. Here are the entries from Safari's cache.db file, which I dumped via sqlite:
* ajax request: INSERT INTO "cfurl_cache_response" VALUES(3260,0,-1982644086,0,'http://localhost:8080/TomcatScratchPad/EmptyPage','2012-05-14 07:01:10');
* location.href transition: INSERT INTO "cfurl_cache_response" VALUES(3276,0,-230554366,0,'http://localhost:8080/TomcatScratchPad/EmptyPage#wtf','2012-05-14 07:01:20');
Also notable is the fact that Chrome is behaving correctly, even though both share a tremendous amount of WebKit code.
I would really appreciate any ideas the community has. Thanks!

I see only a couple of options:
File a bug report with Apple and don't worry about it. :-) Your caching stuff will still work for other browsers. Overall, Safari has a very small market share, although of course if your site is targeted at (say) iPad or iPhone users, that rather changes the nature of the stats for your specific site. :-) (You presumably know from your logs how big your Safari audience is.)
Sub-category: If Safari is a big part of your target market and this really bothers you, see if it's a bug in any of the open source parts of it and, if so, offer a patch.
Don't use the fragment identifier to pass the information, use something else (a cookie perhaps) instead.

Related

Can I use the browser Navigation Timing API for Ajax events in single page apps? If not, what's a good tool?

We've got a single page app built with Knockout and Backbone which makes Ajax calls to the server and does some complex data caching and DOM rendering. We're really like to measure the performance (and log it back to the server) as seen by the user. I can't seem to get my head wrapped around whether the browser Navigation Timing API is going to be useful for this or not. From what I see in examples, the Navigation Timing API is tied to window.performance and this is limited to the page load and not suitable for monitoring Ajax behavior. True or false? If false, what else can I use?
I'd love to set custom instrumentation points between which to measure time, e.g. for an Ajax call that does some DOM rendering with a server result.

1 - True, window.performance is tied to page load. See example below which shows this:
<button id='searchButton'>Look up Cities</button>
<br>
Timing info is same? <span id='results'></span>
<script type="text/javascript" src="//code.jquery.com/jquery-1.9.1.min.js"></script>
<script type="text/javascript" src="//cdnjs.cloudflare.com/ajax/libs/underscore.js/1.4.4/underscore-min.js"></script>
<script type="text/javascript">
jQuery('#searchButton').on('click', function(e){
// deep copy the timing info
var perf1 = jQuery.extend(true, {}, performance.timing);
// do something async
jQuery.getJSON('http://ws.geonames.org/searchJSON?featureClass=P&style=full&maxRows=10&name_startsWith=Denv', function() {
// get another copy of timing info
var perf2 = jQuery.extend(true, {}, performance.timing);
// show if timing information has changed
jQuery('#results').text( _.isEqual( perf1, perf2 ) );
});
return false;
});
</script>
Also, even if you did get it working you'd have missing data from old browsers that don't support this object.
2 - The Boomerang project seems to go beyond the web timing API and also supports older browsers. There is a talk with slides and sample code by the current maintainer listed in this conference. Sorry no direct link.

You can now use the User Timing API (W3C Recommendation 12 December 2013) which provides a way that you can insert API calls at different parts of your Javascript and then extract detailed timing data.
You do that using mark(), it lets you work out how much time it took you hit that ‘mark’ in your web application, and then measure() to calculate the time elapsed between your marks.
For your specific case you can have something like this:
app.render = function(content){
myEl.innerHTML = content;
window.performance.mark('end_render');
window.performance.measure('measure_render', 'start_xhr', 'end_render');
};
var req = new XMLHttpRequest();
req.open('GET', url, true);
req.onload = function(e) {
window.performance.mark('end_xhr');
window.performance.measure('measure_xhr', 'start_xhr', 'end_xhr');
app.render(e.responseText);
}
window.performance.mark('start_xhr');
myReq.send();

There seems to be patchy support for window.performance.getEntries(), which will give you details of all resources loaded into a page along with their URLs. I use this API for jsonp (not XMLHttpRequest) requests in AzurePing.info for the browsers that support it, falling back to new Date().getTime() for those that don't.
At time of writing, IE 10 and Chrome support getEntries, but Firefox does not. Unfortunately, not all the timing properties are set - even in Chrome and IE. All I could rely on was a fetchStart, responseEnd, and duration.
Sample source is on GitHub.

The Navigation Timing API is in my opinion not really helpful when it comes to measuring single page application performance.
Along with the already mentioned User Timing API, the Resource Timing API is actually much more helpful. This API provides functionality to retrieve the timings for all requests made in a user session (actually all you see in the network tabs of the developer tools in most browsers). These timings include Round-Trip times as well as DNS-lookup times etc.
Unfortunately, this is a relatively new specification and is not yet implemented accross all browsers. Chrome and IE > 10 provide implementations (although not yet complete). Surprisingly, IE seems to have implemented the most unitl now...

There are two ways to do it
Resource Timing API
Wrapping XMLHTTPRequest
Lets see the differences between them.
1. Resource Timing API
Browsers add supports for Resource Timing API recently. Resource Timing API basically has timing information about each and every resources loaded from app. It may be css, javascript or AJAX requests. You can get list of resources details as
performance.getEntriesByType('resource');
It will return array of object where you can find AJAX requests by initiatorType which is equal to xmlhttprequest. But there are some limitation.
By default, maximum resource size is 150. Above array only have maximum of 150 resources. If you want more, you can increase buffer size as performance.setResourceTimingBufferSize(500).
You will not get information about whether the AJAX requests is succeed or failed.
2. Wrapping XMLHTTPRequest
If you can wrap your XMLHTTPRequest API, you will get all information that you need from timing, status code and byte size. But you have to write lot of code and ofcourse test, test and test.
[Disclaimer] I work for atatus.com where we help you to measure page load time, AJAX timing and custom transaction. Also you can see session traces about how each and every resources perform.

"undefined" randomly appended in 1% of requested urls on my website since 12 june 2012

Since 12 june 2012 11:20 TU, I see very weirds errors in my varnish/apache logs.
Sometimes, when a user has requested one page, several seconds later I see a similar request but the all string after the last / in the url has been replaced by "undefined".
Example:
http://example.com/foo/bar triggers a http://example.com/foo/undefined request.
Of course theses "undefined" pages does not exist and my 404 page is returned instead (which is a custom page with a standard layout, not a classic apache 404)
This happens with any pages (from the homepage to the deepest)
with various browsers, (mostly Chrome 19, but also firefox 3.5 to 12, IE 8/9...) but only 1% of the trafic.
The headers sent by these request are classic headers (and there is no ajax headers).
For a given ip, this seems occur randomly: sometimes at the first page visited, sometimes on a random page during the visit, sometimes several pages during the visit...
Of course it looks like a javascript problem (I'm using jquery 1.7.2 hosted by google), but I've absolutely nothing changed in the js/html or the server configuration since several days and I never saw this kind of error before. And of course, there is no such links in the html.
I also noticed some interesting facts:
the undefined requests are never found as referer of another pages, but instead the "real" pages were used as referer for the following request of the same IP (the user has the ability to use the classic menu on the 404 page)
I did not see any trace of these pages in Google Analytics, so I assume no javascript has been executed (tracker exists on all pages including 404)
nobody has contacted us about this, even when I invoked the problem in the social networks of the website
most of the users continue the visit after that
All theses facts make me think the problem occurs silently in the browers, probably triggered by a buggy add-on, antivirus, a browser bar or a crappy manufacturer soft integrated in browsers updated yesterday (but I didn't find any add-on released yesterday for chrome, firefox and IE).
Is anyone here has noticed the same issue, or have a more complete explanation?

There is no simple straight answer.
You are going to have to debug this and it is probably JavaScript due to the 'undefined' word in the URL. However it doesn't have to be AJAX, it could be JavaScript creating any URL that is automatically resolved by the browser (e.g. JavaScript that sets the src attribute on an image tag, setting a css-image attribute, etc). I use Firefox with Firebug installed most of the time, so my directions will be with that in mind.
Firebug Initial Setup
Skip this if you already know how to use Firebug.
After the installs and restarting Firefox for Firebug, you are going to have to enable most of Firebug's 'panels'. To open Firebug there will be a little fire bug/insect looking thing in the top right corner of your browser or you can press F12. Click through the Firebug tabs 'Console', 'Script', 'Net' and enable them by opening them up and reading the panel's information. You might have to refresh the page to get them working properly.
Debugging User Interaction
Navigate to one of the pages that has the issue with Firebug open and the Net panel active. In the Net panel there will be a few options: 'Clear', 'Persist', 'All', 'Html', etc. Make sure ALL is selected. Don't do anything on the page and try not to mouse over anything on it. Look through the requests. The request for the invalid URL will be red and probably have a status of 404 Not Found (or similar).
See it on load? Skip to the next part.
Don't see it on initial load? Start using your page and continue here.
Start clicking on every feature, mouse over everything, etc. Keep your eyes on the Net panel and watch for a requests that fail. You might have to be creative, but continue using your application till you see your browser make an invalid request. If the page makes many requests, feel free to hit the 'Clear' button on the top left of the Net panel to clear it up a bit.
If you submit the page and see a failed request go out really quick but then lose it because the next page loads, enable persistence by clicking 'Persist' in the top left of the Net panel.
Once it does, and it should, consider what you did to make that happen. See if you can make it happen again. After you figure out what user interaction is making it happen, dive into that code and start looking for things that are making invalid requests.
You can use the Script tab to setup breakpoints in your JavaScript and step through them. Investigate event handlers done via $(elemment).bind/click/focus/etc or from old school event attributes like onclick=""/onfocus="" etc.
If the request is happening as soon as the page loads
This is going to be a little harder to peg down. You will need to go to the Script tab and start adding break points to every script that runs on load. You do this by clicking on the left side of the line of JavaScript.
Reload your page and your break points should stop the browser from loading the page. Press the 'Continue' button on the script panel. Go to your net panel and see if your request was made, continue till it is found. You can use this to narrow down where the request is being made from by slowly adding more and more break points and then stepping into and out of functions.
What you are looking for in your code
Something that is similar to the following:
var url = workingUrl + someObject['someProperty'];
var url = workingUrl + someObject.someProperty;
Keep in mind that someObject might be an object {}, an array [], or any of the internal browser types. The point is that a property will be accessed that doesn't exist.
I don't see any 404/red requests
Then whatever is causing it isn't being triggered by your tests. Try using more things. The point is you should be able to make the request happen somehow. You just don't know yet. It has to show up in the Net panel. The only time it won't is when you aren't doing whatever triggers it.
Conclusion
There is no super easy way to peg down what exactly is going on. However using the methods I outlined you should be at least be able to get close. It is probably something you aren't even considering.

Based on this post, I reverse-engineered the "Complitly" Chrome Plugin/malware, and found that this extension is injecting an "improved autocomplete" feature that was throwing "undefined" requests at every site that has a input text field with NAME or ID of "search", "q" and many others.
I found also that the enable.js file (one of complitly files) were checking a global variable called "suggestmeyes_loaded" to see if it's already loaded (like a Singleton). So, setting this variable to false disables the plugin.
To disable the malware and stop "undefined" requests, apply this to every page with a search field on your site:
<script type="text/javascript">
window.suggestmeyes_loaded = true;
</script>
This malware also redirects your users to a "searchcompletion.com" site, sometimes showing competitors ADS. So, it should be taken seriously.

You have correctly established that the undefined relates to a JavaScript problem and if your site users haven't complained about seeing error pages, you could check the following.
If JavaScript is used to set or change image locations, it sometimes happens that an undefined makes its way into the URI.
When that happens, the browser will happily try to load the image (no AJAX headers), but it will leave hints: it sets a particular Accept: header; instead of text/html, text/xml, ... it will use image/jpeg, image/png, ....
Once such a header is confirmed, you have narrowed down the problem to images only. Finding the root cause will possibly take some time though :)
Update
To help debugging you could override $.fn.attr() and invoke the debugger when something is being assigned to undefined. Something like this:
(function($, undefined) {
var $attr = $.fn.attr;
$.fn.attr = function(attributeName, value) {
var v = attributeName === 'src' ? value : attributeName.src;
if (v === 'undefined') {
alert("Setting src to undefined");
}
return $attr(attributeName, value);
}
}(jQuery));

Some facts that have been established, especially in this thread: http://productforums.google.com/forum/#!msg/chrome/G1snYHaHSOc/p8RLCohxz2kJ
it happens on pages that have no javascript at all.
this proves that it is not an on-page programming error
the user is unaware of the issue and continues to browse quite happily.
it happens a few seconds after the person visits the page.
it doesn't happen to everybody.
happens on multiple browsers (Chrome, IE, Firefox, Mobile Safari, Opera)
happens on multiple operating systems (Linux, Android, NT)
happens on multiple web servers (IIS, Nginx, Apache)
I have one case of googlebot following the link and claiming the same referrer. They may just be trying to be clever and the browser communicated it to the mothership who then set out a bot to investigate.
I am fairly convinced by the proposal that it is caused by plugins. Complitly is one, but that doesn't support Opera. There many be others.
Though the mobile browsers weigh against the plugin theory.
Sysadmins have reported a major drop off by adding some javascript on the page to trick Complitly into thinking it is already initialized.
Here's my solution for nginx:
location ~ undefined/?$ {
return 204;
}
This returns "yeah okay, but no content for you".
If you are on website.com/some/page and you (somehow) navigate to website.com/some/page/undefined the browser will show the URL as changed but will not even do a page reload. The previous page will stay as it was in the window.
If for some reason this is something experienced by users then they will have a clean noop experience and it will not disturb whatever they were doing.

This sounds like a race condition where a variable is not getting properly initialized before getting used. Considering this is not an AJAX issue according to your comments, there will be a couple of ways of figuring this out, listed below.
Hookup a Javascript exception Logger: this will help you catch just about all random javascript exceptions in your log. Most of the time programmatic errors will bubble up here. Put it before any scripts. You will need to catch these on the server and print them to your logs for analysis later. This is your first line of defense. Here is an example:
window.onerror = function(m,f,l) {
var e = window.encodeURIComponent;
new Image().src = "/jslog?msg=" + e(m) + "&filename=" + e(f) + "&line=" + e(l) + "&url=" + e(window.location.href);
};
Search for window.location: for each of these instances you should add logging or check for undefined concats/appenders to your window.location. For example:
function myCode(loc) {
// window.location.href = loc; // old
typeof loc === 'undefined' && window.onerror(...); //new
window.location.href = loc; //new
}
or the slightly cleaner:
window.setLocation = function(url) {
/undefined/.test(url) ?
window.onerror(...) : window.location.href = url;
}
function myCode(loc) {
//window.location.href = loc; //old
window.setLocation(loc); //new
}
If you are interested in getting stacktraces at this stage take a look at: https://github.com/eriwen/javascript-stacktrace
Grab all unhandled undefined links: Besides window.location The only thing left are the DOM links themselves. The third step is to check all unhandeled DOM links for your invalid URL pattern (you can attach this right after jQuery finishes loading, earlier better):
$("body").on("click", "a[href$='undefined']", function() {
window.onerror('Bad link: ' + $(this).html()); //alert home base
});
Hope this is helpful. Happy debugging.

I'm wondering if this might be an adblocker issue. When I search through the logs by IP address it appears that every request by a particular user to /folder/page.html is followed by a request to /folder/undefined

I don't know if this helps, but my website is replacing one particular *.webp image file with undefined after it's loaded in multiple browsers. Is your site hosting webp images?

I had a similar problem (but with /null 404 errors in the console) that #andrew-martinez's answer helped me to resolve.
Turns out that I was using img tags with an empty src field:
<img src="" alt="My image" data-src="/images/my-image.jpg">
My idea was to prevent browser from loading the image at page load to manually load later by setting the src attribute from the data-src attribute with javascript (lazy loading). But when combined with iDangerous Swiper, that method caused the error.

Stop mobile network proxy from injecting JavaScript

I am using a mobile network based internet connection and the source code is being rewritten when they present the site to the end user.
In the localhost my website looks fine, but when I browse the site from the remote server via the mobile network connection the site looks bad.
Checking the source code I found a piece of JavaScript code is being injected to my pages which is disabling the some CSS that makes site look bad.
I don't want image compression or bandwidth compression instead of my well-designed CSS.
How can I prevent or stop the mobile network provider (Vodafone in this case) from proxy injecting their JavaScript into my source code?

You can use this on your pages. It still compresses and put everything inline but it wont break scripts like jquery because it will escape everything based on W3C Standards
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
On your server you can set the cahce control
"Cache-Control: no-transform"
This will stop ALL modifications and present your site as it is!
Reference docs here
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.5
http://stuartroebuck.blogspot.com/2010/08/official-way-to-bypassing-data.html
Web site exhibits JavaScript error on iPad / iPhone under 3G but not under WiFi

You're certainly not the first. Unfortunately many wireless ISPs have been using this crass and unwelcome approach to compression. It comes from Bytemobile.
What it does is to have a proxy recompress all images you fetch smaller by default (making image quality significantly worse). Then it crudely injects a script into your document that adds an option to load the proper image for each recompressed image. Unfortunately, since the script is a horribly-written 1990s-style JS, it craps all over your namespace, hijacks your event handlers and stands a high chance of messing up your own scripts.
I don't know of a way to stop the injection itself, short of using HTTPS. But what you could do is detect or sabotage the script. For example, if you add a script near the end of the document (between the 1.2.3.4 script inclusion and the inline script trigger) to neuter the onload hook it uses:
<script type="text/javascript">
bmi_SafeAddOnload= function() {};
</script>
then the script wouldn't run, so your events and DOM would be left alone. On the other hand the initial script would still have littered your namespace with junk, and any markup problems it causes will still be there. Also, the user will be stuck with the recompressed images, unable to get the originals.
You could try just letting the user know:
<script type="text/javascript">
if ('bmi_SafeAddOnload' in window) {
var el= document.createElement('div');
el.style.border= 'dashed red 2px';
el.appendChild(document.createTextNode(
'Warning. Your wireless ISP is using an image recompression system '+
'that will make pictures look worse and which may stop this site '+
'from working. There may be a way for you to disable this feature. '+
'Please see your internet provider account settings, or try '+
'using the HTTPS version of this site.'
));
document.body.insertBefore(el, document.body.firstChild);
}
</script>

I'm suprised no one has put this as answer yet. The real solution is:
USE HTTPS!
This is the only way to stop ISPs (or anyone else) from inspecting all your traffic, snooping on your visitors, and modifying your website in flight.
With the advent of Let's Encrypt, getting a certificate is now free and easy. There's really no reason not to use HTTPS in this day and age.
You should also use a combination of redirects and HSTS to keep all of your users on HTTPS.

You provider might have enabled a Bytemobile Unison feature called "clientless personalization". Try accessing the fixed URL http://1.2.3.50/ups/ - if it's configured, you will end up on a page which will offer you to disable all feature you don't like. Including Javascript injection.
Good luck!
Alex.

If you're writing you own websites, adding a header worked for me:
PHP:
Header("Cache-Control: no-transform");
C#:
Response.Cache.SetNoTransforms();
VB.Net:
Response.Cache.SetNoTransforms()
Be sure to use it before any data has been sent to the browser.

I found a trick. Just add:
<!--<![-->
After:
<html>
More information (in German):
http://www.programmierer-forum.de/bmi-speedmanager-und-co-deaktivieren-als-webmaster-t292182.htm#3889392

BMI js it's not only on Vodafone. Verginmedia UK and T-Mobile UK also gives you this extra feature enabled as default and for free. ;-)
In T-mobile it's called "Mobile Broadband Accelerator"
You can Visit:
http://accelerator.t-mobile.co.uk
or
http://1.2.3.50/
to configure it.
In case the above doesn't apply to you or for some reason it's not an option
you could potentially set-up your local proxy (Polipo w/wo Tor)
There is also a Firefox addon called "blocksite"
or as more drastic approach reset tcp connection to 1.2.3.0/24:80 on your firewall.
But unfortunately that wouldn't fix the damage.
Funny enough T-mobile and Verginmedia mobile/broadband support is not aware about this feature! (2011.10.11)

PHP: Header("Cache-Control: no-transform"); Thanks!
I'm glad I found this page.
That Injector script was messing up my php page source code making me think I made an error in my php coding when viewing the page source. Even though the script was blocked with firefox NoScript add on. It was still messing up my code.
Well, after that irritating dilemma, I wanted to get rid of it completely and not just block it with adblock or noscript firefox add ons or just on my php page.
STOP http:// 1.2.3.4 Completely in Firefox: Get the add on: Modify
Headers.
Go to the modify header add on options... now on the Header Tab.
Select Action: Choose ADD.
For Header Name type in: cache-control
For Header Value type in: no-transform
For Comment type in: Block 1.2.3.4
Click add... Then click Start.
The 1.2.3.4 script will not be injected into any more pages! yeah!
I no longer see 1.2.3.4 being blocked by NoScript. cause it's not there. yeah.
But I will still add: PHP: Header("Cache-Control: no-transform"); to my php pages.

If you are getting it on a site that you own or are developing, then you can simply override the function by setting it to null. This is what worked for me just fine.
bmi_SafeAddOnload = null;
As for getting it on other sites you visit, then you could probably open the devtools console and just enter that into there and wipe it out if a page is taking a long time to load. Haven't yet tested that though.

Ok nothing working to me. Then i replace image url every second because when my DOM updates, the problem is here again. Other solution is only use background style auto include in pages. Nothing is clean.
setInterval(function(){ imageUpdate(); }, 1000);
function imageUpdate() {
console.log('######imageUpdate');
var image = document.querySelectorAll("img");
for (var num = 0; num < image.length; num++) {
if (stringBeginWith(image[num].src, "http://1.1.1.1/bmi/***yourfoldershere***")) {
var str=image[num].src;
var res=str.replace("http://1.1.1.1/bmi/***yourfoldershere***", "");
image[num].src = res;
console.log("replace"+str+" by "+res);
/*
other solution is to push img src in data-src and push after dom loading all your data-src in your img src
var data-str=image[num].data-src;
image[num].src = data-str;
*/
}
}
}
function stringEndsWith(string, suffix) {
return string.indexOf(suffix, string.length - suffix.length) !== -1
}
function stringBeginWith(string, prefix) {
return string.indexOf(prefix, prefix.length-string.length) !== -1
}

An effective solution that I found was to edit your hosts file (/etc/hosts on Unix/Linux type systems, C:\Windows\System32\drivers\etc on Windows) to have:
null 1.2.3.4
Which effectively maps all requests to 1.2.3.4 to null. Tested with my Crazy Johns (owned by Vofafone) mobile broadband. If your provider uses a different IP address for the injected script, just change it to that IP.

Header("Cache-Control: no-transform");
use the above php code in your each php file and you will get rid of 1.2.3.4 code injection.
That's all.
I too was suffering from same problem, now it is rectified. Give a try.

I added to /etc/hosts
1.2.3.4 localhost
Seems to have fixed it.

Full Ajax site, redirecting from "normal" URL to Ajax (fragment) URL automatically?

Ok, so all the rage these days is having a site like this:
mysite.com/
mysite.com/about
mysite.com/contact
But then if the user has Javascript enabled, then to have them browse those pages with Ajax:
mysite.com/#/
mysite.com/#/about
mysite.com/#/contact
That's all well and good. I have that all working perfectly well.
My question is, if the user arrives at "mysite.com/about", I want to automatically redirect them to "mysite.com/#/about" immediately if they have Javascript.
I have it working so if they arrive at "mysite.com/about", that page will load fine on its own (no redirects) and then all clicks after that load via ajax, but the pre-fragment URL doens't change. e.g. if they arrive on "mysite.com/about" and then click "contact", the new URL will be "mysite.com/about#/contact". I really don't like that though, it's very ugly.
The only way I can think of to automatically redirect a user arriving at "mysite.com/about" to "mysite.com/#/about" is to have some javascript in the header that is ONLY run if the page is NOT being loaded via ajax. That code looks like this ($ = jQuery):
$(function(){
if( !location.hash || location.hash.substr(1,1) != '/' ) {
location.replace( location.protocol+'//'+location.hostname+'/#'+location.pathname+location.search );
}
});
That technically works, but it causes some very strange behavior. For example, normally when you "view source" for a page that has some ajax content, that ajax content will not be in the source because you're viewing the original page's source. Well, when I view source after redirecting like this, then the code I see is ONLY the code that was loaded via Ajax - I've never seen anything like that before. This happens in both Firefox 3.6 and Chrome 6.0. I haven't verified it with other browsers yet but the fact that two browsers using completely different engines exhibit the same behavior indicates I am doing something bad (e.g. not just a bug with FF or Chrome).
So somehow the browser thinks the page I'm on "is" the Ajax page. I can continue to browse around and it works fine, but if I e.g. close Firefox and re-open it (and it re-opens the pages I was on), it only reloads the Ajax fragment of the page, and not the whole wrapper, until I do a manual refresh. (Chrome doesn't do this though, only Firefox). I've never seen anything like that.
I've tried using setTimeout so it does not do the redirect until after the page has fully loaded, but the same thing happens. Basically, as far as I can tell, this only works if the fragment is put there as the result of a user action (click), and not automatically.
So my question is - what's the best way to automatically redirect a Javascript capable browser from a "normal" URL to an Ajax URL? Anyone have experience doing this? I know there are sites that do this - e.g., http://rdio.com (a music site). No weirdness happens there, but I can't figure out how they're doing it.
Thanks for any advice.

This behavior is like the new twitter. If you type the URL:
http://twitter.com/dinizz
You will be redirected to:
http://twitter.com/#!/dinizz
I realize that this is done, not with javascript but in the server side. I am looking for a solution to implements this using ruby on rails.
Although I suggest you to take a look on this article: Making AJAX Applications Crawlable

Is there a way to mitigate downloading of resources (images/css and js files) with Javascript?

I have a html page on my localhost - get_description.html.
The snippet below is part of the code:
<input type="text" id="url"/>
<button id="get_description_button">Get description</button>
<iframe id="description_container" src="#"/>
When the button is clicked the src of the iframe is set to the url entered in the textbox. The pages fetched this way are very big with lots of linked files. What I am interested in the page is a block of text contained in a <div id="description"> element.
Is there a way to mitigate downloading of resources linked in the page that loads into the iframe?
I don't want to use curl because the data is only available to logged in users and the steps to take with curl to get the content is too complicated. The iframe is simple as I use this on a box which sends the right cookies to identify the request as coming from a logged in user, but the problem is that it is very wasteful to get nearly 1 MB of data to keep 1 KB of it and throw out the rest.
Edit
If the proposed method just works in Firefox it is fine, so I added Firefox tag. Also, it is possible that the answer actually is from the realm of Firefox add-on techniques, so I added that tag as well.
The problem is not that I cannot get at what I'm looking for, rather, the problem is the easy iframe method is wasteful.
I know that Firefox does allow loading only the text of a page. If you open a page and press Ctrl+U you are taken to 'view page source' window, There links behave as normal and are clickable, if you click on a link in source view, the source of the new page is loaded into the view source window, without the linked resources being downloaded, exactly what I'm trying to get. But I don't know how to access this behaviour.
Another example is the Adblock add-on. It somehow kills elements before they get loaded. With plain Javascript this is not possible. Because it only is triggered too late to intervene in good time.

The Same Origin Policy forbids any web page to access contents of any other web page in a different domain so basically you cannot do that.
However it seems that with some browsers it is allowed to access web pages content if you are trying to access it from a local web page which seems to be your case.
Safari, IE 6/7/8 are browser that allow a local web page to do so via XMLHttpRequest (source: Google Browser Security Handbook) so you may want to choose to use one of those browsers to do what you need (note that future versions of those browsers may not allow to do so anymore).
A part from this solution I only see two possibities:
If the web pages you need to fetch content from are somehow controlled by you, you can create a simpler interface to let other web pages to get the content you need (for example allowing JSONP requests).
If the web pages you need to fetch content from are not controlled by you the only solution I see is to fetch content server side logging in from the server directly (I know that you don't want to do so, but I don't see any other possibility if the previous I mentioned are not practicable)
Hope it helps.

Actually I've seen Cross Domain jQuery .load request before, here: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The author claims that codes like these found on that page
$('#container').load('http://google.com'); // SERIOUSLY!
$.ajax({
url: 'http://news.bbc.co.uk',
type: 'GET',
success: function(res) {
var headline = $(res.responseText).find('a.tsh').text();
alert(headline);
}
});
// Works with $.get too!
would work. (The BBC code might not work because of the recent redesign, but you get the idea)
Apparently it is using YQL wrapped into a jQuery plugin to do the trick. Now I cannot say I fully understand what he is doing there but it appears to work, and fits the bill. Once you load the data I suppose it is a simple matter of filtering out the data that you need.
If you prefer something that works at the browser level, may I suggest Mozilla's Jetpack framework for lightweight extensions. I've not yet read the documentations in its entirety but it should contain the APIs needed for this to work.

There are various ways to go about this in AJAX, I'm going to show the jQuery way for brevity as one option, though you could do this in vanilla JavaScript as well.
Instead of an <iframe> you can just use a container, let's say a <div> like this:
<div id="description_container"></div>
Then to load it:
$(function() {
$("#get_description_button").click(function() {
$("#description_container").load($("input").val() + " #description");
});
});
This uses the .load() method which takes a string in this format: .load("url selector"), then takes that element in the page and places it's content inside the container you're loading, in this case #description_container.
This is just the jQuery route, mainly to illustrate that yes, you can do what you want, but you don't have to do it exactly like this, just showing the concept is getting what you want from an AJAX request, rather than in an <iframe>.

Your description sounds like you are fetching pages from the same domain (you said that you need to be logged in and have session credentials) so have you tried to use async request via XMLHttpRequest? It might complain if the html on a page is particularly messed up but you chould still be able to get raw text via .responseText and extract what you need with a regex.

We Keep Coding

JavaScript is the programming language of the Web.