Related
With Selenium or JavaScript how could you get the (over the network) transferred size (bytes) of the loaded page including all the content, images, css, js, etc?
The preferred size is that of what goes over the network, that is compressed, only for the requests that are made, etc.
This is what you usually can see in dev tools, to the right in the network status bar:
If that's not possible, could one just get a total size of all the loaded resources (without compression, etc)? That would be an acceptable alternative.
The browser is Firefox, but if it could be done with some other Selenium compatible browser that would be acceptable also.
I guess this could be done using a proxy, but is there any JS or Selenium way to get such information?
If proxy is the only way, which one would one use (or implement) to keep things simple for such a task? Just implementing something in Java before setting up the driver?
(The solution should work at least on Linux, but preferably on Windows also. I'm using Selenium WebDriver via Java.)
For future reference, it is possible to request this information from the browser by javascript. However, at the time of writing no browser supports this feature for this specific data yet. More information can be found here.
In the mean time, for Chrome you can parse this information from the performance log.
//Enable performance logging
LoggingPreferences logPrefs = new LoggingPreferences();
logPrefs.enable(LogType.PERFORMANCE, Level.ALL);
capa.setCapability(CapabilityType.LOGGING_PREFS, logPrefs);
//Start driver
WebDriver driver = new ChromeDriver(capa);
You can then get this data like this
for (LogEntry entry : driver.manage().logs().get(LogType.PERFORMANCE)) {
if(entry.getMessage().contains("Network.dataReceived")) {
Matcher dataLengthMatcher = Pattern.compile("encodedDataLength\":(.*?),").matcher(entry.getMessage());
dataLengthMatcher.find();
//Do whatever you want with the data here.
}
If, like in your case, you want to know the specifics of a single page load, you could use a pre- and postload timestamp and only get entries within that timeframe.
The performance API mentioned in Hakello's answer is now well supported (on everything except IE & Safari), and is simple to use:
return performance
.getEntriesByType("resource")
.map((x) => x.transferSize)
.reduce((a, b) => (a + b), 0);
You can run that script using executeScript to get the number of bytes downloaded since the last navigation event. No setup or configuration is required.
Yes you can do it using BrowserMobProxy. This is a java jar which use selenium Proxy to track network traffic from client side.
like page load time duration, Query string to different services etc.
you can get it bmp.lightbody.net . This api will create .har files which will contain all these information in json format which you can read using
an online tool http://www.softwareishard.com/har/viewer/
I have achieved this in Python, which might save people some time. To setup the logging:
logging_prefs = {'performance' : 'INFO'}
caps = DesiredCapabilities.CHROME.copy()
caps['loggingPrefs'] = logging_prefs
driver = webdriver.Chrome(desired_capabilities=caps)
To calculate the total:
total_bytes = []
for entry in driver.get_log('performance'):
if "Network.dataReceived" in str(entry):
r = re.search(r'encodedDataLength\":(.*?),', str(entry))
total_bytes.append(int(r.group(1)))
mb = round((float(sum(total_bytes) / 1000) / 1000), 2)
I have an old html page that creates a script file and executes it using:
fsoObject = new ActiveXObject("Scripting.FileSystemObject")
wshObject = new ActiveXObject("WScript.Shell")
I am trying to modify it and make it usable also from other browsers. If you know the answer stop reading and please answer. If there is no quick answer, here is the description of my attempts. I was successful in doing the job, but only when the script is shorter than 2000 characters. I need help for scripts longer than 2000 characters.
The webpage is for internal use only, so it is easy for me to create a custom URL protocol on each computer that runs a VBScript file from a network drive.
I created my custom URL Protocol that starts a VBScript file like this:
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\MyUrlProtocol]
"URL Protocol"=""
#="Url:MyUrlProtocol"
"UseOriginalUrlEncoding"=dword:00000001
[HKEY_CLASSES_ROOT\MyUrlProtocol\DefaultIcon]
#="C:\\Windows\\System32\\WScript.exe"
[HKEY_CLASSES_ROOT\MyUrlProtocol\shell]
[HKEY_CLASSES_ROOT\MyUrlProtocol\shell\open]
[HKEY_CLASSES_ROOT\MyUrlProtocol\shell\open\command]
#="C:\\Windows\\System32\\WScript.exe \"X:\\MyUrlProtocol.vbs\" \"%1\""
In MyUrlProtocol.vbs I have this:
MsgBox "The length of the link is " & Len(WScript.Arguments(0)) & " characters"
MsgBox "The content of the link is: " & WScript.Arguments(0)
When I click on click me I see two messages, so everything works well (tested with Chrome and IE in Windows 7.)
It works also when I execute document.getElementById("test").click()
I thought this could be the solution: I would pass the text of the script to the VBS static script, which would create the dynamic script and run it, but with this system I can't pass more than ~2000 characters.
So I tried to split the text of the script in chunks smaller than 2000 characters and simulate several clicks on the link, but only the first one works.
So I tried with xmlhttp.open("GET","MyUrlProtocol:test",false);, but Chrome says Cross origin requests are only supported for HTTP.
Is it possible to pass more than 2000 characters to a VBScript script via a custom URL protocol?
If not, is it possible to call several custom URL protocols in sequence?
If not, is there another way to create a script file and run it from Javascript?
EDIT 1
I found a solution, but in Chrome only works when it likes, so I'm back to square one.
The code below in IE executes the script 4 times (correct), but in Chrome only the first execution runs.
If I change it to delay += 2000, then Chrome usually runs the script 2 times, but sometimes 1 and sometimes 3 or even 4 times.
If I change it to delay += 10000, then it usually runs the script 4 times, but sometimes misses one.
The function is always executed 4 times, both in Chrome and IE. What is weird is that the sr.click() sometimes does nothing and the function execution continues.
<HTML>
<HEAD>
<script>
var delay;
function runScript(text) {
setTimeout(function(){runScript2(text)}, delay);
delay += 100;
}
function runScript2(text) {
var sr = document.getElementById('scriptRunner');
sr.href='intelliclad:'+text;
sr.click();
}
function test(){
delay = 0;
runScript("uno");
runScript("due");
runScript("tre");
runScript("quattro");
}
</script>
</HEAD>
<BODY>
<input type="button" value="Run test" onclick="test()">
scriptRunner
</BODY>
</HMTL>
EDIT 2
I tried with Luke's suggestion of setting the next timeout from inside the call back but nothing changed (IE works always, Chrome whenever it likes).
Here is the new code:
var scripts;
var delay = 2000;
function runScript() {
var sr = document.getElementById('scriptRunner');
sr.href = 'intelliclad:' + scripts.shift();
sr.click();
if(scripts.length)
setTimeout(function() {runScript()}, delay);
}
function test(){
scripts = ["uno", "due", "tre", "quattro"];
runScript();
}
Some background: The page asks for the shape of a panel, which can be just a few parameters [nfaces=1, shape1='square', width1=100] or hundreds of parameters for panels with many faces, many slots, many fasteners, etc. After asking for all the parameters a script for our internal 3D CAD (which can be larger than 20KB) is generated and the CAD is started and asked to execute the script.
I would like to do all on the client side, because the page is served by a Domino web server, which can't even dream of managing such a complex script.
I didn't read your whole post...have an answer:
I too wish that custom url protocols can handle long urls. They simply do not. IE is even worse as some OSs only accept 800 chars.
So, here's the solution:
For long urls, only pass a single use token. The vbscript uses the token
and does a url get to your web server to get all of the data.
This is the only way I've been able to successfully pass lots of data around. If you ever find a clearer solution, please remember to post it here.
Update:
Note that this is the best way I have found to deal with the url protocol limitations. I too wish this was not necessary. This does work and works well.
You mentioned Dominos, so possibly you need something in a POS environment... I create a web based POS system, so we could face a lot of the same issues.
Suppose you want a custom url to print a pdf to the default printer without the annoying popup window. We need to do this thousands of times a day...
When building the web page, add the print button which when pressed calls the custom url: myproto://printpdf?id=12345&tocken=onetimetoken
this will execute your vbscript on the local desktop
in your vbscript, parse the arguments and react. In this case, your command is printpdf and the id is 123456 and you have a onetime tocken key.
have the vb script to an https get to: https://mydomain.com/APIs/printpdf.whatever?id=12345&key=onetimetoken
check the credentials based on the ip address and token, if all aligns, then return the contents of the pdf (you may want to convert the pdf to a byte array string)
now the vbscript has the pdf, assemble it and write it to a temp folder then execute a silent pdf print command (I use Sumatra PDF http://blog.kowalczyk.info/software/sumatrapdf/free-pdf-reader.html)
mission accomplished.
Since I do know what you what to do in your custom url and the general workflow, I can only describe how I've solved the sort url issue.
Using this technique, the possibilities are limitless. You have full control over the local computer running the web browser, you have a onetime use token which grants access to a web API with can return any sort of information you program.
You could write a custom url protocol to turn on the pizza oven if you wanted :)
If you are not able to create the server side code which is listening for vbscript's get request then this would not work.
You might be able to pass the data from the browser to the vbscript using the clipboard.
Update 2:
Since in this case the data is on the client (one single form can define hundreds of parameters), the server API doesn't know what to answer to the vb script request. So the workflow described above must be preceded by these two steps:
The onkeypress event executes a submit to send the current parameters to the server
The server replies with the refreshed form, adding to the body onload a call to a function which uses another submit to call the custom url, as described on point 1 listed above.
Update 3:
stenci, what you've added (in Update 2) will work. I would do it like this:
user presses a button saying I'm done editing the form
ajax post the form to the server
the server saves the data and attaches unique key to the datastore
the server returns the key to ajax callback function
now the client has a single use key and invokes the url schema passing the key
vbscript does an https get to the server and passes the key
server returns the data to the vbscript
It is a bit long winded. Once coded it will work like a charm.
The only other alternative I can see is to copy the form data to the clipboard using something like: http://zeroclipboard.org/
and then in vbscript see if you can read the clipboard like: Use clipboard from VBScript
How about creating an iFrame for each instance?
Something like this:
function runScript(text) {
var iframe = document.createElement('iframe');
iframe.src = 'intelliclad:'+text;
document.body.appendChild(iframe);
}
function test(){
runScript("uno");
runScript("due");
runScript("tre");
runScript("quattro");
}
You can then use css styling to make these iframes transparent / hidden.
You might not like this answer, but I've used this method in the past and it works.
Instead of relying on ActiveX, consider using a Java Applet, and JNI.
Basically, you have to make sure the native scripts you want to run are available on your client machine, along with a JNI wrapper.
The applet will have to be at least self signed, for the browser to allow it to load and access a native library. Once the JNI libraries are loaded, you can easily call methods from the page / applet.
As a consequence of using Java, you could possibly use the same applet for windows as well as linux clients, provided of course you have native libraries present on the respective clients.
This series of articles talks about precisely your problem : http://www.javaworld.com/article/2076775/java-security/escape-the-sandbox--access-native-methods-from-an-applet.html
P.S the article is really old, but the concept remains unchanged.
I have seen a number of questions that don't answer this, is it possible to check someones bandwidth using java script and load specific content based on it?
The BBC seem to give me low quality images when using my mobile and in the middle of nowhere.
by the looks of this this cool service does this and its a CDN so it could be server side.
http://www.resrc.it/docs/
Does anyone know how they do it? or how I could do it using asp.net or javascript, or an community opensource plug in.
I think it may be possible with https://github.com/yahoo/boomerang/ but not sure this is its true purpose.
Basically you do this like this:
Start a timer
Load an fixed size file e.g a image through an ajax call
Stop the timer
Take some samples and compute the average badwidth
Somethign like this could work:
//http://upload.wikimedia.org/wikipedia/commons/5/51/Google.png
//Size = 238 KB
function measureBW(cnt, cb) {
var start = new Date().getTime();
var bandwidth;
var i = 0;
(function rec() {
var xmlHttp = new XMLHttpRequest();
xmlHttp.open('GET', 'http://upload.wikimedia.org/wikipedia/commons/5/51/Google.png', true);
xmlHttp.onreadystatechange = function () {
if (xmlHttp.readyState == 4) {
var x = new Date().getTime() - start;
bw = Number(((238 / (x / 1000))));
bandwidth = ((bandwidth || bw) + bw) / 2;
i++;
if (i < cnt) {
start = new Date().getTime();rec();
}
else cb(bandwidth.toFixed(0));
}
};
xmlHttp.send(null);
})();
}
measureBW(10, function (e) {
console.log(e);
});
Not that var xmlHttp = new XMLHttpRequest(); won't work on all browsers, you should check for the UserAgent and use the right one
And of course its just an estimated value.
Heres a JSBin example
Start a timer.
Send a AJAX request to your server, requesting a file of known size.
When the AJAX request's done loading, stop the timer, and calculate the bandwidth from the passed time and file size.
The problem with JavaScript is that users can disable it. (Which is more common on phones, that happen to be better off with smaller images)
I've knocked this up based on timing image downloads (ref: http://www.ehow.com/how_5804819_detect-connection-speed-javascript.html)
Word of warning though:
It says my speed is 1.81Mbps,
But according to SpeedTest.Net my speeds are this:
The logic of timing the download seems right but not sure if it's accurate?
Well, like I said in my comments, you can choose 2 approaches:
1) You are in the context of a mobile app, then you can query the technology used by the device directly so you can notify the server directly what type (and size) of content you area able to render. I think phone gap can help you with accessing some of the native mobile API's using JavaScript.
2) The server-timer thing. You can "serve" some files yourself, lets say you have a magic file in your landing page, that, as soon as the client request the file, you grab this HTTP request with a custom handler. You "manually" serve the file by writing to the output stream, and you measure the bytes send and the time it took to reach the EOF, then you can somehow measure the bandwith. Combine this with the session cookie and you have this information per connected browser.
While this isn't an answer, it may be important to note that measuring bandwidth isn't always reliable.
http://www.smashingmagazine.com/2013/01/09/bandwidth-media-queries-we-dont-need-em/
To paraphrase the above:
...the number of bits downloaded divided by the time it took to download them...is true when you download a large file over a single warmed-up TCP connection. That is rarely the case.
Typical page load scenario:
Initial HTML page is downloaded using slow-start mechanism, so measurement will significantly underestimate the available bandwidth
CSS and JavaScript external resources are loaded -- a collection of new TCP connections, all in their slow-start phase, and they are not all necessarily to the same destination server
Images are loaded -- multiple connections, each one downloading a resource. The problem is that these connections are not always in the same phase of their life cycle. Some might be in the slow-start phase; some may have suffered a packet loss and, thus, reduced their window and the bandwidth they are trying to fill; and some might be warmed-up TCP connections, ready to fill the bandwidth. These TCP connections are not necessarily all to the same destination server, and the bandwidth towards the various destination servers might be different between one another.
So, estimating bandwidth is possible, but it is far from simple, and it is possible only for certain phases of the page-loading process. And because having several TCP connections to various destination servers is common (for example, a CDN could host the image resources of a Web page), we cannot really tell what is the bandwidth we want to measure.
Since this is an older question, the alternative suggestion at the end of the article is to consider the more recent srcset attribute for responsive imagery, which lets the browser decide which asset to load based on whatever it knows (which should be more than us). It sounds like it's weighted more towards just determining resolution, but maybe it'll get smarter as support goes up.
I have released BwCh which is an open-source JavaScript API to detect bandwidth for web-based environments
It is built with ES2015. It uses some of the latest JavaScript innovation (window.navigator.connection currently supported in Chrome 48+ for Android as of April 2016) in order to provide a flexible method to detect bandwidth for both mobile and desktop devices. It fallbacks/complements to image pre-loading to detect bandwidth where those newest API are not available.
Since 12 june 2012 11:20 TU, I see very weirds errors in my varnish/apache logs.
Sometimes, when a user has requested one page, several seconds later I see a similar request but the all string after the last / in the url has been replaced by "undefined".
Example:
http://example.com/foo/bar triggers a http://example.com/foo/undefined request.
Of course theses "undefined" pages does not exist and my 404 page is returned instead (which is a custom page with a standard layout, not a classic apache 404)
This happens with any pages (from the homepage to the deepest)
with various browsers, (mostly Chrome 19, but also firefox 3.5 to 12, IE 8/9...) but only 1% of the trafic.
The headers sent by these request are classic headers (and there is no ajax headers).
For a given ip, this seems occur randomly: sometimes at the first page visited, sometimes on a random page during the visit, sometimes several pages during the visit...
Of course it looks like a javascript problem (I'm using jquery 1.7.2 hosted by google), but I've absolutely nothing changed in the js/html or the server configuration since several days and I never saw this kind of error before. And of course, there is no such links in the html.
I also noticed some interesting facts:
the undefined requests are never found as referer of another pages, but instead the "real" pages were used as referer for the following request of the same IP (the user has the ability to use the classic menu on the 404 page)
I did not see any trace of these pages in Google Analytics, so I assume no javascript has been executed (tracker exists on all pages including 404)
nobody has contacted us about this, even when I invoked the problem in the social networks of the website
most of the users continue the visit after that
All theses facts make me think the problem occurs silently in the browers, probably triggered by a buggy add-on, antivirus, a browser bar or a crappy manufacturer soft integrated in browsers updated yesterday (but I didn't find any add-on released yesterday for chrome, firefox and IE).
Is anyone here has noticed the same issue, or have a more complete explanation?
There is no simple straight answer.
You are going to have to debug this and it is probably JavaScript due to the 'undefined' word in the URL. However it doesn't have to be AJAX, it could be JavaScript creating any URL that is automatically resolved by the browser (e.g. JavaScript that sets the src attribute on an image tag, setting a css-image attribute, etc). I use Firefox with Firebug installed most of the time, so my directions will be with that in mind.
Firebug Initial Setup
Skip this if you already know how to use Firebug.
After the installs and restarting Firefox for Firebug, you are going to have to enable most of Firebug's 'panels'. To open Firebug there will be a little fire bug/insect looking thing in the top right corner of your browser or you can press F12. Click through the Firebug tabs 'Console', 'Script', 'Net' and enable them by opening them up and reading the panel's information. You might have to refresh the page to get them working properly.
Debugging User Interaction
Navigate to one of the pages that has the issue with Firebug open and the Net panel active. In the Net panel there will be a few options: 'Clear', 'Persist', 'All', 'Html', etc. Make sure ALL is selected. Don't do anything on the page and try not to mouse over anything on it. Look through the requests. The request for the invalid URL will be red and probably have a status of 404 Not Found (or similar).
See it on load? Skip to the next part.
Don't see it on initial load? Start using your page and continue here.
Start clicking on every feature, mouse over everything, etc. Keep your eyes on the Net panel and watch for a requests that fail. You might have to be creative, but continue using your application till you see your browser make an invalid request. If the page makes many requests, feel free to hit the 'Clear' button on the top left of the Net panel to clear it up a bit.
If you submit the page and see a failed request go out really quick but then lose it because the next page loads, enable persistence by clicking 'Persist' in the top left of the Net panel.
Once it does, and it should, consider what you did to make that happen. See if you can make it happen again. After you figure out what user interaction is making it happen, dive into that code and start looking for things that are making invalid requests.
You can use the Script tab to setup breakpoints in your JavaScript and step through them. Investigate event handlers done via $(elemment).bind/click/focus/etc or from old school event attributes like onclick=""/onfocus="" etc.
If the request is happening as soon as the page loads
This is going to be a little harder to peg down. You will need to go to the Script tab and start adding break points to every script that runs on load. You do this by clicking on the left side of the line of JavaScript.
Reload your page and your break points should stop the browser from loading the page. Press the 'Continue' button on the script panel. Go to your net panel and see if your request was made, continue till it is found. You can use this to narrow down where the request is being made from by slowly adding more and more break points and then stepping into and out of functions.
What you are looking for in your code
Something that is similar to the following:
var url = workingUrl + someObject['someProperty'];
var url = workingUrl + someObject.someProperty;
Keep in mind that someObject might be an object {}, an array [], or any of the internal browser types. The point is that a property will be accessed that doesn't exist.
I don't see any 404/red requests
Then whatever is causing it isn't being triggered by your tests. Try using more things. The point is you should be able to make the request happen somehow. You just don't know yet. It has to show up in the Net panel. The only time it won't is when you aren't doing whatever triggers it.
Conclusion
There is no super easy way to peg down what exactly is going on. However using the methods I outlined you should be at least be able to get close. It is probably something you aren't even considering.
Based on this post, I reverse-engineered the "Complitly" Chrome Plugin/malware, and found that this extension is injecting an "improved autocomplete" feature that was throwing "undefined" requests at every site that has a input text field with NAME or ID of "search", "q" and many others.
I found also that the enable.js file (one of complitly files) were checking a global variable called "suggestmeyes_loaded" to see if it's already loaded (like a Singleton). So, setting this variable to false disables the plugin.
To disable the malware and stop "undefined" requests, apply this to every page with a search field on your site:
<script type="text/javascript">
window.suggestmeyes_loaded = true;
</script>
This malware also redirects your users to a "searchcompletion.com" site, sometimes showing competitors ADS. So, it should be taken seriously.
You have correctly established that the undefined relates to a JavaScript problem and if your site users haven't complained about seeing error pages, you could check the following.
If JavaScript is used to set or change image locations, it sometimes happens that an undefined makes its way into the URI.
When that happens, the browser will happily try to load the image (no AJAX headers), but it will leave hints: it sets a particular Accept: header; instead of text/html, text/xml, ... it will use image/jpeg, image/png, ....
Once such a header is confirmed, you have narrowed down the problem to images only. Finding the root cause will possibly take some time though :)
Update
To help debugging you could override $.fn.attr() and invoke the debugger when something is being assigned to undefined. Something like this:
(function($, undefined) {
var $attr = $.fn.attr;
$.fn.attr = function(attributeName, value) {
var v = attributeName === 'src' ? value : attributeName.src;
if (v === 'undefined') {
alert("Setting src to undefined");
}
return $attr(attributeName, value);
}
}(jQuery));
Some facts that have been established, especially in this thread: http://productforums.google.com/forum/#!msg/chrome/G1snYHaHSOc/p8RLCohxz2kJ
it happens on pages that have no javascript at all.
this proves that it is not an on-page programming error
the user is unaware of the issue and continues to browse quite happily.
it happens a few seconds after the person visits the page.
it doesn't happen to everybody.
happens on multiple browsers (Chrome, IE, Firefox, Mobile Safari, Opera)
happens on multiple operating systems (Linux, Android, NT)
happens on multiple web servers (IIS, Nginx, Apache)
I have one case of googlebot following the link and claiming the same referrer. They may just be trying to be clever and the browser communicated it to the mothership who then set out a bot to investigate.
I am fairly convinced by the proposal that it is caused by plugins. Complitly is one, but that doesn't support Opera. There many be others.
Though the mobile browsers weigh against the plugin theory.
Sysadmins have reported a major drop off by adding some javascript on the page to trick Complitly into thinking it is already initialized.
Here's my solution for nginx:
location ~ undefined/?$ {
return 204;
}
This returns "yeah okay, but no content for you".
If you are on website.com/some/page and you (somehow) navigate to website.com/some/page/undefined the browser will show the URL as changed but will not even do a page reload. The previous page will stay as it was in the window.
If for some reason this is something experienced by users then they will have a clean noop experience and it will not disturb whatever they were doing.
This sounds like a race condition where a variable is not getting properly initialized before getting used. Considering this is not an AJAX issue according to your comments, there will be a couple of ways of figuring this out, listed below.
Hookup a Javascript exception Logger: this will help you catch just about all random javascript exceptions in your log. Most of the time programmatic errors will bubble up here. Put it before any scripts. You will need to catch these on the server and print them to your logs for analysis later. This is your first line of defense. Here is an example:
window.onerror = function(m,f,l) {
var e = window.encodeURIComponent;
new Image().src = "/jslog?msg=" + e(m) + "&filename=" + e(f) + "&line=" + e(l) + "&url=" + e(window.location.href);
};
Search for window.location: for each of these instances you should add logging or check for undefined concats/appenders to your window.location. For example:
function myCode(loc) {
// window.location.href = loc; // old
typeof loc === 'undefined' && window.onerror(...); //new
window.location.href = loc; //new
}
or the slightly cleaner:
window.setLocation = function(url) {
/undefined/.test(url) ?
window.onerror(...) : window.location.href = url;
}
function myCode(loc) {
//window.location.href = loc; //old
window.setLocation(loc); //new
}
If you are interested in getting stacktraces at this stage take a look at: https://github.com/eriwen/javascript-stacktrace
Grab all unhandled undefined links: Besides window.location The only thing left are the DOM links themselves. The third step is to check all unhandeled DOM links for your invalid URL pattern (you can attach this right after jQuery finishes loading, earlier better):
$("body").on("click", "a[href$='undefined']", function() {
window.onerror('Bad link: ' + $(this).html()); //alert home base
});
Hope this is helpful. Happy debugging.
I'm wondering if this might be an adblocker issue. When I search through the logs by IP address it appears that every request by a particular user to /folder/page.html is followed by a request to /folder/undefined
I don't know if this helps, but my website is replacing one particular *.webp image file with undefined after it's loaded in multiple browsers. Is your site hosting webp images?
I had a similar problem (but with /null 404 errors in the console) that #andrew-martinez's answer helped me to resolve.
Turns out that I was using img tags with an empty src field:
<img src="" alt="My image" data-src="/images/my-image.jpg">
My idea was to prevent browser from loading the image at page load to manually load later by setting the src attribute from the data-src attribute with javascript (lazy loading). But when combined with iDangerous Swiper, that method caused the error.
I'm working on a solution to speed our website up. I'm having the client first ajax load the expected next page of the application:
$.ajax({url: '/some/real/path', ...});
The server responds to this and includes in the header:
Cache-Control => 'max-age=20'
which marks the response as being cachable.
The clientside application then waits to see if its prediction was correct, and upon finding that it was, transitions the browser to that same page, but adds a few bits of information into the URL as a # fragment, where this info is available to us only when the user has actually committed their action (i.e. not predictable):
location.href = '/some/real/path#additionalInfoInFragement';
When the browser transition to the page the additional info in the fragment is picked up by that page's javascript and worked to achieve some effect there.
For all browser, including Safari, the response to the starting ajax request IS properly inserted into the browser cache.
And then, for all browsers except Safari, the browser pulls that content out of the cache when we effect the location.href transition to that page. This avoids the server hit and is the basis for our speed-up.
Safari though is not using the cache to re-serve the content. It seems to get tripped up by the '#additionalInfoInFragment' part of the transition. It is including the fragment in its construction of the cache key it uses to check for existing cached content. Here are the entries from Safari's cache.db file, which I dumped via sqlite:
* ajax request: INSERT INTO "cfurl_cache_response" VALUES(3260,0,-1982644086,0,'http://localhost:8080/TomcatScratchPad/EmptyPage','2012-05-14 07:01:10');
* location.href transition: INSERT INTO "cfurl_cache_response" VALUES(3276,0,-230554366,0,'http://localhost:8080/TomcatScratchPad/EmptyPage#wtf','2012-05-14 07:01:20');
Also notable is the fact that Chrome is behaving correctly, even though both share a tremendous amount of WebKit code.
I would really appreciate any ideas the community has. Thanks!
I see only a couple of options:
File a bug report with Apple and don't worry about it. :-) Your caching stuff will still work for other browsers. Overall, Safari has a very small market share, although of course if your site is targeted at (say) iPad or iPhone users, that rather changes the nature of the stats for your specific site. :-) (You presumably know from your logs how big your Safari audience is.)
Sub-category: If Safari is a big part of your target market and this really bothers you, see if it's a bug in any of the open source parts of it and, if so, offer a patch.
Don't use the fragment identifier to pass the information, use something else (a cookie perhaps) instead.