I've created a test web server that I'm using to act as a 'web filter' of sorts. I'm trying to create an extension that uses the webRequest API to make sure that my web server allows or blocks all incoming URL's.
To do this, I'm making an AJAX call from within webRequest to my web server and I'd like to use the response to determine whether to block or allow the specified URL. The problem is, the webRequest method is async, and AJAX calls are async, so I can't wait reliably wait for a response from my server.
I also can't store all blocked / allowed URL's in localStorage, because there could potentially be hundreds of thousands. I've tried using jQuery's async: false property in it's ajax implementation, but that makes the browser almost completely unusable when hundreds of requests are happening at the same time. Anyone have any ideas as to how I might be able to work around this?
EDIT: I know similar questions to this have been asked before, but there haven't been any viable solutions to this problem that I've seen.
I see only two good choices:
make that site a webproxy
use unlimitedStorage permission and store the urls in WebSQL database (it's also the fastest). Despite the general concern that it may be deprecated in Chrome after W3C stopped developing the specification in favor of IndexedDB I don't think it'll happen any time soon because all the other available storage options are either [much] slower or less functional.
Related
I try to wrap my head around how to really secure ajax calls of any kind that are publicly available.
Let’s say the JavaScript on a public page (so no user authentication of any kind) contains an AJAX call to a PHP script (REST API or just a script, it doesn’t matter) that does a lot of heavy lifting. So any user can just look into the source code, find the AJAX call, rebuild and execute it, and execute it again a million times in a second and DDoS your site that way - not so great. At first I thought a HTTP_REFERER check could be helpful, but as any header field, also this is manipulable (just use a curl request) so the gain of security wouldn’t be too high.
The next approach was about a combination of using session ids, cookies, etc. to build some kind of access key for every page viewer and when someone exceeds the limit the AJAX call would run into an error. Sounds great so far, but just by cleaning the cookies, etc. everything will be reseted. So also no real solution. But, of course! Use the IP! Great idea! Users in public networks, that use only one IP for internet access will be totally happy, if one miscreant will block the service for all of them by abusing the call... not. So, also no great solution.
So, I’m really stuck here and can’t think of any great answer for my problem.
I also thought about API keys, or something alike. But that is an information that is also extractable from the JavaScript source. So how to prevent other servers using your service in a proxy kind of manner serving your data to their users? (e.g. you implemented the GMaps API in your website (or any other API) and someone uses your script accessing the API with your key)
tl;dr
Is there any good way to really secure your publicly viewable AJAX calls from abusing them for DDoSing your site, presenting your data on other sites, etc.
I think you're overthinking what AJAX is. When your site makes an ajax request, server side, it's the same as any other page request (even if some scripts are more process intensive). You need to protect your entire site, and not just specific scripts. If your server does not have any DDoS protection, it can be attacked through any page. Look into services like CloudFare
As #Sage Mentioned it is similar to normal http request. You can use normal authentication as the http headers/cookie information will be passed on to the server every time you make ajax call. For clear view you can look into developer console on browser. Its the same as exposing you website root url. Just make sure you have authentications checks for ajax calls too.
If I'm loading arbitrary external javascript code in a browser setting, is it possible to ensure it can't make the browser run make any ajax calls or network requests?
Can you prevent any resource calls? - No. (haven't explored the 'extension' route though)
Since even an <img src='any valid url'> creates a resource request which your code cannot prevent.
Can you prevent ajax calls? - Yes, to an extent.
Assuming that you want to ensure that any third party libraries shouldn't make any arbitrary ajax calls (cross domain), you will simply ensure that you don't enable CORS in your web server.
Your own application code can make ajax calls since they are in your domain only. However, you can filter those calls on server to check for specific properties like purpose, credentials, etc
It may be worth exploring google caja (haven't tried that myself)
I was wondering how ga collected data and send it to their servers, then I found this answer on SO. Now I'm wondering why does GA uses this method rather than doing an AJAX request, is it cheaper?
It's not cheaper, per se, it is reliable. Unlike AJAX, you can include an image from any domain without running into cross-domain browser restrictions, this is why tracking pixels are used instead of ajax requests.
As Rob said, it's primarily to get around cross-domain issues not supported in older browsers. However, as of recently GA has added support for the navigator.sendBeacon() method, which actually is cheaper, allows for retries on error, and doesn't have the problem of failing when the page is being unloaded (like when trying to send an event when a user clicks on an outbound link). As browser support increases, this will likely become the default method for sending hits to GA.
Here's the documentation on how to use sendBeacon with analytics.js:
https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference#useBeacon
I'm gonna develop a framework for comet programming, and I can't use Web Sockets, or Server-Sent Events (because browser support really sucks). So, I need to keep the HTTP connection alive, and send chunked data back to the client.
However, problems show themselves as you get into the work:
Using XMLHttpRequest is not possible, due to the fact that IE doesn't give you xhr.responseText while the xhr.readyState is 3.
A hidden iframe can't be useful, because browser shows the loader while I send data back to the client.
I tried to send a JavaScript file back to the client, sending function execution commands each time, but browsers won't execute JavaScript till it's completely loaded.
However, when I look at Lightstreamer demo page, I see that it sends a JavaScript file back to the client little by little and in each step, it sends a call to the function and that function simply gets executed (I can't do this part). It seems that Lightstreamer uses AJAX, since the request simply shows up in Firebug's console tab, but it works like a charm in IE too.
I tried to use every HTTP header field they've set on their request, and no result. I also tried to use HTTP Post instead of HTTP Get, but still got no result.
I've read almost over 20 articles on how to implement comet, but none of'em appear to solve problems I have:
How to make it cross-browser?
How to get notified when new data is arrived from server (what event should I hook into)?
How to make my page appear as completely loaded to the user (how to implement it, so that browser doesn't show loading activity)?
Can anyone please help? I think there should be a very little tip or trick that I don't know here to glue all the concepts together. Does anyone know what lightstreamer do to overcome these problems?
SockJS author here.
How to make it cross-browser?
This is hard, expect to spend a few months on getting streaming transports on opera and IE.
How to get notified when new data is arrived from server (what event should I hook into)?
There are various techniques, depending on a particular browser. For a good intro take a look at different fallback protocols supported by Socket.IO and SockJS.
How to make my page appear as completely loaded to the user (how to implement it, so that browser doesn't show loading activity)?
Again, there are browser-specific tricks. One is to delay loading AJAX after onload event. Other is to bind-and-unbind an iframe from DOM. ETC. If you still feel interested read SockJS or Socket.io code.
Can anyone please help? I think there should be a very little tip or trick that I don't know here to glue all the concepts together. Does anyone know what lightstreamer do to overcome these problems?
Basically, unless you have a very strong reason to, don't reinvent the wheel. Use SockJS, Socket.io, faye, or any other of dozens projects that do solve this problem already.
The methods you want is the streaming.
How to make it cross-browser?
Considering most browsers, there is no consistent way. You have to choose a proper transport according to the browser. Even worse, you have to rely on the browser sniffing to recognize which browser is being used, and the feature detection counts for nothing about this. You can use XDomainRequest for IE8+, XMLHttpRequest for non-IE and Iframe for IE 6+. Avoid iframe transport if possible.
How to get notified when new data is arrived from server (what event should I hook into)?
This varies according to the transport being used. For example, XDomainRequest fires progress event, XMLHttpRequest fires readystatechange event when chunk is arrived except Opera and IE.
How to make my page appear as completely loaded to the user (how to implement it, so that browser doesn't show loading activity)?
I don't know this issue with iframe, but still occurs in WebKit based browsers such as Chrome and Safari with XMLHttpRequest. The only way to avoid this is to connect after the onload event of window, but, in case of Safari, this does not work.
There are some issues you have to consider besides the above questions.
Event-driven server - The server should be able to process asynchronously.
Transport requirements - The server behaves differently for required transport.
Stream format - If the server is going to send big message or multiple messages in a single chunk, a single chunk does not mean a single data. It could be fragment of a single data or concatenation of multiple data. To recognize what is data, the response should be formatted.
Error handling - Iframe transport does not provide any evidence for disconnection.
...
Last but not least, to implement streaming is pretty tiresome than it looks unlike with long polling. I recommend you use solid framework for doing that such as socketio, sockjs and jquery socket which I've created and managed.
Good luck.
but browsers won't execute JavaScript till it's completely loaded.
Have you tried sending back code wrapped in <script> tags? For example, instead of:
<script type="text/javascript">
f(...data1...);
f(...data2...);
try
<script type="text/javascript">f(...data1...);</script>
<script type="text/javascript">f(...data2...);</script>
The best option in your case would be to use JSONP + Long Pulling on server side. You just have to remember to reconnect any time connection drops (times out) or you receive response.
Example code in jquery:
function myJSONP(){
$.getScript(url, {
success: function(){
myJSONP(); //re-connect
},
failure: function(){
myJSONP(); //re-connect
}
})
}
Obviously your response from server has to be javascript code that will call on of your global functions.
Alternatively you can use some jquery JSONP plugin.
Or take a look on this project http://www.meteor.com/ (really cool, but didn't try it)
I use jQuery for AJAX. My question is simple - why cache AJAX? At work and in every tutorial I read, they always say to set caching to false. What happens if you don't, will the server "store" such requests and get "clogged up"? I can find no good answer anywhere - just links telling you how to set caching to false!
It's not that the server stores requests (though they may do some caching, especially higher volume sites, like SO does for anonymous users).
The issue is that the browser will store the response it gets if instructed to (or in IE's case, even when it's not instructed to). Basically you set cache: false if you don't want to user's browser to show stale data it fetched X minutes ago for example.
If it helps, look at what cache: false does, it appends _=190237921749817243 as a query string pair (random number, the actual one is the current time, so it's always....current). This forces the browser to make the request to the server for data again, since it doesn't know what that query string means, it may be a different page...and since it can't know or be sure, it has to fetch again.
The server won't cache the requests, the browser will. Remember that browsers are built to display pages quickly, so they have a cache that maps URLs to the results last returned by those URLs. Ajax requests are URLs returning results, so they could also be cached.
But usually, Ajax requests are meant to do something, you don't want to skip them ever, even if they look like the same URL as a previous request.
If the browser cached Ajax requests, you'd have stale responses, and server actions being skipped.
If you don't turn it off you'll have issues trying to figure why you AJAX works but your functions aren't responding as you'd like them to. Forced re-validation at the header level is probably the best way to gain a cache-less assimilation of the data being AJAX'd in.
Here's a hypothetical scenario. Say you want the user to be able to click any word on your page and see a tooltip with the definition for that word. The definition is not going to change, so it's fine to cache it.
The main problem with caching requests in any kind of dynamic environment is that you'll get stale data back some of the time. And it can be unpredictable when you'll get a 'fresh' pull vs. a cached pull.
If you're pulling static content via AJAX, you could maybe leave caching on, but how sure are you that you'll never want to change that fetched content?
The problem is, as always, Internet Explorer. IE will usually cache the whole request. So, if you are repeatedly firing the same AJAX request then IE will only do it once and always show the first result (even though subsequent requests could return different results).
The browser caches the information, not the server. The point in using Ajax is usually because you're going to be getting information that changes. If there's a part of a website or something you know isn't going to change, you don't bother with it more than once (in which case, caching is ok), that's the beauty of Ajax. Since you should only be dealing with information that may be changing, you want to get the new information. Therefore, you don't want the browser to cache.
For example, Gmail uses Ajax. If caching was simply left on you wouldn't see your new e-mail for quite awhile, which would be bad.