How many bots have JS "enabled"? - javascript

We're talking your average everyday spamming bots -- those which we try to protect against using captcha.
How many of them are capable of running JS in some kind of embedded-browser?
If it's a very tiny amount, then how on earth can solutions like this be useful: http://wcaptcha.wozia.pt/sample.php
Apart from the obvious usability/accessibility issues, these drag-n-drop solutions require the client to have JS. There's not even a fallback. So, assuming it is intended to protect against bots (non-humans) isn't it entirely redundant, or at least redundant to the extent of how many bots would be technically capable of attempting such a thing?
If the client has JS (which is a pre-requisite for this solution to work) then isn't it safe (within reasonable measure) to assume the client not a bot?

It isn't that redundant. If you just detect for Javascript, people can still boot up instances of Selenium and pretend to comment. The number of spam bots doing that now is in the minority, but as the spam wars evolve, you can bet spam bots will move on to other methods such as using a browser. If you detect for Javascript AND make them drag and drop something, it'll definitely prove you're a human.
But I think this implementation is just not practical because there is still a % of people that have JS off for whatever reason. I hear this % is 2 or 3%, which is still a good amount when you're talking about hundreds of thousands of visitors.
An alternative is to have a noscript option that asks the user to activate Javascript if he/she wants to comment on the blog.

Yes, very few spambots will have JavaScript enabled.
Spam is a percentages game. Only a very small percentage of spam messages will trigger any revenue for the spammer. If you can increase the cost of spam, you make it economically infeasible. Spamming in a JavaScript-enabled browser is way more expensive than spamming on the command line, so you can send out more spam at a time if you stick to curl.
Yes, it is redundant.
Rather than making users do this pointless task, you might as well automatically perform a javascript check. It could be as simple as a script that grabs the domain name of the site and inserts it into each form as a hidden field. This will stop all drive-by spammers. If your site is high-profile enough to attract custom spammers, this solution won't be enough anyway.
For those without JavaScript, just show them a regular old image CAPTCHA after their post fails.

A bigger issue is usability IMHO. Captcha is always going to decrease conversion rates, and often significantly. If your goal is to use JS as a means of deterring bots, I can tell you that it has significantly reduced bot traffic for me by more than 90%.
Just incorporate a hidden field that gets populated by JS. If it isn't filled in, they're either a bot, or one of those idiots with JS turned off, who you don't really want to cater to anyway.
Also incorporate a hidden field that is visible in the DOM. Make it fly off the screen with CSS like "position:absolute; left:9999px; top: -9999px". Don't use "display:none;" If this field is filled in, they're a bot.
I cut down our spam more than 90% with this, so you should use it over Captcha types, unless you're a big business. If you're a big business, your only real solution is a back-end server side solution. Good luck finding that on StackOverflow. They'll close your comment quicker than people can answer it. (and it will have better Google rank than anything out there)

Related

Allow subscriber list registration but prevent scripts

Users on my site have a publicly-visible profile where they accept subscriptions via a simple HTML form. These subscriptions are merged into this user's email list.
Someone could write a script that registers emails constantly to destroy/flood a user's list. This could be mitigated by using IP-based rate-limiting, but this solution does not work if the script runs in a distributed environment.
The only strategy I can think of is using a CAPTCHA, but I'd really like to avoid doing this. What else can I try?
Your question essentially boils down to "How can I tell humans and computers apart without using a CAPTCHA?"
This is indeed quite a complex question with a lot of different answers and approaches. In the following I'll try to name a few. Some of the ideas were taken from this article (German).
Personally I think some kind of CAPTCHA would be a perfect solution. This doesn't
have to be necessarily warped text in an image, you could also use logic puzzles or simple
calculations. But with the following methods you could try to avoid CAPTCHAs; keep in mind that these methods will always be easier to bypass than CAPTCHAs which require user interaction.
Use a hidden field as a honeypot in your form (either type=hidden or use CSS). If this field is filled out (or has another value than you'd expect), you have detected a bot (spam bots usually don't perform semantic analyses, so they fill out everything they find). However this won't work correctly if the bot is specifically targeted at you or simply learns the name of the field and avoids it.
Use JavaScript to check how fast the form is submitted. Of course humans need some time (at least a few seconds) to fill in a form whereas bots are a lot faster.
You should also check if the form is submitted more than once in a short time. This could be done via JavaScript if you use AJAX forms and/or server-side.
The drawback is (as you mentioned yourself), it won't work in distributed systems.
Use JavaScript to detect focus events, clicks or other mouse events that indicate you're dealing with a human. This method is described in this blog article (including some source code examples).
Check if the user works with a standard web browser; spammers sometimes use self-written programs. You could check the user agent string, but this can be manipulated easily. Feature detection would be another possibility.
Of course methods 2-4 won't work if a user has JavaScript disabled. In this case you could display a regular CAPTCHA in <noscript> tags for example. In any case you should always combine several methods to get an effective and user friendly test.
What finally comes to my mind (in your specific case) is checking the validity of the email addresses entered (not only syntactically but also check if the addresses really exist). This can be done in several ways (see this question on SO) - none of them is really reliable, though. So, again, you will have to combine different methods in order to reliably tell humans and bots apart.
Assuming that whoever starts spamming your website specifically targets your website (not a random spam-bot) and will try actively work around all countermeasures then the only option is some kind of captcha, as anything else can be automatically avoided.
All non-captcha methods of preventing fake/spam submissions work either by exploiting flaws in script doing the automated submissions or analyzing the content submitted. With the type of submissions content analysis isn't really an option here. So what is left is a wide variety of automated submission prevention used in fighting for example spam comments:
CSS based solutions ( such as this one: http://wordpress.org/extend/plugins/spam-honeypot/ )
JS based solutions: hidden field is filled by data computed by javascript - if the content is submitted by something as simple as spam script that doesn't support java script it's easily detectable
It's possible to work around those two if the attacker knows they are there - for example when your website is a selected, not random, target.
To summarize: there are plenty solutions that will quite successfully stop random spam submissions, but if someone is specifically targeting your website the only real thing that will work is something that computers are bad at - CAPTCHA.

Computationally difficult [JavaScript] problem?

I need a problem that is computationally difficult (in any language), that I can easily implement in JavaScript. I'm trying to do a CAPTCHA-like test to make it unlikely that hacker is accessing my page mechanically.
Yes, I know that he could use Rhino or some other JS engine and do it -- that's why I want it to be computationally expensive, so it takes him a few hours to set up and his machine a few seconds to fake each access.
I'm think getting a bunch of large primes on the back end and sending over the product of two of them and demand that web-page factor it, but if anybody has a better idea, I'm all ears. Also, does anybody have a good library for doing that factoring thing?
You can use the same method as bitcoin, ie. reversing a secure hash.
Explained here:
http://www.tomshardware.com/reviews/bitcoin-mining-make-money,3514-3.html
Bitcoin source
https://github.com/bitcoin/bitcoin
you can implement a standard captcha and make some more checking on the client side. for exaample, add a event listener on the captcha input text to listen for key down/key up events and xor the keycodes and send them along with the captcha. add a hidden input text in the form named email or something you find on every form. robots fill those up automatically. and if you get a value for post['email'] then it's a robot because the user won't see that. also you can have a piece of code in a totally unrelated javascript that automatically adds a field in the form that is required to validate. so...captcha no captcha, you can still enhance the robot protection client side without computation difficult processes.
The problem with this is that if it is known to be NP-Hard, it's going to be a pain in the rear for human beings to solve, as well, on non-trivial instances. Visual/auditory captchas are kind of cool in that they give people a leg up... we have very sophisticated sensory organs for processing these kinds of things, and computers are not too good at it (though they are getting better all the time!).
As such, you're probably better off coming up with a unique thing that people can do very easily, but that machines are not too good at. For instance, give some simple black and white pictures and ask the user which one doesn't belong, or show some pictures of foods and ask what kind of recipe you could make with them.
Clever approach. Whenever one-way complexity is needed it makes me think of a hash. Simply hash some aspect of their user account (not anything sensitive) and send the hash to the client. You would want to truncate/pad the string to get your desired complexity level. This isn't to secure an account so md5 or any other hashing algorithm would be fine.
Here is some sample code that you might be able to leverage for the client side.

Best practice for "hidden" JavaScript HTTP request?

I'm not exactly sure how to formulate the question, but I think it's more of a suggestions request, instead of a question per se.
We are building an HTML5 service on which users get credited (rewarded, on social gaming lingo) for completing a series of offers. Most of these offers are video ad watching. We already have an implementation of this built on Flash, but for HTML5 I'm encountering a bit more issues on how to make the request calls to validate legit watched video ads. On the Flash interface, we have a series of HTTP requests that the SWF makes, some upon the video playback starts, in the middle and at the end, each one of those requests are related to each other, meaning, the response of one is needed on the next request, etc. Most of the logic to "hide" this "algorithm" is lightly hidden on the SWF binary, and it pretty much serves it purpose.
However, for HTML5 we have to rely on world visible JavaScript and that "hidden" logic is open wide. So, I guess this is a call for suggestions on how these cases are usually handled so that an skilled person could not (so easily) get access to it and exploit the service to get credited programmatically. Obfuscating the JavaScript seems like something that could help but that in no way protects fully.
There's of course some extra security on the backend (like frequency capping, per user capping, etc), but since our capping clears every day, an skilled person could still find a way to get credit for all available offers even without completing them.
It sounds like you want to ensure that your server can distinguish requests that happened as the result of the user interacting with your UI in ways you approve of from requests that did not happen that way.
There are a number of points of attack on such a system.
Inspect the JavaScript to find the event handler and invoke them via Firebug or another tool.
Inspect any keys from your code, and generate the HTTP requests without involving the browser.
Run code in the browser to programmatically generate events.
Use a 3rd-party tool that instruments the browser to generate clicks.
If you've got reasonable solutions to instrumentation attacks (3 and 4), then you can look at Is there any way to hide javascript functions from end user? for ways to get secrets into the client to allow you to sign your requests. Beyond that, obfuscation is the only (and imperfect) way to stop a not-too-determined attacker from any exploitation, and rate-limiting and UI event logging are probably your best bets for stopping determined attackers from benefiting from wide-scale fraud.
You will not be able to prevent a determined attacker (even with SWF, though it's more obfuscated). Your best bet is to make sure that:
Circumventing your measures is expensive in terms of effort, perhaps by using a computationally expensive crypto algorithm so they can't just set up a bunch of scripts to do it.
The payoff is minimal (user-capping is an example of how to reduce payoff; if you're giving out points, it's fine; if you're mailing out twenty dollar bills, you're out of luck)
Cost-benefit.

how to restrict user to copy web content

I am creating a web site and my client demands to restrict user to copy TEXT displayed on the web page.how can I do that? I am using PHP and HTML in my application.
Not trying to be rude, but why do people keep asking this? If you want people to be able to see the information, then you cannot prevent them from copying it. Any kind of javascript nonsense to prevent right-clicking or selection or whatever else will not stop determined thieves and will annoy legitimate users.
As mentioned by every answer previously, there's no way to prevent someone from being able to use the copy from your site. Even if you use methods to restrict direct copy and paste, there are always screenshots, OCR or good old writing by hand.
Looking at it from a different perspective...if the content is sensitive and your client doesn't want it distributed, you COULD add it to a section of your site that requires registration and authentication to access. By doing this you could require that users agree to terms and conditions on registration which explicitly deny permission to reproduce any of the content from the site.
Just a thought.
As every other answer has said, there is nothing technically you can to to prevent people from copying the text of your page. For the text to be display to the user, you must send it to the user's computer, which means they can copy it.
However, you can legally prevent them from copying the text with a service like CopyScape
Copyscape is dedicated to protecting
your valuable content online. We
provide the world's most powerful and
most popular online plagiarism
detection solutions, ranked #1 by
independent tests. Copyscape's
products are trusted by millions of
website owners worldwide to check the
originality of their new content,
prevent duplicate content, and search
for copies of existing content online.
Copyscape provides a free service for
finding copies of your web pages
online, as well as two more powerful
professional solutions for preventing
content theft and content fraud:
Copyscape Premium provides more
powerful plagiarism detection than the
free service, plus a host of other
features, including copy-paste
originality checks, batch search, case
tracking and an API
Copysentry provides comprehensive
protection for your website by
automatically scanning the web daily
or weekly and emailing you when new
copies of your content are found.
Read more on their site.
you can force people to call a phone number to hear the text of your website, a great solution if you do not want people to copy/paste the text of your webpage
Basically, you cannot. Even if there was a way to restrict user from copy & paste the text, they can always just grab the screen and translate it somehow into text.
I'd recommend not to try restrict users in any way. It's not really friendly and people usually hate it. If you want to create some private content, just make people to log in, do some ACL check and hope that they won't copy it somewhere else. You could also consider using some kind of license to prevent people from "stealing" your content.
Even if he was to build the system in flash the user could still hand write out the content if they desperately wanted it, like everyone else said its impossible to stop a determined person from getting your content, unless of course you don't display it.
No, AFAIK, there is no way you can achieve that. Unless you're building the whole thing in Flash or other non-HTML plugin contents.
The short answer is that you can't (easily) do this - if it's visible in the browser then it is obtainable somehow. This is particularly the case if you are just displaying text.
And it all gets back to "Why"? If the information is secret, don't show it to someone in the first place. If you're concerned about copyright violation, as others have said, once someone sees the text, even if you somehow came up with a brilliant technical solution that prevented them from copying the text in any way (which I doubt is possible), they could always write it down by hand, or take a picture of the screen with a digital camera and then OCR it. In the digital age, your protection against copyright violation is more legal than technical: if somebody steals your material and resells it, sue them.
Depending on the nature of your material, you may be able to make it awkward for people to get it all on one screen. Like, if you were running an on-line phone book and you were afraid of people stealing your listings, instead of displaying some large number of listings on one giant page -- all the "A"s or whatever -- you could require people to enter search terms and only show two or three possible hits at a time. Then if someone wanted to steal your listings, they would have to spend thousands of hours entering every imaginable search term. Now that I think of it, I was using some phone book site the other day that gave me a listing of names and addresses that were possible matches, but then I had to click on each one to get the phone number. At the time I thought "dumb nuisance", but now it hits me: they probably had the same idea that I briefly thought was original. Anyway, if your material is a database of individual factoids, this could be practical. If it's an article on the economic history of Lithuania or some such, making the user seach for it in tiny pieces is just going to make people abandon you and look elsewhere.
Personally, I've taken the philosophy that I just don't care. I've had many occassions when I've done Google searches on subjects that interest me and turned up articles that I've written, on sites that never asked my permission. I once even found an article that I wrote on one of those pre-written student papers web sites. (Not that any student would just paste his name on it, print it off, and hand it in, of course. They are "for research purposes only". I'm sure if they knew of students claiming this as their own work they would take down the site immediately.) So an article that I published on the web, available to anyone for free, these people were now charging dishonest students $25 to download! My reaction was, Way cool! It's one thing when others quote you, but you've really reached the big time when others plagiarize you!
This is not possible.
You cannot prevent someone from getting the information if you're sending it to them so they can see it. A user can simply view the source of the HTML and see what the text is and copy it from there and there's nothing you can do to stop them.
Implementing anything in JavaScript is completely ineffective since anyone can just disable JavaScript in their browser and get around it, and you'll only end up annoying your users.
The only way to prevent someone copying the text from a web page is to not put it on the web page in the first place.
If you presented content via images, or flash, and prevented the ability to save as that might be a solution. I found some resources you might find useful in protecting images here and some information on "preventing" print screen here.
Unfortunately, there is no easy solution for your question, as once the content is delivered to the user, they have ultimate control over the information (who's preventing them from taking an actual picture of the site?).
Well, the PHP has nothing to do with it, as that's server-side. You might be able to cook up something in javascript (it's fairly easy to disable right-click; it may also be possible to disable text highlighting), but it's fairly easy to get around this. Failing all else, the user might view source, though that can be encrypted too:
document.write(base64decode('encoded string containing entire HTML document'));
This is, frankly, both annoying and pointless. Anything that's available to the user can be taken somehow. Even flash isn't immune. (There are browser plugins available to take videos out of flash.)
You may want to look at your target audience as well to help determine how you want to make it harder (since you can't realistically prevent it)..
For the simple user just disabling the right click may be good enough to prevent it. Slightly more work would be to do as others had suggested and create an image. With the image you'd probably want to set a background-image on a DIV or something since you can easily drag images, using the IMG tag, straight from the page onto you desktop, or wherever. From there you could use Flash, or some other RIA, or maybe even SVG/VML..
Anyone who knows how to do a screen capture really narrows down what you can feasibly implement :(
<script type="text/JavaScript">
//script to bar copying of website contents
function killCopy(e){
return false
}
function reEnable(){
return true
}
document.onselectstart=new Function("return false"){
if (window.sidebar){
document.onmousedown=killcopy
document.onclick=reEnable
}
};
</script>

Web Development: Do we still need to support non-javascript users?

Background: I'm working on an e-commerce website. It was my original intention to add JavaScript on top of regular html pages, so that users with JS support got the added benefits, but users without it could still use the basic html forms to add things to their cart, to search, etc.
I've run into a few situations though, where there simply isn't a sane way to implement certain functionality in a non JavaScript way.
One example is chained attribute selects on product pages (where the color choices change based on the size chosen, because not all sizes come in every color). Even if I didn't use AJAX, it would still require JavaScript to dynamically change the options.
The only alternatives to JavaScript that I can think of would be:
A. Have an add to cart "wizard" where you have to step through each attribute choice on a separate page (yuck!)
B. Put each size/color variation out as a separate product (and force the customer to wade through the category page to find the color size combo that they want)
...And while both of the above would work regardless of whether the user has JavaScript on or not, they both punish the JavaScript users by restructuring the page and forcing them to use an interface designed for the lowest common denominator.
So the question is, since JavaScript has taken a much larger role in web development than it did a few years back, and the design pattern for an AJAX/JS application/site is so much different now than a 'classic' web design pattern, do we still go out of our way to support non JS users? Or do we say, "To hell with you! Update your browser, turn on JavaScript or go shop elsewhere"?
I'd be interested to see other developer's take on this.
I think it really depends on your target audience. I work for a company that has several types of websites, some are focused on your avg guy or gal who's into gaming. And our stats show us that the vast majority of these people have javascript enabled.
We also have a site that's focused at developers, and many of these developers won't allow javascript to run on a site unless they trust it. I've seen as many as 20-30% of the browsers on that site don't run javascript.
So it's very subjective.
IMHO, it's very reasonable to use vast amounts of tasteful javascript to enhance an otherwise mundane experience. However, I also think that when possible it should gracefully degrade. This form of degradation isn't too hard to achieve (in most cases) as long as you consider it when you're designing things.
The most important non-javascript user is Google. Do not forget that.
When it comes to things like Ajax, or any javascript for that matter, I think it's best to:
Plan for Ajax from the start; Implement Ajax at the end -Jeremy Keith
This means Intercept (hijack) links and forms using (unobtrusive) JavaScript making your code degrade well for those that don't have javascript enabled. If you want to show a fancy slider, make it a link that tells the server to show the div when you reload the page and tell your javascript to do something differently when the link is clicked.
These ideas are your safest bet for a functional, but fun website.
I think that supporting non-Javascript users is absolutely necessary for any site aiming at some kind of "normal" target group (i.e. not gamers or techies).
There is an increasing number of mobile devices accessing, and trying to parse, normal content.
Many corporate networks still block scripting for security reasons - you don't want to lose the occasional employee shopping from work, either.
Javascript tends to screw any attempt at accessibility. In my mind, creating sites that are as accessible as possible is a service to our fellow human beings.
I'm not saying I'm lily white on this. I hate replicating functionality that I just managed to achieve in my JS framework in static HTML, or making it degrade gracefully. But I think it still really, really is a must, and not merely a question of profitability. This is something worth investing a bit extra, or putting a bit of unpaid work into.
If you can exclude users that don't use javascript, so this will be some mobile devices, or the truly paranoid, as well as lynx users, and any users not using the version of javascript you write for.
If you are willing to go with that then I would suggest you have a static html page that has some message that javascript is required.
When your javascript is loaded, and the DOM tree is ready then you can replace this message, so it is never seen, with the rest of the webpage.
But, you may want to see how you can get the functionality, even if limited, for non-javascript browsers.
For example, for colors you can use a horizontal dropdown that can work on all but older IE browsers: http://www.alistapart.com/articles/horizdropdowns/
If you want to use javascript to make your life simpler then that may be a poor reason, but, if you are doing a photoshop webapp then you will need javascript.
NOTE I would suggest having it work with and without javascript, as an e-commerce site will want to not exclude any customers, I expect.
Much of it depends on your audience. As you said years ago JavaScript was used in a different way, and a bit of an annoyance even. Now with AJAX and increased functionality it's a must, and most people have it turned on.
You could say that someone with Javascript disabled is used to stuff not working pretty regularly, and a small minority.
However, if you're building a site that will be frequented by older computers or people on limited bandwidth (such as foreign traffic) you might want to consider working around them. Also a site that is heavily visited by mobile browsers might be another one to focus on.
Take a look at your analytics and see what the current usage is, and profile your audience to really find out.
The best answer will depend on a simple comparison: Estimate the extra money you will spend creating the non javascript alternative site. Estimate the money you will make selling products to your non javascript enabled customers. Compare. If you are a huge shop, then getting sales from that last 2% or 10% of users might be worth it. If you are just one guy, maybe you have more profitable ways to spend your time.
I found this interesting and fact driven post, it might help. Why we should support users without Javascript. Following is a summary:
Some people choose to turn JS off.
JS fails occasionally, HTML/CSS does not.
JS is designed to and should be the icing on the cake, not a patch-job for bad HTML/CSS.
In theory, yes; but in practice, no.
In theory it's in the spirit of the web to support hardware and software with a range of capabilities and configurations, scaling site features appropriately.
In practice, even mobile browsers are converging on the sweet spot occupied by the current major desktop browsers. Users on the outside can typically switch to an alternate browser or device in a pinch.
While it seems logical, progressive, simpler, and more efficient to require Javascript for an e-commerce site, you should ask yourself if you are willing to forego x percent of the business that would be generated by non-Javascript users, and weigh that against lower development costs.
The percent of business lost is likely not the percent of non-Javascript users, because a smaller percentage of non-Javascript visitors are likely to purchase goods are services. The percentage of lost business will probably be somewhat less than the percentage of non-Javascript users.
focus on making advanced feature to the large audience is better than spending time to find work around for non-javascript user and ones who use obsolete platform.
In my opinion there 2 things you have to consider when thinking about using JavaScript on a website:
Is Google still able to crawl all the content
If some parts of the website are not usable without JavaScript then make a very clear message for the non JavaScript users, why you site is not working for them
I think we need to support non-javascript users again.
There are users including me who doesn't disable JS for any kind of paranoia, but because some sites are so JS heavy nowadays that my browser is slowed down to a grinding halt. And this is getting worse recently, so I expect more users will do this eventually. I wanted my performance back so I installed browser plugins that let me selectively enable only the absolutely necessary scripts on the site. So I can have a dozen of tabs open again without performance problems.

Categories