Users on my site have a publicly-visible profile where they accept subscriptions via a simple HTML form. These subscriptions are merged into this user's email list.
Someone could write a script that registers emails constantly to destroy/flood a user's list. This could be mitigated by using IP-based rate-limiting, but this solution does not work if the script runs in a distributed environment.
The only strategy I can think of is using a CAPTCHA, but I'd really like to avoid doing this. What else can I try?
Your question essentially boils down to "How can I tell humans and computers apart without using a CAPTCHA?"
This is indeed quite a complex question with a lot of different answers and approaches. In the following I'll try to name a few. Some of the ideas were taken from this article (German).
Personally I think some kind of CAPTCHA would be a perfect solution. This doesn't
have to be necessarily warped text in an image, you could also use logic puzzles or simple
calculations. But with the following methods you could try to avoid CAPTCHAs; keep in mind that these methods will always be easier to bypass than CAPTCHAs which require user interaction.
Use a hidden field as a honeypot in your form (either type=hidden or use CSS). If this field is filled out (or has another value than you'd expect), you have detected a bot (spam bots usually don't perform semantic analyses, so they fill out everything they find). However this won't work correctly if the bot is specifically targeted at you or simply learns the name of the field and avoids it.
Use JavaScript to check how fast the form is submitted. Of course humans need some time (at least a few seconds) to fill in a form whereas bots are a lot faster.
You should also check if the form is submitted more than once in a short time. This could be done via JavaScript if you use AJAX forms and/or server-side.
The drawback is (as you mentioned yourself), it won't work in distributed systems.
Use JavaScript to detect focus events, clicks or other mouse events that indicate you're dealing with a human. This method is described in this blog article (including some source code examples).
Check if the user works with a standard web browser; spammers sometimes use self-written programs. You could check the user agent string, but this can be manipulated easily. Feature detection would be another possibility.
Of course methods 2-4 won't work if a user has JavaScript disabled. In this case you could display a regular CAPTCHA in <noscript> tags for example. In any case you should always combine several methods to get an effective and user friendly test.
What finally comes to my mind (in your specific case) is checking the validity of the email addresses entered (not only syntactically but also check if the addresses really exist). This can be done in several ways (see this question on SO) - none of them is really reliable, though. So, again, you will have to combine different methods in order to reliably tell humans and bots apart.
Assuming that whoever starts spamming your website specifically targets your website (not a random spam-bot) and will try actively work around all countermeasures then the only option is some kind of captcha, as anything else can be automatically avoided.
All non-captcha methods of preventing fake/spam submissions work either by exploiting flaws in script doing the automated submissions or analyzing the content submitted. With the type of submissions content analysis isn't really an option here. So what is left is a wide variety of automated submission prevention used in fighting for example spam comments:
CSS based solutions ( such as this one: http://wordpress.org/extend/plugins/spam-honeypot/ )
JS based solutions: hidden field is filled by data computed by javascript - if the content is submitted by something as simple as spam script that doesn't support java script it's easily detectable
It's possible to work around those two if the attacker knows they are there - for example when your website is a selected, not random, target.
To summarize: there are plenty solutions that will quite successfully stop random spam submissions, but if someone is specifically targeting your website the only real thing that will work is something that computers are bad at - CAPTCHA.
Related
When you pay through online payment systems ( being with or without 3DSecure), you fill in the form and validate, and from a strictly visual point of view, things seems pretty straightforward. But behind, there is often multiple redirections, which are handled through JavaScript.
Basically, your data is submitted, and you land on a page with a pre-filled form, which is immediately submitted through JavaScript, sometimes multiple times in a row (with fast enough connection, you don't even see those steps from browser).
I was wondering why they do it that way (instead of proper back-end redirections), and I can't find an answer to it.
My guess is that it's just to make it harder for scripts to follow it, but it's still possible to do it (so why bother), and to my opinion, the "dirty aspect" of it (from a coder point of view) is not worth the constraints it gives to scripts that would attempt an automatic validation.
Do you have any insights on this?
From my view, using the JavaScript will detect the bot or human efficiently.
As you can already saw, how the Google validate the bot.
It's just simple a check box, but it's quite complicated if you try to write the bot to verify or pass the check. (Now I still don't know how to pass by it ^)
Question: What precautions should I take when I let clients add custom JS scripts to their pages?
IF you want more details:
I am working on a custom CMS like project for a company, The CMS has number of "groups" that each subscriber "owns" where they do their own thing.
The new requirements is that some groups want to add google analytics to see how they are doing. So I naturally added a column in the table and made code adjustements so if there is some data in that column, I just use the following line in master page to set the script out:
ScriptManager.RegisterClientScriptBlock(Page, typeof(Page), "CustomJs", CustomJs, true);
It works just fine, only, It got me thinking...
It's really easy for someone with good knowledge of how to access cookies etc from from js. Sure, each group is moderated and only super admin can add this javascript, sure, they wouldn't be silly enough to hack their own group. Each group has their own code so its not possible to hack other groups BUT STILL
I am not really comfortable in letting user's add their own javascript codes.
I could monitor each group myself, but the groups are growing really quick and I will hit a time when I will no longer be able to do that.
So, to brief it up: What precautions should I take to avoid any mishaps ?
ps: did try to google, no convincing answers anywhere.
Instead of allowing the users to add their own Javascript files, and given that the only requirement here is for google analytics, why not just let them put their analytics ID into the CMS and if it's present, output the relevant Google Analytics code?
This way you fulfill the users requirement and also avoid the need to protect against malicious scripting.
Letting users use Javascript is in general, a very bad idea. Don't do it unless you have to.
I once I had a problem where I need to let clients use Javascript, but, the clients weren't necessarily trusted, so, I modified cofeescript so that only a small subset was compilable to javascript, and it worked pretty well. This may be waaaay too overkill for you.
You should not let your users access cookies, that's always a pain. Also, no localStorage or webSQL if you're one of the HTML5 people, and, no document.write() because that's another form of eval as JSLint tells you.
And, the problem with letting people have javascript is that even if you believe you have trusted users, someone may get a password, and you don't want that person to get access to all the other accounts in the group.
Automatically recognizing whether some JavaScript code is malicious or sandboxing it is close to impossible. If you don't want to allow hacking your site you are left with only few options:
Don't allow users to add JavaScript at all.
Only allow predefined JavaScript code, e.g. for Google Analytics.
Have all custom JavaScript inspected by a human before it is allowed to display on the site. Never trust scripts loaded from third party sites - these can change from one day to another and turn malicious.
If you have no other choice, you may consider separating path/domain of user javascripts (and cookies).
For example your user have page:
user1.server.com
and you keep user pages at
user1.server.com
So, if you set session cookies to the user1.server.com, it'll render them unobtainable for user scripts from other domains (e.g. user2.server.com).
Another option may be executing all user's javascript at server JS engine (thus controlling all it's I/O and limiting access to browser resources).
There is no simple and easy solution anyway, so better consider using options from other answers (e.g. predifined script API, human inspection).
We're talking your average everyday spamming bots -- those which we try to protect against using captcha.
How many of them are capable of running JS in some kind of embedded-browser?
If it's a very tiny amount, then how on earth can solutions like this be useful: http://wcaptcha.wozia.pt/sample.php
Apart from the obvious usability/accessibility issues, these drag-n-drop solutions require the client to have JS. There's not even a fallback. So, assuming it is intended to protect against bots (non-humans) isn't it entirely redundant, or at least redundant to the extent of how many bots would be technically capable of attempting such a thing?
If the client has JS (which is a pre-requisite for this solution to work) then isn't it safe (within reasonable measure) to assume the client not a bot?
It isn't that redundant. If you just detect for Javascript, people can still boot up instances of Selenium and pretend to comment. The number of spam bots doing that now is in the minority, but as the spam wars evolve, you can bet spam bots will move on to other methods such as using a browser. If you detect for Javascript AND make them drag and drop something, it'll definitely prove you're a human.
But I think this implementation is just not practical because there is still a % of people that have JS off for whatever reason. I hear this % is 2 or 3%, which is still a good amount when you're talking about hundreds of thousands of visitors.
An alternative is to have a noscript option that asks the user to activate Javascript if he/she wants to comment on the blog.
Yes, very few spambots will have JavaScript enabled.
Spam is a percentages game. Only a very small percentage of spam messages will trigger any revenue for the spammer. If you can increase the cost of spam, you make it economically infeasible. Spamming in a JavaScript-enabled browser is way more expensive than spamming on the command line, so you can send out more spam at a time if you stick to curl.
Yes, it is redundant.
Rather than making users do this pointless task, you might as well automatically perform a javascript check. It could be as simple as a script that grabs the domain name of the site and inserts it into each form as a hidden field. This will stop all drive-by spammers. If your site is high-profile enough to attract custom spammers, this solution won't be enough anyway.
For those without JavaScript, just show them a regular old image CAPTCHA after their post fails.
A bigger issue is usability IMHO. Captcha is always going to decrease conversion rates, and often significantly. If your goal is to use JS as a means of deterring bots, I can tell you that it has significantly reduced bot traffic for me by more than 90%.
Just incorporate a hidden field that gets populated by JS. If it isn't filled in, they're either a bot, or one of those idiots with JS turned off, who you don't really want to cater to anyway.
Also incorporate a hidden field that is visible in the DOM. Make it fly off the screen with CSS like "position:absolute; left:9999px; top: -9999px". Don't use "display:none;" If this field is filled in, they're a bot.
I cut down our spam more than 90% with this, so you should use it over Captcha types, unless you're a big business. If you're a big business, your only real solution is a back-end server side solution. Good luck finding that on StackOverflow. They'll close your comment quicker than people can answer it. (and it will have better Google rank than anything out there)
I am creating a web site and my client demands to restrict user to copy TEXT displayed on the web page.how can I do that? I am using PHP and HTML in my application.
Not trying to be rude, but why do people keep asking this? If you want people to be able to see the information, then you cannot prevent them from copying it. Any kind of javascript nonsense to prevent right-clicking or selection or whatever else will not stop determined thieves and will annoy legitimate users.
As mentioned by every answer previously, there's no way to prevent someone from being able to use the copy from your site. Even if you use methods to restrict direct copy and paste, there are always screenshots, OCR or good old writing by hand.
Looking at it from a different perspective...if the content is sensitive and your client doesn't want it distributed, you COULD add it to a section of your site that requires registration and authentication to access. By doing this you could require that users agree to terms and conditions on registration which explicitly deny permission to reproduce any of the content from the site.
Just a thought.
As every other answer has said, there is nothing technically you can to to prevent people from copying the text of your page. For the text to be display to the user, you must send it to the user's computer, which means they can copy it.
However, you can legally prevent them from copying the text with a service like CopyScape
Copyscape is dedicated to protecting
your valuable content online. We
provide the world's most powerful and
most popular online plagiarism
detection solutions, ranked #1 by
independent tests. Copyscape's
products are trusted by millions of
website owners worldwide to check the
originality of their new content,
prevent duplicate content, and search
for copies of existing content online.
Copyscape provides a free service for
finding copies of your web pages
online, as well as two more powerful
professional solutions for preventing
content theft and content fraud:
Copyscape Premium provides more
powerful plagiarism detection than the
free service, plus a host of other
features, including copy-paste
originality checks, batch search, case
tracking and an API
Copysentry provides comprehensive
protection for your website by
automatically scanning the web daily
or weekly and emailing you when new
copies of your content are found.
Read more on their site.
you can force people to call a phone number to hear the text of your website, a great solution if you do not want people to copy/paste the text of your webpage
Basically, you cannot. Even if there was a way to restrict user from copy & paste the text, they can always just grab the screen and translate it somehow into text.
I'd recommend not to try restrict users in any way. It's not really friendly and people usually hate it. If you want to create some private content, just make people to log in, do some ACL check and hope that they won't copy it somewhere else. You could also consider using some kind of license to prevent people from "stealing" your content.
Even if he was to build the system in flash the user could still hand write out the content if they desperately wanted it, like everyone else said its impossible to stop a determined person from getting your content, unless of course you don't display it.
No, AFAIK, there is no way you can achieve that. Unless you're building the whole thing in Flash or other non-HTML plugin contents.
The short answer is that you can't (easily) do this - if it's visible in the browser then it is obtainable somehow. This is particularly the case if you are just displaying text.
And it all gets back to "Why"? If the information is secret, don't show it to someone in the first place. If you're concerned about copyright violation, as others have said, once someone sees the text, even if you somehow came up with a brilliant technical solution that prevented them from copying the text in any way (which I doubt is possible), they could always write it down by hand, or take a picture of the screen with a digital camera and then OCR it. In the digital age, your protection against copyright violation is more legal than technical: if somebody steals your material and resells it, sue them.
Depending on the nature of your material, you may be able to make it awkward for people to get it all on one screen. Like, if you were running an on-line phone book and you were afraid of people stealing your listings, instead of displaying some large number of listings on one giant page -- all the "A"s or whatever -- you could require people to enter search terms and only show two or three possible hits at a time. Then if someone wanted to steal your listings, they would have to spend thousands of hours entering every imaginable search term. Now that I think of it, I was using some phone book site the other day that gave me a listing of names and addresses that were possible matches, but then I had to click on each one to get the phone number. At the time I thought "dumb nuisance", but now it hits me: they probably had the same idea that I briefly thought was original. Anyway, if your material is a database of individual factoids, this could be practical. If it's an article on the economic history of Lithuania or some such, making the user seach for it in tiny pieces is just going to make people abandon you and look elsewhere.
Personally, I've taken the philosophy that I just don't care. I've had many occassions when I've done Google searches on subjects that interest me and turned up articles that I've written, on sites that never asked my permission. I once even found an article that I wrote on one of those pre-written student papers web sites. (Not that any student would just paste his name on it, print it off, and hand it in, of course. They are "for research purposes only". I'm sure if they knew of students claiming this as their own work they would take down the site immediately.) So an article that I published on the web, available to anyone for free, these people were now charging dishonest students $25 to download! My reaction was, Way cool! It's one thing when others quote you, but you've really reached the big time when others plagiarize you!
This is not possible.
You cannot prevent someone from getting the information if you're sending it to them so they can see it. A user can simply view the source of the HTML and see what the text is and copy it from there and there's nothing you can do to stop them.
Implementing anything in JavaScript is completely ineffective since anyone can just disable JavaScript in their browser and get around it, and you'll only end up annoying your users.
The only way to prevent someone copying the text from a web page is to not put it on the web page in the first place.
If you presented content via images, or flash, and prevented the ability to save as that might be a solution. I found some resources you might find useful in protecting images here and some information on "preventing" print screen here.
Unfortunately, there is no easy solution for your question, as once the content is delivered to the user, they have ultimate control over the information (who's preventing them from taking an actual picture of the site?).
Well, the PHP has nothing to do with it, as that's server-side. You might be able to cook up something in javascript (it's fairly easy to disable right-click; it may also be possible to disable text highlighting), but it's fairly easy to get around this. Failing all else, the user might view source, though that can be encrypted too:
document.write(base64decode('encoded string containing entire HTML document'));
This is, frankly, both annoying and pointless. Anything that's available to the user can be taken somehow. Even flash isn't immune. (There are browser plugins available to take videos out of flash.)
You may want to look at your target audience as well to help determine how you want to make it harder (since you can't realistically prevent it)..
For the simple user just disabling the right click may be good enough to prevent it. Slightly more work would be to do as others had suggested and create an image. With the image you'd probably want to set a background-image on a DIV or something since you can easily drag images, using the IMG tag, straight from the page onto you desktop, or wherever. From there you could use Flash, or some other RIA, or maybe even SVG/VML..
Anyone who knows how to do a screen capture really narrows down what you can feasibly implement :(
<script type="text/JavaScript">
//script to bar copying of website contents
function killCopy(e){
return false
}
function reEnable(){
return true
}
document.onselectstart=new Function("return false"){
if (window.sidebar){
document.onmousedown=killcopy
document.onclick=reEnable
}
};
</script>
I'm working on an ajax application that makes extensive use of jQuery. I'm not worried about whether or not the application degrades gracefully.
So far I have been using Malsup's excellent jQuery form plugin to create forms that submit ajax requests. (For example, to submit updated record information.)
However I am considering dispensing with form tags altogether, and instead manually constructing $.post() statements when needed.
I'm wondering: What are peoples' thoughts on the best way to submit a large amount of information to the server - considering graceful degradation is not a requirement. Are there perils with just using $.post()?
Thanks in advance
Nope, not at all. That's all the plugin is doing anyway, under the hood.
The form tag does at least provide you with a nice structural grouping of your form tags, so that you can query for them more easily, though.
You've said it yourself - the peril is that it won't degrade gracefull!
Have jQuery add an extra field called UsingjQuery, then output your results based on whether this field is set or not.
This way users with javascript turned off (mobile clients, etc) will still be able to submit.
edit: Saw you mentioned 'degrades gracefully' but somehow didn't see it said 'not worried about' first!
Having a form tag does allow one javascript trick that jQuery doesn't support without: $('form').reset() ...
I stopped using FORM tags some time ago, but I also have a set of captured users that I know exactly what platform they are using.
I agree with David Pfeffer, however, I would also make the point that on occasion, form tags can get in your way. I've specifically had problems where I wanted multiple forms inside of a table, but that caused really ugly problems with positioning. So, I wound up dropping just the input elements in, copying them into a form that was elsewhere on the page, then submitting that form. It was a bit of a pain in the butt.
If you can do away with forms, and aren't worried about degradation, then I would highly consider it.