Ignore commas in a text input field when submitting - javascript

So I've made this search that does what its supposed to do front-end wise. However, when submitting I'd like the query to ignore commas.
Right now I'm using commas to make a comma separated search. The whole thing is, when I submit; the comma's are included and thus messes up my search values.
Is there any way to ignore comma's upon submit?
Example: Searching [Example][Test] will actually return Example,Test.
I've made a fiddle here
Any suggestions and help is greatly appreciated.

var firster = true;
//capture form submit
$('form.nice').submit(function(e){
if(firster){
// if its the first submit prevent default
e.preventDefault();
// update input value to have no commas
var val = $('input').val();
val = val.replace(/,/g, ' ');
$('input').val(val);
// let submit go through and submit
firster = false;
$(this).submit();
}
});
DEMO

Looking at your profile, I'm guessing you're using python as a server-side language. The issue you're trying to solve is best dealt with server-side: never rely on front-end code to escape or format data that is being used in a query... check Bobby Tables for more info
Anyhow, in python, you could try this:
ajaxString.replace(",","\", \"")
Thiis will replace all commas witIh " OR ", so a string like some, keywords is translated into some", "keywords, just add some_field IN (" and the closing ") to form a valid query.
Alternatively, you can split the keywords, and deal with them separately (which could come in handy when sorting the results depending on how relevant the results might be.
searchTerms = ajaxString.split(",")
>>>['some','keywords']
That should help you on your way, I hope.
Lastly, I'd suggest just not bothering with developing your own search function at all. Just add a google search to your site, they're the experts. There is just no way you, by yourself, can do better. Or even if you could, just imagine how long it'd take you!
Yes, sometimes a company will create their own search-engine, but only if they have a good reason to do so, and have the resources such an endevour requires. Programming is often all about being "cleverly lazy": Don't reinvent the wheel.

Related

Prevent user from pasting line break

I'm currently having an issue with a code. In my code, I've got a textarea where the user can enter the title of an article and I would like this article to be only in one row. That's why I wrote a script to prevent users to press the return key. But they could bypass this security, indeed if they copy/past the line break they could enter a line break. So, is there a way to detect line break ? I suppose we can do this with regular expressions and with \n or \n. However I tried this:
var enteredText = $('textarea[name="titleIdea"]').val();
var match = /\r|\n/.exec(enteredText);
if (match) {
alert('working');
}
and it doesn't work for an unknown reason. I think the var enteredText = $('textarea[name="titleIdea"]').val(); doesn't work because when I try to alert() it, it shows nothing. But something strange is that when I do an alert on $('textarea[name="titleIdea"]').val(); and not on the enteredText variable it shows the content.
Have a great day. (sorry for mistakes, I'm french)
if they copy/past the line break they could enter a line break
That's why you shouldn't even worry about preventing them from entering it - just don't save it. Remove it on the blur and input events if you really want to, but the only time it actually matters is before you save it to the database (or whatever you are using).
$('textarea[name="titleIdea"]').on('blur input', function() {
$(this).val($(this).val().replace(/(\r\n|\n|\r)/gm,""));
});
And, as other people have already mentioned, if they can't do line breaks, you shouldn't be using a textarea.
I assume your problem is with the paste event.
If i guessed this is my snippet:
$(function () {
$('textarea[name="titleIdea"]').on('paste', function(e) {
var data;
if (window.clipboardData) { // for IE
data = window.clipboardData.getData('Text');
} else {
data = e.originalEvent.clipboardData.getData('Text');
}
var match = /\r|\n/.exec(data);
if (match) {
alert('working');
console.log(data);
}
})
});
<script src="https://code.jquery.com/jquery-1.12.4.min.js"></script>
<textarea name="titleIdea">
</textarea>
This needs to be handled in the backend. Even if you use the recommended appropriate HTML input type of text (instead of textarea), you still do not remove the possibility of return chars getting saved.
The two other answers use Javascript - which technically is the domain of this question. However, this can not be solved with Javascript! This assumes that the input will always come from the form you created with the JS function working perfectly.
The only way to avoid specific characters being inserted into your database is to parse and clean the data in the backend language prior to inserting into your database.
For example, if you are using PHP, you could run a similar regex that stripped out the \n\r chars before it went into processing.
Javascript only helps the UX in this case (the user sees what they will be saving). But the only way to ensure you have data integrity is to validate it on the server side.

What is a good way to change fields in livecycle-documents without triggering validations?

I got a form with lots of fields and many scripts so I broke it down to my very fundamental problem with this example:
Before printing I check if everything is filled out with:
Seite1.execValidate();
The validate XML source of the field:
<validate nullTest="error" scriptTest="error"/>
When clicking on the top-button I want different things to happen.
For example:
field.rawValue = "";
or (if it's a decimal field)
//isPauschal was set earlier to either true or false
field.value.decimal.leadDigits = (isPauschal)?"4":"2";
But then this happens:
Field goes blue (=it is empty) when setting its value to "" - this I want only to happen when I'm validating with the print-button.
Now I found a workaround:
field.mandatory = "";
field.rawValue = "";
field.mandatory = "error";
But if I were to write this every time I changed something that would trigger this my code would look pretty bad and much more confusing.
Can someone help me?
What could I do to easily validate my fields before printing and still being able to change them around with js at runtime without them making strange colors. ;)
I don't want to validate them individually - I wish to keep something like the execValidate() command so it automatically checks all the fields in a subform.
Let me know if you need any more information!

How would I use javascript to validate an e-mail with output on the DOM?

Basically I created a form in html and when things are input properly it simply goes to google.com. Right now I have completed the first few fields but I am unsure of how I would make it recognize if the input that was put in and did not include an # sign or a . some point after it.
I created a fiddle as I was having trouble getting some of the longer of my lines of code to be in-line.
Click here for the fiddle Example
You have a few options depending on how thorougly you want to validate.
email.indexOf('#') >= 0
checks that there is an # at all in the email. See http://jsfiddle.net/27f1h6ws/ for a version of your fiddle with it added.
A more thorough way would be to check it with regex. You can do it extremely simple just checking the general structure of the email input, or extremely thorough check for all valid characters, depending on how crucial the validation is. See this link or the answers in this question for more information.
You can use HTML5 properties like :
pattern attribute which contains a regexp
set your input type to email
If you want to do it with JavaScript, use a regexp also and use the test() method to verify it.
Add this
var re = /^([\w-]+(?:\.[\w-]+)*)#((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$/i;
return re.test(email);

Asymmetrical search for special characters in JavaScript

I'm trying to implement an asymmetrical search for a dictionary web app, so searching for ü, for example, will return only tokens that actually contain ü, but searching for u will return both u and ü. (This is so users who don't know how to type special characters can still search for them, but users who do know how to type them won't be inundated with the plain character forms unnecessarily.)
It has to all be client-side JavaScript without any external libraries.
I've managed to make the second search type work by running both the search term and the text I'm searching through the following function, effectively merging special characters with their plain counterparts:
function cleanUp(dirty) {
cleaned = dirty.replace(/[áàâãäāă]/ig,"a");
cleaned = cleaned.replace(/đ/ig,"d");
cleaned = cleaned.replace(/[éèêẽëēĕ]/ig,"e");
cleaned = cleaned.replace(/[íìîĩïīĭ]/ig,"i");
cleaned = cleaned.replace(/ñ/ig,"n");
cleaned = cleaned.replace(/[óòôõöōŏ]/ig,"o");
cleaned = cleaned.replace(/[úùûũüūŭ]/ig,"u");
return cleaned;
}
I then compare the strings to get my results with something like:
var search_term = cleanup(search_input.value);
var text_to_search = cleanup(main_text);
if (text_to_search.indexOf(search_term) > -1) ... //do something
It's not elegant, but it works. After cleaning up both strings the user can search for i.e. uber and get über even if they don't know how to type ü. But if they do know how, searching for über directly also returns things like uber, which is what I don't want.
I've already thought of things like checking for each special character separately for each search term or duplicating every dictionary entry that has a special character to produce a special-character and a plain-character version, but all of my ideas would seriously slow down the processing time for the search.
Any ideas are greatly appreciated.
The answer you posted sounds quite reasonable.
I would just like to suggest a cleaner way (pun intended) to code your cleanup() function and similar functions that do a series of string operations:
function cleanUp(dirty) {
return dirty
.replace(/[áàâãäāă]/ig,"a")
.replace(/đ/ig,"d")
.replace(/[éèêẽëēĕ]/ig,"e")
.replace(/[íìîĩïīĭ]/ig,"i")
.replace(/ñ/ig,"n")
.replace(/[óòôõöōŏ]/ig,"o")
.replace(/[úùûũüūŭ]/ig,"u");
}
I ended up checking to see if the search term contained any special characters, and if it did, I didn't run it through cleanup(), and compared it to the original dictionary entry instead of the cleaned one. Thanks for the comments everyone.

Is it safe to validate a URL with a regexp?

In my web app I've got a form field where the user can enter an URL. I'm already doing some preliminary client-side validation and I was wondering if I could use a regexp to validate if the entered string is a valid URL. So, two questions:
Is it safe to do this with a regexp? A URL is a complex beast, and just like you shouldn't use a regexp for parsing HTML, I'm worried that it might be unsuitable for a URL as well.
If it can be done, what would be a good regexp for the task? (I know that Google turns up countless regexps, but I'm worried about their quality).
My goal is to prevent a situation where the URL appears in the web page and is unusable by the browser.
Well... maybe. People often ask a similar question about email addresses, and with those you would need a horrendously complicated regular expression (i.e. a couple pages long, at least) to correctly validate them. I don't think URLs are quite as complicated (the W3C has a document describing their format) but still, any reasonably short regexp you come up with will probably block some valid URLs.
I would suggest thinking about what kinds of URLs you need to be accepting. Maybe for your purposes, blocking the occasional valid-but-weird submission is fine, and in that case you can use a simple regex that matches most URLs, like the one in Dobiatowski's answer. Or you could use a regex that accepts all valid URLs and a few invalid ones, if that works for you. But I'd be wary of trying to find a regular expression that accepts exactly all valid URLs and no invalid ones. If you want to have 100% foolproof verification in that way, I'd suggest using a client-side validation of the second type I mentioned (that accepts a few invalid URLs) and doing a more comprehensive check on the server side, using some library in whatever language you are using to process the form data.
As stated by Crescent Fresh, in the comments, there are similar questions to this one which I didn't find. One of them also supplies the full standards-compliant regex for validating an URL:
(?:http://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.
)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)
){3}))(?::(?:\d+))?)(?:/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F
\d]{2}))|[;:#&=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{
2}))|[;:#&=])*))*)(?:\?(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{
2}))|[;:#&=])*))?)?)|(?:ftp://(?:(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?
:%[a-fA-F\d]{2}))|[;?&=])*)(?::(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-
fA-F\d]{2}))|[;?&=])*))?#)?(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-
)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?
:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?))(?:/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!
*'(),]|(?:%[a-fA-F\d]{2}))|[?:#&=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'()
,]|(?:%[a-fA-F\d]{2}))|[?:#&=])*))*)(?:;type=[AIDaid])?)?)|(?:news:(?:
(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;/?:&=])+#(?:(?:(
?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[
a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3})))|(?:[a-zA-Z](
?:[a-zA-Z\d]|[_.+-])*)|\*))|(?:nntp://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[
a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d
])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?)/(?:[a-zA-Z](?:[a-zA-Z
\d]|[_.+-])*)(?:/(?:\d+))?)|(?:telnet://(?:(?:(?:(?:(?:[a-zA-Z\d$\-_.+
!*'(),]|(?:%[a-fA-F\d]{2}))|[;?&=])*)(?::(?:(?:(?:[a-zA-Z\d$\-_.+!*'()
,]|(?:%[a-fA-F\d]{2}))|[;?&=])*))?#)?(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a
-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d]
)?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?))/?)|(?:gopher://(?:(?:
(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:
(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+
))?)(?:/(?:[a-zA-Z\d$\-_.+!*'(),;/?:#&=]|(?:%[a-fA-F\d]{2}))(?:(?:(?:[
a-zA-Z\d$\-_.+!*'(),;/?:#&=]|(?:%[a-fA-F\d]{2}))*)(?:%09(?:(?:(?:[a-zA
-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;:#&=])*)(?:%09(?:(?:[a-zA-Z\d$
\-_.+!*'(),;/?:#&=]|(?:%[a-fA-F\d]{2}))*))?)?)?)?)|(?:wais://(?:(?:(?:
(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:
[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?
)/(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)(?:(?:/(?:(?:[a-zA
-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)/(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(
?:%[a-fA-F\d]{2}))*))|\?(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]
{2}))|[;:#&=])*))?)|(?:mailto:(?:(?:[a-zA-Z\d$\-_.+!*'(),;/?:#&=]|(?:%
[a-fA-F\d]{2}))+))|(?:file://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]
|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:
(?:\d+)(?:\.(?:\d+)){3}))|localhost)?/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'()
,]|(?:%[a-fA-F\d]{2}))|[?:#&=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(
?:%[a-fA-F\d]{2}))|[?:#&=])*))*))|(?:prospero://(?:(?:(?:(?:(?:[a-zA-Z
\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)
*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?)/(?:(?:(?:(?
:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:#&=])*)(?:/(?:(?:(?:[a-
zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:#&=])*))*)(?:(?:;(?:(?:(?:[
a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:#&])*)=(?:(?:(?:[a-zA-Z\d
$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:#&])*)))*)|(?:ldap://(?:(?:(?:(?:
(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:
[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?
))?/(?:(?:(?:(?:(?:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d])
)|(?:%20))+|(?:OID|oid)\.(?:(?:\d+)(?:\.(?:\d+))*))(?:(?:%0[Aa])?(?:%2
0)*)=(?:(?:%0[Aa])?(?:%20)*))?(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F
\d]{2}))*))(?:(?:(?:%0[Aa])?(?:%20)*)\+(?:(?:%0[Aa])?(?:%20)*)(?:(?:(?
:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID
|oid)\.(?:(?:\d+)(?:\.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])
?(?:%20)*))?(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)))*)(?:(
?:(?:(?:%0[Aa])?(?:%20)*)(?:[;,])(?:(?:%0[Aa])?(?:%20)*))(?:(?:(?:(?:(
?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|o
id)\.(?:(?:\d+)(?:\.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])?(
?:%20)*))?(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*))(?:(?:(?:
%0[Aa])?(?:%20)*)\+(?:(?:%0[Aa])?(?:%20)*)(?:(?:(?:(?:(?:[a-zA-Z\d]|%(
?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|oid)\.(?:(?:\d+)(?:
\.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])?(?:%20)*))?(?:(?:[a
-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)))*))*(?:(?:(?:%0[Aa])?(?:%2
0)*)(?:[;,])(?:(?:%0[Aa])?(?:%20)*))?)(?:\?(?:(?:(?:(?:[a-zA-Z\d$\-_.+
!*'(),]|(?:%[a-fA-F\d]{2}))+)(?:,(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-f
A-F\d]{2}))+))*)?)(?:\?(?:base|one|sub)(?:\?(?:((?:[a-zA-Z\d$\-_.+!*'(
),;/?:#&=]|(?:%[a-fA-F\d]{2}))+)))?)?)?)|(?:(?:z39\.50[rs])://(?:(?:(?
:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?
:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))
?)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))+)(?:\+(?:(?:
[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))+))*(?:\?(?:(?:[a-zA-Z\d$\-_
.+!*'(),]|(?:%[a-fA-F\d]{2}))+))?)?(?:;esn=(?:(?:[a-zA-Z\d$\-_.+!*'(),
]|(?:%[a-fA-F\d]{2}))+))?(?:;rs=(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA
-F\d]{2}))+)(?:\+(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))+))*)
?))|(?:cid:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;?:#&=
])*))|(?:mid:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;?:#
&=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;?:#&=]
)*))?)|(?:vemmi://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z
\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\
.(?:\d+)){3}))(?::(?:\d+))?)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a
-fA-F\d]{2}))|[/?:#&=])*)(?:(?:;(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a
-fA-F\d]{2}))|[/?:#&])*)=(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d
]{2}))|[/?:#&])*))*))?)|(?:imap://(?:(?:(?:(?:(?:(?:(?:[a-zA-Z\d$\-_.+
!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~])+)(?:(?:;[Aa][Uu][Tt][Hh]=(?:\*|(?:(
?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~])+))))?)|(?:(?:;[
Aa][Uu][Tt][Hh]=(?:\*|(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2
}))|[&=~])+)))(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[
&=~])+))?))#)?(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])
?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:
\d+)){3}))(?::(?:\d+))?))/(?:(?:(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:
%[a-fA-F\d]{2}))|[&=~:#/])+)?;[Tt][Yy][Pp][Ee]=(?:[Ll](?:[Ii][Ss][Tt]|
[Ss][Uu][Bb])))|(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))
|[&=~:#/])+)(?:\?(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[
&=~:#/])+))?(?:(?:;[Uu][Ii][Dd][Vv][Aa][Ll][Ii][Dd][Ii][Tt][Yy]=(?:[1-
9]\d*)))?)|(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~
:#/])+)(?:(?:;[Uu][Ii][Dd][Vv][Aa][Ll][Ii][Dd][Ii][Tt][Yy]=(?:[1-9]\d*
)))?(?:/;[Uu][Ii][Dd]=(?:[1-9]\d*))(?:(?:/;[Ss][Ee][Cc][Tt][Ii][Oo][Nn
]=(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~:#/])+)))?))
)?)|(?:nfs:(?:(?://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-
Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:
\.(?:\d+)){3}))(?::(?:\d+))?)(?:(?:/(?:(?:(?:(?:(?:[a-zA-Z\d\$\-_.!~*'
(),])|(?:%[a-fA-F\d]{2})|[:#&=+])*)(?:/(?:(?:(?:[a-zA-Z\d\$\-_.!~*'(),
])|(?:%[a-fA-F\d]{2})|[:#&=+])*))*)?)))?)|(?:/(?:(?:(?:(?:(?:[a-zA-Z\d
\$\-_.!~*'(),])|(?:%[a-fA-F\d]{2})|[:#&=+])*)(?:/(?:(?:(?:[a-zA-Z\d\$\
-_.!~*'(),])|(?:%[a-fA-F\d]{2})|[:#&=+])*))*)?))|(?:(?:(?:(?:(?:[a-zA-
Z\d\$\-_.!~*'(),])|(?:%[a-fA-F\d]{2})|[:#&=+])*)(?:/(?:(?:(?:[a-zA-Z\d
\$\-_.!~*'(),])|(?:%[a-fA-F\d]{2})|[:#&=+])*))*)?)))
Obviously this is as insane as I feared it would be, so I'll be rethinking the whole thing.
So, the answer is - yes, it can be done, but you should REALLY think twice whether you want to do it this way. Or accept that the regex will be imperfect.
Regex's are safe for lexical validity, but it doesn't mean the site is going to be there. You will actually have to test the connection to see if it's a valid URL by checking the response returned. In all, it depends on your user requirements to say what is valid and what is not - how safe/secure it is depends on you. If you have something like http://foo.com/?referral=http://bar.com/, it'll break some scripts because users don't expect another protocol/path combo as a parameter. Also, some other special characters and null byte hacks have been known to do fishy things with parameters, but I don't think there have been any successful Regex hacks - perhaps memory overflows?
If this is something that needs to be secure, it should probably be handled server side. Although you can perform server side Javascript, I would probably recommend Perl, since it was designed/developed as a text-based parser.
The Regex is only as advanced, or as limited, as you make it. Humans are logical, therefore the problems they solve can be solved by a logic engine (computers), provided that accurate instructions (in this case the regular expression pattern) are given to follow.
#Kerry:
In Javascript you don't "have to" put '/'s around the regular expression. There are also conditions where you can put it in quotes: var re = new Regexp("\w+");
Web Examples:
I think Kerry's was pretty nice, though I only gave it a glance w/o checking it, but here are some simpler examples found from the web
http://www.javascriptkit.com/script/script2/acheck.shtml:
// Email Check
var filter=/^([\w-]+(?:\.[\w-]+)*)#((?:[\w-]+\.)*\w[\w-]{0,66})\.([a-z]{2,6}(?:\.[a-z]{2})?)$/i
if (filter.test("email address or variable here")){...}
http://snippets.dzone.com/posts/show/452:
// URL Validation
function isUrl(s) {
var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/
return regexp.test(s);
}
Finally, this looks promising.
http://www.weberdev.com/get_example-4569.html:
// URL Validation
function isValidURL(url){
var RegExp = /^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/;
if(RegExp.test(url)){
return true;
}else{
return false;
}
}
// Email Validation
function isValidEmail(email){
var RegExp = /^((([a-z]|[0-9]|!|#|$|%|&|'|\*|\+|\-|\/|=|\?|\^|_|`|\{|\||\}|~)+(\.([a-z]|[0-9]|!|#|$|%|&|'|\*|\+|\-|\/|=|\?|\^|_|`|\{|\||\}|~)+)*)#((((([a-z]|[0-9])([a-z]|[0-9]|\-){0,61}([a-z]|[0-9])\.))*([a-z]|[0-9])([a-z]|[0-9]|\-){0,61}([a-z]|[0-9])\.)[\w]{2,4}|(((([0-9]){1,3}\.){3}([0-9]){1,3}))|(\[((([0-9]){1,3}\.){3}([0-9]){1,3})\])))$/
if(RegExp.test(email)){
return true;
}else{
return false;
}
}
i think that its safe
here is one sample
http://snippets.dzone.com/posts/show/452
I do believe it is safe, and on #1, that isn't a steadfast rule but more a guideline, speaking from personal experience.
This is the one I use to validate a URL:
(([\w]+:)?\/\/)(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?
Realize that in javascript you have to put '/'s around the regular expression.

Categories