registeration check if email exist without checking database - javascript

consider I want to make a multiple step registration form with Javascript(e.g. Angular) and I don't want to do some ajax before all steps are completely done.
I've thought may be executing a function in Javascript which take user email as an argument which determine whether email exist in database or not.
but actually I don't want my function to do Ajax call.
consider each time a user is registered in database this algorithm should be updated.
it's a little hard to say but I mean there should be an algorithm which determines an item exists or not without having items. (I know it seems a little silly. but actually its not).
you get all items from database once and with knowing all items you write an algorithm which determines whether an given item exist in database or not without knowing database items.
consider log in process system can determine whether user password is correct or not without knowing user password . system just knows some thing about user password (hash or md5 or ...)
so we can here execute a function on our existing user table and get some values and strings or ... and with these values we can detect whether user email already exist in database or not without knowing all items.
one of the reasons which I'm asking question is performance issues(consider user table with so many records.) and the second reason is just to be fancy :)

Generate a hash from the email address. Suppose the hash is a 20-bit value (just take the bottom 20 bits of md5 hash, for example). That means you need a 128K byte table where each bit is either 0 or 1 depending on whether there is an email which hashes to that value. You can easily check for an email present by generating the hash and looking it up in the table. A 1 means either the email is used or there is a hash collision. A 0 guarantees the email was not used at the time the table was generated. To reduce the chance of a collision, make sure the number of bits in the table is much larger than the number of users. 20 bits gives 1 million hash buckets.

So for every request you are going to build a custom response script containgn your entire dataset and send it back to the client for checking? Leaving aside the integrity problem if you don't check the email address sent back from the client and the need to hash the data to prevent disclosure, this is not going to scale well. It will work really quickly with 10s or even 100s of rows but it won't work for 10s of thousands of rows.

Related

XSS attack in JavaScript, how to unsanitize HTML [duplicate]

User equals untrustworthy. Never trust untrustworthy user's input. I get that. However, I am wondering when the best time to sanitize input is. For example, do you blindly store user input and then sanitize it whenever it is accessed/used, or do you sanitize the input immediately and then store this "cleaned" version? Maybe there are also some other approaches I haven't though of in addition to these. I am leaning more towards the first method, because any data that came from user input must still be approached cautiously, where the "cleaned" data might still unknowingly or accidentally be dangerous. Either way, what method do people think is best, and for what reasons?
Unfortunately, almost no one of the participants ever clearly understands what are they talking about. Literally. Only Kibbee managed to make it straight.
This topic is all about sanitization. But the truth is, such a thing like wide-termed "general purpose sanitization" everyone is so eager to talk about is just doesn't exist.
There are a zillion different mediums, each require it's own, distinct data formatting. Moreover - even single certain medium require different formatting for it's parts. Say, HTML formatting is useless for javascript embedded in HTML page. Or, string formatting is useless for the numbers in SQL query.
As a matter of fact, such a "sanitization as early as possible", as suggested in most upvoted answers, is just impossible. As one just cannot tell in which certain medium or medium part the data will be used. Say, we are preparing to defend from "sql-injection", escaping everything that moves. But whoops! - some required fields weren't filled and we have to fill out data back into form instead of database... with all the slashes added.
On the other hand, we diligently escaped all the "user input"... but in the sql query we have no quotes around it, as it is a number or identifier. And no "sanitization" ever helped us.
On the third hand - okay, we did our best in sanitizing the terrible, untrustworthy and disdained "user input"... but in some inner process we used this very data without any formatting (as we did our best already!) - and whoops! have got second order injection in all its glory.
So, from the real life usage point of view, the only proper way would be
formatting, not whatever "sanitization"
right before use
according to the certain medium rules
and even following sub-rules required for this medium's different parts.
It depends on what kind of sanitizing you are doing.
For protecting against SQL injection, don't do anything to the data itself. Just use prepared statements, and that way, you don't have to worry about messing with the data that the user entered, and having it negatively affect your logic. You have to sanitize a little bit, to ensure that numbers are numbers, and dates are dates, since everything is a string as it comes from the request, but don't try to do any checking to do things like block keywords or anything.
For protecting against XSS attacks, it would probably be easier to fix the data before it's stored. However, as others mentioned, sometimes it's nice to have a pristine copy of exactly what the user entered, because once you change it, it's lost forever. It's almost too bad there's not a fool proof way to ensure you application only puts out sanitized HTML the way you can ensure you don't get caught by SQL injection by using prepared queries.
I sanitize my user data much like Radu...
First client-side using both regex's and taking control over allowable characters
input into given form fields using javascript or jQuery tied to events, such as
onChange or OnBlur, which removes any disallowed input before it can even be
submitted. Realize however, that this really only has the effect of letting those
users in the know, that the data is going to be checked server-side as well. It's
more a warning than any actual protection.
Second, and I rarely see this done these days anymore, that the first check being
done server-side is to check the location of where the form is being submitted from.
By only allowing form submission from a page that you have designated as a valid
location, you can kill the script BEFORE you have even read in any data. Granted,
that in itself is insufficient, as a good hacker with their own server can 'spoof'
both the domain and the IP address to make it appear to your script that it is coming
from a valid form location.
Next, and I shouldn't even have to say this, but always, and I mean ALWAYS, run
your scripts in taint mode. This forces you to not get lazy, and to be diligent about
step number 4.
Sanitize the user data as soon as possible using well-formed regexes appropriate to
the data that is expected from any given field on the form. Don't take shortcuts like
the infamous 'magic horn of the unicorn' to blow through your taint checks...
or you may as well just turn off taint checking in the first place for all the good
it will do for your security. That's like giving a psychopath a sharp knife, bearing
your throat, and saying 'You really won't hurt me with that will you".
And here is where I differ than most others in this fourth step, as I only sanitize
the user data that I am going to actually USE in a way that may present a security
risk, such as any system calls, assignments to other variables, or any writing to
store data. If I am only using the data input by a user to make a comparison to data
I have stored on the system myself (therefore knowing that data of my own is safe),
then I don't bother to sanitize the user data, as I am never going to us it a way
that presents itself as a security problem. For instance, take a username input as
an example. I use the username input by the user only to check it against a match in
my database, and if true, after that I use the data from the database to perform
all other functions I might call for it in the script, knowing it is safe, and never
use the users data again after that.
Last, is to filter out all the attempted auto-submits by robots these days, with a
'human authentication' system, such as Captcha. This is important enough these days
that I took the time to write my own 'human authentication' schema that uses photos
and an input for the 'human' to enter what they see in the picture. I did this because
I've found that Captcha type systems really annoy users (you can tell by their
squinted-up eyes from trying to decipher the distorted letters... usually over and
over again). This is especially important for scripts that use either SendMail or SMTP
for email, as these are favorites for your hungry spam-bots.
To wrap it up in a nutshell, I'll explain it as I do to my wife... your server is like a popular nightclub, and the more bouncers you have, the less trouble you are likely to have
in the nightclub. I have two bouncers outside the door (client-side validation and human authentication), one bouncer right inside the door (checking for valid form submission location... 'Is that really you on this ID'), and several more bouncers in
close proximity to the door (running taint mode and using good regexes to check the
user data).
I know this is an older post, but I felt it important enough for anyone that may read it after my visit here to realize their is no 'magic bullet' when it comes to security, and it takes all these working in conjuction with one another to make your user-provided data secure. Just using one or two of these methods alone is practically worthless, as their power only exists when they all team together.
Or in summary, as my Mum would often say... 'Better safe than sorry".
UPDATE:
One more thing I am doing these days, is Base64 encoding all my data, and then encrypting the Base64 data that will reside on my SQL Databases. It takes about a third more total bytes to store it this way, but the security benefits outweigh the extra size of the data in my opinion.
I like to sanitize it as early as possible, which means the sanitizing happens when the user tries to enter in invalid data. If there's a TextBox for their age, and they type in anything other that a number, I don't let the keypress for the letter go through.
Then, whatever is reading the data (often a server) I do a sanity check when I read in the data, just to make sure that nothing slips in due to a more determined user (such as hand-editing files, or even modifying packets!)
Edit: Overall, sanitize early and sanitize any time you've lost sight of the data for even a second (e.g. File Save -> File Open)
The most important thing is to always be consistent in when you escape. Accidental double sanitizing is lame and not sanitizing is dangerous.
For SQL, just make sure your database access library supports bind variables which automatically escapes values. Anyone who manually concatenates user input onto SQL strings should know better.
For HTML, I prefer to escape at the last possible moment. If you destroy user input, you can never get it back, and if they make a mistake they can edit and fix later. If you destroy their original input, it's gone forever.
Early is good, definitely before you try to parse it. Anything you're going to output later, or especially pass to other components (i.e., shell, SQL, etc) must be sanitized.
But don't go overboard - for instance, passwords are hashed before you store them (right?). Hash functions can accept arbitrary binary data. And you'll never print out a password (right?). So don't parse passwords - and don't sanitize them.
Also, make sure that you're doing the sanitizing from a trusted process - JavaScript/anything client-side is worse than useless security/integrity-wise. (It might provide a better user experience to fail early, though - just do it both places.)
My opinion is to sanitize user input as soon as posible client side and server side, i'm doing it like this
(client side), allow the user to
enter just specific keys in the field.
(client side), when user goes to the next field using onblur, test the input he entered
against a regexp, and notice the user if something is not good.
(server side), test the input again,
if field should be INTEGER check for that (in PHP you can use is_numeric() ),
if field has a well known format
check it against a regexp, all
others ( like text comments ), just
escape them. If anything is suspicious stop script execution and return a notice to the user that the data he enetered in invalid.
If something realy looks like a posible attack, the script send a mail and a SMS to me, so I can check and maibe prevent it as soon as posible, I just need to check the log where i'm loggin all user inputs, and the steps the script made before accepting the input or rejecting it.
Perl has a taint option which considers all user input "tainted" until it's been checked with a regular expression. Tainted data can be used and passed around, but it taints any data that it comes in contact with until untainted. For instance, if user input is appended to another string, the new string is also tainted. Basically, any expression that contains tainted values will output a tainted result.
Tainted data can be thrown around at will (tainting data as it goes), but as soon as it is used by a command that has effect on the outside world, the perl script fails. So if I use tainted data to create a file, construct a shell command, change working directory, etc, Perl will fail with a security error.
I'm not aware of another language that has something like "taint", but using it has been very eye opening. It's amazing how quickly tainted data gets spread around if you don't untaint it right away. Things that natural and normal for a programmer, like setting a variable based on user data or opening a file, seem dangerous and risky with tainting turned on. So the best strategy for getting things done is to untaint as soon as you get some data from the outside.
And I suspect that's the best way in other languages as well: validate user data right away so that bugs and security holes can't propagate too far. Also, it ought to be easier to audit code for security holes if the potential holes are in one place. And you can never predict which data will be used for what purpose later.
Clean the data before you store it. Generally you shouldn't be preforming ANY SQL actions without first cleaning up input. You don't want to subject yourself to a SQL injection attack.
I sort of follow these basic rules.
Only do modifying SQL actions, such as, INSERT, UPDATE, DELETE through POST. Never GET.
Escape everything.
If you are expecting user input to be something make sure you check that it is that something. For example, you are requesting an number, then make sure it is a number. Use validations.
Use filters. Clean up unwanted characters.
Users are evil!
Well perhaps not always, but my approach is to always sanatize immediately to ensure nothing risky goes anywhere near my backend.
The added benefit is that you can provide feed back to the user if you sanitize at point of input.
Assume all users are malicious.
Sanitize all input as soon as possible.
Full stop.
I sanitize my data right before I do any processing on it. I may need to take the First and Last name fields and concatenate them into a third field that gets inserted to the database. I'm going to sanitize the input before I even do the concatenation so I don't get any kind of processing or insertion errors. The sooner the better. Even using Javascript on the front end (in a web setup) is ideal because that will occur without any data going to the server to begin with.
The scary part is that you might even want to start sanitizing data coming out of your database as well. The recent surge of ASPRox SQL Injection attacks that have been going around are doubly lethal because it will infect all database tables in a given database. If your database is hosted somewhere where there are multiple accounts being hosted in the same database, your data becomes corrupted because of somebody else's mistake, but now you've joined the ranks of hosting malware to your visitors due to no initial fault of your own.
Sure this makes for a whole lot of work up front, but if the data is critical, then it is a worthy investment.
User input should always be treated as malicious before making it down into lower layers of your application. Always handle sanitizing input as soon as possible and should not for any reason be stored in your database before checking for malicious intent.
I find that cleaning it immediately has two advantages. One, you can validate against it and provide feedback to the user. Two, you do not have to worry about consuming the data in other places.

How to write data into a specific location Firebase Web

I want to write data into a specific location in the database. Let's say, I have a couple of users in the database. Each of them has their own personal information, including their e-mails. I want to find the user based on the e-mail, that's to say by using his e-mail (but I don't know exactly whose e-mail it is, but whoever it is do something with that user's information). To be more visible, here is my database sample.
Now, while working on one of my javascript files, when the user let's say name1 changes his name, I update my object in javascript and want to replace the whole object under ID "-LEp2F2fSDUt94SRU0cx". To cut short, I want to write this updated object in the path ("Users/-LEp2F2fSDUt94SRU0cx") without doing it by hand and just "knowing" the e-mail. So the logic is "Go find the user with the e-mail "name1#yahoo.com" and replace the whole object with his new updated object". I tried to use orderByChild("Email").equalTo("name1#yahoo.com").set(updated_object), but this syntax does not work I guess. Hopefully I could explain myself.
The first part is the query, that is separate from the post to update. This part is the query to get the value:
ref.child('users').orderByChild("Email").equalTo("name1#yahoo.com")
To update, you need to do something like this once you have the user id from the query result:
ref.child('users').child(userId).child("Email").update(newValue);
firebase.database.Query
A Query sorts and filters the data at a Database location so only a
subset of the child data is included. This can be used to order a
collection of data by some attribute (for example, height of
dinosaurs) as well as to restrict a large list of items (for example,
chat messages) down to a number suitable for synchronizing to the
client. Queries are created by chaining together one or more of the
filter methods defined here.
// Find all dinosaurs whose height is exactly 25 meters.
var ref = firebase.database().ref("dinosaurs");
ref.orderByChild("height").equalTo(25).on("child_added", function(snapshot) {
console.log(snapshot.key);
});

Is it bad practice to pass javascript calculated values to database?

Lets assume I have a Form which has 3 input fields:
gross amount
value added tax rate
net amount
The user can fill out the gross amount and the value added tax rate field. The net amount field is set readonly.
Now I want to save time and increase permformance by let javascript calculate the net amount and fill the required field and pass this value to the database.
The only thing I would check is if the net amount field is empty e.g. with Symfony NotBlank constraint.
Is this bad or bad practice to take the javascript calculated values?
Never trust the user. Do everything you can on serverside, even if you've already done it on clientside - especially not a trivially simple calculation like that that will not save you almost any time on serverside.
EDIT: Unless, as Sharky says, you don't actually care about the two other values, and are providing the calculation only as a courtesy to the user. In this case, net value is the real input field - even if it is technically readonly. Validate it and store it as such, and don't even bother transmitting the other two.
There are three dangers here:
Malicious code injection
This occurs by displaying something a user entered directly back onto the screen or another screen where the user has tampered with the data. For example, a user is asked for their name and types <script>alert('You got hacked!');</script> instead of their name. Every user that views their profile will see this code, and if you don't check, it is possible it will execute it on every person's machine that looks.
This would happen if you had a page in the system that displayed the user entered data back without checking it first, in your case perhaps a page that displays the current values from the database.
You can avoid this situation by sanitising what the user types in, or by checking it on the way out of the database.
For your situation though, if you store the values in the database as numbers instead of strings, then you won't have a problem.
Manipulating values in javascript
You have a much bigger problem though, and that is that you should perform calcuations on the server side. It is possible for the user to manipulate the value of the net amount and send anything they want to the database.
If a user is buying something from an online store, for example, and they add 5 items to their basket costing £1.00 each, then if you calculate the total in javascript (£5.00) and submit and store that, it is possible for the user to edit it and change the total to £0 and get the items for free.
For security you should of course calculate the cost on the server, and use values that you personally retrieve from the database - don't re-use any data sent from the user in the backend because they could also edit the individual item cost to £0 as well.
Javascript number accuracy
Just as a bonus, you should also be aware that your front end and back end may calculate values differently when you're adding and multiplying numbers together. Consider this code:
var total = 0.3 - 0.2;
If you expect total to be 0.1, you're wrong - the answer is something like 0.99999999998.
Javascript doesn't handle floating point numbers very accurately.

How to generate unique number from string in node.js

I would like to generate a unique number from string. The string is a combination of username and password. I would like to generate a unique number id (not string) from this combination. I first md5 the combination and then convert it to number. The number length needs to be 10. Any suggestions?
It would be best if you can provide more details about the third-party you're trying to interface with, because this is a very odd request and it contains a fundamental flaw. You ask for the number to be unique, but you are allowing for only 10 decimal ("number id") digits, or ~10 billion possible values.
This sounds like an awful lot but it's really not. This gives you a hash of just over 33 bits. The simple hash collision probability calculator at http://davidjohnstone.net/pages/hash-collision-probability puts this at a 44% chance of a collision at just 100,000 entries. But that assumes full usage of all the available input characters. Since username and password combinations are almost always limited to alphabetic and numeric characters, the real collision chance is much worse at far fewer entries (can't be calculated without knowing the characters you allow for these fields - but it's bad).
NodeJS provides numerous crypto functions in the crypto module. A whole set of hashing functions is available, including the ideal-case SHA* options. These can be used to provide safe, irreversible hashes with astronomically collision probabilities.
If these options are not usable for you, I would suggest you have a fundamental design flaw. You're almost certainly mapping a user/pass combination to a userID in a remote system in a way that an attacker would find easy to compromise with a simple brute-force attack, given the high collision risk in your model.
If you are doing what I think you are doing, the "right" way to do this would be to have a simple database on a server somewhere. The user/pass would be assigned a unique ID in there, and it doesn't matter what this is - it could be an auto-increment ID field in a single MySQL table. The server would then contact this remote service with the ID value for any API calls necessary, and return the results to the user. This eliminates the security risk because the username/password are not actually hashed, just stored, and can be checked 100% on every call.
Never use a hash as a primary data value. It's a simplification, not a real value on its own.

Captcha always show the same number in my first login

I am using the below javascript to load the captcha in my site. This is working fine.But in my first login it defaultly show the number "5AbD". How can i change it.
http://wiki.asp.net/page.aspx/1369/simple-captcha-code-in-javascript/
I user the above link javascrtipt. I can't able to post this script here..
You dont provide many details, but often something like this is related to a random number being generated every time with the same seed. If you rolled your own captcha I would look into how you are generating the string. Commonly one passes the system tick count as the random number seed.
Otherwise, you don't really provide enough information for anyone to give a helpful answer.
Edit:
1) After seeing your code, first I want to say that as captcha this is extremely flawed. The whole point is that a bot cant determine the code and automatically enter it. This is why they are usually images generated on the server. It is difficult to extract the value from an image.
2) It is showing the same value every time because you have not coded it otherwise. You are literally starting with the same -hard coded- value and modifying that. Look into the Math.random() function for generating a random number for the initial value instead of hard coding that. But, referencing point #1, I would scrap this whole javascript thing altogether because as captcha it's useless because a bot could just grab the value of that control and fill out the form with it.
3) The steps for implementing captcha are usually something like: generate the random string on the server, save that string to session, generate an image with that string (with some noise/font funkiness to prevent an image processor from easily being able to read the text), then display the image on the page. The actual string value never leaves the server. When the form is submitted, you just compare the user value with the value you previously stored in the session. But rather than go through all of that (unless the whole point is as a learning exercise), you might think about using any of the pre-made captcha controls such as recaptcha, etc. Either way, look into the random number function, because as long as you are always starting with the same hard coded values then you will always get the same result.

Categories