Regex to pull out querystring variables and values - javascript

I am using Javascript and trying to break out query string variables from their values. I made a regex that works just fine IF there are no other ampersands except for denoting variables, otherwise the data cuts off at the ampersand.
example: ajax=10&a=test&b=cats & dogs returns a = "test", b = "cats "
I cannot encode the ampersands before the string is made due to the nature of this project and the inefficiency with encoding/replacing characters in hundreds of locations upon entry.
What this piece of code should ultimately do is turn the querystring ajax=10&a=cats & dogs into ajax=10&a=cats%20%26%20dogs
list = [ 'ajax','&obj','&a','&b','&c','&d','&e','&f','&g','&h','&m' ];
ajax_string = '';
for (var i=0, li=list.length; i<li; i++) {
variables = new RegExp(list[i] +"=([^&(.+)=]*)");
query_string = variables.exec(str);
if (query_string != null) {
alert(query_string);
}
}

The query string should be split on ampersands. Any ampersands in the values of actual arguments should be converted to %26.
This is what the query string you posted should look like:
ajax=10&a=test&b=cats+%26+dogs
The query string you posted should give you this:
'ajax': '10'
'a': 'test'
'b': 'cats '
' dogs': ''
Edit
It looks like you actually want to sanitize a query string that other developers have built lazily. If we assume that: a) every argument name matches /[a-zA-Z0-9]+/; and b) it is always followed by an equals sign, then this code will work:
var queryString = 'ajax=10&a=test&b=cats & dogs';
var parts = queryString.split(/&(?=[a-zA-Z0-9]+\=)/);
for(var i = 0; i < parts.length; i++)
{
var index = parts[i].indexOf('=') + 1;
if(index > 0)
parts[i] = parts[i].substring(0, index) + escape(parts[i].substring(index));
//else: error?
}
queryString = parts.join("&");
alert("queryString: " + queryString);

> I cannot encode the ampersands before the string is made due to the nature of this project
Then you won't have a full-proof answer.
Ampersands ("&") separate query parameters in url query strings. You can't have it both ways where some of your query parameter values contain un-escaped "&" and expect a parser based on this simple rule to know the difference.
If you can't escape "&" as "%26" in each value component beforehand, then you can never know that the values you get are correct. The best you could do is: If the value to the right of an "&" and before the next "&" does not contain an equal sign "=", you append the value to the previous value read, or the empty string if this is the first value read.
This requires a proper parser as JavaScript does not support lookahead regular expressions that could help you do this.
Note however that an algorithm like that completely bypasses the spec. Presuming for a moment that the query string:
a=test&b=cats & dogs&c=test
is valid, technically that string contains 4 parameters: "a" (with a value of "test"), "b" (with a value of "cats "), " dogs" (with no value), and "c" (with a value of "test").
If you don't change the query string at the source (and properly escape the value component), you're just hacking in the wrong solution.
Good luck.

Related

Best practice for converting string to object in JavaScript

I am working on a small UI for JSON editing which includes some object and string manipulation. I was able to make it work, but one of the fields is bit tricky and I would be grateful for an advice.
Initial string:
'localhost=3000,password=12345,ssl=True,isAdmin=False'
Should be converted to this:
{ app_server: 'localhost:3000', app_password:'12345', app_ssl: 'True', app_isAdmin: 'False' }
I was able to do that by first splitting the string with the ',' which returns an array. And then I would loop through the second array and split by '='. In the last step I would simply use forEach to loop through the array and create an object:
const obj = {}
arr2.forEach((item) => (obj[`app_${item[0]}`] = item[1]));
This approach works, but in case some of the fields, i.e password contains ',' or '=', my code will break. Any idea on how to approach this? Would some advanced regex be a good idea?
Edit: In order to make things simple, it seems that I have caused an opposite effect, so I apologize for that.
The mentioned string is a part of larger JSON file, it is the one of the values. On the high level, I am changing the shape of the object, every value that has the structure I described 'server='something, password=1234, ssl=True', has to be transformed into separate values which will populate the input fields. After that, user modify them or simply download the file (I have separate logic for joining the input fields into the initial shape again)
Observation/Limitation with the design that you have :
As per your comment, none of the special characters is escaped in any way then how we will read this string password=12345,ssl=True ? It will be app_password: 12345,ssl=True or app_password: 12345 ?
why localhost=3000 is converted into app_server: 'localhost:3000' instead of app_localhost: '3000' like other keys ? Is there any special requirement for this ?
You have to design your password field in the way that it will not accept at least , character which is basically used to split the string.
Here you go, If we can correct the above mentioned design observations :
const str = 'localhost=3000,password=123=45,ssl=True,isAdmin=False';
const splittedStr = str.split(',');
const result = {};
splittedStr.forEach(s => {
const [key, ...values] = s.split('=')
const value = values.join('=');
result[`app_${key}`] = value
});
console.log(result);
As you can see in above code snippet, I added password value as 123=45 and it is working properly as per the requirement.
You can use a regular expression that matches key and value in the key=value format, and will capture anything between single quotes when the value happens to start with a single quote:
(\w+)=(?:'((?:\\.|[^'])*)'|([^,]+))
This assumes that:
The key consists of alphanumerical characters and underscores only
There is no white space around the = (any space that follows it, is considered part of the value)
If the value starts with a single quote, it is considered a delimiter for the whole value, which will be terminated by another quote that must be followed by a comma, or must be the last character in the string.
If the value is not quoted, all characters up to the next comma or end of the string will be part of the value.
As you've explained that the first part does not follow the key=value pattern, but is just a value, we need to deal with this exception. I suggest prefixing the string with server=, so that now also that first part has the key=value pattern.
Furthermore, as this input is part of a value that occurs in JSON, it should be parsed as a JSON string (double quoted), in order to decode any escaped characters that might occur in it, like for instance \n (backslash followed by "n").
Since it was not clarified how quotes would be escaped when they occur in a quoted string, it remains undecided how for instance a password (or any text field) can include a quote. The above regex will require that if there is a character after a quote that is not a comma, the quote will be considered part of the value, as opposed to terminating the string. But this is just shifting the problem, as now it is impossible to encode the sequence ', in a quoted field. If ever this point is clarified, the regex can be adapted accordingly.
Implementation in JavaScript:
const regex = /(\w+)=(?:'(.*?)'(?![^,])|([^,]+))/g;
function parse(s) {
return Object.fromEntries(Array.from(JSON.parse('"server=' + s + '"').matchAll(regex),
([_, key, quoted, value]) => ["app_" + key, quoted ?? (isNaN(value) ? value : +value)]
));
}
// demo:
// Password includes here a single quote and a JSON encoded newline character
const s = "localhost:3000, password='12'\\n345', ssl='True', isAdmin='False'";
console.log(parse(s));

Javascript Regular expression not working as expected

I have string which is in form of JSON but not a valid JSON string. String is like as below (Its single line string but I have added new lines for clarity.)
"{
clientId :\"abc\",
note:\"ATTN:Please take care of item x\"
}"
I am trying to fix it (reformating to valid JSON) using javascript regular expression. I am currently using following regular expression but its not working for second property i.e. note as it has colon (:) in its value.
retObject.replace(/(['"])?([a-zA-Z0-9_]+)(['"])?:/g, '"$2": ');
What I am trying to do here is using regular expression to reformat above string to
"{
"clientId" :"abc",
"note":"ATTN:Please take care of item x"
}"
Tried many ways but couldnt get it just right as I am still beginer in RegEx.
Try using .split() with RegExp /[^\w\s\:]/ , .test() with RegExp /\:$/ , .match() with RegExp /\w+/
var str = "{clientId :\"abc\",note:\"ATTN:Please take care of item x\"}";
var res = {};
var arr = str.split(/[^\w\s\:]/).filter(Boolean);
for (var i = 0; i < arr.length; i++) {
if ( /\:$/.test(arr[i]) ) {
res[ arr[i].match(/\w+/) ] = arr[i + 1]
}
}
console.log(res)
Trying to fix broken JSON with a regexp is a fool's errand. Just when you think you have the regexp working, you will be presented with additional gobbledygook such as
"{ clientId :\"abc\", note:\"ATTN:Please take \"care\" of item x\" }"
where one of the strings has double quotes inside of it, and now your regexp will fail.
For your own sanity and that of your entire team, both present and future, have the upstream component that is producing this broken JSON fixed. All languages in the world have perfectly competent JSON serializers which will create conformant JSON. Tell the upstream folks to use them.
If you have absolutely no choice, use the much-reviled eval. Meet evil with evil:
eval('(' + json.replace(/\\"/g, '"') + ')')

Capitalize the first letter of each word

var name = "AlbERt EINstEiN";
function nameChanger(oldName) {
var finalName = oldName;
// Your code goes here!
finalName = oldName.toLowerCase();
finalName = finalName.replace(finalName.charAt(0), finalName.charAt(0).toUpperCase());
for(i = 0; i < finalName.length; i++) {
if (finalName.charAt(i) === " ")
finalName.replace(finalName.charAt(i+1), finalName.charAt(i+1).toUpperCase());
}
// Don't delete this line!
return finalName;
};
// Did your code work? The line below will tell you!
console.log(nameChanger(name));
My code as is, returns 'Albert einstein'. I'm wondering where I've gone wrong?
If I add in
console.log(finalName.charAt(i+1));
AFTER the if statement, and comment out the rest, it prints 'e', so it recognizes charAt(i+1) like it should... I just cannot get it to capitalize that first letter of the 2nd word.
There are two problems with your code sample. I'll go through them one-by-one.
Strings are immutable
This doesn't work the way you think it does:
finalName.replace(finalName.charAt(i+1), finalName.charAt(i+1).toUpperCase());
You need to change it to:
finalName = finalName.replace(finalName.charAt(i+1), finalName.charAt(i+1).toUpperCase());
In JavaScript, strings are immutable. This means that once a string is created, it can't be changed. That might sound strange since in your code, it seems like you are changing the string finalName throughout the loop with methods like replace().
But in reality, you aren't actually changing it! The replace() function takes an input string, does the replacement, and produces a new output string, since it isn't actually allowed to change the input string (immutability). So, tl;dr, if you don't capture the output of replace() by assigning it to a variable, the replaced string is lost.
Incidentally, it's okay to assign it back to the original variable name, which is why you can do finalName = finalName.replace(...).
Replace is greedy
The other problem you'll run into is when you use replace(), you'll be replacing all of the matching characters in the string, not just the ones at the position you are examining. This is because replace() is greedy - if you tell it to replace 'e' with 'E', it'll replace all of them!
What you need to do, essentially, is:
Find a space character (you've already done this)
Grab all of the string up to and including the space; this "side" of the string is good.
Convert the very next letter to uppercase, but only that letter.
Grab the rest of the string, past the letter you converted.
Put all three pieces together (beginning of string, capitalized letter, end of string).
The slice() method will do what you want:
if (finalName.charAt(i) === " ") {
// Get ONLY the letter after the space
var startLetter = finalName.slice(i+1, i+2);
// Concatenate the string up to the letter + the letter uppercased + the rest of the string
finalName = finalName.slice(0, i+1) + startLetter.toUpperCase() + finalName.slice(i+2);
}
Another option is regular expression (regex), which the other answers mentioned. This is probably a better option, since it's a lot cleaner. But, if you're learning programming for the first time, it's easier to understand this manual string work by writing the raw loops. Later you can mess with the efficient way to do it.
Working jsfiddle: http://jsfiddle.net/9dLw1Lfx/
Further reading:
Are JavaScript strings immutable? Do I need a "string builder" in JavaScript?
slice() method
You can simplify this down a lot if you pass a RegExp /pattern/flags and a function into str.replace instead of using substrings
function nameChanger(oldName) {
var lowerCase = oldName.toLowerCase(),
titleCase = lowerCase.replace(/\b./g, function ($0) {return $0.toUpperCase()});
return titleCase;
};
In this example I've applied the change to any character . after a word boundary \b, but you may want the more specific /(^| )./g
Another good answer to this question is to use RegEx to do this for you.
var re = /(\b[a-z](?!\s))/g;
var s = "fort collins, croton-on-hudson, harper's ferry, coeur d'alene, o'fallon";
s = s.replace(re, function(x){return x.toUpperCase();});
console.log(s); // "Fort Collins, Croton-On-Hudson, Harper's Ferry, Coeur D'Alene, O'Fallon"
The regular expression being used may need to be changed up slightly, but this should give you an idea of what you can do with regular expressions
Capitalize Letters with JavaScript
The problem is twofold:
1) You need to return a value for finalName.replace, as the method returns an element but doesn't alter the one on which it's predicated.
2) You're not iterating through the string values, so you're only changing the first word. Don't you want to change every word so it's in lower case capitalized?
This code would serve you better:
var name = "AlbERt EINstEiN";
function nameChanger(oldName) {
// Your code goes here!
var finalName = [];
oldName.toLowerCase().split(" ").forEach(function(word) {
newWord = word.replace(word.charAt(0), word.charAt(0).toUpperCase());
finalName.push(newWord);
});
// Don't delete this line!
return finalName.join(" ");
};
// Did your code work? The line below will tell you!
console.log(nameChanger(name));
if (finalName.charAt(i) === " ")
Shouldn't it be
if (finalName.charAt(i) == " ")
Doesn't === check if the object types are equal which should not be since one it a char and the other a string.

How to make JSON.stringify encode non-ascii characters in ascii-safe escaped form (\uXXXX) without "post-processing"?

I have to send characters like ü to the server as unicode character but as an ASCII-safe string. So it must be \u00fc (6 characters) not the character itself. But after JSON.stringify it always gets ü regardless of what I've done with it.
If I use 2 backslashes like \\u00fc then I get 2 in the JSON string as well and that's not good either.
Important constraint: I can't modify the string after JSON.stringify, it's part of the framework without workaround and we don't want to fork the whole package.
Can this be done? If so, how?
If, for some reason, you want your JSON to be ASCII-safe, replace non-ascii characters after json encoding:
var obj = {"key":"füßchen", "some": [1,2,3]}
var json = JSON.stringify(obj)
json = json.replace(/[\u007F-\uFFFF]/g, function(chr) {
return "\\u" + ("0000" + chr.charCodeAt(0).toString(16)).substr(-4)
})
document.write(json);
document.write("<br>");
document.write(JSON.parse(json));
This should get you to where you want. I heavily based this on this question: Javascript, convert unicode string to Javascript escape?
var obj = {"key":"ü"};
var str1 = JSON.stringify(obj);
var str2 = "";
var chr = "";
for(var i = 0; i < str1.length; i++){
if (str1[i].match(/[^\x00-\x7F]/)){
chr = "\\u" + ("000" + str1[i].charCodeAt(0).toString(16)).substr(-4);
}else{
chr = str1[i];
}
str2 = str2 + chr;
}
console.log(str2)
I would recommend though that you look into #t.niese comment about parsing this server side.
Depending on the exact scenario, you can affect the behavior of JSON.stringify by providing a toJSON method as detailed here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify#tojson_behavior
If an object has a toJSON method that is a function, then calling JSON.stringify on that will use the result of that method rather than the normal serialization. You could combine this with the approaches mentioned in other answers to get the result you want, even if a library doesn't naturally provide any hooks for customization.
(Of course, its possible that a third-party library is itself doing something that overrides this behavior.)

How can I extract a URL from url("http://www.example.com")?

I need to get the URL of an element's background image with jQuery:
var foo = $('#id').css('background-image');
This results in something like url("http://www.example.com/image.gif"). How can I get just the "http://www.example.com/image.gif" part from that? typeof foo says it's a string, but the url() part makes me think that JavaScript and/or jQuery has a special URL type and that I should be able to get the location with foo.toString(). That doesn't work though.
Note that different browser implementations may return the string in a different format. For instance, one browser may return double-quotes while another browser may return the value without quotes. This makes it awkward to parse, especially when you consider that quotes are valid as URL characters.
I would say the best approach is a good old check and slice():
var imageUrlString = $('#id').css('background-image'),
quote = imageUrlString.charAt(4),
result;
if (quote == "'" || quote == '"')
result = imageUrlString.slice(5, -2);
else
result = imageUrlString.slice(4, -1);
Assuming the browser returns a valid string, this wouldn't fail. Even if an empty string were returned (ie, there is no background image), the result is an empty string.
You might want to consider regular expressions in this case:
var urlStr = 'url("http://www.foo.com/")';
var url = urlStr.replace(/^url\(['"]?([^'"]*)['"]?\);?$/, '$1');
This particular regex allows you to use formats like url(http://foo.bar/) or url("http://foo.bar/"), with single quotes instead of double quotes, or possibly with a semicolon at the end.
You could split the string at each " and get the second element:
var foo = $('#id').css('background-image').split('"')[1];
Note: This doesn't work if your URL contains quotation marks.
If it's always the same, I'd just take the substring of the URL without the prefix.
For instance, if it's always:
url("<URL>")
url("<otherURL>")
It's always the 5th index of the string to the len - 2
Not the best by all means, but probably faster than a Regex if you're not worried about other string formats.
There is no special URL type - it's a string representing a CSS url value. You can get the URL back out with a regex:
var foo = ${'#id').css('background-image');
var url = foo.match(/url\(['"](.*)['"]\)/)[1];
(that regex isn't foolproof, but it should work against whatever jQuery returns)

Categories