selector concatenation peculiarity - javascript

I came across behavior that I cannot explain. I want to get an element (id='address.zipCode)' with simple selector:
$('#' + prefix + 'zipCode')
and it doesn't work. In this case, prefix == 'address\\.'. Chrome console debugging results in:
> prefix
"address\\."
> $('#' + prefix + 'zipCode')
[]
The most interesting part is that:
$('#' + "address\\." + 'zipCode')
[<input id=​"address.zipCode" name=​"address.zipCode" class=​"zipCodeMask" type=​"text" value>​]
Any ideas what's wrong with that?

Working backwards from the behavior of the Chrome REPL (which displays the final value of the string, i.e. sans escaping characters), you actually have two backslashes in your final string. In other words, you have probably assigned prefix like so:
var prefix = "address\\\\.";
What you actually need is only one backslash, which means you should type in two backslashes in the string literal (one for escaping):
var prefix = "address\\.";

Related

How to convert string with concatenation to real string in javascript?

I have a string which looks like this on a page response (saved as autoResponse):
... hexMD5('\262' + '****' + '\155\135\053\325\374\315\264\062\232\354\242\205\217\034\154\005'); ...
In order to capture this, I use:
var hex = autoResponse.split('hexMD5(')[1].split(')')[0];
This now gives me this string:
'\262' + '****' + '\155\135\053\325\374\315\264\062\232\354\242\205\217\034\154\005'
If I put this directly into the hexMD5() method, it thinks that the ', + symbols and white space are apart of the secret.
I tried to use replace() to remove them like so:
while(hex.split("'").length !== 1) hex = hex.replace("'", "");
while(hex.split("+").length !== 1) hex = hex.replace("+", "");
while(hex.split(" ").length !== 1) hex = hex.replace(" ", "");
However, when I then do hexMD5(hex) it gives me an incorrect hex. Is there anyway I can convert the hex to a string where it combines the strings together as if I was hardcoding it like
hexMD5('\262' + '****' + '\155\135\053\325\374\315\264\062\232\354\242\205\217\034\154\005');
any help would be appreciated.
You can use a single, much simpler RegExp for this:
hex = hex.replace(/' ?\+ ?'/g, '');
That says "replace all single-quotes, followed by possibly a space, then a plus, then possibly another space, followed by another single quote" and replaces those matches with nothing, thus removing them. (You need the \ before the + because + is a special character in RegExes that needs to be escaped.)

Can someone explain why a "+" is needed before closing quotes in an HTML tag?

I am pretty new to Javascript and this is the first time I have come across this type of concatenation while taking a JQuery course. My question is regarding the + sign and why it is needed with all the quotes before the ">" to close the HTML tag. In other words the + "'> " part of the code. Why can't you just end the quotes and close the tag without a plus sign? Can someone break that down step by step? I left a comment in the script where I am confused. There are 2 instances. Thanks.
<script type="text/javascript">
$("document").ready(function() {
buildBookmarks('h3', 'header');
});
function buildBookmarks(strWhichTag, sBookMarkNode) {
var cAnchorCount = 0;
var oList = $("<ul id='bookmarksList'>");
$("div:not([id=header]) " + strWhichTag).each(function() { //This is the part
$(this).html("<a name='bookmark" + cAnchorCount + "'></a>" + $(this).html());
oList.append($("<li><a href='#bookmark" + cAnchorCount++ + "'> " + $(this).text() + "</a></li>"));
});
$("#" + sBookMarkNode).append(oList);
}
</script>
You already ended the string literal, back here:
"<li><a href='#bookmark"
// ^ here
Now you’re concatenating an expression to the string before it, and concatenating the result of that concatenation with another string literal, and concatenation is done with the + operator.
If we replace the literals with placeholders representing expressions (which they are), and add parentheses, it might become clearer:
oList.append($(A + (cAnchorCount++) + B + ($(this).text()) + C));
Note that the HTML here isn’t special in any way; the strings you have here are strings like any other. They just happen to represent HTML. (I’d recommend learning the W3C DOM before jQuery, by the way; it makes this separation clearer.)
The script is building an HTML element in Javascript by concatenating strings together. You can think of it this way:
tag_opening + anchor_count + tag_closing
The + is necessary because it’s adding strings together. Without it, you would just be following one string with another, which would result in a syntax error.
The plus here is used for concatenation. What you're really doing is adding a string which you've written (a name='bookmark) plus a variable (cAnchorCount), plus another string ('>) to make one long string. Without the plus, the syntax makes no sense.
Your example is unclear, but you are combining a variable an an HTML string:
"<a href='#bookmark" + cAnchorCount++ + "'> "
Will evaluate to something like:
<a href='#bookmark1'>
The code you listed uses external double quote (") to wrap the string and internal single quotes (') to wrap the attribute values. There is no interpolation in JavaScript strings, only string concatenation.
They are trying to format a string so that once the code compiles, it will be readable HTML.
For example:
oList.append($("<li><a href='#bookmark" + cAnchorCount++ + "'> " + $(this).text() + "</a></li>")); is trying to ultimately get
<li><a href='#bookmark3'>TEXT</a></li>
You need the '> to close the tag properly from <a href=' after adding in text.
In this case it has nothing to do with HTML or the content of the strings being concatenated, it's just that JavaScript uses the + operator to concatenate strings. I imagine every language uses some operator to do this. For example:
var result = 'first string' + 'second string';
This creates the concatenated string:
'first stringsecond string'
Now if one of those strings happens to be a variable, the same thing happens:
var someVariable = 'second string';
var result = 'first string' + someVariable;
This still results in the same concatenated string:
'first stringsecond string'
That's all the code in the question is doing in these cases. Concatenating one string to another. For example:
$("div:not([id=header]) " + strWhichTag)
The variable strWhichTag contains some string value, likely the name of an HTML tag judging by its name and usage. So if it contains something like 'div' then the result is:
$("div:not([id=header]) div")
This overall resulting string is then being evaluated by jQuery as a selector to identify a set of HTML elements on which to operate. But the string concatenation doesn't matter, it could just as easily result in this:
$("div:not([id=header]) some nonsense")
if the strWhichTag variable contained the value 'some nonsense'.
THis is called concatenation, the # which is the ID of the string, while the sBookMarkNode is the variable. Then when the page loads it will append (create) right after the # + sBookMarkNode which the variable sBookMarkNode is created apparently else where or is a global variable since I cannot see it in your code snippet.

Most efficient way to grab XML tag from file with JavaScript and Regex

I'm doing some more advanced automation on iOS devices and simulators for an enterprise application. The automation is written in browserless Javascript. One of the methods works on the device but not on the simulator, so I need to code a workaround. For the curious, it's UIATarget.localTarget().frontMostApp().preferencesValueForKey(key).
What we need to do is read a path to a server (which varies) from a plist file on disk. As a workaround on the simulator, I've used the following lines to locate the plist file containing the preferences:
// Get the alias of the user who's logged in
var result = UIATarget.localTarget().host().performTaskWithPathArgumentsTimeout("/usr/bin/whoami", [], 5).stdout;
// Remove the extra newline at the end of the alias we got
result = result.replace('\n',"");
// Find the location of the plist containing the server info
result = UIATarget.localTarget().host().performTaskWithPathArgumentsTimeout("/usr/bin/find", ["/Users/"+result+"/Library/Application Support/iPhone Simulator", "-name", "redacted.plist"], 100);
// For some reason we need a delay here
UIATarget.localTarget().delay(.5);
// Results are returned in a single string separated by newline characters, so we can split it into an array
// This array contains all of the folders which have the plist file under the Simulator directory
var plistLocations = result.stdout.split("\n");
...
// For this example, let's just assume we want slot 0 here to save time
var plistBinaryLocation = plistLocations[0];
var plistXMLLocation = plistLocations[i] + ".xml";
result = UIATarget.localTarget().host().performTaskWithPathArgumentsTimeout("/usr/bin/plutil", ["-convert","xml1", plistBinaryLocation,"-o", plistXMLLocation], 100);
From here, I think the best way to get the contents is to cat or grep the file, since we can't read the file directly from disk. However, I'm having trouble getting the syntax down. Here's an edited snippet of the plist file I'm reading:
<key>server_url</key>
<string>http://pathToServer</string>
There are a bunch of key/string pairs in the file, where the server_url key is unique. Ideally I'd do something like a lookback, but because JavaScript doesn't appear to support it, I figured I'd just get the pair from the file and whittle it down a bit later.
I can search for the key with this:
// This line works
var expression = new RegExp(escapeRegExp("<key>server_url</key>"));
if(result.stdout.match(expression))
{
UIALogger.logMessage("FOUND IT!!!");
}
else
{
UIALogger.logMessage("NOPE :(");
}
Where the escapeRegExp method looks like this:
function escapeRegExp(str)
{
var result = str.replace(/([()[{*+.$^\\|?])/g, '\\$1');
UIALogger.logMessage("NEW STRING: " + result);
return result;
}
Also, this line returns a value (but gets the wrong line):
var expression = new RegExp(escapeRegExp("<string>(.*?)</string>"));
However, when you put the two together, it (the Regex syntax) works on the terminal but doesn't work in code:
var expression = new RegExp(escapeRegExp("<key>server_url</key>[\s]*<string>(.*?)</string>"));
What am I missing? I also tried grep and egrep without any luck.
There are two problems affecting you here getting the regex to work in your JavaScript code.
First, you are escaping the whole regex expression string, which means that your capturing (.*?) and your whitespace ignoring [\s]* will also be escaped and won't be evaluated the way you're expecting. You need to escape the XML parts and add in the regex parts without escaping them.
Second, the whitespace ignoring part, [\s]* is falling prey to JavaScript's normal string escaping rules. the "\s" is turning into "s" in the output. You need to escape that backslash with "\s" so that it stays as "\s" in the string that you pass to construct the regular expression.
I've built a working script that I've verified in the UI Automation engine itself. It should extract and print out the expected URL:
var testString = "" +
"<plistExample>\n" +
" <key>dont-find-me</key>\n" +
" <string>bad value</string>\n" +
" <key>server_url</key>\n" +
" <string>http://server_url</string>\n" +
"</plistExample>";
function escapeRegExp(str)
{
var result = str.replace(/([()[{*+.$^\\|?])/g, '\\$1');
UIALogger.logMessage("NEW STRING: " + result);
return result;
}
var strExp = escapeRegExp("<key>server_url</key>") + "[\\s]*" + escapeRegExp("<string>") + "(.*)" + escapeRegExp("</string>");
UIALogger.logMessage("Expression escaping only the xml parts:" + strExp);
var exp = new RegExp(strExp);
var match = testString.match(exp);
UIALogger.logMessage("Match: " + match[1]);
I should point out, though, that the only thing you need to escape in the regex is the forward slashes in the XML closing tags. That means that you don't need your escapeRegExp() function and can write the expression you want like this:
var exp = new RegExp("<key>server_url<\/key>[\\s]*<string>(.*)<\/string>");

Regex for url validation (unterminated parentheticals error)

I have the following expression to validate a URL but it gives me a syntax error on the browser. I am no expert in regex expressions so I am not sure what I am looking for. I would also like it to test for http:// and https:// urls.
"url":{
"regex":"/^http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)?$/",
"alertText":"URL must start with http://"}
Edit:
To clarify I am looking for help for both the regex and the syntax issues please. I have tried about 20 different variations based on all the answers but still no luck. Just to clarity, I do not need to validate the entire URL. I just need to validate that it starts with http:// or https:// but it must not fail validation if left empty. I can get the http part working with this
/^https?:///
no need to escape the / even. But it fails if the input field is empty, when I try:
/^(https?://)?/
I get an error saying "unterminated parenthetical /^(https?://)/".
Just to confuse matters more, here is one that I added yesterday to validate a date or no entry and it like the same sort of format to me.
/^([0-9]{1,2}\-\[0-9]{1,2}\-\[0-9]{4})?$/
For what it's worth, the syntax error is the unescaped forward slash here: /\S*
Edit: oh wow, I'm tired. All of the forward slashes are unescaped. You can escape them with a backslash: \/
Here's the spec on URIs, of which URLs are a subset, or here's the spec on URLs if you're sure that's all you care about. A full implementation of either would be nearly impossible with only a single regular expression.
If you truly want to validate a URL, one that you know will be HTTP or HTTPS, send it an HTTP HEAD request and check the response code.
Alternatively, if you're going to play loose with the spec, decide how loose you're willing to be with the input, and if it's better to exclude valid URLs or permit false ones.
If you want to test for a URL or empty input, you might want to do two passes.
test for empty string.
test for valid url.
I would do something like the following (assuming urlString is my input).
// get rid of whitespace, in case user hit spacebar/tab
// also removes leading/trailing spaces.
urlString = urlString.replace(/[\s]*/g,'');
// test if zero length string, if not, test the url.
if( urlString.length > 0 ){ // test the URL
var re = new RegExp( your_expression_goes_here );
var result = re.exec(urlString);
if( result != null ) {
// we have a hit!!! this is a URL.
} else {
// this is a bad string.
}
} else {
// user entered no text, let's move on.
}
So, the preceding should work and allow you to test for either empty string or a url. As to the regular expression you're using "/(http|https):///", I believe it's a bit flawed. Yes, it will catch "http://" or "https://", but it will also key in on a string like "htthttp://" which is clearly not what you want.
Your other sample "/^(http|https):///" is better in that it will match from the beginning of the string and will tell you if the string begins like a URL.
Now, I think jrob above was on the right track with his second string in regards to testing the full URL. I think I found the same sample he used at this page. I've modified the expression as per below and tested it using an online regex tester, can't post the link as I'm a new user :D.
It seems to catch a whole manner of valid URLs and produces an error if the input string is in any way an invalid URL, at least for the invalid URLs I can think of. It also catches http/https protocols only, which I think is your base requirement.
^(?:http(?:s?)\:\/\/|~/|/)?(?:\w+:\w+#)?(?:(?:[-\w]+\.)+([a-zA-Z]{2,9}))(?::[\d]{1,5})?(?:(?:(?:/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|/)+|\?|#)?(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?$
Hope this helps.
Updated code (twice).
I still strongly suggest you test for empty string first as per my earlier example, and you only test for the valid values if the string is non zero. I have tried to combine the two tests into one, but have been unable to do so so far (maybe someone else can still figure it out).
The following tests work for me, here's a URL sample as you required:
//var re = /^(?:http(?:s?)\:\/\/)/;
// the following expression will test for http(s):// and empty string
var re = /^(?:http(?:s?)\:\/\/)*$/;
// use the precompiled expression above, or the following
// two lines:
//var reTxt = "^(?:http(?:s?)\:\/\/)";
//var re = new RegExp(reTxt);
alert(
"result:" + re.test("http://") +
"\nresult:" + re.test("https://") +
"\nresult:" + re.test("") +
"\nresult:" + re.test("https:") +
"\nresult:" + re.test("xhttp://") +
"\nresult:" + re.test("ftp://") +
"\nresult:" + re.test("http:/") +
"\nresult:" + re.test("http://somepage.com") +
"\nresult:" + re.test("httphttp://") +
"\nresult:" + re.test(" http://") +
"\nresult:" + re.test("Random text")
);
And here's a test for dates:
var re2 = /^[0-9]{1,2}\-[0-9]{1,2}\-[0-9]{4}$/;
// use the precompiled expression above, or the following
// two lines:
//var reDateTxt = /^[0-9]{1,2}\-[0-9]{1,2}\-[0-9]{4}$/;
//var re2 = new RegExp(reDateTxt);
alert(
"result:" + re2.test("02-02-2009") +
"\nresult:" + re2.test("022-02-2009") +
"\nresult:" + re2.test("02-032-2009") +
"\nresult:" + re2.test("02-02-23009") +
"\nresult:" + re2.test(" 02-02-2009") +
"\nresult:" + re2.test("02-0a2-2009") +
"\nresult:" + re2.test("02-02-2009") +
"\nresult:" + re2.test("Random text")
);

Regexp for matching numbers and units in an HTML fragment?

I'm trying to make a regexp that will match numbers, excluding numbers that are part of other words or numbers inside certain html tags. The part for matching numbers works well but I can't figure out how to find the numbers inside the html.
Current code:
//number regexp part
var prefix = '\\b()';//for future use
var baseNumber = '((\\+|-)?([\\d,]+)(?:(\\.)(\\d+))?)';
var SIBaseUnit = 'm|kg|s|A|K|mol|cd';
var SIPrefix = 'Y|Z|E|P|T|G|M|k|h|ia|d|c|m|µ|n|p|f|a|z|y';
var SIUnit = '(?:('+SIPrefix+')?('+SIBaseUnit+'))';
var generalSuffix = '(PM|AM|pm|am|in|ft)';
var suffix = '('+SIUnit+'|'+generalSuffix+')?\\b';
var number = '(' + prefix + baseNumber + suffix + ')';
//trying to make it match only when not within tags or inside excluded tags
var htmlBlackList = 'script|style|head'
var htmlStartTag = '<[^(' + htmlBlackList + ')]\\b[^>]*?>';
var reDecimal = new RegExp(htmlStartTag + '[^<]*?' + number + '[^>]*?<');
<script>
var htmlFragment = "<script>alert('hi')</script>";
var style = "<style>.foo { font-size: 14pt }</style>";
// ...
</script>
<!-- turn off this style for now
<style> ... </style>
-->
Good luck getting a regular expression to figure that out.
You're using JavaScript, so I'm guessing you're probably running in a browser. Which means you have access to the DOM, giving you access to the browser's very capable HTML parser. Use it.
The [^] regex modifier only works on single characters, not on compound expressions like (script|style|head). What you want is ?! :
var htmlStartTag = '<(?!(' + htmlBlackList + ')\\b)[^>]*?>';
(?! ... ) means 'not followed by ...' but [^ ... ] means 'a single character not in ...'.
I'm trying to make a regexp that will match numbers, excluding numbers that are part of other words or numbers inside certain html tags.
Regex cannot parse HTML. Do not use regex to parse HTML. Do not pass Go. Do not collect £200.
To ‘only match something not-within something else’ you would need a negative lookbehind assertion (“(?<!”), but JavaScript Regexps do not support lookbehind, and most other regex implementations don't support the complex variable-length lookbehind you'd need to have any hope of matching a context like being inside a tag. Even if you did have variable-length lookbehind, that'd still not reliably parse HTML, because as previously mentioned many times every day, regex cannot parse HTML.
Use an HTML parser. A browser HTML parser will be able to digest even partial input without complaining.

Categories