Regular Expression: Match Partial or full string - javascript

I have small script that takes the value from a text input and needs to match an item in an array either partially or fully.
I'm struggling at the moment with the regular expression and the syntax and wondered if I could pick your brains.
for (var i=0; i < liveFilterData.length; i+=1) {
if (liveFilterData[i].match(liveFilter.val())) {
alert();
}
}
I need the liveFilter.val() and Regular Expression to match the current array item liveFilterData[i] so if someone types in H or h in the text box, it checks if there is a matching item in the array. If they type in He or he then it matches Head, Header or heading.
Sorry, I've looked all over the web on how to build regular expressions but I can't work it out.

Simple string comparison shold do the trick:
for (var v, i = liveFilterData.length; i--;) {
if (liveFilterData[i].slice (0, (v = liveFilter.val().toLowerCase ()).length) === v) {
alert();
}
}
liveFilterData should contain the words in lower case.

I'm not sure I totally understand the question. Is liveFilter.val() a regular expression, or is it just a string that you are trying to match against any value in an array? I'm guessing that you have a event on the textbox's keypress, keydown or keypress, and that the code you wrote above runs in the callback to this event. If so, there are a number of things you can do to convert the value to an appropriate regex: "^"+liveFilter.val(), Since you are using the regex in a loop, you should precompile it with new RegExp, so your loop would look something like:
//the i in the second param here is a flag indicating case insensitive
//a hat '^' in regex means 'beginning of the input'
regex = new RegExp("^"+liveFilter.val(), i);
for (var i=0; i < liveFilterData.length; i+=1) {
// regex.test uses the precompiled regex, and determines if there is a match
if (regex.test(liveFilterData[i])) {
alert("matched " + i);
}
}
Hope this helps!

Related

Quickly check if two regular expressions share matches

say I have a list of regular expressions, which match filepaths:
{
"list":[
"^/foo/bar/baz/x",
"^/foo/bar/baz/y"
"^/foo/mon/choo$",
...
"^/foo/.*"
]
}
Note that at runtime, this will happen:
let regexes = list.map(function(l){
return new RegExp(l);
});
I need to create a routine to quickly check if two or more of the regular expressions match the same input.
Is there a way to quickly check if an imaginary/potential filepath would match more than one regular expression in the list?
For example, the regular expression /foo/.* will match the first 3 items, and therefore that represents an error in my program.
Use case: the user is expected to create a list of regular expressions, but they have to be exclusive regular expressions which do not share any matches.
I could check this with actual input, but I am wondering if there is a way to check this with theoretical input as well. (I am hoping that latter would be faster).
The "hard" way: I have a list of files. For each file I check to see if it matches any of the regular expressions in the list. If it matches more than 1 in the list, I throw an error.
The problem with the hard way is that I would like to validate the list before using any real input data.
Since you're using an array, it's possible to have duplicates of the exact regex, so maybe you want to use an object with keys instead or just set it.
Along with that, you could actually use the regex themselves to test each other. In my example below I'm only checking .* or .+ but if you really want to be comprehensive, you could run each regex against every other regex. Though I didn't do this because it might have a really long run time, but that's up to you.
var list = [
"/foo/bar/baz/x",
"/foo/bar/baz/y",
"/foo/mon/choo$",
"/foo/.*"
];
var error_list = [...list, "/foo/mon/choo$"];
let set = new Set(error_list);
console.log(set.length === list.length, "\"if false it means there's duplicates\"");
var regexes = [];
for (var regex of list){
if (regex.match(/\.(\*|\+)/)){
regexes.push(regex);
}
}
loop:
for (var regex of regexes){
var r = new RegExp("^"+regex);
for (let test of list){
if (test.match(r) && regex !== test){
console.log(test, "this matched");
// break loop;
}
}
}

Capitalize the first letter of each word

var name = "AlbERt EINstEiN";
function nameChanger(oldName) {
var finalName = oldName;
// Your code goes here!
finalName = oldName.toLowerCase();
finalName = finalName.replace(finalName.charAt(0), finalName.charAt(0).toUpperCase());
for(i = 0; i < finalName.length; i++) {
if (finalName.charAt(i) === " ")
finalName.replace(finalName.charAt(i+1), finalName.charAt(i+1).toUpperCase());
}
// Don't delete this line!
return finalName;
};
// Did your code work? The line below will tell you!
console.log(nameChanger(name));
My code as is, returns 'Albert einstein'. I'm wondering where I've gone wrong?
If I add in
console.log(finalName.charAt(i+1));
AFTER the if statement, and comment out the rest, it prints 'e', so it recognizes charAt(i+1) like it should... I just cannot get it to capitalize that first letter of the 2nd word.
There are two problems with your code sample. I'll go through them one-by-one.
Strings are immutable
This doesn't work the way you think it does:
finalName.replace(finalName.charAt(i+1), finalName.charAt(i+1).toUpperCase());
You need to change it to:
finalName = finalName.replace(finalName.charAt(i+1), finalName.charAt(i+1).toUpperCase());
In JavaScript, strings are immutable. This means that once a string is created, it can't be changed. That might sound strange since in your code, it seems like you are changing the string finalName throughout the loop with methods like replace().
But in reality, you aren't actually changing it! The replace() function takes an input string, does the replacement, and produces a new output string, since it isn't actually allowed to change the input string (immutability). So, tl;dr, if you don't capture the output of replace() by assigning it to a variable, the replaced string is lost.
Incidentally, it's okay to assign it back to the original variable name, which is why you can do finalName = finalName.replace(...).
Replace is greedy
The other problem you'll run into is when you use replace(), you'll be replacing all of the matching characters in the string, not just the ones at the position you are examining. This is because replace() is greedy - if you tell it to replace 'e' with 'E', it'll replace all of them!
What you need to do, essentially, is:
Find a space character (you've already done this)
Grab all of the string up to and including the space; this "side" of the string is good.
Convert the very next letter to uppercase, but only that letter.
Grab the rest of the string, past the letter you converted.
Put all three pieces together (beginning of string, capitalized letter, end of string).
The slice() method will do what you want:
if (finalName.charAt(i) === " ") {
// Get ONLY the letter after the space
var startLetter = finalName.slice(i+1, i+2);
// Concatenate the string up to the letter + the letter uppercased + the rest of the string
finalName = finalName.slice(0, i+1) + startLetter.toUpperCase() + finalName.slice(i+2);
}
Another option is regular expression (regex), which the other answers mentioned. This is probably a better option, since it's a lot cleaner. But, if you're learning programming for the first time, it's easier to understand this manual string work by writing the raw loops. Later you can mess with the efficient way to do it.
Working jsfiddle: http://jsfiddle.net/9dLw1Lfx/
Further reading:
Are JavaScript strings immutable? Do I need a "string builder" in JavaScript?
slice() method
You can simplify this down a lot if you pass a RegExp /pattern/flags and a function into str.replace instead of using substrings
function nameChanger(oldName) {
var lowerCase = oldName.toLowerCase(),
titleCase = lowerCase.replace(/\b./g, function ($0) {return $0.toUpperCase()});
return titleCase;
};
In this example I've applied the change to any character . after a word boundary \b, but you may want the more specific /(^| )./g
Another good answer to this question is to use RegEx to do this for you.
var re = /(\b[a-z](?!\s))/g;
var s = "fort collins, croton-on-hudson, harper's ferry, coeur d'alene, o'fallon";
s = s.replace(re, function(x){return x.toUpperCase();});
console.log(s); // "Fort Collins, Croton-On-Hudson, Harper's Ferry, Coeur D'Alene, O'Fallon"
The regular expression being used may need to be changed up slightly, but this should give you an idea of what you can do with regular expressions
Capitalize Letters with JavaScript
The problem is twofold:
1) You need to return a value for finalName.replace, as the method returns an element but doesn't alter the one on which it's predicated.
2) You're not iterating through the string values, so you're only changing the first word. Don't you want to change every word so it's in lower case capitalized?
This code would serve you better:
var name = "AlbERt EINstEiN";
function nameChanger(oldName) {
// Your code goes here!
var finalName = [];
oldName.toLowerCase().split(" ").forEach(function(word) {
newWord = word.replace(word.charAt(0), word.charAt(0).toUpperCase());
finalName.push(newWord);
});
// Don't delete this line!
return finalName.join(" ");
};
// Did your code work? The line below will tell you!
console.log(nameChanger(name));
if (finalName.charAt(i) === " ")
Shouldn't it be
if (finalName.charAt(i) == " ")
Doesn't === check if the object types are equal which should not be since one it a char and the other a string.

Javascript pig latin regex not working

Trying to make a function that will convert English words into Pig Latin. The problem I have is when I check to see if the first letter is a vowel. I check using a regular expression: if (str[0] === /[aeiou]/i) but it doesn't work. Something is wrong with my regex but I look at references and it seems like that should work. I don't know what's going on. Can someone explain why the regex I am using does not work and what would be a similar working solution? If you run the code below, it doesn't give the right result, just saying beforehand.
function translate(str) {
var tag = "";
if (str[0] === /[aeiou]/i) {
tag = "way";
return str + tag;
}
else {
var count = 0;
for (var i = 0; i< str.length; i++) {
if (str[i] !== /[aeiou]/i)
tag += str[i];
else
break;
count = i;
}
console.log(count);
return str.slice(count + 1) + tag + "ay";
}
}
So when I run say translate(overjoyed) it should return "overjoyedway". And if I run translate(glove) it should return "oveglay".
What you have written is not the way you use regular expressions. The code if (str[0] === /[aeiou]/i) tests whether the first element of the str string array is both equal value and equal type as the regular expression: /[aeiou]/i. Characters are not the same type as regular expressions, so such a comparison will evaluate to false.
Think of the regular expression as a tool that can be used to search an entire string array for a match (all of str, not just str[0]). The web has a bunch of great examples, but to get you started, you might try using str.search(regexp) which will return the index of the first match (if found) or -1 (if no match).
Your code then becomes (without too much deviation from the original, and without trying to be clever or optimal):
function translate(str) {
var tag = "";
var pos = str.search(/[aeiou]/i); // This is ONE way to use regular expressions.
if (pos == 0) { // First letter is a vowel.
tag = "way";
return str + tag;
} else if (pos > 0) { // Some letter (not the first) is a vowel.
// Instead of the loop checking each element, we already know where
// the match is found: at position = pos.
console.log(pos); // Log the match position of the first vowel.
tag = str.slice(0, pos); // The string before the first vowel.
return str.slice(pos) + tag + "ay";
}
}
This works
function pig(str) {
if (/^[aeiou]/i.test(str)) {
return str + 'way';
}
else {
return str.replace(/^(.[^aeiou]+)([aeiou].*)$/i, "$2$1ay");
}
}
console.log(pig('overjoyed'));
console.log(pig('glove'));
I know this is old, but I just recently did something similar in Ruby and thought I'd rewrite in in JavaScript and supply it as an answer.
function translate( words )
{
return words.split(' ').map(function(word){
return word.split(/\b([^a,e,i,o,u]{0,}qu|[^a,e,i,o,u]+)?([a,e,i,o,u][a-z]+)?/i).reverse().join('') + 'ay'
}).join(' ')
}
So in mine I take a string words that can be a single word or a full sentence.
I split it up into an array of words by splitting on spaces.
I then used map to run code on each word in that array, in there I have it split the word using my regex (which I'll explain at the end) which splits it into the first sound if it is anything other than a vowel sound and the rest of the word.
On the word quiet that split actually results in the following array: ["","qu","iet",""], and on the word aqua it results in ["",undefined,"aqua",""].
I can ignore those undefined and empty strings though because when we join it back together they get ignored.
So after it is split up I reverse the array and then join it back together as a word (joining it using an empty string '') and then tack on 'ay' to the end of the resulting string.
Now to explain the regex:
\b says we are looking for the start of a word, it could alternatively be ^ for the start of the string.
([^a,e,i,o,u]{0,}qu|[^a,e,i,o,u]+)? is an optional capture group looking for that first consonant sound, let's further break it up.
So within it we have two alternatives we are looking for, either [^a,e,i,o,u]{0,}qu or [^a,e,i,o,u]+, the first one checks for the first sound containing qu with or without preceding non-vowel characters (so the qu from quiet, or the squ from square get matched instead of stopping before the u), and the second one is checking for all non-vowel letters before the first vowel in the case that there is no qu at the start.
Now that final part of it ([a,e,i,o,u][a-z]+)? is just grabbing from that first vowel on as the rest of that match
I hope someone somewhere finds this useful :)
function translatePigLatin(str) {
if(["a","e","i","o","u"].indexOf(str[0])!==-1){
str+="way";
}else{
while(["a","e","i","o","u"].indexOf(str[0])==-1){
str+=str[0];
str=str.slice(1);
}
str+="ay";
}
return str;
}
translatePigLatin("glove");
this works:
1.check the string's first letter .if the letter is vowel then is easy for us to complete the word with "way"
2.if the first letter is not vowel,remove the first letter to end.
repeat the loop until the "first" letter is vowel.
3.now the string have changed and we add an another string("ay") to the string's end

Shared part in RegEx matched string

In following code:
"a sasas b".match(/sas/g) //returns ["sas"]
The string actually include two sas strings, a [sas]as b and a sa[sas] b.
How can I modify RegEx to match both?
Another example:
"aaaa".match(/aa/g); //actually include [aa]aa,a[aa]a,aa[aa]
Please consider the issue in general not just above instances.
A pure RexEx solution is preferred.
If you want to match at least one such "merged" occurrence, then you could do something like:
"a sasas b".match(/s(as)+/g)
If you want to retrieve the matches as separate results, then you have a bit more work to do; this is not a case that regular expressions are designed to handle. The basic algorithm would be:
Attempt a match. If it was unsuccessful, stop.
Extract the match you are interested in and do whatever you want with it.
Take the substring of the original target string, starting from one character following the first character in your match.
Start over, using this substring as the new input.
(To be more efficient, you could match with an offset instead of using substrings; that technique is discussed in this question.)
For example, you would start with "a sasas b". After the first match, you have "sas". Taking the substring that starts one character after the match starts, we would have "asas b". The next match would find the "sas" here, and you would again repeat the process with "as b". This would fail to match, so you would be done.
This significantly-improved answer owes itself to #EliGassert.
String.prototype.match_overlap = function(re)
{
if (!re.global)
re = new RegExp(re.source,
'g' + (re.ignoreCase ? 'i' : '')
+ (re.multiline ? 'm' : ''));
var matches = [];
var result;
while (result = re.exec(this))
matches.push(result),
re.lastIndex = result.index + 1;
return matches.length ? matches : null;
}
#EliGassert points out that there is no need to walk through the entire string character by character; instead we can find a match anywhere (i.e. do without the anchor), and then continue one character after the index of the found match. While researching how to retrieve said index, I found that the re.lastIndex property, used by exec to keep track of where it should continue its search, is in fact settable! This works rather nicely with what we intend to do.
The only bit needing further explanation might be the beginning. In the absence of the g flag, exec may never return null (always returning its one match, if it exists), thus possibly going into an infinite loop. Since, however, match_overlap by design seeks multiple matches, we can safely recompile any non-global RegExp as a global RegExp, importing the i and m options as well if set.
Here is a new jsFiddle: http://jsfiddle.net/acheong87/h5MR5/.
document.write("<pre>");
document.write('sasas'.match_overlap(/sas/));
document.write("\n");
document.write('aaaa'.match_overlap(/aa/));
document.write("\n");
document.write('my1name2is3pilchard'.match_overlap(/[a-z]{2}[0-9][a-z]{2}/));
document.write("</pre>");​
Output:
sas,sas
aa,aa,aa
my1na,me2is,is3pi
var match = "a sasas b".match(/s(?=as)/g);
for(var i =0; i != match.length; ++i)
alert(match[i]);
Going off of the comment by Q. Sheets and the response by cdhowie, I came up with the above solution: it consumes ONE character in the regular expression and does a lookahead for the rest of the match string. With these two pieces, you can construct all the positions and matching strings in your regular expression.
I wish there was an "inspect but don't consume" operator that you could use to actually include the rest of the matching (lookahead) string in the results, but there unfortunately isn't -- at least not in JS.
Here's a generic way to do it:
​String.prototype.match_overlap = function(regexp)
{
regexp = regexp.toString().replace(/^\/|\/$/g, '');
var re = new RegExp('^' + regexp);
var matches = [];
var result;
for (var i = 0; i < this.length; i++)
if (result = re.exec(this.substr(i)))
matches.push(result);
return matches.length ? matches : null;
}
Usage:
var results = 'sasas'.match_overlap(/sas/);
Returns:
An array of (overlapping) matches, or null.
Example:
Here's a jsFiddle in which this:
document.write("<pre>");​
document.write('sasas'.match_overlap(/sas/));
document.write("\n");
document.write('aaaa'.match_overlap(/aa/));
document.write("\n");
document.write('my1name2is3pilchard'.match_overlap(/[a-z]{2}[0-9][a-z]{2}/));
document.write("</pre>");​
returns this:
sas,sas
aa,aa,aa
my1na,me2is,is3pi
Explanation:
To explain a little bit, we intend for the user to pass a RegExp object to this new function, match_overlap, as he or she would do normally with match. From this we want to create a new RegExp object anchored at the beginning (to prevent duplicate overlapped matches—this part probably won't make sense unless you encounter the issue yourself—don't worry about it). Then, we simply match against each substring of the subject string this and push the results to an array, which is returned if non-empty (otherwise returning null). Note that if the user passes in an expression that is already anchored, this is inherently wrong—at first I stripped anchors out, but then I realized I was making an assumption in the user's stead, which we should avoid. Finally one could go further and somehow merge the resulting array of matches into a single match result resembling what would normally occur with the //g option; and one could go even further and make up a new flag, e.g. //o that gets parsed to do overlap-matching, but this is getting a little crazy.

remove all but a specific portion of a string in javascript

I am writing a little app for Sharepoint. I am trying to extract some text from the middle of a field that is returned:
var ows_MetaInfo="1;#Subject:SW|NameOfADocument
vti_parservers:SR|23.0.0.6421
ContentTypeID:SW|0x0101001DB26Cf25E4F31488B7333256A77D2CA
vti_cachedtitle:SR|NameOfADocument
vti_title:SR|ATitleOfADocument
_Author:SW:|TheNameOfOurCompany
_Category:SW|
ContentType:SW|Document
vti_author::SR|mrwienerdog
_Comments:SW|This is very much the string I need extracted
vti_categories:VW|
vtiapprovallevel:SR|
vti_modifiedby:SR|mrwienerdog
vti_assignedto:SR|
Keywords:SW|Project Name
ContentType _Comments"
So......All I want returned is "This is very much the string I need extracted"
Do I need a regex and a string replace? How would you write the regex?
Yes, you can use a regular expression for this (this is the sort of thing they are good for). Assuming you always want the string after the pipe (|) on the line starting with "_Comments:SW|", here's how you can extract it:
var matchresult = ows_MetaInfo.match(/^_Comments:SW\|(.*)$/m);
var comment = (matchresult==null) ? "" : matchresult[1];
Note that the .match() method of the String object returns an array. The first (index 0) element will be the entire match (here, we the entire match is the whole line, as we anchored it with ^ and $; note that adding the "m" after the regex makes this a multiline regex, allowing us to match the start and end of any line within the multi-line input), and the rest of the array are the submatches that we capture using parenthesis. Above we've captured the part of the line that you want, so that will present in the second item in the array (index 1).
If there is no match ("_Comments:SW|" doesnt appear in ows_MetaInfo), then .match() will return null, which is why we test it before pulling out the comment.
If you need to adjust the regex for other scenarios, have a look at the Regex docs on Mozilla Dev Network: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
You can use this code:
var match = ows_MetaInfo.match(/_Comments:SW\|([^\n]+)/);
if (match)
document.writeln(match[1]);
I'm far from competent with RegEx, so here is my RegEx-less solution. See comments for further detail.
var extractedText = ExtractText(ows_MetaInfo);
function ExtractText(arg) {
// Use the pipe delimiter to turn the string into an array
var aryValues = ows_MetaInfo.split("|");
// Find the portion of the array that contains "vti_categories:VW"
for (var i = 0; i < aryValues.length; i++) {
if (aryValues[i].search("vti_categories:VW") != -1)
return aryValues[i].replace("vti_categories:VW", "");
}
return null;
}​
Here's a working fiddle to demonstrate.

Categories