regex match starts and ends with expression using angular

regex match starts and ends with expression using angular - javascript

I have my working example here
https://stackblitz.com/edit/angular-fk2vpr?embed=1&file=src/app/app.component.ts
here I wrote condition like if word starts and ends with { and } they need to highlight and in editable way. here its working fine for word which is having no space like {username} but in case of { host} then its not taking that word to highlight and also if somewords like {pas}!!
then word of {pas} alone needs to highlight but its taking along with that !!
Can anyone assist me what mistake I made on this scenario
Thanks in advance

Try with below code it will work.
const editable = data.match(/{+.*?}+/gi);
if(editable){
editable.map(word => {
const re= new RegExp(word,'g');
data = data.replace(re,' '+ word.replace(/\s/gi,'')+' ')
})
}
As per above code the token {username}!! will not work as we are splitting with space. To resolve that problem we can do add extra space before and after of that token. Use below line to work in all condition.
data = data.replace(word, ' ' + word.replace(/\s/g,'') + ' ')

The problem lies with mapping the data.split(' '). Notice that when you call that on a string username { username} scrypt secret {password} you will not get an array with 5 elements, as you would like (i.e ['username', '{ username}', 'scrypt', 'secret', '{password}']), but it will split on every space character; so the resulting array looks more like this:
[
'username', '',
'{', '',
'', 'username}',
'scrypt', 'secret',
'{password}'
]
If you can, I would advice on making the arr array as an array of arrays, each subarray consisting of words you want to launch a regex through. If this is, however, not an option, I would suggest not splitting like that, but mapping like this:
this.mappedArr = this.arr.map(data => {
const editable = data.match('([{].*?[}])');
console.log(editable);
return {
editable,
data: !editable ?
data :
data.match(/(\{+ *)?\w+( *\}+)?/g).map(word => ({word, editable: word.match('^[{].*?[}]$')})) <
};
})
This begs for refactoring though, but it should work as an emergency fix.

Related

Removing all space from an array javascript

I have an array that I need to remove spaces from, for example it returns like
[book, row boat,rain coat]
However, I would like to remove all the white spaces.
All the guides I saw online said to use .replace, but it seems like that only works for strings. Here is my code so far.
function trimArray(wordlist)
{
for(var i=0;i<wordlist.length;i++)
{
wordlist[i] = wordlist.replace(/\s+/, "");
}
}
I have also tired replace(/\s/g, '');
Any help is greatly appreciated!

First and foremost you need to enclose the words in your array quotes, which will make them into strings. Otherwise in your loop you'll get the error that they're undefined variables. Alternatively this could be achieved in a more terse manner using map() as seen below:
const arr = ['book', 'row boat', 'rain coat'].map(str => str.replace(/\s/g, ''));
console.log(arr);

This will remove all of the spaces, even those within the text:
const result = [' book',' row boat ','rain coat '].map(str => str.replace(/\s/g, ''));
console.log(result);
and this will only remove preceding and trailing spaces:
const result = [' book',' row boat ','rain coat '].map(str => str.trim());
console.log(result);

Regex pattern that will match similar or same strings in subject

I don't even know how to properly set Title for this question. So I've been trying to make something, but I failed. I assume it'd be best to show a few examples below of what I want to accomplish.
// Let's say I have a list of some tags/slugs.
$subjects = [
'this-is-one',
'might-be-two',
'yessir',
'indeednodash',
'but-it-might'
];
$patterns = [
'this-is-one', // should match $subjects[0]
'mightbetwoorthree', // should match $subject[1]
'yes-sir', // should match $subject[2]
'indeednodash', // should match $subject[3]
'but-it-might-be-long-as-well' // should match $subject[4]
];
So, as one might see... Some of the patterns, do not fully/exactly match the given subject... So that's my problem. I want to make a regex, that would match all those possible variations.
I tried something basic, within foreach loop, but ofc it won't work as it's not fully matched...
if (preg_match("/\b$pattern\b/", $subject)) { // ... }
Any suggestions, explanations and code samples, please... I am trying to wrap my mind around regex, but not going well.
I will tag JS as well, because not necesserily has to do anything with php or preg_match.

function getMatchesOf(pattern, subjects) {
var result = [];
pattern = pattern.replace(/[^a-z]/g, '');
subjects.forEach(function(subject) {
var _subject = subject.replace(/[^a-z]/g, '');
if(pattern.includes(_subject))
result.push(subject);
});
return result;
}
var subjects = [
'this-is-one',
'might-be-two',
'yessir',
'indeednodash',
'but-it-might'
];
var patterns = [
'this-is-one',
'mightbe',
'yes-sir',
'indeednodash',
'but-it-might-be-long-as-well'
];
console.log(patterns[0] + " matches: ", getMatchesOf(patterns[0], subjects));
console.log(patterns[4] + " matches: ", getMatchesOf(patterns[4], subjects));

Check if each word is existing in database

Issue
I need to check if each word of a string is spelled correctly by searching a mongoDB collection for each word.
Doing a minimum amount of DB query
First word of each sentence must be in upper case, but this word could be upper or lower case in the dictionary. So I need a case sensitive match for each word. Only the first word of each sentence should be case insensitive.
Sample string
This is a simple example. Example. This is another example.
Dictionary structure
Assume there is a dictionary collection like this
{ word: 'this' },
{ word: 'is' },
{ word: 'a' },
{ word: 'example' },
{ word: 'Name' }
In my case, there are 100.000 words in this dictionary. Of course names are stored in upper case, verbs are stored lower case and so on...
Expected result
The words simple and another should be recognized as 'misspelled' word as they are not existing in the DB.
An array with all existing words should be in this case: ['This', 'is', 'a', 'example']. This is upper case as it is the first word of a sentence; in DB it is stored as lower case this.
My attempt so far (Updated)
const sentences = string.replace(/([.?!])\s*(?= [A-Z])/g, '$1|').split('|');
let search = [],
words = [],
existing,
missing;
sentences.forEach(sentence => {
const w = sentence.trim().replace(/[^a-zA-Z0-9äöüÄÖÜß ]/gi, '').split(' ');
w.forEach((word, index) => {
const regex = new RegExp(['^', word, '$'].join(''), index === 0 ? 'i' : '');
search.push(regex);
words.push(word);
});
});
existing = Dictionary.find({
word: { $in: search }
}).map(obj => obj.word);
missing = _.difference(words, existing);
Problem
The insensitive matches don't work properly: /^Example$/i will give me a result. But in existing there will go the original lowercase example, that means Example will go to missing-Array. So the case insensitive search is working as expected, but the result arrays have a missmatch. I don't know how to solve this.
Optimizing the code possible? As I'm using two forEach-loops and a difference...

This is how I would face this issue:
Use regex to get each word after space (including '.') in an array.
var words = para.match(/(.+?)(\b)/g); //this expression is not perfect but will work
Now add all words from your collection in an array by using find(). Lets say name of that array is wordsOfColl.
Now check if words are in the way you want or not
var prevWord= ""; //to check first word of sentence
words.forEach(function(word) {
if(wordsOfColl.toLowerCase().indexOf(word.toLowerCase()) !== -1) {
if(prevWord.replace(/\s/g, '') === '.') {
//this is first word of sentence
if(word[0] !== word[0].toUpperCase()) {
//not capital, so generate error
}
}
prevWord = word;
} else {
//not in collection, generate error
}
});
I haven't tested it so please let me know in comments if there's some issue. Or some requirement of yours I missed.
Update
As author of question suggested that he don't want to load whole collection on client, you can create a method on server which returns an array of words instead of giving access to client of collection.

How to parse a dirty CSV with Node.js?

I'm scratching my head on a CSV file I cannot parse correctly, due to many errors. I extracted a sample you can download here: Test CSV File
Main errors (or what generated an error) are:
Quotes & commas (many errors when trying to parse the file with R)
Empty rows
Unexpected line break inside a field
I first decided to use Regular Expression line by line to clean the data before loading them into R but couldn't solve the problem and it was two slow (200Mo file)
So I decided to use a CSV parser under Node.js with the following code:
'use strict';
const Fs = require('fs');
const Csv = require('csv');
let input = 'data_stack.csv';
let readStream = Fs.createReadStream(input);
let option = {delimiter: ',', quote: '"', escape: '"', relax: true};
let parser = Csv.parse(option).on('data', (data) => {
console.log(data)
});
readStream.pipe(parser)
But:
Some rows are parsed correctly (array of strings)
Some are not parsed (all fields are one string)
Some rows are still empty (can be solve by adding skip_empty_lines: true to the options)
I don't know how to handle the unexpected line break.
I don't know how to make this CSV clean, neither with R nor with Node.js.
Any help?
EDIT:
Following #Danny_ds solution, I can parse it correctly. Now I cannot stringify it back correctly.
with console.log(); I get a proper object but when I'm trying to stringify it, I don't get a clean CSV (still have line break and empty rows).
Here is the code I'm using:
'use strict';
const Fs = require('fs');
const Csv = require('csv');
let input = 'data_stack.csv';
let output = 'data_output.csv';
let readStream = Fs.createReadStream(input);
let writeStream = Fs.createWriteStream(output);
let opt = {delimiter: ',', quote: '"', escape: '"', relax: true, skip_empty_lines: true};
let transformer = Csv.transform(data => {
let dirty = data.toString();
let replace = dirty.replace(/\r\n"/g, '\r\n').replace(/"\r\n/g, '\r\n').replace(/""/g, '"');
return replace;
});
let parser = Csv.parse(opt);
let stringifier = Csv.stringify();
readStream.pipe(transformer).pipe(parser).pipe(stringifier).pipe(writeStream);
EDIT 2:
Here is the final code that works:
'use strict';
const Fs = require('fs');
const Csv = require('csv');
let input = 'data_stack.csv';
let output = 'data_output.csv';
let readStream = Fs.createReadStream(input);
let writeStream = Fs.createWriteStream(output);
let opt = {delimiter: ',', quote: '"', escape: '"', relax: true, skip_empty_lines: true};
let transformer = Csv.transform(data => {
let dirty = data.toString();
let replace = dirty
.replace(/\r\n"/g, '\r\n')
.replace(/"\r\n/g, '\r\n')
.replace(/""/g, '"');
return replace;
});
let parser = Csv.parse(opt);
let cleaner = Csv.transform(data => {
let clean = data.map(l => {
if (l.length > 100 || l[0] === '+') {
return l = "Encoding issue";
}
return l;
});
return clean;
});
let stringifier = Csv.stringify();
readStream.pipe(transformer).pipe(parser).pipe(cleaner).pipe(stringifier).pipe(writeStream);
Thanks to everyone!

I don't know how to make this CSV clean, neither with R nor with
Node.js.
Actually, it is not as bad as it looks.
This file can easily be converted to a valid csv using the following steps:
replace all "" with ".
replace all \n" with \n.
replace all "\n with \n.
With \n meaning a newline, not the characters "\n" which also appear in your file.
Note that in your example file \n is actually \r\n (0x0d, 0x0a), so depending on the software you use you may need to replace \n in \r\n in the above examples. Also, in your example there is a newline after the last row, so a quote as the last character will be replaced too, but you might want to check this in the original file.
This should produce a valid csv file:
There will still be multiline fields, but that was probably intended. But now those are properly quoted and any decent csv parser should be able to handle multiline fields.
It looks like the original data has had an extra pass for escaping quote characters:
If the original fields contained a , they were quoted, and if those fields already contained quotes, the quotes were escaped with another quote - which is the right way to do.
But then all rows containing a quote seem to have been quoted again (actually converting those rows to one quoted field), and all the quotes inside that row were escaped with another quote.
Obviously, something went wrong with the multiline fields. Quotes were added between the multiple lines too, which is not the right way to do.

The data is not too messed up to work with. There is a clear pattern.
General steps:
Temporarily remove mixed format inner fields (beginning with double(or more) quotes and having all kinds of characters.
Remove quotes from start and end of quoted lines giving clean CSV
Split data into columns
Replace removed fields
Step 1 above is the most important. If you apply this then the problems with new lines, empty rows and quotes and commas disappear. If you look in the data you can see columns 7, 8 and 9 contain mixed data. But it is always delimited by 2 quotes or more. e.g.
good,clean,data,here,"""<-BEGINNING OF FIELD DATA> Oh no
++\n\n<br/>whats happening,, in here, pages of chinese
characters etc END OF FIELD ->""",more,clean,data
Here is a working example based on the file provided:
fs.readFile('./data_stack.csv', (e, data) => {
// Take out fields that are delimited with double+ quotes
var dirty = data.toString();
var matches = dirty.match(/""[\s\S]*?""/g);
matches.forEach((m,i) => {
dirty = dirty.replace(m, "<REPL-" + i + ">");
});
var cleanData = dirty
.split('\n') // get lines
// ignore first line with column names
.filter((l, i) => i > 0)
// remove first and last quotation mark if exists
.map(l => l[0] === '"' ? l.substring(1, l.length-2) : l) // remove quotes from quoted lines
// split into columns
.map(l => l.split(','))
// return replaced fields back to data (columsn 7,8 and 9)
.map(col => {
if (col.length > 9) {
col[7] = returnField(col[7]);
col[8] = returnField(col[8]);
col[9] = returnField(col[9]);
}
return col;
function returnField(f) {
if (f) {
var repls = f.match(/<.*?>/g)
if (repls)
repls.forEach(m => {
var num = +m.split('-')[1].split('>')[0];
f = f.replace(m, matches[num]);
});
}
return f;
}
})
return cleanData
});
Result:
Data looks pretty clean. All rows produce the expected number of columns matching the header (last 2 rows shown):
...,
[ '19403',
'560e348d2adaffa66f72bfc9',
'done',
'276',
'2015-10-02T07:38:53.172Z',
'20151002',
'560e31f69cd6d5059668ee16',
'""560e336ef3214201030bf7b5""',
'a+�a��a+�a+�a��a+�a��a+�a��',
'',
'560e2e362adaffa66f72bd99',
'55f8f041b971644d7d861502',
'foo',
'foo',
'foo#bar.com',
'bar.com' ],
[ '20388',
'560ce1a467cf15ab2cf03482',
'update',
'231',
'2015-10-01T07:32:52.077Z',
'20151001',
'560ce1387494620118c1617a',
'""""""Final test, with a comma""""""',
'',
'',
'55e6dff9b45b14570417a908',
'55e6e00fb45b14570417a92f',
'foo',
'foo',
'foo#bar.com',
'bar.com' ],

Following on from my comment:
The data is too messed up to fix in one step, don't try.
Firstly decide whether double-quotes and/or comma's might be part of the data. If they are not, remove the double-quotes with a simple regex.
Next, there should be 14 commas on each line. Read the file as text and count the number of commas on each line in turn. Where there are less than 14, check the following line and if the sum of the commas is 14, merge the 2 lines. If the sum is less than 14, check the next line and continue until you have 14 commas. If the next line takes you over 14 there is a serious error so make a note of the line numbers - you will probably have to fix by hand. Save the resulting file.
With luck, you will now have a file that can be processed as a CSV. If not, come back with the partially tidied file and we can try to help further.
It should go without saying that you should process a copy of the original, you are unlikely to get it right first time :)

How to match only numerical email addresses

How can I match only numerical email addresses?
I'm testing against the following email addresses:
12345839223#gmail.com <-- want this
38482934934#gmail.com <-- want this
abcaasd#gmail.com <-- don't want this
asdasd123#gmail.com <-- don't want this
123asdasd#gmail.com <-- don't want this
I tried the following regex, but it matches some addresses with letters.
([0-9])+(#+)

The regex /^\d+(?=#)/ will achieve this for you. As you can see from the image below, it looks for the start of the line followed by one or more digits followed by an "#" symbol.
Here's a RegEx101 test case for reference
var emails = [
'12345839223#gmail.com',
'38482934934#gmail.com',
'abcaasd#gmail.com',
'asdasd123#gmail.com',
'123asdasd#gmail.com'
];
function emailNum(email) {
return (/^\d+(?=#)/.exec(email)||[false])[0];
// return the match if it exists or false
}
for(var i in emails) document.write(emails[i]+': '+emailNum(emails[i])+'<br>');

In Javascript, you could have a function like this:
function isNumberEmail(email) {
return /^\d+#.*\./.test(email)
}
emailsToTest = ["12345839223#gmail.com",
"38482934934#gmail.com",
"abcaasd#gmail.com",
"asdasd123#gmail.com",
"123asdasd#gmail.com"]
emailsToTest.forEach(function(email) {
document.write(email + " - " + isNumberEmail(email))
document.write("<br>")
})

You can use the following to test if the first part of the email is a number:
function test( val ) {
var first = val.match(/^([^#]+)/g)[0];
return /^\d+$/g.test(first);
}
console.log(test('12345#email.com'));
console.log(test('12345678#email.com'));
console.log(test('abc12345#email.com'));
console.log(test('12345abc#email.com'));

I think this will work, wouldn't mind if someone can verify. Adds # to capture group
/([1-9][0-9]*)+(#)/g
Edit: ^\d+# works as per #dustmouse's comment

We Keep Coding

JavaScript is the programming language of the Web.

regex match starts and ends with expression using angular - javascript

Related

Removing all space from an array javascript

Regex pattern that will match similar or same strings in subject

Check if each word is existing in database

How to parse a dirty CSV with Node.js?

How to match only numerical email addresses

Categories

Resources