Regex to replace single or multiple occurrence of hyphen - javascript

I have written below function to convert space with hyphen or reverse
space with hyphen str.trim().replace(/\s+/g, '-')
hyphen with space str.replace(/\-/g,' ')
But now I am trying to replace single hyphen with double hyphen, I can't use point 1 function because it convert single/multiple occurrence instead of single.
Is there any way to write regex which do 3 operation in single formula
convert forward slash with underscore replace(/\//g, '_')
convert space with single hyphen
convert single hyphen with multiple hyphen
e.g.
regex 1 change
"Name/Er-Gourav Mukhija" into "Name_Er--Gourav-Mukhija"
regex 2 do reverse of it.

You could use a callback function instead of a replace string. That way you can specify and replace all characters at once.
const input = 'Name/Er-Gourav Mukhija';
const translate = {
'/': '_',
'-': '--',
' ': '-',
};
const reverse = {
'_': '/',
'--': '-',
'-': ' ',
};
// This is just a helper function that takes
// the input string, the regex and the object
// to translate snippets.
function replaceWithObject( input, regex, translationObj ) {
return input.replace( regex, function( match ) {
return translationObj[ match ] ? translationObj[ match ] : match;
} );
}
function convertString( input ) {
// Search for /, - and spaces
return replaceWithObject( input, /(\/|\-|\s)/g, translate );
}
function reverseConvertedString( input ) {
// Search for _, -- and - (the order here is very important!)
return replaceWithObject( input, /(_|\-\-|\-)/g, reverse );
}
const result = convertString( input );
console.log( result );
console.log( reverseConvertedString( result ) );

It is not possible to write a Regex formula which does conditional replacements (ie a->b, c->d). I would instead try to create two statements to replace " " -> "--" and "/" -> "_".
You can use your existing code for both of these operations. I would recommend using this site for building and testing Regexes in the future.

Consider var str = "Name/Er-Gourav Mukhija"
To convert forward slash with underscore, as you mentioned use replace(/\//g, '_')
To convert space with single hyphen, use replace(/\s+/g, '-')
To convert single hyphen to double hyphen, use replace(/\-/g, '--').
All these 3 can be combined into:
str.replace(/\//g, '_').replace(/\s+/g, '-').replace(/\-/g, '--')

You should use a loop to do all at once:
str = str.split("");
var newStr = "";
str.forEach(function (curChar) {
switch(curChar) {
case " ":
newStr += "-";
break;
case "/":
newStr += "_";
break;
case "-":
newStr += "--";
break;
default:
newStr += curChar;
}
});
str = newStr;
Feel free to turn this into a function if you like. I also havven't made it do the reverse, but all you'd need to do is replace the assignment strings with the case strings in the switch () statement.
There's no way to do it all with regex, as your later regex will overwrite your first regex in at least one case no matter how you write it.

Related

how to convert visual hebrew to logical hebrew using vanilla javascript?

I've been trying to figure out how to convert visual hebrew to logical hebrew with vanilla JavaScript by just making a function that reverses the string given to it and then just replacing the brackets with the correct ones
e.g. ( to ) or ) to (.
Here is an explanation for whats the difference between the two:
https://www.w3.org/International/questions/qa-visual-vs-logical
When you reverse a string, the brackets stay the same, so for example when you have :) שלום the reversed string will be םולש :( and thats why i added the switch case
Whats going wrong is Hebrew combined with English, im using this function to print hebrew correctly in minecraft education (what it does is - it prints hebrew flipped so that the string starts from the end and ends at the start) and as you can see in the image in this link - https://imgur.com/a/zEwHCmA , the first line is supposed to be the correct one, but the second one is the one being printed
Is there a chance it can be done with regex?
function myHebrewPrint(text: string ): void {
let reversed=""
for(let char of text)
{
switch(char)
{
case "(":
char = ")";
break;
case ")":
char = "(";
break;
case "[":
char = "]";
case "]":
char = "[";
break;
case "{":
char = "}";
break;
case "}":
char = "{";
break;
default:
break;
}
reversed=char+reversed
}
console.log(reversed)
}
Edit: I'm leaving this in case someone wants to have an attempt at this.
If I understood the requirements correctly, I think I managed to have this working. I have some explanation as comments.
function visualToLogicalHebrew(text) {
const hasHebrew = new RegExp("^[\u0590-\u05FF]+$"); //pattern to check if word has hebrew letters
const arr = text.split(' '); //split the whole text by space, having an array full of words
let sentence = "";
for (const word of arr) {
let isHebrew = hasHebrew.test(word);
if (isHebrew) {
sentence = word.split("").reverse().join("") + " " + sentence; //if hebrew then reverse all letters, add space and the previous sentence
}
else {
sentence = sentence + word + " ";
}
}
return sentence;
}
let result = visualToLogicalHebrew(":) שלום :subnetשלום");
console.log(result);
Attaches my solution based on #Costa solution and adaptation to the complex needs in my case
The first regex pattern matches the Hebrew / Arabic characters and the next character (except for a new line) and repeat to export entire verses
The function inside the replace convert the string to array and reverse it
The second regular expression test if the extra character taken is NOT an appropriate letter like a semicolon or anything else that should not be reversed and move it back to the end by shift and push
The join convert the array back to string
function reverseRTLchars(text){
//Hebrew U+0590 to U+05FF
//Arabic U+0600 to U+06FF
return text.replace(/([\u0590-\u06FF]+.)+/g, function(m){
m = m.split('').reverse();
if(/[^\u0590-\u06FF]/.test(m[0])){
m.push(m.shift())
}
return m.join('');
})
}
var string = `
םלוע םולש! םויה ריואה גזמ המ?
ש"פוסה דעו םויהמ: תיקלח ןנועמ דע ריהב
ב םוי' 28 ץרמ 2022 (ב"פשת)
`;
document.getElementById('conten').innerHTML = reverseRTLchars(string);
<pre id="conten" dir="rtl"></pre>

How to use regex expression to split str based off multiple characters in expression?

Sorry if the wording is bad. So I'm trying to find out how to pass in a string match of multiple characters long into my dynamic regex expression.
The regex in my else statement works with 1 character being passed in so I'm trying to do the same thing except with multiple characters being passed in the first if statement.
const delimiter = str.slice(0, str.indexOf('\n'));
const strLength = delimiter.length;
if (delimiter[0] === '[' && delimiter.charAt(strLength - 1) === ']') {
const customDelimiter = delimiter.slice(delimiter.indexOf(delimiter[1]), delimiter.indexOf(delimiter.charAt(strLength - 1)));
console.log(customDelimiter) // => '***'
const regex = new RegExp(`,|\\n|\\${customDelimiter}`,'g');
return strArr = str.split(regex).filter(Boolean);
} else {
const firstChar = str.slice(0, 1); // => '*'
const regex = new RegExp(`,|\\n|\\${firstChar}`,'g');
return strArr = str.split(regex).filter(Boolean);
}
So for example I want this string:
'[*]\n11***22***33' to equal 66 b/c it should split it into an array of [11, 22, 33] using the '*' delimiter. I get an error message saying: "SyntaxError: Invalid regular expression: /,|\n|***/: Nothing to repeat".
When you use * as delimeter in your regex, it becomes ,|\\n|\\|\*, which is the correct regex.
It matches ',' or '\n' or a '*' character.
For your string, it matches [***]\n11***22***33.
But when you use *** as a delimiter in your regex, it becomes ,|\\n|\\|\***, which is incorrect. Here it gets two unescaped * at the end. * in regex means 0 or more of the preceding pattern. You cannot have two of them together.
This is a special case because * has a special meaning in regex.
If you would have used any non-regex character, it would work.
A simpler solution would be to use javascript split function to easily get the desired result.
You could first split the string using \n.
let splitStr = str.split('\n');
// This would return ["[***]", "11***22***33"]
and then split the 1st index of the splitStr using the delimeter.
splitStr[1].split('***');
// splitStr[1].split(customDelimiter)
// This would return ["11", "22", "33"]
Using this you wouldn't need to use if or else statement to separate out single character delimiter and multiple character delimiter.

Convert comma-separated string to nested array, RegExp?

Got this type of string:
var myString = '23, 13, (#752, #141), $, ASD, (#113, #146)';
I need to split it to an array with comma as separator but also converts (..) to an array.
This is the result I want: [23, 13, ['#752', '#141'], '$', 'ASD', ['#113', '#146']];
I got huge data-sets so its very important to make it as fast as possible. What's the fastest way? Do some trick RegExp function or do it manually with finding indexes etc.?
Here's a jsbin: https://jsbin.com/cilakewecu/edit?js,console
Convert the parens to brackets, quote the strings, then use JSON.parse:
JSON.parse('[' +
str.
replace(/\(/g, '[').
replace(/\)/g, ']').
replace(/#\d+|\w+/g, function(m) { return isNaN(m) ? '"' + m + '"' : m; })
+ ']')
> [23,13,["#752","#141"],"ASD",["#113","#146"]]
You can use RegEx
/\(([^()]+)\)|([^,()\s]+)/g
RegEx Explanation:
The RegEx contain two parts. First, to capture anything that is inside the parenthesis. Second, capture simple values (string, numbers)
\(([^()]+)\): Match anything that is inside the parenthesis.
\(: Match ( literal.
([^()]+): Match anything except ( and ) one or more number of times and add the matches in the first captured group.
\): Match ) literal.
|: OR condition in RegEx
([^,()\s]+): Match any character except , (comma), parenthesis ( and ) and space one or more number of times and add the match in the second captured group
Demo:
var myString = '23, 13, (#752, #141), ASD, (#113, #146)',
arr = [],
regex = /\(([^()]+)\)|([^,()\s]+)/g;
// While the string satisfies regex
while(match = regex.exec(myString)) {
// Check if the match is parenthesised string
// then
// split the string inside those parenthesis by comma and push it in array
// otherwise
// simply add the string in the array
arr.push(match[1] ? match[1].split(/\s*,\s*/) : match[2]);
}
console.log(arr);
document.body.innerHTML = '<pre>' + JSON.stringify(arr, 0, 4) + '</pre>'; // For demo purpose only
Just use the split method.
var str = '23, 13, (#752, #141), ASD, (#113, #146)',
newstr = str.replace(/\(/gi,'[').replace(/\)/gi,']'),
splitstr = newstr.split(',');

Javascript convert PascalCase to underscore_case/snake_case

How can I convert PascalCase string into underscore_case/snake_case string? I need to convert dots into underscores as well.
eg. convert
TypeOfData.AlphaBeta
into
type_of_data_alpha_beta
You could try the below steps.
Capture all the uppercase letters and also match the preceding optional dot character.
Then convert the captured uppercase letters to lowercase and then return back to replace function with an _ as preceding character. This will be achieved by using anonymous function in the replacement part.
This would replace the starting uppercase letter to _ + lowercase_letter.
Finally removing the starting underscore will give you the desired output.
var s = 'TypeOfData.AlphaBeta';
console.log(s.replace(/(?:^|\.?)([A-Z])/g, function (x,y){return "_" + y.toLowerCase()}).replace(/^_/, ""));
OR
var s = 'TypeOfData.AlphaBeta';
alert(s.replace(/\.?([A-Z])/g, function (x,y){return "_" + y.toLowerCase()}).replace(/^_/, ""));
any way to stop it for when a whole word is in uppercase. eg. MotorRPM into motor_rpm instead of motor_r_p_m? or BatteryAAA into battery_aaa instead of battery_a_a_a?
var s = 'MotorRMP';
alert(s.replace(/\.?([A-Z]+)/g, function (x,y){return "_" + y.toLowerCase()}).replace(/^_/, ""));
str.split(/\.?(?=[A-Z])/).join('_').toLowerCase();
u're welcome
var s1 = 'someTextHere';
var s2 = 'SomeTextHere';
var s3 = 'TypeOfData.AlphaBeta';
var o1 = s1.split(/\.?(?=[A-Z])/).join('_').toLowerCase();
var o2 = s2.split(/\.?(?=[A-Z])/).join('_').toLowerCase();
var o3 = s3.split(/\.?(?=[A-Z])/).join('_').toLowerCase();
console.log(o1);
console.log(o2);
console.log(o3);
Alternatively using lodash:
lodash.snakeCase(str);
Example:
_.snakeCase('TypeOfData.AlphaBeta');
// ➜ 'type_of_data_alpha_beta'
Lodash is a fine library to give shortcut to many everyday js tasks.There are many other similar string manipulation functions such as camelCase, kebabCase etc.
This solution solves the non-trailing acronym issue with the solutions above
I ported the code in 1175208 from Python to JavaScript.
Javascript Code
function camelToSnakeCase(text) {
return text.replace(/(.)([A-Z][a-z]+)/, '$1_$2').replace(/([a-z0-9])([A-Z])/, '$1_$2').toLowerCase()
}
Working Examples:
camelToSnakeCase('thisISDifficult') -> this_is_difficult
camelToSnakeCase('thisISNT') -> this_isnt
camelToSnakeCase('somethingEasyLikeThis') -> something_easy_like_this
"alphaBetaGama".replace(/([A-Z])/g, "_$1").toLowerCase() // alpha_beta_gamma
Problem - Need to convert a camel-case string ( such as a property name ) into underscore style to meet interface requirements or for meta-programming.
Explanation
This line uses a feature of regular expressions where it can return a matched result ( first pair of () is $1, second is $2, etc ).
Each match in the string is converted to have an underscore ahead of it with _$1 string provided. At that point the string looks like alpha_Beta_Gamma.
To correct the capitalization, the entire string is converted toLowerCase().
Since toLowerCase is a fairly expensive operation, its best not to put it in the looping handler for each match-case, and run it once on the entire string.
After toLowerCase it the resulting string is alpha_beta_gamma ( in this example )
This will get you pretty far: https://github.com/domchristie/humps
You will probably have to use regex replace to replace the "." with an underscore.
I found this but I edited it so suit your question.
const camelToSnakeCase = str => str.replace(/[A-Z]/g, letter => `_${letter.toLowerCase()}`).replace(/^_/,'')
Good examples for js:
Snake Case
Kebab Case
Camel Case
Pascal Case
have here
function toCamelCase(s) {
// remove all characters that should not be in a variable name
// as well underscores an numbers from the beginning of the string
s = s.replace(/([^a-zA-Z0-9_\- ])|^[_0-9]+/g, "").trim().toLowerCase();
// uppercase letters preceeded by a hyphen or a space
s = s.replace(/([ -]+)([a-zA-Z0-9])/g, function(a,b,c) {
return c.toUpperCase();
});
// uppercase letters following numbers
s = s.replace(/([0-9]+)([a-zA-Z])/g, function(a,b,c) {
return b + c.toUpperCase();
});
return s;
}
Try this function, hope it helps.
"TestString".replace(/[A-Z]/g, val => "_" + val.toLowerCase()).replace(/^_/,"")
replaces all uppercase with an underscore and lowercase, then removes the leading underscore.
A Non-Regex Answer that converts PascalCase to snake_case
Note: I understand there are tons of good answers which solve this question elegantly. I was recently working on something similar to this where I chose not to use regex. So I felt to answer a non-regex solution to this.
const toSnakeCase = (str) => {
return str.slice(0,1).toLowerCase() + str.split('').slice(1).map((char) => {
if (char == char.toUpperCase()) return '_' + char.toLowerCase();
else return char;
}).join('');
}
Eg.
inputString = "ILoveJavascript" passed onto toSnakeCase()
would become "i_love_javascript"

Splitting a string by delimiter if characters exist either side of delimiter

In JavaScript how can I split a string by delimiter only if the delimiter has a character (non-numeric) on either side of it? Can this be accomplished with via RegEx?
var str = 'this-is-hyphenated - this isn't';
Should result in an array: this | is | hyphenated - this isn't
how can I split a string by delimiter only if the delimiter has a character (non-numeric) on either side of it?
Given the "non–numeric" criterion, you can't use \b (end of word) flag as it will match digits in words (e.g. foo2 is seen as one word, not a word followed by '2').
You can do it in two steps using replace with a string that is extremely unlikely to occur (say &&&&) and capture groups:
s.replace(/([a-z])-([a-z])/ig,'$1&&&&$2').split('&&&&')
however that may not be what you want.
This is a less RegEx version
var str = 'this-is-hyphenated - this isn\'t',
chrRegEx = /[a-z]/i;
var result = str.split("-").reduce(function(result, current) {
var previous = result[result.length - 1];
if (!previous) {
return result.concat(current);
}
if (chrRegEx.test(previous[previous.length-1]) && chrRegEx.test(current[0])){
result = result.concat(current);
} else {
result[result.length - 1] += "-" + current;
}
return result;
}, []);
console.log(result);
# [ 'this', 'is', 'hyphenated - this isn\'t' ]

Categories