Temporarily remove URL from string - javascript

I've created a Twitter bot that copies the tweet of a certain user and then uwu-fies them, meaning it just changes some characters to make them funny, Elon becomes Ewon for example. Now of course it's very debatable how funny this actually is but I think that's besides the point for now.
If I got a tweet with a URL, of course the URL can't be uwu-fied since it would become invalid. The way I've sold this right now is, search for the URL using a regex, replace it with a performance.now() (I used to use a UUID v4 but that also contains characters that would get uwu-fied) and save an object with the URL and performance.now() that was used.
Then when the uwu-fication is done I can reconstruct is using the saved object, this does work but it feels like a bodged solution. The only other solution I could think of is generating a UUID that only contains characters that won't get uwu-fied?
EDIT:
Based of the current marked answer I've solved the problem by transforming my code into this:
// Split the sentence into words
const words = sentence.split(` `);
const pattern = new RegExp(/(?:https?|ftp):\/\/[\n\S]+/g);
// If the word is a URL just attach it to the new string without uwufying
let uwufied = ``;
words.forEach(word => uwufied += ` ${pattern.test(word) ? word : uwufyWord(word)}`);

You can split the tweet into an array .split(" "), and then run over that array with a foreach loop. You can handle the tweet word by word then. At the start of your handle process you would check that the "word" is not an url. Then handle your replacements.
let tweet = "Hello World. What's up?"
let arr = tweet.split(" ")
let output = ""
for (word of arr) {
// Check that it's not an URL here
// Replace here
output += word + " "
}
// Use output here
console.log(output)

Related

Is it possible to make a random string be generated in discord.js? Also to have the backtick formatting

I was wondering how to make my discord bot generate a random string of letters of numbers every time a punishment is done, and then include that in the log embed, the dm embed and the confirmation embed. And then the next time, another random one is generated. Also, I want to have the hi formatting in js but it's going weird in my VSC I don't know how to do that.
This is what I found for generating a random string:
function getRandomString(length) {
var randomChars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
var result = '';
for ( var i = 0; i < length; i++ ) {
result += randomChars.charAt(Math.floor(Math.random() * randomChars.length));
}
return result;
}
//usage: getRandomString(20); // pass desired length of random string
My code for the Embed is:
.setDescription(`${verify} ${user} has been **warned** with **ID** `${verify}``)
var sendEm = await msg.channel.send(warnEmbed);
msg.delete()
}
I try to put backticks in $verify) but it doesn't work because it makes all the code underneath orange, does anyone know how to help for this?
I think what you're looking for is to escape the backtick. Escaped characters are treated as any normal character in a string. It's as simple as putting a backslash in front of the character you want to escape.
So in your case, you'd need to do this:
// Using "\`" will insert an actual backtick in the string
.setDescription(`${verify} ${user} has been **warned** with **ID** \`${verify}\``)

Replace characters of a string matched by regex

I am in a situation to find the domain name of all valid URLs among a HTML page, replace these domain names with another domain name, but within the domain name, I need to do a 2nd replacement. For example, say the url https://www.example.com/path/to/somewhere is among the HTML page, I need to eventually transfer it into something like www-example-com.another.domain/path/to/somewhere.
I can do the first match and replace with the following code:
const regex = new RegExp('(https?:\/\/([^:\/\n\"\'?]+))', 'g');
txt = txt.replace(regex, "$1.another.domain");
but I have no idea how to do the second match and replace to replace the . into -. I wonder if there is any efficient way to finish this task. I tried to do something like the following but it does not work:
const regex = new RegExp('(https?:\/\/([^:\/\n\"\'?]+))', 'g');
txt = txt.replace(regex, "$1".replace(/'.'/g, '-') + ".another.domain");
Ok - I think I know what you're looking for. I'll explain what it's doing.
You 2 capture groups: the one before and the one after the first /.
You're taking the first capture group, and converting the . to -
You're adding via string .another.domain and then you're appending the 2nd capture group on it afterward
const address1 = 'https://www.example.com/path/to/somewhere';
const newDomain = "another.domain"
const pattern = /(https?:\/\/[^:\/\n\"\'?]+)(\/.*)/;
const matches = pattern.exec(address1);
const converted = matches[1].replace(/\./g, "-") + `.${newDomain}${matches[2]}`;
console.log(converted);
You can use the function version of String.prototype.replace() to have some more control over the specific replacements.
For example...
const txt = 'URL is https://www.example.com/path/to/somewhere'
const newTxt = txt.replace(/(https?:\/\/)([\w.]+)/g, (_, scheme, domain) =>
`${scheme}${domain.replace(/\./g, '-')}.another.domain`)
console.log(newTxt)
Here, scheme is the first capture group (https?:\/\/) and domain is the second ([\w.]+).
If you need a fancier domain matcher (as per your question), just substitute that part of the regex.

RegEx for matching YouTube embed ID

I'm in non-modern JavaScript and I have a string defined as follows:
"//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0"
I want to pull out just the DmYK479EpQc but I don't know the length. I do know that I want what is after the / and before the ?
Is there some simple lines of JavaScript that would solve this?
Use the URL object?
console.log(
(new URL("//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0", location.href)).pathname
.split('/')
.pop());
Why? Because I can likely make up a URL that defeats the regex (though for youtube it's probably unlikely)
This expression might help you to do so, and it might be faster:
(d\/)([A-z0-9]+)(\?)
Graph
This graph shows how the expression would work and you can visualize other expressions in this link:
const regex = /(.*)(d\/)([A-z0-9]+)(\?)(.*)/gm;
const str = `//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0`;
const subst = `$3`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Performance Test
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
const repeat = 1000000;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const string = '//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0';
const regex = /(.*)(d\/)([A-z0-9]+)(\?)(.*)/gm;
var match = string.replace(regex, "$3");
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
How about non-regex way
console.log("//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0".split('/').pop().split('?')[0]);
I'm not going to give a piece of code because this is a relatively simple algorithm, and easy to implement.
Please note that those links has this format (correct me if I'm wrong):
https:// or http://
www.youtube.com/
embed/
Video ID (DmYK479EpQc in this case)
?parameters (note that they start ALWAYS with the character ?)
You want the ID of the video, so you can split the string into those sections and if you store those sections in one array you can be sure that the ID is at the 3rd position.
One example of how that array would look like would be:
['https://', 'www.youtube.com', 'embed', 'DmYK479EpQc', '?vq=hd720&rel=0']
One option uses a regex replacement:
var url = "//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0";
var path = url.replace(/.*\/([^?]+).*/, "$1");
console.log(path);
The above regex pattern says to:
.* match and consume everything up to and
/ including the last path separator
([^?]+) then match and capture any number of non ? characters
.* then consume the rest of the input
Then, we just replace with the first capture group, which corresponds to the text after the final path separator, but before the start of the query string, should the URL have one.
You can use this regex
.* match and consume everything up to
[A-z0-9]+ then match and capture any number and character between A-z
.* then consume the rest of the input
const ytUrl = '//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0';
const regex = /(.*)(d\/)([A-z0-9]+)(\?)(.*)/gm;
const position = '$3';
let result = ytUrl.replace(regex, position);
console.log('YouTube ID: ', result);
This regex just split the string into different sections and the YouTube id is at the 3rd position.
Another, solution is using split. This method splits a string into an array of substrings.
const ytUrl = '//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0';
let result = ytUrl.split('/').pop().split('?').shift()
console.log('YouTube ID: ', result);
In this sample, we split the URL using / as separator. Then we took the last element of the array with the pop method. and finally we split again using ? as separator and we take the first element of the array with the shift method.

Regex - Capture anything within parentheses and brackets

I'm really bad at regex and I'm not sure what to do with this. I want to capture anything within () or [] (including the brackets) and nothing after. For example if I type [this is text] I want it to return exactly that. Also I have a json full of terms the user types. If the term is not on the json then it shouldn't print. This is the snippet of code which relates to the regex.
let sf_re = /(?:(,)\s+|\s+(xx)\s+)|\+|(,)/
if (document.querySelector(".images")){
document.querySelector(".images").innerHTML = ""
for(var i=0;i<item.length;i++){
if(/(\[(.*?)\]|\((.*?)\))/.test(item[i])){
let text = item[i]
let note = document.createElement("span")
note.innerHTML = String(text)
document.querySelector(".images").appendChild(note)
}
Here is an example of what happens
The only thing that should show is [cat cat cat]. "dog" should not appear at all because it's not on my list. In regexr it seems to work fine. I'm not sure what to add.
Edit: I think that my original post had insufficient information. The user types into an input bar which is split into an array using .split(). My main goal is to allow the user to add small lines of text. For example if my json has the terms "cat", "dog", and "pig" and the user types those terms, then it will return something. This is what I get using the suggestions below. Notice how "f" returns an arrow the first time, but not the second time. I'm not sure why this happens. I may be using "match" wrong. I tried this and I get an error "cannot read property of undefined":
let regex = /(\[(.*?)\]|\((.*?)\))/
if (document.querySelector(".images")){
document.querySelector(".images").innerHTML = ""
for(var i=0;i<item.length;i++){
if(item[i].match(regex)[0]){
let text = item[i]
let note = document.createElement("span")
note.innerHTML = String(text)
document.querySelector(".images").appendChild(note)
}
Also I have a json full of terms the user types. If the term is not on the json then it shouldn't print.
You can use
obj.hasOwnProperty(prop);
This will let you know whether the object contains the specified prop. Here is an Example -
var x = {
y: 10
};
alert(x.hasOwnProperty("y")); //true
alert(x.hasOwnProperty("z")); //false
The regex is [\[|\(](.*)[\]\)]
Explanation: read '[' or '(', then read anything until there is a ']' or a ')'
var regexp = /[\[|\(](.*)[\]\)]/;
var match = ("[cat cat cat]dog").match(regexp);
console.log(match[1]);
Here are two simple regexs you can use. One matches on brackets and the other matches on parenthesis.
The third is a combination that checks for either. Keep in mind, however, that you'll only ever be able to one of each for [ ] and ( ). If you're accepting multiple of either kind, then the regex will break since it'll return all characters between the outermost ones.
Using match it gives back an array, so if you know you're only getting a single result, you can grab the first item, like in result and result2. If not, you can just deal with the array of results like in result3.
const regex = /\[.*\]/
const regex2 = /\(.*\)/
const regex3 = /(\[.*\])|(\(.*\))/g
const result = ("[cat cat cat]dog").match(regex3)[0];
const result2 = ("banana(cat cat)horse").match(regex3)[0];
const result3 = ("alfred[cat cat cat]banana(dog dog)").match(regex3);
console.log(result); //[cat cat cat]
console.log(result2); //(cat cat)
console.log(result3); // [ '[cat cat cat]', '(dog dog)' ]
The . character matches anything, and the * will repeat 0 or more times. The \ character can be used to escape [, ], ( and ) as necessary (and any other special regex character). The g at the end will check across the entire string.
Hopefully that should be enough to get you un-stuck.

Parse string regex for known keys but leave separator

Ok, So I hit a little bit of a snag trying to make a regex.
Essentially, I want a string like:
error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z
to be parsed to read
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
ateTo=2013-01-16T05:00:00.000Z
So I want it to pull known keywords, and ignore other strings that have =.
My current regex looks like this:
(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=[\w\s\f\-\:]+(?![(error|user|dateFrom|dateTo|timeFrom|timeTo|hang)\=])
So I'm using known keywords to be used dynamically so I can list them as being know.
How could I write it to include this requirement?
You could use a replace like so:
var input = "error=some=new item user=max dateFrom=2013-01-15T05:00:00.000Z dateTo=2013-01-16T05:00:00.000Z";
var result = input.replace(/\s*\b((?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/g, "\n$1");
result = result.replace(/^\r?\n/, ""); // remove the first line
Result:
error=some=new item
user=max
dateFrom=2013-01-15T05:00:00.000Z
dateTo=2013-01-16T05:00:00.000Z
Another way to tokenize the string:
var tokens = inputString.split(/ (?=[^= ]+=)/);
The regex looks for space that is succeeded by (a non-space-non-equal-sign sequence that ends with a =), and split at those spaces.
Result:
["error=some=new item", "user=max", "dateFrom=2013-01-15T05:00:00.000Z", "dateTo=2013-01-16T05:00:00.000Z"]
Using the technique above and adapt your regex from your question:
var tokens = inputString.split(/(?=\b(?:error|user|dateFrom|dateTo|timeFrom|timeTo|hang)=)/);
This will correctly split the input pointed out by Qtax mentioned in the comment: "error=user=max foo=bar"
["error=", "user=max foo=bar"]

Categories