Can not get content with special character URL in javascript? - javascript

I'm using javascript to get html content from url. I'm follow code in below :
function getContentHTML(theUrl) {
$.ajax({
url: theUrl,
crossDomain: true,
success: function (data) {
$("#divcontent").html(data);
}
});
}
If URL is in English(ex : bbc.com,voa.com....),It will be ok,but if url have contain special character, I can not get content.How can I slove?U can test in my URL.
URL : https://tw.news.yahoo.com/%E7%A9%BA%E9%9B%A3%E8%BB%8D%E6%96%B9%E5%A4%A7%E6%94%B9%E5%8F%A3-%E7%A2%BA%E5%AF%A6%E6%9C%89%E4%BB%8B%E5%85%A5%E8%88%AA%E7%AE%A1-120050818.html

Special characters are not allowed in the url. You need to encode the special characters. Use encodeURI function available in javascript. Check rfc 1738 for more details.

try this function to decode urls:
var uri_dec = decodeURIComponent(encodedUrl);
or
var uri_dec = decodeURI(encodedUrl);

This is because url is encoded,browser will automatically put some characters in place of special characters in url for example in place of space in url %20 gets inserted.
So you have to encode decode url,explained very clearly in there links :
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent
var encodeduri=https://tw.news.yahoo.com/%E7%A9%BA%E9%9B%A3%E8%BB%8D%E6%96%B9%E5%A4%A7%E6%94%B9%E5%8F%A3-%E7%A2%BA%E5%AF%A6%E6%9C%89%E4%BB%8B%E5%85%A5%E8%88%AA%E7%AE%A1-120050818.html
var uri_decoded = decodeURIComponent(encodeduri); //The decodeURIComponent() function decodes a URI component.
OR
var uri_decoded = decodeURI(encodeduri); //The decodeURI() function is used to decode a URI.

Related

Jquery check if String consists of any URL and get it

I want to get a url from a string but I am unfamiliar with the use of regex or any such methods.
For example i have 3 strings,
"I've navigated to www.facebook.com";
"I've navigated to www.facebook.com and to www.google.com";
"I've navigated to https://www.facebook.com ;
In my case : I should get "www.facebook.com" as the url that is extracted from the first string.
All I want is to get the first url inside the string so i can make a link preview using an API I found. But I am struggling to get the url using Javascript or Jquery. The string will be gotten from a textbox and I want to get the url on keyup.
You can use this function to extract the first URL found in the given string:
function getFirstUrl(string) {
var pattern = /(https?:\/\/)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)/;
var match = string.match(pattern);
return match[0];
}
var string = "I've nagivated to www.facebook.com and to www.google.com";
var url = getFirstUrl(string);
// url === 'www.facebook.com'
I got the regex pattern from this answer and modified it to make the https:// part optional.

Remove (#) from URL in Silverlight

I have a URL e.g. http://localhost:8000/#Test/Method
I am doing following:
System.Windows.Browser.HtmlPage.Window.CurrentBookmark = string.Empty;
which just remove Test/Method but not '#'
I need to modify browser URL
as:
http://localhost:8000/
Any ideas on how to fix this?
If you are manipulating a String you can use the following method :
String Url = "Foo#Bar";
Url = Url.Replace("#", string.Empty);
Use regular expression
String url = "http://localhost:8000/#Test/Method"
url = url.replace(/#/g, "");
The above reg expression will scan for all occurrences of '#' character and replaces with empty character

jQuery AJAX / Posting data with symbols

I am sending quite a few values with my AJAX call, like this:
var postData = "aid="+aid+"&lid="+lid+"&token="+token+"&count="+count+"&license="+license;
postData = postData + "&category="+category+"&event_name="+event_name+"&set_menu="+set_menu;
postData = postData + "&set_id="+set_id+"&location="+location+"&delay="+delay;
and then sending the call like this:
$.ajax({
type : 'GET',
url : 'ajax/createFolderID.asp',
dataType : 'html',
data : postData,
success : function() { do something },
complete : function() { do something },
error : function() { do something }
});
The problem is, one of the querystring values, "event_name", comes from user input. If the user enters an ampersand (&) symbol, the postData string breaks and won't send anything after that symbol.
Example case: &event_name=D&G Clothing Launch
Party&set_menu=existing...
I understand what is going wrong, but not so sure what the best fix would be. Do I convert those characters to something else, or is there a way of escaping them? Also, are there any other characters that will cause harm to the script, like plus (+) or minus (-) signs, or apostrophes (')?
Escape each of your values.
var postData = "aid="+escape(aid)+"&lid="+escape(lid) ... ;
If you pass the postData to jQuery as a map, it will encode the components for you:
var postData = { aid: aid,
lid: lid,
...
If you really need to pass a string, you should use encodeURIComponent to properly encode the user data.
The W3C has some more information on form encoding.
First use a Map.
post = {
"aid":aid,
"lid":lid,
"token":token
...
}
Then generate url-encoded string.
a=[];
for(var x in post){
a.push(encodeURIComponent(x)+"="+encodeURIComponent(post[x]));
}
var postData = a.join("&");
Update 1:
If you are using jQuery no need to generate url-encoded string. Just pass the map.
Update 2:
escape is not good as it only handles with ASCII. So using encodeURIComponent. When are you supposed to use escape instead of encodeURI / encodeURIComponent? Thanks #SamuelEdwinWard
Just use :
postData = encodeURIComponent (postData);
before lauching it.
escape, unescape, encodeURI, encodeURIComponent are various methods you may need using Ajax.
However, if you use escape (http://www.google.com), you will also escape ://, destroying your URI.
That's why you should use encodeURI, or encodeURIComponent.
See also When are you supposed to use escape instead of encodeURI / encodeURIComponent?

How to convert signs in url/text to hex characters? (converting = to %3D)

With the script I'm making, jquery is getting vars from url parameter. The value that its getting is an url so if its something like
http://localhost/index.html?url=http://www.example.com/index.php?something=some
it reads:
url = http://www.example.com/index.php?something
If its like
http://localhost/index.html?url=http://www.example.com/index.php?something%3Dsome
it reads:
url = http://www.example.com/index.php?something%3Dsome
which would register as a valid url. my question is how can I search for = sign in the url variable and replace it with hex %3D with jquery or javascript?
Use the (built-in) encodeURIComponent() function:
url = 'http://localhost/index.html?url=' +
encodeURIComponent('http://www.example.com/index.php?something=some');
Are you looking for encodeURIComponent and decodeURIComponent?

Encode URL in JavaScript

How do you safely encode a URL using JavaScript such that it can be put into a GET string?
var myUrl = "http://example.com/index.html?param=1&anotherParam=2";
var myOtherUrl = "http://example.com/index.html?url=" + myUrl;
I assume that you need to encode the myUrl variable on that second line?
Check out the built-in function encodeURIComponent(str) and encodeURI(str).
In your case, this should work:
var myOtherUrl =
"http://example.com/index.html?url=" + encodeURIComponent(myUrl);
You have three options:
escape() will not encode: #*/+
encodeURI() will not encode: ~!##$&*()=:/,;?+'
encodeURIComponent() will not encode: ~!*()'
But in your case, if you want to pass a URL into a GET parameter of other page, you should use escape or encodeURIComponent, but not encodeURI.
See Stack Overflow question Best practice: escape, or encodeURI / encodeURIComponent for further discussion.
Stick with encodeURIComponent(). The function encodeURI() does not bother to encode many characters that have semantic importance in URLs (e.g. "#", "?", and "&"). escape() is deprecated, and does not bother to encode "+" characters, which will be interpreted as encoded spaces on the server (and, as pointed out by others here, does not properly URL-encode non-ASCII characters).
There is a nice explanation of the difference between encodeURI() and encodeURIComponent() elsewhere. If you want to encode something so that it can safely be included as a component of a URI (e.g. as a query string parameter), you want to use encodeURIComponent().
The best answer is to use encodeURIComponent on values in the query string (and nowhere else).
However, I find that many APIs want to replace " " with "+" so I've had to use the following:
const value = encodeURIComponent(value).replace('%20','+');
const url = 'http://example.com?lang=en&key=' + value
escape is implemented differently in different browsers and encodeURI doesn't encode many characters (like # and even /) -- it's made to be used on a full URI/URL without breaking it – which isn't super helpful or secure.
And as #Jochem points out below, you may want to use encodeURIComponent() on a (each) folder name, but for whatever reason these APIs don't seem to want + in folder names so plain old encodeURIComponent works great.
Example:
const escapedValue = encodeURIComponent(value).replace('%20','+');
const escapedFolder = encodeURIComponent('My Folder'); // no replace
const url = `http://example.com/${escapedFolder}/?myKey=${escapedValue}`;
I would suggest to use the qs npm package:
qs.stringify({a:"1=2", b:"Test 1"}); // gets a=1%3D2&b=Test+1
It is easier to use with a JavaScript object and it gives you the proper URL encoding for all parameters.
If you are using jQuery, I would go for the $.param method. It URL encodes an object, mapping fields to values, which is easier to read than calling an escape method on each value.
$.param({a:"1=2", b:"Test 1"}) // Gets a=1%3D2&b=Test+1
Modern solution (2021)
Since the other answers were written, the URLSearchParams API has been introduced. It can be used like this:
const queryParams = { param1: 'value1', param2: 'value2' }
const queryString = new URLSearchParams(queryParams).toString()
// 'param1=value1&param2=value2'
It also encodes non-URL characters.
For your specific example, you would use it like this:
const myUrl = "http://example.com/index.html?param=1&anotherParam=2";
const myOtherUrl = new URL("http://example.com/index.html");
myOtherUrl.search = new URLSearchParams({url: myUrl});
console.log(myOtherUrl.toString());
This solution is also mentioned here and here.
encodeURIComponent() is the way to go.
var myOtherUrl = "http://example.com/index.html?url=" + encodeURIComponent(myUrl);
But you should keep in mind that there are small differences from PHP version urlencode() and as #CMS mentioned, it will not encode every character. Guys at http://phpjs.org/functions/urlencode/ made JavaScript equivalent to phpencode():
function urlencode(str) {
str = (str + '').toString();
// Tilde should be allowed unescaped in future versions of PHP (as reflected below), but if you want to reflect current
// PHP behavior, you would need to add ".replace(/~/g, '%7E');" to the following.
return encodeURIComponent(str)
.replace('!', '%21')
.replace('\'', '%27')
.replace('(', '%28')
.replace(')', '%29')
.replace('*', '%2A')
.replace('%20', '+');
}
I think now in 2022 to be really safe, you should always consider constructing your URLs using the URL() interface. It'll do most of the job for you. So coming to your code,
const baseURL = 'http://example.com/index.html';
const myUrl = new URL(baseURL);
myUrl.searchParams.append('param', '1');
myUrl.searchParams.append('anotherParam', '2');
const myOtherUrl = new URL(baseURL);
myOtherUrl.searchParams.append('url', myUrl.href);
console.log(myUrl.href);
// Outputs: http://example.com/index.html?param=1&anotherParam=2
console.log(myOtherUrl.href);
// Outputs: http://example.com/index.html?url=http%3A%2F%2Fexample.com%2Findex.html%3Fparam%3D1%26anotherParam%3D2
console.log(myOtherUrl.searchParams.get('url'));
// Outputs: http://example.com/index.html?param=1&anotherParam=2
Or...
const params = new URLSearchParams(myOtherUrl.search);
console.log(params.get('url'));
// Outputs: http://example.com/index.html?param=1&anotherParam=2
Something like this is assured not to fail.
To encode a URL, as has been said before, you have two functions:
encodeURI()
and
encodeURIComponent()
The reason both exist is that the first preserves the URL with the risk of leaving too many things unescaped, while the second encodes everything needed.
With the first, you could copy the newly escaped URL into address bar (for example) and it would work. However your unescaped '&'s would interfere with field delimiters, the '='s would interfere with field names and values, and the '+'s would look like spaces. But for simple data when you want to preserve the URL nature of what you are escaping, this works.
The second is everything you need to do to make sure nothing in your string interfers with a URL. It leaves various unimportant characters unescaped so that the URL remains as human readable as possible without interference. A URL encoded this way will no longer work as a URL without unescaping it.
So if you can take the time, you always want to use encodeURIComponent() -- before adding on name/value pairs encode both the name and the value using this function before adding it to the query string.
I'm having a tough time coming up with reasons to use the encodeURI() -- I'll leave that to the smarter people.
What is URL encoding:
A URL should be encoded when there are special characters located inside the URL. For example:
console.log(encodeURIComponent('?notEncoded=&+'));
We can observe in this example that all characters except the string notEncoded are encoded with % signs. URL encoding is also known as percentage encoding because it escapes all special characters with a %. Then after this % sign every special character has a unique code
Why do we need URL encoding:
Certain characters have a special value in a URL string. For example, the ? character denotes the beginning of a query string. In order to successfully locate a resource on the web, it is necessary to distinguish between when a character is meant as a part of string or part of the URL structure.
How can we achieve URL encoding in JavaScript:
JavaScript offers a bunch of built-in utility functions which we can use to easily encode URLs. These are two convenient options:
encodeURIComponent(): Takes a component of a URI as an argument and returns the encoded URI string.
encodeURI(): Takes a URI as an argument and returns the encoded URI string.
Example and caveats:
Be aware of not passing in the whole URL (including scheme, e.g., https://) into encodeURIComponent(). This can actually transform it into a not functional URL. For example:
// for a whole URI don't use encodeURIComponent it will transform
// the / characters and the URL won't fucntion properly
console.log(encodeURIComponent("http://www.random.com/specials&char.html"));
// instead use encodeURI for whole URL's
console.log(encodeURI("http://www.random.com/specials&char.html"));
We can observe f we put the whole URL in encodeURIComponent that the forward slashes (/) are also converted to special characters. This will cause the URL to not function properly anymore.
Therefore (as the name implies) use:
encodeURIComponent on a certain part of a URL which you want to encode.
encodeURI on a whole URL which you want to encode.
To prevent double encoding, it's a good idea to decode the URL before encoding (if you are dealing with user entered URLs for example, which might be already encoded).
Let’s say we have abc%20xyz 123 as input (one space is already encoded):
encodeURI("abc%20xyz 123") // Wrong: "abc%2520xyz%20123"
encodeURI(decodeURI("abc%20xyz 123")) // Correct: "abc%20xyz%20123"
A similar kind of thing I tried with normal JavaScript:
function fixedEncodeURIComponent(str){
return encodeURIComponent(str).replace(/[!'()]/g, escape).replace(/\*/g, "%2A");
}
You should not use encodeURIComponent() directly.
Take a look at RFC3986: Uniform Resource Identifier (URI): Generic Syntax
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
The purpose of reserved characters is to provide a set of delimiting characters that are distinguishable from other data within a URI.
These reserved characters from the URI definition in RFC3986 ARE NOT escaped by encodeURIComponent().
MDN Web Docs: encodeURIComponent()
To be more stringent in adhering to RFC 3986 (which reserves !, ', (, ), and *), even though these characters have no formalized URI delimiting uses, the following can be safely used:
Use the MDN Web Docs function...
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
Performance
Today (2020.06.12) I performed a speed test for chosen solutions on macOS v10.13.6 (High Sierra) on browsers Chrome 83.0, Safari 13.1, and Firefox 77.0. This results can be useful for massive URLs encoding.
Conclusions
encodeURI (B) seems to be fastest, but it is not recommended for URLs
escape (A) is a fast cross-browser solution
solution F recommended by MDN is medium fast
solution D is slowest
Details
For solutions
A
B
C
D
E
F
I perform two tests
for short URL - 50 characters - you can run it HERE
for long URL - 1M characters - you can run it HERE
function A(url) {
return escape(url);
}
function B(url) {
return encodeURI(url);
}
function C(url) {
return encodeURIComponent(url);
}
function D(url) {
return new URLSearchParams({url}).toString();
}
function E(url){
return encodeURIComponent(url).replace(/[!'()]/g, escape).replace(/\*/g, "%2A");
}
function F(url) {
return encodeURIComponent(url).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
// ----------
// TEST
// ----------
var myUrl = "http://example.com/index.html?param=1&anotherParam=2";
[A,B,C,D,E,F]
.forEach(f=> console.log(`${f.name} ?url=${f(myUrl).replace(/^url=/,'')}`));
This snippet only presents code of chosen solutions
Example results for Chrome
Nothing worked for me. All I was seeing was the HTML of the login page, coming back to the client side with code 200. (302 at first but the same Ajax request loading login page inside another Ajax request, which was supposed to be a redirect rather than loading plain text of the login page).
In the login controller, I added this line:
Response.Headers["land"] = "login";
And in the global Ajax handler, I did this:
$(function () {
var $document = $(document);
$document.ajaxSuccess(function (e, response, request) {
var land = response.getResponseHeader('land');
var redrUrl = '/login?ReturnUrl=' + encodeURIComponent(window.location);
if(land) {
if (land.toString() === 'login') {
window.location = redrUrl;
}
}
});
});
Now I don't have any issue, and it works like a charm.
Here is a live demo of encodeURIComponent() and decodeURIComponent() JavaScript built-in functions:
<!DOCTYPE html>
<html>
<head>
<style>
textarea{
width: 30%;
height: 100px;
}
</style>
<script>
// Encode string to Base64
function encode()
{
var txt = document.getElementById("txt1").value;
var result = btoa(txt);
document.getElementById("txt2").value = result;
}
// Decode Base64 back to original string
function decode()
{
var txt = document.getElementById("txt3").value;
var result = atob(txt);
document.getElementById("txt4").value = result;
}
</script>
</head>
<body>
<div>
<textarea id="txt1">Some text to decode
</textarea>
</div>
<div>
<input type="button" id="btnencode" value="Encode" onClick="encode()"/>
</div>
<div>
<textarea id="txt2">
</textarea>
</div>
<br/>
<div>
<textarea id="txt3">U29tZSB0ZXh0IHRvIGRlY29kZQ==
</textarea>
</div>
<div>
<input type="button" id="btndecode" value="Decode" onClick="decode()"/>
</div>
<div>
<textarea id="txt4">
</textarea>
</div>
</body>
</html>
Encode URL String
var url = $(location).attr('href'); // Get the current URL
// Or
var url = 'folder/index.html?param=#23dd&noob=yes'; // Or specify one
var encodedUrl = encodeURIComponent(url);
console.log(encodedUrl);
// Outputs folder%2Findex.html%3Fparam%3D%2323dd%26noob%3Dyes
For more information, go to, jQuery Encode/Decode URL String.
Use fixedEncodeURIComponent function to strictly comply with RFC 3986:
function fixedEncodeURIComponent(str) {
return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
return '%' + c.charCodeAt(0).toString(16);
});
}
You can use ESAPI library and encode your URL using the below function. The function ensures that '/'s are not lost to encoding while the remainder of the text contents are encoded:
function encodeUrl(url)
{
String arr[] = url.split("/");
String encodedUrl = "";
for(int i = 0; i<arr.length; i++)
{
encodedUrl = encodedUrl + ESAPI.encoder().encodeForHTML(ESAPI.encoder().encodeForURL(arr[i]));
if(i<arr.length-1) encodedUrl = encodedUrl + "/";
}
return url;
}
Don't forget the /g flag to replace all encoded ' '
var myOtherUrl = "http://example.com/index.html?url=" + encodeURIComponent(myUrl).replace(/%20/g,'+');
I always use this to encode stuff for URLs. This is completely safe because it will encode every single character even if it doesn't have to be encoded.
function urlEncode(text) {
let encoded = '';
for (let char of text) {
encoded += '%' + char.charCodeAt(0).toString(16);
}
return encoded;
}
let name = `bbb`;
params = `name=${name}`;
var myOtherUrl = `http://example.com/index.html?url=${encodeURIComponent(params)}`;
console.log(myOtherUrl);
Use backtick now in ES6 to encode urls
try this - https://bbbootstrap.com/code/encode-url-javascript-26885283

Categories