I want to open word-documents by clicking on a link in my solution. The link below shows how it is structured in ofe for office. This solution is really nice because it works in every browser but i have problems with special characters.
ms-word:ofe|u|file://our.local/Testing ÅÄÖ.DOCX
I Have tried different approaches to solve this problem but its not working when åäö is present in the path. EncodeURI on the path does not help for instance.
https://learn.microsoft.com/en-us/office/client-developer/office-uri-schemes does not describe anything out of the ordinary and only follow URI spec.
Documents without special characters works great but i can not figure out how special characters should be encoded to make it work.
If i take the file:\... part and past it into any browser it is working but not with the ofe prefix. So it should be some problem with encoding due to it is working fine without any special characters.
Running in cmd is also working:
So in this case i guess the browser is encoding the characters before sending to protocolhandler ??
Related
I use browserless.js (headless Chrome) to fetch the html code of a website, and then use a regular expression to find certain image URLs.
One example is the following:
https://vignette.wikia.nocookie.net/moviepedia/images/8/88/Adrien_Brody.jpg/revision/latest/top-crop/width/360/height/450?cb\u003d20141113231800\u0026path-prefix\u003dde
There are unicode characters such as \u003d, which should be decoded (in this case to =). The reason is that I want to include these images in a site, and without decoding some of them cannot be displayed (like that one above, just paste the URL; it gives broken-image.webp).
I have tried lots of things, but nothing works.
JSON.parse(JSON.stringify(...))
String.prototype.normalize()
decodeURIComponent
Curiously, the regular expression for "\u003d" (i.e. "\\u003d" in js) does not match that string above, but "u003d" does.
This is all very weird, and my current guess is that browserless is responsible for some weird formatting behind the scenes. Namely, when I console log the URL and copy paste it somewhere else, every method mentioned above works for decoding.
I hope that someone can help me on this.
Just to mark this one as answered. Thomas replied:
JSON.parse(`"${url}"`)
I have a web app, which allows searching. So when I go to somedomain.com/search/<QUERY> it searches for entities according to <QUERY>. The problem is, when I try to search for . or .. it doesn't work as expected (which is pretty obvious). What surprised me though, is that if I manually enter the url of somedomain.com/search/%2E, the browser (tested Chrome and IE11) converts it somedomain.com/search/ and issues a request without necessary payload.
So far I haven't found anything that would say it's not possible to make this work, so I came here. Right now I have only one option: replacing . and .. to something like __dotPlaceholder__, but this feels like a dirty hack to me.
Any solution (js or non-js) will be welcomed. Any information on why do browsers strip url-encoded dots is also a nice-to-have.
Unfortunately part of RFC3986 defines the URI dot segments to be normalised and stripped out in that case, ie http://example.com/a/./ to become http://example.com/a
see https://www.rfc-editor.org/rfc/rfc3986#page-33 for more information
I am using some JavaScript code in SSRS to open a link in a new window on a report. The report links point to file locations on a server. The code I am using within Reporting Services for the link is:
="javascript:void(window.open('"+ "file:" & Replace(Fields!FilePath.Value,"\","/") + "','_blank'))"
This code works just fine when the file name is something 'normal' such as:
\\myserver\images\Files\1969\1-000-002_SE 82ND AVE 1_1969.pdf
However, when there are special characters (at least # for sure), I get an error message. This is what happens. An example file name would be:
\\myserver\images\Files\1978\1-001-003_SE 82nd AVE #12 1_1978.pdf
In this case what gets returned as the URL is:
\\myserver\images\Files\1978\1-001-003_SE 82nd AVE
As can be seen, the URL is cut off at the first instance of the number sign. If I copy the shortcut for the offending link, this is what I get:
javascript:void(window.open('file://myserver/images/Files/1978/1-001-003_SE%2082nd%AVE%20#12%201_1978.pdf','_blank'))
It appears that the JavaScript is encoding the file path correctly but something is getting lost in translation between the JavaScript code and the URL.
I am unable to change the file names so I need to come up with a way to work with the special characters. I have tried using EncodeURI() but could not figure out how to format it correctly in SSRS to work with the existing JavaScript.
Any ideas would be welcomed.
URLs will recognize the HTML character numbers. So, outside of your JavaScript, use an SSRS replace function for each special character you expect to find, replacing each with its corresponding HTML number code. For instance, a pound sign is %23; and a space is %20.
Note, I have some pages that use pound signs to split out URL parameters, and this does NOT seem to work in those cases. However, it might work in your situation. To try this, change your function to the following:
="javascript:void(window.open('"+ "file:" & Replace(Replace(Fields!FilePath.Value,"\","/"),"#","%23") + "','_blank'))"
In case this does work for you, you can find more of these codes here.
I am passing an object via the url using:
encodeURIComponent(JSON.stringify(myObject))
"ä" is encoded as "%C3%A4" on my local server.
Unfortunately it is encoded as "a%CC%88" on the webserver.
Which breaks my app because it is part of the name of a database field which isn't found when wrong encoded. And I can't control that there are no ä's in field names because the app allows users to upload their own data.
How can I make sure that "ä" is always encoded correctly?
SORRY. To make this clear: The encoding happens both times client-side in the browser. But when the web-app is served from the webserver the "ä" is encoded as "%C3%A4" instead of "a%CC%88" (I've tested both in the same chrome browser)
Thanks for all your help. It got me to dig deeper:
I have code that runs on an event. It loops through checkboxes and creates an array of objects containing (also) the field names. The code gets the field names from an attribute named "feld" of the checkbox:
<div class="checkbox">
<label>
<input class="feld_waehlen" type="checkbox" dstyp="Taxonomie" datensammlung="SISF Index 2 (2005)" feld="Artname vollständig">Artname vollständig
</label>
</div>
running this code:
console.log("this.getAttribute('feld') = " + this.getAttribute('feld'));
gives as expected: $(this).attr('feld') = Artname vollständig
If while looping, I run:
console.log('encodeURIComponent("Artname vollständig") = ' + encodeURIComponent("Artname vollständig"));
the answer is correct: encodeURIComponent("Artname vollständig") = Artname%20vollst%C3%A4ndig
But if I run:
console.log("encodeURIComponent(this.getAttribute('feld')) = " + encodeURIComponent(this.getAttribute('feld')));
the answer is: encodeURIComponent(this.getAttribute('feld')) = Artname%20vollsta%CC%88ndig
This happens all in the browser. But the issue only appears, when the web-app is served from the webserver (a couchapp running on cloudant.com).
How can it be that the method "getAttribute" returns a different encoding?
The following code has been tested on Chrome 29 OS X, IE 8 Windows XP.
encodeURIComponent("ä") //%C3%A4"
decodeURIComponent("%C3%A4") //ä
so basically "%C3%A4" should be the expected output.
I think the issue here might be encodeURIComponent require a UTF-8 encoding while your server-side language returns something other than this.
encodeURICompoent - MDN
just a follow up in case somebody runs into this issue later.
It seems to be unique to cloudant.com where my couchapp was hosted.
This is the answer I got from their very helpful support:
OK - I think I've found the culprit. The issue is that, due to internal optimisations (which are not present in CouchDB), the form of unicode strings can get changed. In this case, ä is represented as:
U+0061 LATIN SMALL LETTER A character
U+0308 COMBINING DIAERESIS character (̈)
instead of
U+00E4 LATIN SMALL LETTER A WITH DIAERESIS character (ä)
Both are semantically equivalent, so the fix is to normalize your unicode strings before comparison. Unfortunately, JavaScript has no built-in unicode normalization, but you can use a library such ashttps://github.com/walling/unorm.
It's not an issue for me any more as I changed to a virtual server running on digitalocean.com with vanilla couchdb (and am very happy with it).
But I do think this could hit others developing couchapps in German or other languages needing utf8 and hosting them on cloudant.com
Thanks for your great help.
Alex
I got the regexp right. Works perfectly for Firefox ONLY. How would i make this cross browser, cross platform manner. Since it is file name and extension validation you are right i am using File Upload control.
^[a-zA-Z0-9_\.]{3,28}(.pdf|.txt|.doc|.docx|.png|.gif|.jpeg|.jpg|.zip|.rar)$
matches File name must not be empty[ 3, 28 characters long].
Extension must be within the group.
When this works superb in forefox i assume because the fileUpload.value = Filename.extension in firefox. It awfully fails in Google chrome and IE. I am using the above with .net Regular Expression validator and ClientScript enabled.
I know how to validate it on server, so please no server side solutions.
note:
Google chrome:
Provides the fileupload control value as c:\fakePath\filename.extension
IE:
Provides the Full path.
You can't use the ^ to start with if you sometimes have a full path but are only interested in the filename. The dot of the filending should be escaped.
You could try something like this:
[^\\/]{3,}\.(pdf|txt|doc|docx|png|gif|jpeg|jpg|zip|rar)$
As it looks you get only the file with Firefox but the full path with other browsers.
I'd always add a prefix / to your string and than validate the last part after the last fileseprator / or \.
This example uses lookahead to check the fileseparator (or manually added /) before the file and also allows the check of max 28 char for filename. see this online regex tester:
(?<=[\\/])[\w\.]{3,28}\.(?:pdf|txt|doc|docx|png|gif|jpeg|jpg|zip|rar)$
As things stand, your regex validates garbage like the following:
....pdf
____pdf
It also rejects perfectly valid files:
i.jpg
my-pic.jpg
pic.JPG
The easiest is to validate things in multiple steps:
Extract the extension:
\.[a-zA-Z]{3,4}$
Lowercase the extension and validate it against an array of acceptable values.
Optionally validate the file's name (though I'd recommend cleaning it instead):
[a-zA-Z0-9_-]+(?:\.[a-zA-Z0-9_-]+)*