I am trying to base64 encode matplotlib generated plots and write them to a html page on GAE.
Later, digitize the plots
Finally, converting the html page to pdf using xhtml2pdf library.
The code worked well if plots are created by jqplot. However once switched to matplotlib, I am having an error called Incorrect padding, which I do not know how to solve. Thanks for any suggestions.
Step 1
Python CODE:
import matplotlib
import matplotlib.pyplot as plt
import StringIO
import urllib, base64
plt.figure(1)
plt.hist([2,3,4,5,6,7], bins=3)
fig = plt.gcf()
imgdata = StringIO.StringIO()
fig.savefig(imgdata, format='png')
imgdata.seek(0) # rewind the data
uri_1 = 'data:image/png;base64,'
uri_2 = urllib.quote(base64.b64encode(imgdata.buf))
uri_2 += "=" * ((4 - len(uri_2 ) % 4) % 4)
uri_3 = uri_1 + uri_2
uri_4 = '<img id="chart1" src = "%s"/>' % (uri_3)
Step 2
Javascript
var n_plot = $('img[id^="chart"]').size();
i=1;
var imgData = [];
while(i <= n_plot){
//sometimes the plots are generated under jqplot
try{
imgData.push($('#chart'+i).jqplotToImageStr({}));
i=i+1
}
// This case is generated by matplotlib
catch(e){
imgData.push($('#chart'+i).attr('src'));
i=i+1
}
}
imgData_json = JSON.stringify(imgData)
Step 3
Python
pdf = pisa.CreatePDF(imgData_json, file(filename, "wb"))
That's the problem:
uri_2 += "=" * ((4 - len(uri_2 ) % 4) % 4)
Why do you think the url needs extra padding? The output of b64encode will already be padded if neccessary, adding extra padding will only confuse the decoder. Some may ignore the extra padding, most however will just produce an error.
Oh, and also for data urls it's unnecessary to quote the string - base64 doesn't contain characters that really need to be escaped. That's only needed if you need to use base64 in a regular url, as + and = can cause problems, but that's not the case for data urls.
Related
I want to input any Base64 string to function and get the PDF from there. So tried this way, It download the PDF but there is a error
"Failed to load PDF document."
This is the way I tried,
let data = "SGVsbG8gd29ybGQ=" //hello world
var bufferArray = this.base64ToArrayBuffer(data);
var binary_string = window.atob(data)
var len = bufferArray.length;
var bytes = new Uint8Array(len);
for (var i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
let blob = new Blob([bytes.buffer], { type: 'application/pdf' })
var url = URL.createObjectURL(blob);
window.open(url);
//convert base64 string to arraybuffer
base64ToArrayBuffer(data) {
var bString = window.atob(data);
var bLength = bString.length;
var bytes = new Uint8Array(bLength);
for (var i = 0; i < bLength; i++) {
var ascii = bString.charCodeAt(i);
bytes[i] = ascii;
}
return bytes;
};
Base64 is not pdf so hello.b64 will never morph into hello.pdf
It needs a pdf header page and trailer in decimal bytes, those cannot be easily added as base64 object wrapping as too many variables.
The text/pdf needs careful script as text to wrap around the hello text see hello example https://stackoverflow.com/a/70748286/10802527
So as Base64 for example
JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgSGVsbG8gV29ybGQgICApJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G
<iframe type="application/pdf" width="95%" height=150 src="data:application/pdf;base64,JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgSGVsbG8gV29ybGQgICApJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G">frame</iframe>
Try above but may be blocked by security it will look like this for some users but not ALL !
In comments you asked how text could be manipulated in java script, and my stock answer is java script cannot generally be easily used to build PDF or edit Base64 content. However if you have prepared placeholders it can be changed by find and replace. But must be done with care as the total file length should never be changed.
As an example take the above as a prior template and switch the content to.
JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgRmFyZS10aGVlLXdlbGwpJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G
So by find and replace SGVsbG8gV29ybGQgICAp with RmFyZS10aGVlLXdlbGwp we get a text change:- (it is important the string length is a multiple of 4 and the length is the same)
<iframe type="application/pdf" width="95%" height=150 src="data:application/pdf;base64,JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgRmFyZS10aGVlLXdlbGwpJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G">frame</iframe>
and the result be
There are strict rules to be followed when using this method:-
Hello World ) is the template, note the inclusion of white space before the ) limit thus
Fare-thee-well) is as far as substitution is allowed in this case
so source field must be pre-planned to be big enough for largest replacement and is based on a plain text length of multiples of 3 (matches base64 blocks of 4)
I am trying to run a python file (which is actually running a Deep learning model) on a button click using Node JS. I am trying to achieve this using input form in html and routes in index.js file. But this is causing this error after running for a while:
I just want to run the python file in the background, no arguments, no input or output.
This is my index.html file:
<form action="/runpython" method="POST">
<button type="submit">Run python</button>
</form>
And this is my index.js file:
function callName(req, res) {
var spawn = require("child_process").spawn;
var process = spawn("python", ["denoising.py"]);
process.stdout.on("data", function (data) {
res.send(data.toString());
});
}
app.post("/runpython", callName);
Note: This works fine if I have simple print statement in my .py file
print("Hello World!")
But running below code in .py file creates an issue
"""# import modules"""
"""# loading previously trained model"""
import noisereduce as nr
import numpy as np
import librosa
import librosa.display
import IPython.display as ipd
import matplotlib.pyplot as plt
from keras.models import load_model
import soundfile as sf
model = load_model(
r'model/denoiser_batchsize_5_epoch_100_sample_2000_org_n_n.hdf5', compile=True)
"""# testing on real world audio
"""
# function of moving point average used for minimizing distortion in denoised audio.
def moving_average(x, w):
return np.convolve(x, np.ones(w), 'valid') / w
# audio , sr = librosa.load(r'real_world_data/noise speech.wav' , res_type='kaiser_fast')
audio, sr = librosa.load(r'real_world_data/winona.wav', res_type='kaiser_fast')
# audio, sr = librosa.load(r'real_world_data/babar.wav', res_type='kaiser_fast')
# audio, sr = librosa.load(r'real_world_data/sarfaraz_eng.wav', res_type='kaiser_fast')
print(audio)
print(len(audio))
ipd.Audio(data=audio, rate=22050)
real_audio_spec = np.abs(librosa.stft(audio))
fig, ax = plt.subplots()
img = librosa.display.specshow(librosa.amplitude_to_db(
real_audio_spec, ref=np.max), y_axis='log', x_axis='time', ax=ax)
ax.set_title('Power spectrogram input real audio ')
fig.colorbar(img, ax=ax, format="%+2.0f dB")
ipd.Audio(data=audio, rate=22050)
start = 0
end = 65536
print(len(audio))
print(len(audio)/22050)
split_range = int(len(audio) / 65536)
print(split_range)
predicted_noise = []
input_audio = []
for i in range(split_range):
audio_frame = audio[start:end]
input_audio.append(audio_frame)
audio_reshape = np.reshape(audio_frame, (1, 256, 256, 1))
prediction = model.predict(audio_reshape)
prediction = prediction.flatten()
predicted_noise.append([prediction])
start = start + 65536
end = end + 65536
predicted_noise = np.asarray(predicted_noise).flatten()
input_audio = np.asarray(input_audio).flatten()
real_pred_noise_spec = np.abs(librosa.stft(predicted_noise))
"""## input audio to model"""
ipd.Audio(data=input_audio, rate=22050)
sf.write('input_audio.wav', input_audio.astype(np.float32), 22050, 'PCM_16')
fig, ax = plt.subplots()
img = librosa.display.specshow(librosa.amplitude_to_db(
real_pred_noise_spec, ref=np.max), y_axis='log', x_axis='time', ax=ax)
ax.set_title('Power spectrogram pred noise of real audio ')
fig.colorbar(img, ax=ax, format="%+2.0f dB")
ipd.Audio(data=predicted_noise, rate=22050)
sf.write('predicted_noise.wav', predicted_noise.astype(
np.float32), 22050, 'PCM_16')
ipd.Audio(data=moving_average(predicted_noise, 8), rate=22050)
denoised_final_audio = input_audio - predicted_noise
real_denoised_audio_spec = np.abs(librosa.stft(denoised_final_audio))
fig, ax = plt.subplots()
img = librosa.display.specshow(librosa.amplitude_to_db(
real_denoised_audio_spec, ref=np.max), y_axis='log', x_axis='time', ax=ax)
ax.set_title('Power spectrogram final denoised real audio ')
fig.colorbar(img, ax=ax, format="%+2.0f dB")
ipd.Audio(data=denoised_final_audio, rate=22050)
sf.write('denoised_final_audio_by_model.wav',
denoised_final_audio.astype(np.float32), 22050, 'PCM_16')
"""## moving point average of the real world denoised signal"""
real_world_mov_avg = moving_average(denoised_final_audio, 4)
print(real_world_mov_avg)
print(len(real_world_mov_avg))
ipd.Audio(data=real_world_mov_avg, rate=22050)
"""## noise reduce library"""
# !pip install noisereduce
"""### nr on real world audio"""
# if you cant import it. than you need to install it using 'pip install noisereduce'
"""#### using noise reduce directly on the real world audio to see how it works on it. """
reduced_noise_direct = nr.reduce_noise(
y=audio.flatten(), sr=22050, stationary=False)
ipd.Audio(data=reduced_noise_direct, rate=22050)
sf.write('denoised_input_audio_direct_by_noisereduce_no_model.wav',
reduced_noise_direct.astype(np.float32), 22050, 'PCM_16')
"""#### using noise reduce on model denoised final output. to make it more clean."""
# perform noise reduction
reduced_noise = nr.reduce_noise(y=real_world_mov_avg.flatten(
), sr=22050, y_noise=predicted_noise, stationary=False)
# wavfile.write("mywav_reduced_noise.wav", rate, reduced_noise)
ipd.Audio(data=reduced_noise, rate=22050)
sf.write('denoised_final_audio_by_model_than_noisereduce_applied.wav',
reduced_noise.astype(np.float32), 22050, 'PCM_16')
print("python code executed")
If there is any alternative, then please let me know. I am new to Node JS and this is the only workable method I found
Why are you using res.send(data.toString());, I don't see any use of this line in your code. Try removing the mentioned code and run again.
I'm newbie in python programming. I'm learning beautifulsoup to scrap website.
I want to extract and store the value of "stream" to my variable.
My Python code as follows :
import bs4 as bs #Importing BeautifulSoup4 Python Library.
import urllib.request
import requests
import json
import re
headers = {'User-Agent':'Mozilla/5.0'}
url = "http://thoptv.com/partners/mhdTVlive/Core.php?level=1200&channel=Dsports_HD"
page = requests.get(url)
soup = bs.BeautifulSoup(page.text,"html.parser")
pattern = re.compile('var stream = (.*?);')
scripts = soup.find_all('script')
for script in scripts:
if(pattern.match(str(script.string))):
data = pattern.match(script.string)
links = json.loads(data.groups()[0])
print(links)
This is the source javascript code to get the stream url value.
https://content.jwplatform.com/libraries/oncyToRO.js'>if( navigator.userAgent.match(/android/i)||
navigator.userAgent.match(/webOS/i)||
navigator.userAgent.match(/iPhone/i)||
navigator.userAgent.match(/iPad/i)||
navigator.userAgent.match(/iPod/i)||
navigator.userAgent.match(/BlackBerry/i)||
navigator.userAgent.match(/Windows Phone/i)) {var stream =
"http://ssrigcdnems01.cdnsrv.jio.com/jiotv.live.cdn.jio.com/Dsports_HD/Dsports_HD_800.m3u8?jct=ibxIPxc6rkq1yIUJb4RlEV&pxe=1504146411&st=AQIC5wM2LY4SfczRaEwgGl4Dyvly_3HihdlD_Oduojk5Kxs.AAJTSQACMDIAAlNLABQtNjUxNDEwODczODgxNzkyMzg5OQACUzEAAjYw";}else{var
stream =
"http://hd.simiptv.com:8080//index.m3u8?key=VIoVSsGRLRouHWGNo1epzX&exp=932213423&domain=thoptv.stream&id=461";}jwplayer("THOPTVPlayer").setup({"title":
'thoptv.stream',"stretching":"exactfit","width": "100%","file":
none,"height": "100%","skin": "seven","autostart": "true","logo":
{"file":"https://i.imgur.com/EprI2uu.png","margin":"-0",
"position":"top-left","hide":"false","link":"http://mhdtvlive.co.in"},"androidhls":
true,});jwplayer("THOPTVPlayer").onError(function(){jwplayer().load({file:"http://content.jwplatform.com/videos/7RtXk3vl-52qL9xLP.mp4",image:"http://content.jwplatform.com/thumbs/7RtXk3vl-480.jpg"});jwplayer().play();});jwplayer("THOPTVPlayer").onComplete(function(){window.location
= window.location.href;});jwplayer("THOPTVPlayer").onPlay(function(){clearTimeout(theTimeout);});
I need to extract the url from stream.
var stream = "http://ssrigcdnems01.cdnsrv.jio.com/jiotv.live.cdn.jio.com/Dsports_HD/Dsports_HD_800.m3u8?jct=ibxIPxc6rkq1yIUJb4RlEV&pxe=1504146411&st=AQIC5wM2LY4SfczRaEwgGl4Dyvly_3HihdlD_Oduojk5Kxs.AAJTSQACMDIAAlNLABQtNjUxNDEwODczODgxNzkyMzg5OQACUzEAAjYw";}
Rather then thinking complicated with regex, if the link is the only dynamically changing part, you can split the string with some known separating tokens.
x = """
https://content.jwplatform.com/libraries/oncyToRO.js'>if( navigator.userAgent.match(/android/i)|| navigator.userAgent.match(/webOS/i)|| navigator.userAgent.match(/iPhone/i)|| navigator.userAgent.match(/iPad/i)|| navigator.userAgent.match(/iPod/i)|| navigator.userAgent.match(/BlackBerry/i)|| navigator.userAgent.match(/Windows Phone/i)) {var stream = "http://ssrigcdnems01.cdnsrv.jio.com/jiotv.live.cdn.jio.com/Dsports_HD/Dsports_HD_800.m3u8?jct=ibxIPxc6rkq1yIUJb4RlEV&pxe=1504146411&st=AQIC5wM2LY4SfczRaEwgGl4Dyvly_3HihdlD_Oduojk5Kxs.AAJTSQACMDIAAlNLABQtNjUxNDEwODczODgxNzkyMzg5OQACUzEAAjYw";}else{var stream = "http://hd.simiptv.com:8080//index.m3u8?key=VIoVSsGRLRouHWGNo1epzX&exp=932213423&domain=thoptv.stream&id=461";}jwplayer("THOPTVPlayer").setup({"title": 'thoptv.stream',"stretching":"exactfit","width": "100%","file": none,"height": "100%","skin": "seven","autostart": "true","logo": {"file":"https://i.imgur.com/EprI2uu.png","margin":"-0", "position":"top-left","hide":"false","link":"http://mhdtvlive.co.in"},"androidhls": true,});jwplayer("THOPTVPlayer").onError(function(){jwplayer().load({file:"http://content.jwplatform.com/videos/7RtXk3vl-52qL9xLP.mp4",image:"http://content.jwplatform.com/thumbs/7RtXk3vl-480.jpg"});jwplayer().play();});jwplayer("THOPTVPlayer").onComplete(function(){window.location = window.location.href;});jwplayer("THOPTVPlayer").onPlay(function(){clearTimeout(theTimeout);});
"""
left1, right1 = x.split("Phone/i)) {var stream =")
left2, right2 = right1.split(";}else")
print(left2)
# "http://ssrigcdnems01.cdnsrv.jio.com/jiotv.live.cdn.jio.com/Dsports_HD/Dsports_HD_800.m3u8?jct=ibxIPxc6rkq1yIUJb4RlEV&pxe=1504146411&st=AQIC5wM2LY4SfczRaEwgGl4Dyvly_3HihdlD_Oduojk5Kxs.AAJTSQACMDIAAlNLABQtNjUxNDEwODczODgxNzkyMzg5OQACUzEAAjYw"
pattern.match() matches the pattern from the beginning of the string. Try using pattern.search() instead - it will match anywhere within the string.
Change your for loop to this:
for script in scripts:
data = pattern.search(script.text)
if data is not None:
stream_url = data.groups()[0]
print(stream_url)
You can also get rid of the surrounding quotes by changing the regex pattern to:
pattern = re.compile('var stream = "(.*?)";')
so that the double quotes are not included in the group.
You might also have noticed that there are two possible stream variables depending on the accessing user agent. For tablet like devices the first would be appropriate, while all other user agents should use the second stream. You can use pattern.findall() to get all of them:
>>> pattern.findall(script.text)
['"http://ssrigcdnems01.cdnsrv.jio.com/jiotv.live.cdn.jio.com/Dsports_HD/Dsports_HD_800.m3u8?jct=LEurobVVelOhbzOZ6EkTwr&pxe=1571716053&st=AQIC5wM2LY4SfczRaEwgGl4Dyvly_3HihdlD_Oduojk5Kxs.*AAJTSQACMDIAAlNLABQtNjUxNDEwODczODgxNzkyMzg5OQACUzEAAjYw*"', '"http://hd.simiptv.com:8080//index.m3u8?key=vaERnLJswnWXM8THmfvDq5&exp=944825312&domain=thoptv.stream&id=461"']
this code works for me
import bs4 as bs #Importing BeautifulSoup4 Python Library.
import urllib.request
import requests
import json
headers = {'User-Agent':'Mozilla/5.0'}
url = "http://thoptv.com/partners/mhdTVlive/Core.php?
level=1200&channel=Dsports_HD"
page = requests.get(url)
soup = bs.BeautifulSoup(page.text,"html.parser")
scripts = soup.find_all('script')
out = list()
for c, i in enumerate(scripts): #go over list
text = i.text
if(text[:2] == "if"): #if the (if) comes first
for count, t in enumerate(text): # then we have reached the correct item in the list
if text[count] == "{" and text[count + 1] == "v" and text[count + 5] == "s": # and if this is here that stream is set
tmp = text[count:] # add this to the tmp varible
break # and end
co = 0
for m in tmp: #loop over the results from prev. result
if m == "\"" and co == 0: #if string is starting
co = 1 #set count to "true" 1
elif m == "\"" and co == 1: # if it is ending stop
print(''.join(out)) #results
break
elif co == 1:
# as long as we are looping over the rigth string
out.append(m) #add to out list
pass
result = ''.join(out) #set result
it basicly filters the string manuely.
but if we use user1767754 method (brilliant by the way) we will end up something like this:
import bs4 as bs #Importing BeautifulSoup4 Python Library.
import urllib.request
import requests
import json
headers = {'User-Agent':'Mozilla/5.0'}
url = "http://thoptv.com/partners/mhdTVlive/Core.php?level=1200&channel=Dsports_HD"
page = requests.get(url)
soup = bs.BeautifulSoup(page.text,"html.parser")
scripts = soup.find_all('script')
x = scripts[3].text
left1, right1 = x.split("Phone/i)) {var stream =")
left2, right2 = right1.split(";}else")
print(left2)
I needed to convert the following function to python to deobfuscate a text extracted while web scraping:
function obfuscateText(coded, key) {
// Email obfuscator script 2.1 by Tim Williams, University of Arizona
// Random encryption key feature by Andrew Moulden, Site Engineering Ltd
// This code is freeware provided these four comment lines remain intact
// A wizard to generate this code is at http://www.jottings.com/obfuscator/
shift = coded.length
link = ""
for (i = 0; i < coded.length; i++) {
if (key.indexOf(coded.charAt(i)) == -1) {
ltr = coded.charAt(i)
link += (ltr)
}
else {
ltr = (key.indexOf(coded.charAt(i)) - shift + key.length) % key.length
link += (key.charAt(ltr))
}
}
document.write("<a href='mailto:" + link + "'>" + link + "</a>")
}"""
here is my converted python equivalent:
def obfuscateText(coded,key):
shift = len(coded)
link = ""
for i in range(0,len(coded)):
inkey=key.index(coded[i]) if coded[i] in key else None
if ( not inkey):
ltr = coded[i]
link += ltr
else:
ltr = (key.index(coded[i]) - shift + len(key)) % len(key)
link += key[ltr]
return link
print obfuscateText("uw#287u##Guw#287Xw8Iwu!#W7L#", "WXYVZabUcdTefgShiRjklQmnoPpqOrstNuvMwxyLz01K23J456I789H.#G!#$F%&E'*+D-/=C?^B_`{A|}~")
actionattraction$comcastWnet
but I am getting a slightly incorrect output instead of actionattraction#comcast.net I get above. Also many a times the above code gives random characters for the same html page,
The target html page has a obfuscateText function in JS with the coded and key, I extract the function signature in obsfunc and execute it on the fly:
email=eval(obsfunc)
which stores the email in above variable, but the problem is that it works most of the time but fails certain times , I strongly feel that the problem is with the arguments supplied to the python function , they may need escaping or conversion as it contains special characters? I tried passing raw arguments and different castings like repr() but the problem persisted.
Some examples for actionattraction#comcast.net wrongly computed and correctly computed using the same python function(first line is email):
#ation#ttr#ationVaoma#st!nct
obfuscateText("KMd%Y#Kdd8KMd%Y#IMY!MKcdJ#*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y#X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~")
}ction}ttr}ction#comc}st.net
obfuscateText("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!#.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~")
actionattraction#comcast.net
obfuscateText("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!#.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~")
I've rewritten the deobfuscator:
def deobfuscate_text(coded, key):
offset = (len(key) - len(coded)) % len(key)
shifted_key = key[offset:] + key[:offset]
lookup = dict(zip(key, shifted_key))
return "".join(lookup.get(ch, ch) for ch in coded)
and tested it as
tests = [
("KMd%Y#Kdd8KMd%Y#IMY!MKcdJ#*d", "utvsrwqxpyonzm0l1k2ji3h4g5fe6d7c8b9aZ.Y#X!WV#U$T%S&RQ'P*O+NM-L/K=J?IH^G_F`ED{C|B}A~"),
("}ARGML}RRP}ARGMLjAMKA}QRiLCR", "}|{`_^?=/-+*'&%$#!#.9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA~"),
("DEWLRQDWWUDEWLRQoERPEDVWnQHW", "%&$#!#.'9876*54321+0zyxw-vutsr/qponm=lkjih?gfed^cbaZY_XWVUT`SRQPO{NMLKJ|IHGFE}DCBA~"),
("ZUhq4uh#e4Om.04O", "ksYSozqUyFOx9uKvQa2P4lEBhMRGC8g6jZXiDwV5eJcAp7rIHL31bnTWmN0dft")
]
for coded,key in tests:
print(deobfuscate_text(coded, key))
which gives
actionattraction#comcast.net
actionattraction#comcast.net
actionattraction#comcast.net
anybody#home.com
Note that all three key strings contain &; replacing it with & fixes the problem. Presumably at some point the javascript was mistakenly html-code-escaped; Python has a module which will unencode html special characters like so:
# Python 2.x:
import HTMLParser
html_parser = HTMLParser.HTMLParser()
unescaped = html_parser.unescape(my_string)
# Python 3.x:
import html.parser
html_parser = html.parser.HTMLParser()
unescaped = html_parser.unescape(my_string)
First of all, index doesn't return None, but throws an exception. In your case, W appears instead of a dot because the index returned is 0, and not inkey (which is also wrong) mistakenly beleive that a character is not present in the key.
Second, presence of & suggests that you indeed may have to find and decode HTML entities.
Finally, I'd recommend to rewrite it like
len0 = len(code)
len1 = len(key)
link = ''
for ch in code:
try:
ch = key[(key.index(ch) - len0 + len1) % len1]
except ValueError: pass
link += ch
return link
I am working with the FatSecret REST API
Im using the OAuthSimple javascript library to generate the signed url.
Here's the code I have -
params['oauth_timestamp'] = Math.floor(new Date().getTime()/1000);
params['oauth_nonce'] = '1234';
params['oauth_version'] = '1.0';
var paramStr = '';
for(var key in params){
paramStr += key+"="+params[key]+"&";
}
paramStr = paramStr.substring(0,paramStr.length-1);
var oauth = OAuthSimple();
oauth.setAction('GET');
var o = oauth.sign(
{
path:this.requestURL,
parameters: paramStr,
signatures:{
api_key:this.apiKey,
shared_secret:this.sharedSecret,
access_token: this.accessToken,
access_secret: this.accessSecret
}
});
console.log(o.signed_url);
return o.signed_url;
params is an associative array containing all the non oauth related parameters for this call.
When I use this signed url I get an "invalid/used nonce"
The OAuth Testing Tool uses the same OAuthSimple library and if I put in all the same parameters (including the timestamp) it generates exactly the same url.
The only difference is that the url generated by the testing tool works and gives me the full response from the server. The url generated by my code does't.
I tried various nonce values including sending a MD5 of the timestamp but I get the same error. The reason I'm using 1234 right now is that the testing tool uses 1234 by default and that seems to work.
Any help is appreciated. Thanks in advance.
Updating #Saravanan's answer with something that works on current browsers:
function genNonce() {
const charset = '0123456789ABCDEFGHIJKLMNOPQRSTUVXYZabcdefghijklmnopqrstuvwxyz-._~'
const result = [];
window.crypto.getRandomValues(new Uint8Array(32)).forEach(c =>
result.push(charset[c % charset.length]));
return result.join('');
}
console.info(genNonce());
The nonce value as per twitter documentation:
The value for this request was generated by base64 encoding 32 bytes of random data, and stripping out all non-word characters, but any
approach which produces a relatively random alphanumeric string should
be OK here.
Based on the above notes, I use the following javascript code to generate nonce value each time I send a request:
var nonceLen = 32;
return crypto.randomBytes(Math.ceil(nonceLen * 3 / 4))
.toString('base64') // convert to base64 format
.slice(0, nonceLen) // return required number of characters
.replace(/\+/g, '0') // replace '+' with '0'
.replace(/\//g, '0'); // replace '/' with '0'
Try this if it works!
Try this
This works every time
var nonce = Math.random().toString(36).replace(/[^a-z]/, '').substr(2);