Chrome Extension sends unreadable data to python script - javascript

I am new to Chrome extensions and just built a popup that, when submitted via Javascript, sends info to a Python Script on GAE which works with the data. Now, everything works perfectly fine as long as I do not use special characters like Ä,Ö,Ü. When I do use these letters, I get the error:
Traceback (most recent call last):
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1535, in __call__
rv = self.handle_exception(request, response, e)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1529, in __call__
rv = self.router.dispatch(request, response)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1102, in __call__
return handler.dispatch()
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "/base/data/home/apps/s~google.com:finaggintel/1.368063289009985228/main.py", line 115, in post
t.title = self.request.get('title').encode('utf-8')
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 175, in get
param_value = self.get_all(argument_name)
File "/python27_runtime/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 212, in get_all
param_value = self.params.getall(argument_name)
File "/python27_runtime/python27_lib/versions/third_party/webob-1.1.1/webob/multidict.py", line 327, in getall
return map(self._decode_value, self.multi.getall(self._encode_key(key)))
File "/python27_runtime/python27_lib/versions/third_party/webob-1.1.1/webob/multidict.py", line 301, in _decode_value
value = value.decode(self.encoding, self.errors)
File "/python27_runtime/python27_dist/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xdc in position 0: unexpected end of data
To be frank - I have no idea where to debug this issue. I tried utf-8 de- and encoding in Python (but again, this is new to me):
class News(webapp2.RequestHandler):
def post(self):
try:
user_job = joblist[user][0]
user_pod = joblist[user][1]
except KeyError:
user_job = 'Guest'
user_pod = 'Guest'
link = self.request.get('link').encode('utf-8')
if 'http' not in self.request.get('link'):
link ='http://'+self.request.get('link')
else:
link = self.request.get('link')
t = NewsBase(parent=news_key('finaggnews'))
t.user = user
t.date = datetime.now()
t.text = self.request.get('text').encode('utf-8')
t.title = self.request.get('title').encode('utf-8')
t.link = link
t.upvotes = []
t.downvotes = []
t.put()
Am I doing something wrong? Am I even close to the issue? Thanks for your help!
EDIT: Included Traceback

Ok,
You have it back to front, you should be decoding the inboud data to a unicode representation.
e.g.
>>> x = "Ä"
>>> x.decode('utf-8')
u'\xc4'
>>>
>>> y=x.decode('utf-8')
>>> print y
Ä
>>>
So for your line
t.title = self.request.get('title').encode('utf-8')
try
t.title = self.request.get('title').decode('utf-8')
However this assumes the data is needs to decoded from a utf-8 stream.
You should specify accept-charset="utf-8" in the form (or on the client when posting) so that the correct encoding is defined rather than guessing and trying to decode.
For instance on windows the default encoding isn't utf-8 but latin_1 and trying to decode utf-8 from latin_1 wouldn't work. The character the decode('utf-8') was failing on (0xdc) can be decoded if you use decode('latin_1')

Related

Inserting jinja2 syntax using .setAttribute

my first question here. Hope it is clear enough!
I need to insert the jinja2 code into the "name" and "id" fields of newly created cells in a table. Highlighting indicates that JS thinks that "order" and "month" are variables, but they are not.
I am using flask for this app, and it gives an error.
UndefinedError jinja2.exceptions.UndefinedError: 'month' is undefined
Traceback (most recent call last) File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\app.py",
line 2548, in __call__ return self.wsgi_app(environ, start_response)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\app.py",
line 2528, in wsgi_app response = self.handle_exception(e)
^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\app.py",
line 2525, in wsgi_app response = self.full_dispatch_request()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\app.py",
line 1822, in full_dispatch_request rv = self.handle_user_exception(e)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\app.py",
line 1820, in full_dispatch_request rv = self.dispatch_request()
^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\app.py",
line 1796, in dispatch_request return
self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\app.py", line 66,
in index return render_template("index.html", months=months,
row_count=row_count)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\templating.py",
line 147, in render_template return _render(app, template, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\flask\templating.py",
line 130, in _render rv = template.render(context)
^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\jinja2\environment.py",
line 1301, in render self.environment.handle_exception()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\jinja2\environment.py",
line 936, in handle_exception raise
rewrite_traceback_stack(source=source)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\templates\index.html",
line 1, in top-level template code {% extends "layout.html" %} File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\templates\layout.html",
line 55, in top-level template code jinja_id =
'yb_col{{month['order']}}'; File
"C:\Users\Евгения\Documents\GitHub\CS50xFinalProject\env\Lib\site-packages\jinja2\environment.py",
line 466, in getitem return obj[argument]
^^^^^^^^^^^^^ jinja2.exceptions.UndefinedError: 'month' is undefined
Here is my JS code:
} else {
input.setAttribute("type","number");
input.setAttribute("class", "month");
input.setAttribute("id", "yb_col{{month["order"]}}");
input.setAttribute("name", "{{month["month"]}}");
}
The section after "else" is what gave me trouble.
I have tried hiding jinja syntax in a variable and passing it in the variable. Expectedly, did not work.
Attribute name and id need to name this syntax to have consistency with the rest of HTML.
I have tried element.attribute rout but in this case, assigning jinja to a variable is a problem.
} else {
jinja_id = "yb_col{{month["order"]}}";
jinja_name = "{{month["month"]}}";
input.setAttribute("type","number");
input.setAttribute("class", "month");
input.id = jinja_id;
input.name = jinja_name;
}
Edit: using ' in jinja instead of " takes care of highlighting, but not the issue itself.
Edit02: It inserts jinja just fine as long as jinja makes no reference to {% for n in x %} structure. Where n is an issue.

Parse <script type=“text/javascript” twitter python

very long code..
Need parse screen_name:
<script type="text/javascript" charset="utf-8" nonce="YjJmNTAwODgtODBmMy00YzQ5LWJhODItMmQwNTk0Yjg4MTI1">window.__INITIAL_STATE__={"optimist":[],"urt":{},"toasts":[],"needs_phone_verification":false,"normal_followers_count":2,"notifications":false,"pinned_tweet_ids_str":[],"profile_image_url_https":"https://pbs.twimg.com/profile_images/1174197230003208192/qK5cqalJ_normal.jpg","profile_interstitial_type":"","protected":false,"featureSwitch":{"config":{"2fa_multikey_management_enabled":{"value":false},"screen_name":"Vickson25435099","always_use_https":true,"use_cookie_personalization":false,"sleep_time":{"enabled":false,"end_time":null,"start_time":null},"geo_enabled":false,"language":"en","discoverable_by_email":true,"discoverable_by_mobile_phone":true,"personalized_trends":true,"allow_media_tagging":"none","allow_contributor_request":"all","allow_ads_personalization":true,"allow_logged_out_device_personalization":true,"allow_location_history_personalization":true,"allow_sharing_data_for_third_party_personalization":false,"allow_dms_from":"following","allow_dm_groups_from":"following","translator_type":"none","country_code":"us","nsfw_user":false,"nsfw_admin":false,"ranked_timeline_setting":1,"ranked_timeline_eligible":null,"address_book_live_sync_enabled":false,"universal_quality_filtering_enabled":"enabled","dm_receipt_setting":"all_disabled","alt_text_compose_enabled":null,"mention_filter":"unfiltered","allow_authenticated_periscope_requests":true,"protect_password_reset":false,"require_password_login":false,"requires_login_verification":false,"dm_quality_filter":"enabled","autoplay_disabled":false,"settings_metadata":{}},"fetchStatus":"loaded"},"dataSaver":{"dataSaverMode":false},"transient":{"dtabBarInfo":{"hide":false},"loginPromptShown":false,"lastViewedDmInboxPath":"/messages","themeFocus":""}},"devices":{"browserPush":{"fetchStatus":"none","pushNotificationsPrompt":{"dismissed":false,"fetchStatus":"none"},"subscribed":false,"supported":null},"devices":{"data":{"emails":[],"phone_numbers":[]},"fetchStatus":"none"},"notificationSettings":{"push_settings":{"error":null,"fetchStatus":"none"},"push_settings_template":{"template":{"settings":[]}},"sms_settings":{"error":null,"fetchStatus":"none"},"sms_settings_template":{"template":{"settings":[]}},"checkin_time":null}},"audio":{"conversationLookup":{}},"hashflags":{"fetchStatus":"none","hashflags":{}},"friendships":{"pendingFollowers":{"acceptedIds":[],"ids":[],"fetchStatus":{"bottom":"none","top":"none"},"hydratedIds":[]}},"homeTimeline":{"useLatest":false,"fetchStatus":"none"},"multiAccount":{"fetchStatus":"none","users":[],"badgeCounts":{},"addAccountFetchStatus":"none"},"badgeCount":{"unreadDMCount":0},"ocf_location":{"startLocation":{}},"navigation":{},"teams":{"fetchStatus":"none","teams":{}},"cardState":{},"promotedContent":{}};window.__META_DATA__={"env":"prod","isLoggedIn":true,"isRTL":false,"hasMultiAccountCookie":false,"uaParserTags":["m2","rweb","msw"],"serverDate":1614578006755,"sha":"9921d3a6d626dc45b0f5a65681ef95c891d815cd"};window.__PREFETCH_DATA__={"items":[{"key":"dataUsageSettings","payload":{"dataSaverMode":false}}],"timestamp":1614578006700};</script>
I`m trying this method
import requests
import json
from bs4 import BeautifulSoup
x = requests.get('https://twitter.com/home')
b = BeautifulSoup(x.text, 'html.parser')
for b in b.find_all('script'):
wis = x.text.split('window.__INITIAL_STATE__=')
if len(wis) > 1:
data = json.loads(wis[1].split(';')[0])
print(data["screen_name"])
Result: KeyError "screen_name"
And this way doesn't work either:
import requests
import json
x = requests.get('https://twitter.com/home')
html = x.text.split('window.__INITIAL_STATE__=')[0]
html = html.split(';</script>')[0]
data = json.loads(html)
print(data['screen_name'])
Result
Traceback (most recent call last):
File "<string>", line 8, in <module>
File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
>
update using full source html
for script in b.find_all('script'):
if 'window.__INITIAL_STATE__=' not in script.contents[0]:
continue
wis = script.contents[0].split('window.__INITIAL_STATE__=')
data = json.loads(wis[1].split(';window.__META_DATA__')[0])
print(data["settings"]["remote"]["settings"]["screen_name"])
break
you wont get screen_name it is only for current logged user, you have to requests with valid cookies to get the data.
btw for example above, it has multiple variable (json), you want json between window.__INITIAL_STATE__= and ,"devices"
b = BeautifulSoup(html, 'html.parser')
for script in b.find_all('script'):
if 'window.__INITIAL_STATE__=' not in script.contents[0]:
continue
wis = script.contents[0].split('window.__INITIAL_STATE__=')
data = json.loads(wis[1].split(',"devices"')[0])
print(data['featureSwitch']['config']['screen_name'])
break

How to parse JavaScript Json into Python dict type, effeciently

I am looking for way to read javascript json data loaded into one of a script tag of this page. I have tried various re patterns posted on google and stackoveflow but got nothing.
The Json Formatter shows an Invalid (RFC 8259).
Here is a code
import requests,json
from scrapy.selector import Selector
headers = {'Content-Type': 'application/json', 'Accept-Language': 'en-US,en;q=0.5', 'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 5_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B179 Safari/7534.48.3'}
url = 'https://www.zocdoc.com/doctor/andrew-fagelman-md-7363?insuranceCarrier=-1&insurancePlan=-1'
response = requests.get(url,headers = headers)
sel = Selector(text = response.text)
profile_data = sel.css('script:contains(APOLLO_STATE)::text').get('{}').split('__REDUX_STATE__ = JSON.parse(')[-1].split(');\n window.ZD = {')[0]
profile_json = json.loads(profile_data)
print(type(profile_json))
The problem seems an invalid json format. The type of profile_json is string while a little amendments in above code shows below error stack
>>> profile_data = sel.css('script:contains(APOLLO_STATE)::text').get('{}').split('__REDUX_STATE__ = JSON.parse("')[-1].split('");\n window.ZD = {')[0].replace("\\","")
>>> profile_json = json.loads(profile_data)
Traceback (most recent call last):
File "/usr/lib/python3.6/code.py", line 91, in runcode
exec(code, self.locals)
File "<console>", line 1, in <module>
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 41316 (char 41315)
Error in output are highlighted here:
The original HTML contains this (heavily trimmed):
<script>
...
window.__REDUX_STATE__ = JSON.parse("{\"routing\": ...
\"awards\":[\"Journal of Urology - \\\"Efficacy, Safety, and Use of Viagra in Clinical Practice.\\\"\",\"Critical Care Resident of the Year - 2003\"],
...
The same string extracted by scrapy is this:
"awards":[
"Journal of Urology - ""Efficacy",
"Safety",
"and Use of Viagra in Clinical Practice.""",
"Critical Care Resident of the Year - 2003"
],
It appears the backslashes are removed from it, making the JSON invalid.
I don't know if this is an efficient way of handling the problem but below code resolved my problem.
>>> import js2xml
>>> profile_data = sel.css('script:contains(APOLLO_STATE)::text').get('{}')
>>> parsed = js2xml.parse(profile_data)
>>> js = json.loads(parsed.xpath("//string[contains(text(),'routing')]/text()")[0])

Django Ajax Post with HTML content using JSON.stringify then json.loads error

I have a textarea within a form on a webpage that has "HTML" content.
<!-- HTML -->
<textarea id="my-textarea">
<div class="this">Content here & here!</div>
</textarea>
I fetch the content with Javascript and use encodeURIComponent to safely encode the string for AJAX JSON. and store it into a key/value array. (python dict)
// Javascript
var textarea = document.getElementById('my-textarea').value;
var data = {};
data['html'] = encodeURIComponent(textarea);
console.log(data);
// prints --> {html: "%3Cdiv%20class%3D%22this%22%3EContent%20here%20%26amp%3B%20here%3C%2Fdiv%3E"}
// In AJAX function.
var json = "data=" + JSON.stringify(data);
I then send the data to Django class based view and in post I have
# Python / Django
if request.is_ajax():
print(request.POST)
# prints --> <QueryDict: {'data': ['{"code":"<div class="this">Content here & here</div>"}']}>
data = request.POST.get('data', None)
if data:
data = json.loads(data)
This throws an error as below:
Traceback (most recent call last):
File "/home/tr/dev/host-root/apps/trenddjango2/venv/lib/python3.8/site-packages/django/core/handlers/exception.py", line 34, in inner
response = get_response(request)
File "/home/tr/dev/host-root/apps/trenddjango2/venv/lib/python3.8/site-packages/django/core/handlers/base.py", line 115, in _get_response
response = self.process_exception_by_middleware(e, request)
File "/home/tr/dev/host-root/apps/trenddjango2/venv/lib/python3.8/site-packages/django/core/handlers/base.py", line 113, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/tr/dev/host-root/apps/trenddjango2/venv/lib/python3.8/site-packages/django/views/generic/base.py", line 71, in view
return self.dispatch(request, *args, **kwargs)
File "/home/tr/dev/host-root/apps/trenddjango2/django/common/views/dashboard/mixins.py", line 56, in dispatch
return super(TemplateDashboardMixin, self).dispatch(request, *args, **kwargs)
File "/home/tr/dev/host-root/apps/trenddjango2/venv/lib/python3.8/site-packages/django/views/generic/base.py", line 97, in dispatch
return handler(request, *args, **kwargs)
File "/home/tr/dev/host-root/apps/trenddjango2/django/common/views/dashboard/catalogue.py", line 550, in post
data = json.loads(jsonData)
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 22 (char 21)
I have found if I remove the "" double quotes within class="this", the string will load correctly.
My question is: How do I load a JSON string with double quotes in python even with it being escaped as %22 ?
Use \\ to escape it.
json.loads('{"code":"<div class=\\"this\\">Content here & here</div>"}')
this will work as you expected. The example below shows you how to escape.
'<div class="this">Content here & here!</div>'.replace(/"/g, '\\\\"')
Result is:
"<div class=\\"this\\">Content here & here!</div>"

How to convert data into json so that it can be accessed in javascript

I have fetched some data into python and I want to visualize it in google intensity maps. Here is my python code for it - (the relevant part)
query = "SELECT USState, NOFU2008 FROM " + TABLE_ID
data2008 = service.query().sql(sql=query).execute()
print data2008['columns']
query = "SELECT USState, NOFU2009 FROM " + TABLE_ID
data2009 = service.query().sql(sql=query).execute()
# print data2009
query = "SELECT USState, NOFU2010 FROM " + TABLE_ID
data2010 = service.query().sql(sql=query).execute()
variables = {"data2008": data2008}
self.render_response('index.html', json.encode(variables))
I want to show a map on the browser with three buttons 2008, 2009 and 2010 with the intensity changing on the maps according to the button that was clicked. The above code gives me the following error -
Traceback (most recent call last):
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1535, in __call__
rv = self.handle_exception(request, response, e)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1529, in __call__
rv = self.router.dispatch(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 1102, in __call__
return handler.dispatch()
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "C:\Program Files (x86)\Google\google_appengine\lib\webapp2-2.5.2\webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "C:\Users\Yash\New folder\data\Assignments\jmankoff-byte3\main.py", line 67, in get
self.render_response('index.html', json.encode(variables))
File "C:\Users\Yash\New folder\data\Assignments\jmankoff-byte3\main.py", line 33, in render_response
values.update(context)
ValueError: dictionary update sequence element #0 has length 1; 2 is required
INFO 2014-03-03 00:15:23,483 module.py:612] default: "GET / HTTP/1.1" 500 1858
Note - In the above code I have only passed data2008. I also want the data2009 and data2010.

Categories