Calling javascript function using python function in django - javascript

I have a website built using the django framework that takes in an input csv folder to do some data processing. I would like to use a html text box as a console log to let the users know that the data processing is underway. The data processing is done using a python function. It is possible for me to change/add text inputs into the text box at certain intervals using my python function?
Sorry if i am not specific enough with my question, still learning how to use these tools!
Edit - Thanks for all the help though, but I am still quite new at this and there are lots of things that I do not really understand. Here is an example of my python function, not sure if it helps
def query_result(request, job_id):
info_dict = request.session['info_dict']
machines = lt.trace_machine(inputFile.LOT.tolist())
return render(request, 'tools/result.html', {'dict': json.dumps(info_dict),
'job_id': job_id})
Actually my main objective is to let the user know that the data processing has started and that the site is working. I was thinking maybe I could display an output log in a html textbox to achieve this purpose.

No cannot do that because you already at server side therefor you cannot touch anything in html page.
You can have 2 ways to do that:
You can make a interval function to call to server and ask the progress and update the progress like you want at callback function.
You can open a socket connection in your server & browser to instantly update.

While it is impossible for the server (Django) to directly update the client (browser), you can you JavaScript to make the request, and Django can return a StreamingHttpResponse. As each part of the response is received, you can update the textbox using JavaScript.
Here is a sample with pseudo code
def process_csv_request(request):
csv_file = get_csv_file(requests)
return StreamingHttpResponse(process_file(csv_file))
def process_file(csv_file):
for row in csv_file:
yield progress
actual_processing(row)
return "Done"
Alternatively you could write the process to the db or some cache, and call an API that returns the progress repeatedly from the frontend

You can achieve this with websockets using Django Channels.
Here's a sample consumer:
class Consumer(WebsocketConsumer):
def connect(self):
self.group_name = self.scope['user']
print(self.group_name) # use this for debugging not sure what the scope returns
# Join group
async_to_sync(self.channel_layer.group_add)(
self.group_name,
self.channel_name
)
self.accept()
def disconnect(self, close_code):
# Leave group
async_to_sync(self.channel_layer.group_discard)(
self.group_name,
self.channel_name
)
def update_html(self, event):
status = event['status']
# Send message to WebSocket
self.send(text_data=json.dumps({
'status': status
}))
Running through the Channels 2.0 tutorial you will learn that by putting some javascript on your page, each time it loads it will connect you to a websocket consumer. On connect() the consumer adds the user to a group. This group name is used by your csv processing function to send a message to the browser of any users connected to that group (in this case just one user) and update the html on your page.
def send_update(channel_layer, group_name, message):
async_to_sync(channel_layer.group_send)(
group_name,
{
'type': 'update_html',
'status': message
}
)
def process_csv(file):
channel_layer = get_channel_layer()
group_name = get_user_name() # function to get same group name as in connect()
with open(file) as f:
reader=csv.reader(f)
send_update(channel_layer, group_name, 'Opened file')
for row in reader:
send_update(channel_layer, group_name, 'Processing Row#: %s' % row)
You would include javascript on your page as outlined in the Channels documentation then have an extra onmessage function fro updating the html:
var WebSocket = new ReconnectiongWebSocket(...);
WebSocket.onmessage = function(e) {
var data = JSON.parse(e.data);
$('#htmlToReplace').html(data['status']);
}

Related

scrapy + selenium: <a> tag has no href, but content is loaded by javascript

I'm almost there with my first try of using scrapy, selenium to collect data from website with javascript loaded content.
Here is my code:
# -*- coding: utf-8 -*-
import scrapy
from selenium import webdriver
from scrapy.selector import Selector
from scrapy.http import Request
from selenium.webdriver.common.by import By
import time
class FreePlayersSpider(scrapy.Spider):
name = 'free_players'
allowed_domains = ['www.forge-db.com']
start_urls = ['https://www.forge-db.com/fr/fr11/players/?server=fr11']
driver = {}
def __init__(self):
self.driver = webdriver.Chrome('/home/alain/Documents/repository/web/foe-python/chromedriver')
self.driver.get('https://forge-db.com/fr/fr11/players/?server=fr11')
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
#time.sleep(1)
sel = Selector(text = self.driver.page_source)
players = sel.xpath('.//table/tbody/tr')
for player in players:
joueur = player.xpath('.//td[3]/a/text()').get()
guilde = player.xpath('.//td[4]/a/text()').get()
yield {
'player' : joueur,
'guild' : guilde
}
next_page_btn = self.driver.find_element_by_xpath('//a[#class="paginate_button next"]')
if next_page_btn:
time.sleep(2)
next_page_btn.click()
yield scrapy.Request(url = self.start_urls, callback=self.parse)
# Close the selenium driver, so in fact it closes the testing browser
self.driver.quit()
def parse_players(self):
pass
I want to collect user names and their relative guild and output to a csv file.
For now my issue is to proceed to NEXT PAGE and to parse again the content loaded by javascript.
if i'm able to simulate click on NEXT tag, i'm not 100% sure that code will proceed all pages and i'm not able to parse the new content using the same function.
Any idea how could i solve this issue ?
thx.
Instead of using selenium, you should try recreate the request to update the table. If you look closely at the HTML under chrometools. You can see that the request is made with parameters and a response is sent back with the data in a nice structured format.
Please see here with regards to dynamic content in scrapy. As it explains the first step to think about is it necessary to recreate browser activity ? Or can I get the information I need from reverse engineering HTTP get requests. Sometimes the information is hidden with <script></script> tags and you can use some regex or some string methods to gain what you want. Rendering the page and then using browser activity should be thought of as a last step.
Now before I go into some background on reverse engineering the requests, this website you're trying to get information from requires only to reverse engineer the HTTP requests.
Reverse Engineering HTTP requests in Scrapy
Now in terms of the actual web itself we can use chrome devtools by right clicking inspect on a page. Clicking the network tab allows you to see all requests the browser makes to render the page. In this case you want to see what happens when you click next.
Image1: here
Here you can see all the requests made when you click next on the page. I always look for the biggest sized response as that'll most likely have your data.
Image2: here
Here you can see the request headers/params etc... things you need to make a proper HTTP request. We can see that the referring URL is actually getplayers.php with all the params to get the next page added on. If you scroll down you can see all the same parameters it sends to getplayers.php. Keep this in mind, sometimes we need to send headers, cookies and parameters.
Image3: here
Here is the preview of the data we would get back from the server if we make the correct request, it's a nice neat format which is great for scraping.
Now You could copy the headers and parameters, cookies here into scrapy, but after a bit of searching and it's always worth checking this first, if just by passing in an HTTP request with the url will you get the data you want then that is the simplest way.
In this case it's true and infact you get in a nice need format with all the data.
Code example
import scrapy
class TestSpider(scrapy.Spider):
name = 'test'
allowed_domains = ['forge-db.com']
def start_requests(self):
url = 'https://www.forge-db.com/fr/fr11/getPlayers.php?'
yield scrapy.Request(url=url)
def parse(self,response):
for row in response.json()['data']:
yield {'name':row[2],'guild':row[3] }
Settings
In settings.py, you need to set ROBOTSTXT_OBEY = False The site doesn't want you to access this data so we need to set it to false. Be careful, you could end getting banned from the server.
I would also suggest a couple of other settings to be respectful and cache the results so if you want to play around with this large dataset you don't hammer the server.
CONCURRENT_REQUESTS = 1
DOWNLOAD_DELAY = 3
HTTPCACHE_ENABLED = True
HTTPCACHE_DIR = 'httpcache'
Comments on the code
We make a request to https://www.forge-db.com/fr/fr11/getPlayers.php? and if you were to print the response you get all the data from the table, it's quite a lot... Now it looks like it's in json format so we use scrapy's new feature to handle json and convert into a python dictionary. response.json() be sure that you have uptodate scrapy to take advantage of this. Otherwise you could use the json library that python provides to do the same thing.
Now you have to look at the preview data abit here but the individual rows are within response.json()['data'][i] where i in the row of data. The name and guild are within response.json()['data'][i][2] and response.json()['data'][i][3]. So looping over every response.json()['data']and grabbing the name and guild.
If the data wasn't so structured as it is here and it needed modifying I would strongly urge you to use Items or ItemLoaders for creating the fields that you can then output the data. You can modifying the extracted data more easily with ItemLoaders and you can interact with duplicates items etc using a pipeline. These are just some thoughts for in the future, I almost never use yielding a dictionary for extracting data particularly large datasets.

Force flask to wait for Asynchronous JS Function before Reload

I have a submit button for upload images and information. When the submit button is called the following happens at the same time:
POST request to the Flask app. (for updating info)
JS:(at same time as above)
GET request to the Flask app for presignedPost.
Once presignedPostRequest received upload file to s3.
What happens is that the POST request to Flask finishes and tells the page to reload. When this happens the AJAX request gets cancelled (if its still in process). Sometimes my code works and other times it doesnt. By adding a time.sleep(3) to the Flask app I can wait for the s3 upload to finish and everything works. This is not a good solution.
How can I force Flask to wait until the JS function is complete?
I'm trying to save my server by having users send directly to s3. Should be faster for them.
Waiting for 3sec works. Looking at xhr logs in chrome tells what is happening.
preventDefault() doesnt work because there are 2 requests happening.
#users.route("/account", methods=['GET', 'POST'])
#login_required
def account():
form = UpdateAccountForm()
if form.validate_on_submit():
if form.picture.data:
# picture_file = save_picture(form.picture.data)
# current_user.image_file = picture_file
time.sleep(3)
current_user.username = form.username.data
current_user.email = form.email.data
db.session.commit()
flash('Your account has been updated!', 'success')
return redirect(url_for('users.account'))
elif request.method == 'GET':
form.username.data = current_user.username
form.email.data = current_user.email
image_file = url_for('static', filename='profile_pics/' + current_user.image_file)
return render_template('account.html', title='Account',
image_file=image_file, form=form)
ctrl + K is not working so here's a short JS version. ctrl + K is going to my url bar in Chrome. :-(
function uploadfile() {
**Get PresignedPostRequest**
**Upload file to s3**
}
document.getElementById("submit").onclick = function() {uploadfile()
};
I know why this is working this way but I don't know of a reasonable solution. Do I have to change my design pattern? I'm using flask because I'm weaker on JS.
Just graduated a Bootcamp so I'm pretty new to this.
I could run everything through my app but it would be harder on my server...
I think I could use socket.io but it's another layer of complication....
Thanks for looking!
I changed my code to a different approach. Then I returned a few days later to get help from a friend and my code worked fine. Here is what I used to solve the problem. Idk why this didn't work for two days...
document.getElementById("accountForm").onsubmit = function (e) {
e.preventDefault();
uploadfile();
document.getElementById('accountForm').submit();

Run command only once when request with .getJSON in Flask

I am trying to make a web interface which display the serial output from my Arduino using Pyserial. I am using Ajax ($.getJSON) to update my HTML string.
The problem I have now is that every time I request my JSON data, it also initialise my ser = serial.Serial('/dev/cu.wchusbserialfa140',9600), which makes the query slow and prohibit real-time update of serial outputs.
my code are as following:
I am trying my best to only execute serial.Serial() once.
#app.before_request
def before_request():
g.status = False
#app.route('/')
def template():
return render_template('index.html')
#app.route('/result')
def serial_monitor():
#connect to serial port for once
if g.status == False:
ser = serial.Serial('/dev/cu.wchusbserialfa140',9600)
g.status = True
result = str(ser.readline())
voltage = {'value':result}
else:
result = str(ser.readline())
voltage = {'value':result}
return jsonify(voltage)
My javascript:
I am using setInterval to repeat it automatically.
$.getJSON($SCRIPT_ROOT + '/result', function(data)
{$('#voltage').text(data.value);});
I have been trying to learn to make my little web interface and Stackoverflow has been great help to me. I have searched and tried hard to solve this problem, but I think it is worth reaching out now.
Thank you all in advance !!
Edited:
I have hacked it a bit to make it do what I want to do for now.
However, I am planning to use a form to get the port value from user before running serial.Serial line. I am still looking at session/global variable route.
global ser
ser = serial.Serial('port',9600)
#app.route('/')
def template():
return render_template('index.html')
#app.route('/result')
def serial_monitor():
result = str(ser.readline())
voltage = {'value':result}
return jsonify(voltage)
The following was the solution I found.
By setting the global variable status correctly (inside the function), I can now run any code for once only.
#app.route('/')
def template():
return render_template('index.html')
status = False
#app.route('/result')
def serial_monitor():
global status
#connect to serial port for once
if status== False:
ser = serial.Serial('/dev/cu.wchusbserialfa140',9600)
status = True
result = str(ser.readline())
voltage = {'value':result}
return jsonify(voltage)
Maybe keep ser as a global variable (although this can be a problem if you use multiple process based workers) so you don't have to open it every time, just seek or whatever is required to get into the correct state (I know nothing about serial so that may or may not make sense). Or maybe voltage can also be global and constantly updated in a background thread so that the serial_monitor function only has to read the latest value of a variable.

Celery+Django -- Poll task for state and report success or failure using Django messages framework

In my Django project that uses Celery (among a number of other things), I have a Celery task that will upload a file to a database in the background. I use polling to keep track of the upload progress and display a progress bar for uploading. Here are some snippets that detail the upload process:
views.py:
from .tasks import upload_task
...
upload_task.delay(datapoints, user, description) # datapoints is a list of dictionaries, user and description are simple strings
tasks.py:
from taskman.celery import app, DBTask # taskman is the name of the Django app that has celery.py
from celery import task, current_task
#task(base=DBTask)
def upload_task(datapoints, user, description):
from utils.db.databaseinserter import insertIntoDatabase
for count in insertIntoDatabase(datapoints, user, description):
percent_completion = int(100 * (float(count) / float(len(datapoints))))
current_task.update_state(state='PROGRESS', meta={'percent':percent_completion})
databaseinserter.py:
def insertIntoDatabase(datapoints, user, description):
# iterate through the datapoints and upload them one by one
# at the end of an iteration, yield the number of datapoints completed so far
The uploading code all works well, and the progress bar also works properly. However, I'm not sure how to send a Django message that tells the user that the upload is complete (or, in the event of an error, send a Django message informing the user of the error). When the upload begins, I do this in views.py:
from django.contrib import messages
...
messages.info(request, "Upload is in progress")
And I want to do something like this when an upload is successful:
messages.info(request, "Upload successful!")
I can't do that in views.py since the Celery task is fire and forget. Is there a way to do this in celery.py? In my DBTask class in celery.py I have on_success and on_failure defined, so would I be able to send Django messages from there?
Also, while my polling technically works, it's not currently ideal. The way the polling works currently is that it will endlessly check for a task regardless of whether one is in progress or not. It quickly floods the server console logs and I can imagine has a negative impact on performance overall. I'm pretty new to writing polling code so I'm not entirely sure of best practices and such as well as how to only poll when I need to. What is the best way to deal with the constant polling and the clogging of the server logs? Below is my code for polling.
views.py:
def poll_state(request):
data = 'Failure'
if request.is_ajax():
if 'task_id' in request.POST.keys() and request.POST['task_id']:
task_id = request.POST['task_id']
task = AsyncResult(task_id)
data = task.result or task.state
if data == 'SUCCESS' or data == 'FAILURE': # not sure what to do here; what I want is to exit the function early if the current task is already completed
return HttpResponse({}, content_type='application/json')
else:
data ='No task_id in the request'
logger.info('No task_id in the request')
else:
data = 'Not an ajax request'
logger.info('Not an ajax request')
json_data = json.dumps(data)
return HttpResponse(json_data, content_type='application/json')
And the corresponding jQuery code:
{% if task_id %}
jQuery(document).ready(function() {
var PollState = function(task_id) {
jQuery.ajax({
url: "poll_state",
type: "POST",
data: "task_id=" + task_id,
}).done(function(task) {
if (task.percent) {
jQuery('.bar').css({'width': task.percent + '%'});
jQuery('.bar').html(task.percent + '%');
}
else {
jQuery('.status').html(task);
};
PollState(task_id);
});
}
PollState('{{ task_id }}');
})
{% endif %}
(These last two snippets come largely from previous StackOverflow questions on Django+Celery progress bars.)
The simplest answer to reduce logging and overhead is to put a timeout on your next PollState call. The way your function is written right now it immediately polls again. Something simple like:
setTimeout(function () { PollState(task_id); }, 5000);
This will drastically reduce your logging issue and overhead.
Regarding your Django messaging question you'd need to pull those completed tasks out with some sort of processing. One way to do it is a Notification model—or similar—which you can then add a piece of middleware to fetch unread notifications and inject them into the messages framework.
Thanks to Josh K for the tip on using setTimeout. Unfortunately I could never figure out the middleware approach, so instead I'm going with a simpler approach of sending an HttpResponse in poll_state like so:
if data == "SUCCESS":
return HttpResponse(json.dumps({"message":"Upload successful!", "state":"SUCCESS"}, content_type='application/json'))
elif data == "FAILURE":
return HttpResponse(json.dumps({"message":"Error in upload", "state":"FAILURE"}, content_type='application/json'))
The intent is to simply render a success or error message based on the JSON received. There are new problems now but those are for a different question.

Rails Live: Dynamically send resut from method to browser

I am trying to notify the browser of the user of a change of the status of the model. I am trying to use the live-module of Rails for that. Here is what I have got so far:
require 'json'
class Admin::NotificationsController < ActionController::Base
include ActionController::Live
def index
puts "sending message"
videos = Video.all
response.headers['Content-Type'] = 'text/event-stream'
begin
if(params[:id].present?)
response.stream.write(sse({id: params[:id]},{event: "video_encoded"}))
end
rescue IOError
ensure
response.stream.close
end
end
private
def sse(object, options = {})
(options.map{|k,v| "#{k}: #{v}" } << "data: #{JSON.dump object}").join("\n") + "\n\n"
end
end
The idea behind the above controller is, that when its url gets called with a parameter, it would send this parameter (in this case the id) to the user. Here is how I am trying to call the controller:
notifications_path(id: video.id)
Unfortunately though, the following event-listener in the browser does not fire, even if I use curl to provoke an event:
var source = new EventSource('/notifications');
source.addEventListener("video_encoded", function(event) {
console.log("message")
console.log(event)
});
The goal of this is, that I want to add an dom-element to a certain page (later on) if there is a change. There may be a better way, but Ruby Live seemed like a suitable solution. Any tips or proposals of a different approach are appreciated.
Your use case does not seem like a valid use case for ActionController::Live. You are not sending a streaming output to the browser. You do a one time check on ID and send the JSON output.
Use a regular controller and get the request by AJAX instead of EventSource.

Categories