We are loading on the web browser the content of the model "Post", which is ordered by "helium" (our variable for popularity).
Now, we'd like to order the same content by "pop_date" (our variable for due_date) when the user clicks on a button.
Is it possible to do it instantly without reloading the browser page?
The options we've seen until now:
change the "order_by" and request a new list ( new operation on the database, reuploading all the medias again)
use python "sorted" function (memory issues for many posts loaded)
models.py
class Post(models.Model):
pub_date = models.DateTimeField('date published',auto_now=True)
pop_date = models.DateTimeField('Popping time', blank=False)
helium=models.IntegerField(default=0)
(...)
views.py (query)
Post.objects.filter(pop_date__gte=timezone.now()).order_by(self.ordering,'-id')
url.py
# Order by Helium
url(r'^(?P<slug>[-\w]+)/popular$', views.IndexView.as_view(ordering='-helium'), name='index'),
# Order by Popping time
url(r'^(?P<slug>[-\w]+)/timeline$', views.IndexView.as_view(ordering='pop_date'), name='index_now'),
I would definitely go with order_by. You can't use sorted after a user clicks on the button, you have to make another request again (you can eventually use JS but that's a whole new story - you will have to fetch all the data you need, this will limit you not to be able to use pagination in the future).
Related
I am working on a flashcards app using flask. It is similar to Anki if you know about that. When I click on a deck, I want it to go to study_session.html where it displays the first card of the selected deck. Currently I have a home screen that lists the user's decks. Each list item is a button element with an onclick attribute that calls a javascript method that takes the selected deck's name as a parameter. The method sends a post request to the flask route "/study-session" along with the name of the deck. This allows the route's method to get the selected deck and send the appropriate flashcard to the "study_session.html" page to be displayed. What's wrong is that either the javascript method or the server-side route method, or both do not seem to be executing. When I click the deck I want to study, nothing happens. I would like someone to point out my mistake and offer the appropriate fix. If I need ajax, please show me how the javascript would look like. Thank you.
Here is the html that has the button with an onclick attribute:
{% for deck in user.decks %}
<li class="list-group-item">
<button class="btn btn-link pl-0" onclick="startStudySession({{ deck.deck_name }})">{{deck.deck_name}}</button>
Here is the javascript:
// Takes deckId and sends post req to study-session route.
// Then runs study-session view
function startStudySession(deckName) {
// Specify what route to send request to
fetch("/study-session", {
method: "POST",
body: JSON.stringify({deckName: deckName})
}).then((_res) => {
window.location.href = "/study-session"
});
}
Here is the route method on the python server:
#views.route('/study-session', methods=['GET', 'POST'])
def study_session():
"""
Renders study_session page with one flashcard displayed. Handles all requests on study_session page.
"""
request_data = json.loads(request.data)
cur_deck = current_user.get_deck(request_data['deckName'])
cur_card = cur_deck.notes[0].flashcards[0]
front = cur_card.front_text
back = cur_card.back_text
print("is this working?")
return jsonify('', render_template("study_session.html", user=current_user, front=front, back=back))
I'm almost there with my first try of using scrapy, selenium to collect data from website with javascript loaded content.
Here is my code:
# -*- coding: utf-8 -*-
import scrapy
from selenium import webdriver
from scrapy.selector import Selector
from scrapy.http import Request
from selenium.webdriver.common.by import By
import time
class FreePlayersSpider(scrapy.Spider):
name = 'free_players'
allowed_domains = ['www.forge-db.com']
start_urls = ['https://www.forge-db.com/fr/fr11/players/?server=fr11']
driver = {}
def __init__(self):
self.driver = webdriver.Chrome('/home/alain/Documents/repository/web/foe-python/chromedriver')
self.driver.get('https://forge-db.com/fr/fr11/players/?server=fr11')
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
#time.sleep(1)
sel = Selector(text = self.driver.page_source)
players = sel.xpath('.//table/tbody/tr')
for player in players:
joueur = player.xpath('.//td[3]/a/text()').get()
guilde = player.xpath('.//td[4]/a/text()').get()
yield {
'player' : joueur,
'guild' : guilde
}
next_page_btn = self.driver.find_element_by_xpath('//a[#class="paginate_button next"]')
if next_page_btn:
time.sleep(2)
next_page_btn.click()
yield scrapy.Request(url = self.start_urls, callback=self.parse)
# Close the selenium driver, so in fact it closes the testing browser
self.driver.quit()
def parse_players(self):
pass
I want to collect user names and their relative guild and output to a csv file.
For now my issue is to proceed to NEXT PAGE and to parse again the content loaded by javascript.
if i'm able to simulate click on NEXT tag, i'm not 100% sure that code will proceed all pages and i'm not able to parse the new content using the same function.
Any idea how could i solve this issue ?
thx.
Instead of using selenium, you should try recreate the request to update the table. If you look closely at the HTML under chrometools. You can see that the request is made with parameters and a response is sent back with the data in a nice structured format.
Please see here with regards to dynamic content in scrapy. As it explains the first step to think about is it necessary to recreate browser activity ? Or can I get the information I need from reverse engineering HTTP get requests. Sometimes the information is hidden with <script></script> tags and you can use some regex or some string methods to gain what you want. Rendering the page and then using browser activity should be thought of as a last step.
Now before I go into some background on reverse engineering the requests, this website you're trying to get information from requires only to reverse engineer the HTTP requests.
Reverse Engineering HTTP requests in Scrapy
Now in terms of the actual web itself we can use chrome devtools by right clicking inspect on a page. Clicking the network tab allows you to see all requests the browser makes to render the page. In this case you want to see what happens when you click next.
Image1: here
Here you can see all the requests made when you click next on the page. I always look for the biggest sized response as that'll most likely have your data.
Image2: here
Here you can see the request headers/params etc... things you need to make a proper HTTP request. We can see that the referring URL is actually getplayers.php with all the params to get the next page added on. If you scroll down you can see all the same parameters it sends to getplayers.php. Keep this in mind, sometimes we need to send headers, cookies and parameters.
Image3: here
Here is the preview of the data we would get back from the server if we make the correct request, it's a nice neat format which is great for scraping.
Now You could copy the headers and parameters, cookies here into scrapy, but after a bit of searching and it's always worth checking this first, if just by passing in an HTTP request with the url will you get the data you want then that is the simplest way.
In this case it's true and infact you get in a nice need format with all the data.
Code example
import scrapy
class TestSpider(scrapy.Spider):
name = 'test'
allowed_domains = ['forge-db.com']
def start_requests(self):
url = 'https://www.forge-db.com/fr/fr11/getPlayers.php?'
yield scrapy.Request(url=url)
def parse(self,response):
for row in response.json()['data']:
yield {'name':row[2],'guild':row[3] }
Settings
In settings.py, you need to set ROBOTSTXT_OBEY = False The site doesn't want you to access this data so we need to set it to false. Be careful, you could end getting banned from the server.
I would also suggest a couple of other settings to be respectful and cache the results so if you want to play around with this large dataset you don't hammer the server.
CONCURRENT_REQUESTS = 1
DOWNLOAD_DELAY = 3
HTTPCACHE_ENABLED = True
HTTPCACHE_DIR = 'httpcache'
Comments on the code
We make a request to https://www.forge-db.com/fr/fr11/getPlayers.php? and if you were to print the response you get all the data from the table, it's quite a lot... Now it looks like it's in json format so we use scrapy's new feature to handle json and convert into a python dictionary. response.json() be sure that you have uptodate scrapy to take advantage of this. Otherwise you could use the json library that python provides to do the same thing.
Now you have to look at the preview data abit here but the individual rows are within response.json()['data'][i] where i in the row of data. The name and guild are within response.json()['data'][i][2] and response.json()['data'][i][3]. So looping over every response.json()['data']and grabbing the name and guild.
If the data wasn't so structured as it is here and it needed modifying I would strongly urge you to use Items or ItemLoaders for creating the fields that you can then output the data. You can modifying the extracted data more easily with ItemLoaders and you can interact with duplicates items etc using a pipeline. These are just some thoughts for in the future, I almost never use yielding a dictionary for extracting data particularly large datasets.
I'm attempting to scrape data on food seasonality from the Seasonal Food Guide but hitting a snag. The site has a fairly simple URL structure:
https://www.seasonalfoodguide.org/produce_name/state_name
I've been able to use Selenium and Beautiful Soup to successfully scrape the seasonality information from one page, but on subsequent loops the section of text I'm looking for doesn't actually load so I get AttributeError: 'NoneType' object has no attribute 'text'. I know it's because months_list_raw is coming back empty due to the fact that the 'wheel-months-list' portion of the page isn't loading on the second loop. Code is below. Any ideas?
for ingredient in produce_list:
for state in state_list:
# grab page content
search_url = 'https://www.seasonalfoodguide.org/{}/{}'.format(ingredient,state)
driver.get(search_url)
page_soup = soup(driver.page_source, 'lxml')
# grab list of months
months_list_raw = page_soup.find('p',{'id':'wheel-months-list'})
months_list = months_list_raw.text
The page is being rendered on the client side, which means when you open the page, another request is being made to a backend server to fetch the data based on your selected filters. So the issue is that when you open the page and read the HTML, the content is not fully loaded yet. The simplest thing you could do is sleep for some time after opening the page with Selenium in order to wait for it to fully load. I've tested your code by throwing in time.sleep(3) after the driver.get(search_url) and it worked fine.
To prevent the error from occuring and continuing with your loop you need to do a check for when the months_list_raw element is not None. It seems like some of the produce pages do not have any data for some states, so you will need to handle that in your program how you want.
for ingredient in produce_list:
for state in state_list:
# grab page content
search_url = 'https://www.seasonalfoodguide.org/{}/{}'.format(ingredient,state)
driver.get(search_url)
page_soup = soup(driver.page_source, 'lxml')
# grab list of months
months_list_raw = page_soup.find('p',{'id':'wheel-months-list'})
if months_list_raw is not None:
months_list = months_list_raw.text
else:
# Handle case where ingredient/state data doesn't exist
Is it possible to restrict user from putting url of page by hand?
Let's say I have two pages - somepage.com/home and someplace.com/other and somewhere in home page is button that redirects user to /other site. I want to make sure that user won't be able to access /other by writing its url by hand. Instead it should redirect back to home page.
Is there maybe some decorator like login_required that I can use? Or maybe I should use some js function?
Thanks in advance for any tips, cheers.
Try using the referrer property of the html, if the user goes to other page by clicking the link in your home page, then referrer with point to your homepage. But if the user have manually typed the other page link then the referrer will not point to your homepage.
Return Value: A String, representing the URL of the document that loaded the current document. Returns the entire URL, including the protocol (like http://). If the current document was not opened through a link (for example, through a bookmark), an empty string is returned.
Reference : https://www.w3schools.com/jsref/prop_doc_referrer.asp
Other, not so efficient solutions :
You don't need login functionality ? If you need then simply go ahead and implement one and mark other view as login_required.
If you don't need that then read point #2.
You can POST some random unique value from homepage button to go to other page. And in other view check for that value. If that value is not found then redirect to homepage. If found, display other view.
If the user is accessing page by entering the url, that value will be missing those redirection to homepage to occur. One thing to make note of, this solution is not secure. Users can trick the server, if you don't implement is secure enough.
You can build a middelware and use it.
First, you need to give each user a group or access to the user.i did this with user_type.
Now you create file acl and make a pack of urls name.
acl.py
acl_view_segment_divided = {
"pack_1": [
"url_name_1",
"url_name_2",
.
.
.
],
"pack_2": [
"url_name_3",
"url_name_4",
.
.
.
],
}
acl_view_segment = dict(
user_type_1=list(
acl_view_segment_divided["pack_1"] +
acl_view_segment_divided["pack_2"]
),
user_type_2=list(
acl_view_segment_divided["pack_1"]
),
)
Now you need to create middelware file:
middleware.py
from django.conf import settings
from django.utils.deprecation import MiddlewareMixin
from django.core.urlresolvers import resolve
from acl import acl_view_segment
from django.shortcuts import render
from django.contrib import auth
from apps.main.views import page_permission_denied_view
class ACLMiddleware(MiddlewareMixin):
#staticmethod
def process_view(request, view_func, view_args, view_kwargs):
if not request.user.is_authenticated():
return
current_url = resolve(request.path_info).url_name
if current_url in getattr(settings, 'ACL_EXEMPT_VIEWS', set()):
return
user_type = request.user.user_type
acl = acl_view_segment[user_type]
if current_url not in acl:
return page_permission_denied_view(request)
or
return redirect(home_page)
added middelware to setting.py
settings.py
MIDDLEWARE = [
...
'yourproject.middleware.ACLMiddleware',
]
If you have questions, ask questions in comment.I hope your problem is resolved.
Hello people
I'm trying to figured this out, but I still can't do it.
I have a rails 3 app, I'm working with invoices and payments. In the form for payments I have a collection_select where I display all the invoices number (extracted from a postgres database), and what I'm trying to do is when i select an invoice autopopulate others text_fields (provider, address, etc.) without reloading the page, in the same form.
I know I should use ajax, js, jquery, but I'm a beginner in these languages, so i don't know how or where to start
hope you can help me... thanks
What you are going to want to do is route an ajax call to a controller, which will respond with json containing the information. you will then use jquery to populate the different fields.
In your routes:
get "invoice/:id/get_json", :controller=>"invoice", :action=>"get_json"
In your invoice_controller:
def get_json
invoice = Invoice.find(params[:invoice_id])
render :text => invoice.to_json
end
In your invoice model (if the default to_json method is not sufficent):
def to_json
json = "{"
json += "id:'#{self.id}'"
json += ",date_created:'#{self.date}'"
... //add other data you want to have here later
json += "}"
end
In your javascript file,
$("#invoice_selecter").change(function(){ //calls this function when the selected value changes
$.get("/invoice/"+$(this).val()+"/get_json",function(data, status, xhr){ //does ajax call to the invoice route we set up above
data = eval(data); //turn the response text into a javascript object
$("#field_1").val(data.date_created); //sets the value of the fields to the data returned
...
});
});
You are probably going to run into a few issues, i would highly recommend downloading and installing fire bug if you are not on google chrome.. and if you are, make sure you are using the development tools. I believe you can open them up by right clicking and hitting inspect element. Through this, you should be able to monitor the ajax request, and whether or not it succeeded and things.