window.location is not woking - javascript

I'm using jQuery with Django in server-side. What I'm trying to do is to get some text from the user through the form and simultaneously displaying the text in the canvas area like about.me and flavors.me does. Then the user drag the text in the canvas area to the desired position and when they click the next button,the data must be stored in the database and redirect to the homepage. Everything is working perfect(the datas are stored in the database) except when I click the button which I set window.location to "http://127.0.0.1:8000". But I'm not getting to that page when I click the button.
I'm getting some errors in Django server:
error: [Errno 32] Broken pipe
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 51161)
Traceback (most recent call last):
File "/usr/lib/python2.7/SocketServer.py", line 284, in _handle_request_noblock
Here is my html:
https://gist.github.com/2359541
Django views.py:
from cover.models import CoverModel
from django.http import HttpResponseRedirect
def coverview(request):
if request.is_ajax():
t = request.POST.get('top')
l = request.POST.get('left')
n = request.POST.get('name')
h = request.POST.get('headline')
try:
g = CoverModel.objects.get(user=request.user)
except CoverModel.DoesNotExist:
co = CoverModel(top=t, left=l, name=n, headline=h)
co.user = request.user
co.save()
else:
g.top = t
g.left = l
g.name = n
g.headline = h
g.save()
return HttpResponseRedirect("/")
urls.py:
url(r'^cover/check/$', 'cover.views.coverview'),
url(r'^cover/$', login_required(direct_to_template), {'template': 'cover.html'}),
Could anyone help me?
Thanks!

There's really not enough information in your question to properly diagnose this, but you can try this:
It's always a bad idea to hard-code a domain name in your JS. What happens when you take this to production, for example? If you want to send the user to the homepage (presumed from the location being set to http://127.0.0.1:8000/), then set the location simply to /. That will ensure that it will always go to the site root regardless of the IP address, domain name or port.

Part of the problem is that you're trying to post data, and then immediately leaving the page by using window.location. You should only change the window.location whenever you get the response back from the $.post().
$.post("check/", { top: t, left: l, name: n, headline: h}, function(data) {
window.location.href = "/";
});
Notice also that I removed the hardcoded URL. Use a relative one here, like Chris said.
If it still isn't working, you need to check for Javascript errors in the lines above. Use Firebug, Chrome Dev Tools, Opera Dragonfly, something. Check to make sure your POST is actually going through, and post more data about that back here.

Related

scrapy + selenium: <a> tag has no href, but content is loaded by javascript

I'm almost there with my first try of using scrapy, selenium to collect data from website with javascript loaded content.
Here is my code:
# -*- coding: utf-8 -*-
import scrapy
from selenium import webdriver
from scrapy.selector import Selector
from scrapy.http import Request
from selenium.webdriver.common.by import By
import time
class FreePlayersSpider(scrapy.Spider):
name = 'free_players'
allowed_domains = ['www.forge-db.com']
start_urls = ['https://www.forge-db.com/fr/fr11/players/?server=fr11']
driver = {}
def __init__(self):
self.driver = webdriver.Chrome('/home/alain/Documents/repository/web/foe-python/chromedriver')
self.driver.get('https://forge-db.com/fr/fr11/players/?server=fr11')
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
#time.sleep(1)
sel = Selector(text = self.driver.page_source)
players = sel.xpath('.//table/tbody/tr')
for player in players:
joueur = player.xpath('.//td[3]/a/text()').get()
guilde = player.xpath('.//td[4]/a/text()').get()
yield {
'player' : joueur,
'guild' : guilde
}
next_page_btn = self.driver.find_element_by_xpath('//a[#class="paginate_button next"]')
if next_page_btn:
time.sleep(2)
next_page_btn.click()
yield scrapy.Request(url = self.start_urls, callback=self.parse)
# Close the selenium driver, so in fact it closes the testing browser
self.driver.quit()
def parse_players(self):
pass
I want to collect user names and their relative guild and output to a csv file.
For now my issue is to proceed to NEXT PAGE and to parse again the content loaded by javascript.
if i'm able to simulate click on NEXT tag, i'm not 100% sure that code will proceed all pages and i'm not able to parse the new content using the same function.
Any idea how could i solve this issue ?
thx.
Instead of using selenium, you should try recreate the request to update the table. If you look closely at the HTML under chrometools. You can see that the request is made with parameters and a response is sent back with the data in a nice structured format.
Please see here with regards to dynamic content in scrapy. As it explains the first step to think about is it necessary to recreate browser activity ? Or can I get the information I need from reverse engineering HTTP get requests. Sometimes the information is hidden with <script></script> tags and you can use some regex or some string methods to gain what you want. Rendering the page and then using browser activity should be thought of as a last step.
Now before I go into some background on reverse engineering the requests, this website you're trying to get information from requires only to reverse engineer the HTTP requests.
Reverse Engineering HTTP requests in Scrapy
Now in terms of the actual web itself we can use chrome devtools by right clicking inspect on a page. Clicking the network tab allows you to see all requests the browser makes to render the page. In this case you want to see what happens when you click next.
Image1: here
Here you can see all the requests made when you click next on the page. I always look for the biggest sized response as that'll most likely have your data.
Image2: here
Here you can see the request headers/params etc... things you need to make a proper HTTP request. We can see that the referring URL is actually getplayers.php with all the params to get the next page added on. If you scroll down you can see all the same parameters it sends to getplayers.php. Keep this in mind, sometimes we need to send headers, cookies and parameters.
Image3: here
Here is the preview of the data we would get back from the server if we make the correct request, it's a nice neat format which is great for scraping.
Now You could copy the headers and parameters, cookies here into scrapy, but after a bit of searching and it's always worth checking this first, if just by passing in an HTTP request with the url will you get the data you want then that is the simplest way.
In this case it's true and infact you get in a nice need format with all the data.
Code example
import scrapy
class TestSpider(scrapy.Spider):
name = 'test'
allowed_domains = ['forge-db.com']
def start_requests(self):
url = 'https://www.forge-db.com/fr/fr11/getPlayers.php?'
yield scrapy.Request(url=url)
def parse(self,response):
for row in response.json()['data']:
yield {'name':row[2],'guild':row[3] }
Settings
In settings.py, you need to set ROBOTSTXT_OBEY = False The site doesn't want you to access this data so we need to set it to false. Be careful, you could end getting banned from the server.
I would also suggest a couple of other settings to be respectful and cache the results so if you want to play around with this large dataset you don't hammer the server.
CONCURRENT_REQUESTS = 1
DOWNLOAD_DELAY = 3
HTTPCACHE_ENABLED = True
HTTPCACHE_DIR = 'httpcache'
Comments on the code
We make a request to https://www.forge-db.com/fr/fr11/getPlayers.php? and if you were to print the response you get all the data from the table, it's quite a lot... Now it looks like it's in json format so we use scrapy's new feature to handle json and convert into a python dictionary. response.json() be sure that you have uptodate scrapy to take advantage of this. Otherwise you could use the json library that python provides to do the same thing.
Now you have to look at the preview data abit here but the individual rows are within response.json()['data'][i] where i in the row of data. The name and guild are within response.json()['data'][i][2] and response.json()['data'][i][3]. So looping over every response.json()['data']and grabbing the name and guild.
If the data wasn't so structured as it is here and it needed modifying I would strongly urge you to use Items or ItemLoaders for creating the fields that you can then output the data. You can modifying the extracted data more easily with ItemLoaders and you can interact with duplicates items etc using a pipeline. These are just some thoughts for in the future, I almost never use yielding a dictionary for extracting data particularly large datasets.

I am unable to set back data to ckeditor instance which I fetched from it and stored in database?

I have a Django web app which has various instances of ck-editor instances on a web page.By using on blur event I am saving whole of data of the instance in the database -
{% for editor in editors %}
CKEDITOR.appendTo("{{editor.ck_id}}" ,
{
on: {
blur: function(event){
var data = event.editor.getData();
console.log("data of {{editor.ck_id}} is " + data);
var request = $.ajax({
url: "/editor/save/",
type: "GET",
data: {
content : data,
content_id : "{{editor.ck_id}}"
},
dataType: "html"
});
}
}
}
// "{{editor.data}}"
);
CKEDITOR.instances['{{editor.ck_id}}'].insertHtml("{{editor.data}}");
Here ck_id and data are the two database fields of ckeditors.Now suppose I write this on one instance of ckeditor -
Tony Stark
When I lose focus on that instance then on blur event is fired and I use 'getData()' to get the html data of that instance.
<p>Tony Stark</p>
is saved into the database.On the python interpreter now when I fetch editor's data it shows -
<p>Tony Stark</p>
which is obvious.
Now When I re start the server and set the data of every ck-editor instance again then this exception is raised -
Uncaught SyntaxError: Unexpected token ILLEGAL
I know why this is happening - due to this -
CKEDITOR.instances['ck_1'].insertHtml("<p>Tony Stark</p>
");
The data which I sent by fetching from database was -
<p>Tony Stark</p>
and it somehow got converted to above mentioned text with illegal tokens.I have tried to use setData() but with no result. Do I need to encode/decode this HTML or something?
Now my question is how to again reset data to a ck-editor instance which I fetched from it and stored in database. I have posted this question on several forums but many people have got no clue of what I am asking?
Is there anyone here who has tried to do same thing and have succeeded in it?
Thanks
PS: Adrian Ghiuta solution seems to be working but there is just problem. When the editor is loaded first time then in google chrome's debugger inserted line is seen as -
"<p>Tony Stark</p>"
which is rendered due to line "{{editor.data|safe}}".But when I change my editor's content to just
Tony
then in database
<p>Tony</p>
is being saved but when I restart the server it does not render and throws this error
Unexpected ILLEGAL TOKEN
due to this -
"<p>Tony</p>
"
Here ending double quotes are in the next line.But during initial data loading it was on the same line.Might this be causing be any problem?
Because my chrome console shows error at that position.
I will add some images to show the condition.
Initial loading -
Initial console -
Editor after editing -
Error being shown after restarting server and reloading editors -
Console -
You can see how error is thrown at line where double quotes are at the next line.Do I to escape html or something?
Sorry for my naivety but I do not have much command over HTML.

POST to .jsp website with Python requests library

I'm using Python 3.3 and Requests 2.2.1.
I'm trying to POST to a website ending in .jsp, which then changes to .doh ending. Using the same basic requests code outline I'm able to successfully login and scrape other websites, but the javascript part on this site is not working. This is my code:
import requests
url = 'https://prodpci.etimspayments.com/pbw/include/sanfrancisco/input.jsp'
payload = {'plateNumber':'notshown', 'statePlate':'CA'} #tried CA and California
s = requests.Session() #Tried 'session' and 'Session' following different advice
post = s.post(url, data=payload)
r = s.get('https://prodpci.etimspayments.com/pbw/include/sanfrancisco/input.jsp')
print(r.text)
Finally, when manually entering data into the webpage through firefox browser, the page changes and url becomes https://prodpci.etimspayments.com/pbw/inputAction.doh, which only has contet if you are redirected there after typing in license plate.
From the printed text, I know I'm getting content from the page as it would be without POSTing anything, but I need the content for the page once I've POSTed the payload.
For the POST payload, do I need to include something like 'submit':'submit' to simulate clicking the search button?
Am I doing the GET request from the right url, considering the url I POST to?
You're making POST request and after that another GET request and this is why you get the same page with the form.
response = s.post(url, data=payload)
print(response.text)
Also if you check the form markup, you'll find its action is /pbw/inputAction.doh and additionally the form sends a few parameters from hidden inputs. Therefore you should use that URL in your request and probably the values from hidden inputs.
With the next code I'm able to retrieve the same response as via regular request in browser:
import requests
url = 'https://prodpci.etimspayments.com/pbw/inputAction.doh'
payload = {
'plateNumber': 'notshown',
'statePlate': 'CA',
'requestType': 'submit',
'clientcode': 19,
'requestCount': 1,
'clientAccount': 5,
}
s = requests.Session()
response = s.post(url, data=payload)
print(response.text)
The same you can see in browser after same request via the form:
...
<td colspan="2"> <li class="error">Plate is not found</li></td>
...

Rails Live: Dynamically send resut from method to browser

I am trying to notify the browser of the user of a change of the status of the model. I am trying to use the live-module of Rails for that. Here is what I have got so far:
require 'json'
class Admin::NotificationsController < ActionController::Base
include ActionController::Live
def index
puts "sending message"
videos = Video.all
response.headers['Content-Type'] = 'text/event-stream'
begin
if(params[:id].present?)
response.stream.write(sse({id: params[:id]},{event: "video_encoded"}))
end
rescue IOError
ensure
response.stream.close
end
end
private
def sse(object, options = {})
(options.map{|k,v| "#{k}: #{v}" } << "data: #{JSON.dump object}").join("\n") + "\n\n"
end
end
The idea behind the above controller is, that when its url gets called with a parameter, it would send this parameter (in this case the id) to the user. Here is how I am trying to call the controller:
notifications_path(id: video.id)
Unfortunately though, the following event-listener in the browser does not fire, even if I use curl to provoke an event:
var source = new EventSource('/notifications');
source.addEventListener("video_encoded", function(event) {
console.log("message")
console.log(event)
});
The goal of this is, that I want to add an dom-element to a certain page (later on) if there is a change. There may be a better way, but Ruby Live seemed like a suitable solution. Any tips or proposals of a different approach are appreciated.
Your use case does not seem like a valid use case for ActionController::Live. You are not sending a streaming output to the browser. You do a one time check on ID and send the JSON output.
Use a regular controller and get the request by AJAX instead of EventSource.

Troubles using scrapy with javascript __doPostBack method

Trying to automatically grab the search results from a public search, but running into some trouble. The URL is of the form
http://www.website.com/search.aspx?keyword=#&&page=1&sort=Sorting
As I click through the pages, after visiting this page, it changes slightly to
http://www.website.com/search.aspx?keyword=#&&sort=Sorting&page=2
Problem being, if I then try to directly visit the second link without first visiting the first link, I am redirected to the first link. My current attempt at this is defining a long list of start_urls in scrapy.
class websiteSpider(BaseSpider):
name = "website"
allowed_domains = ["website.com"]
baseUrl = "http://www.website.com/search.aspx?keyword=#&&sort=Sorting&page="
start_urls = [(baseUrl+str(i)) for i in range(1,1000)]
Currently this code simply ends up visiting the first page over and over again. I feel like this is probably straightforward, but I don't quite know how to get around this.
UPDATE:
Made some progress investigating this and found that the site updates each page by sending a POST request to the previous page using __doPostBack(arg1, arg2). My question now is how exactly do I mimic this POST request using scrapy. I know how to make a POST request, but not exactly how to pass it the arguments I want.
SECOND UPDATE:
I've been making a lot of progress! I think... I looked through examples and documentation and eventually slapped together this version of what I think should do the trick:
def start_requests(self):
baseUrl = "http://www.website.com/search.aspx?keyword=#&&sort=Sorting&page="
target = 'ctl00$empcnt$ucResults$pagination'
requests = []
for i in range(1, 5):
url = baseUrl + str(i)
argument = str(i+1)
data = {'__EVENTTARGET': target, '__EVENTARGUMENT': argument}
currentPage = FormRequest(url, data)
requests.append(currentPage)
return requests
The idea is that this treats the POST request just like a form and updates accordingly. However, when I actually try to run this I get the following traceback(s) (Condensed for brevity):
2013-03-22 04:03:03-0400 [guru] ERROR: Unhandled error on engine.crawl()
dfd.addCallbacks(request.callback or spider.parse, request.errback)
File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 280, in addCallbacks
assert callable(callback)
exceptions.AssertionError:
2013-03-22 04:03:03-0400 [-] ERROR: Unhandled error in Deferred:
2013-03-22 04:03:03-0400 [-] Unhandled Error
Traceback (most recent call last):
Failure: scrapy.exceptions.IgnoreRequest: Skipped (request already seen)
Changing question to be more directed at what this post has turned into.
Thoughts?
P.S. When the second errors happen scrapy is unable to cleany shutdown and I have to send a SIGINT twice to get things to actually wrap up.
FormRequest doesn't have a positional argument in the constructor for formdata:
class FormRequest(Request):
def __init__(self, *args, **kwargs):
formdata = kwargs.pop('formdata', None)
so you actually have to say formdata=:
requests.append(FormRequest(url, formdata=data))

Categories