Facebook and their method for collecting data - reverse engineering the methods?

Facebook and their method for collecting data - reverse engineering the methods? - javascript

So apparently, Facebook collects huge amount of data from user, and not just from hitting the Like button, but also the amount of time that user has spent looking at some post by someone ( reading friends status update ). Is there a method to see what am I actually sending to Facebook and when (time is relevant). Can I view those requests in Windows 7?
Is it possible to do reverse engineering on this particular topic?

This will be available in all browsers but if you have access to Chrome then go to "View" and then "Developer" and then "Developer Tools".
From here select the "Network" tab.
You can then see all traffic travelling between your browser and the internet. You can filter this list down based on the Domain Name, e.g. facebook or any other aliases they use.
Click on any item in the list and then you can see the request and response.
This should help you to get started.

It's not that simple..
Facebook will use their own format, not like the most simple aka REST / JSON applications.
Facebook is making it very very hard to read/understand their APIs, obviously...
They will use some kind of their own binary data implementation, so really if you look at the post data, its just number, some (maybe) encrypted token like data, stored in a base64 format.. what ever..
Additionally, FB is using a lot of AI processing, this is no rocket-science anymore.. The internal APIs could also work based on that. So reverse-engineering FB makes no sense. Just write you own.
I also think, that many very good IT specialists are already trying it. Companies like FB will also make internal contests on this topic, to make their APIs even more secure. Actually, if you do some Online Banking, you will find more useful information on what data was send, then on FB.

Related

How to "store" data from an API call and refresh it (make new call) overtime (replacing old stored contents) on website

I'm a volunteer for this association/game called FAF. We have a leaderboards (https://www.faforever.com/competitive/leaderboards/1v1) of players that we get through API calls. However, it isn't very efficient to make it so everytime someone opens the leaderboard page to make an API call to get the rankings. However, Imagine if 1000 people visit it, that would be 1000 calls to the API all for the exact same information.
Therefore, I've been trying to search and search of a method to do an API call, store that in the code and show that to the users. Then find a way to automatize said API call to be every 30min - 1h. So that way, its just 1 call that stores the info for users to see, rather than a new call for the same information every time an user opens the leaderboards page. However, I can't find anything of how to do this with js (fetch, ajax, json). I'm still learning front-end dev so I'm not sure if there is even a way to do this?
Would appreciate a lot if you could link me to a resource or coding "technique" to achieve this using JS. Thanks!

What you are describing is caching. Caching is an extremely common technique to reduce server load and latency. Most web server libraries offer some kind of cache functionality, which can be found in their respective docs. Frontend caching is not quite as common but can be achieved using local storage as mentioned in this blog post https://medium.com/#brockreece/frontend-caching-strategies-38c57f59e254
Here is another article about caching, this time a bit more general using nodejs https://www.honeybadger.io/blog/nodejs-caching/

How to send a pre-built request

What I want:
I'm using a website (that I wish to remain anonymous) to buy securities. It is quite complex and as far as I can see coded in JavaScript.
What I would like to do with this website is to 'inject' a request to buy something from a separate process. So instead of having to search for what I want to buy manually and get in there and manually fill out the form, click buy and confirm the 'are you sure you want to place the order?' popup I would just like to send whatever command/request is being sent to the server when the confirm-button is pressed directly.
To be extra clear: I simply don't want to go through the manual hassle but rather just send a pre-built request with the necessary parameters embedded.
I'm certainly not looking to do anything malicious, just make my order input faster and smoother. It is not necessary to automate login or anything like that.
I understand that this is not much to go on but I'm throwing it out there and ask the question: Can it be done?
I really don't know how this stuff works behind the scenes, maybe the request is somehow encrypted to some custom format that is next to impossible to reverse engineer, or maybe not.

"Injecting" is probably the wrong term. Most people will think of sql injection or javascript injection which is usually malicious activity. That doesn't seem to be what you want.
What you are looking for is an automation tool. There are plenty of tools available. Try a google search for "web automation tool." Selenium http://www.seleniumhq.org/ and PhantomJS http://phantomjs.org/ are popular ones.
Additionally, you may be able to recreate the request that is actually buying the security. If you use Chrome you can open Developer Tools and look at what appears on the Network tab as you go through the site. Firefox and Edge have similar tools as well. When you make the purchase you will see the actual network request that placed it. Then, depending on how the site is implemented you may just be able to replicate that request using a tool like Postman.
However before you do any of the above, I would recommend that you take a look at the TOS for the site you mention. They may specifically prohibit that kind of activity.

I want to elaborate my comment and Michael Ratliff answer.
On example.
We got some services. The administration of this services could be done via web-interface. But only in manual mode, there is no API (yes, 2016 year and no API). So at first there was not much work with administration and we done it manually.
But time passed and the amount of administration work grow exponentially so we come to situation where this work must be automated (still no API even few new versions was released).
What have we done:
We opened pages we need in browser, open Inspect Element (in Firefox), open Network, fill the web-form, press button we need. In the left part we see all requests to service, by pressing any request on the right side appears full description of what was send/get, all requests and their parameters. Then we took that parameters, change them and send back to server. Kind of reverse-engineering though.
For automation we used PHP and CURL. For now almost all work with the services is automated.
And yes, we have used Selenium (before PHP and CURL). You can open form you need. Press Rec do some stuff on the web-form, Selenium collects this data and then you can change parameters in Selenium script and re-run it.

Incorporate LinkedIn to a webpage

I'm trying to create a webpage that can incorporate LinkedIn info's (profile,people,company, etc...).
The things that it can/would do are the following:
When the user enters a name that is registered in LinkedIn, he gets the following
*Name, Company, Email
*List of LinkedIn messages that are waiting for reply
The same process goes on everytime the user adds a profile, I'm planning to use the Profile API of LinkedIn to get the Name, Company and Email but I can't find a working example to be my basis.
As for the 2nd one I still don't know how to get the LinkedIn messages.
Here's my Layout and expected result.
How can I achieved this? Opinions and Suggestions are highly appreciated tnx

This is far to broad a question for me to invest the necessary time in to figure the answers (multiple) for you, but do let me give you some hints. First of all, from my experience with the linkedin API not all the data you wish to access is available (do double check this though, I used the API quite awhile back and stuff might have changed in the meantime). As this data is not available through the API the only alternative would be to somehow bypass the cross domain policy, which in conclusion would require the user to install a chrome extension/firefox plugin which will function as a proxy for your application or even 'better', make you entire application a browser plugin based web app. Not that I am a fan of those whatsoever but if you application is meant in any way whatsoever as a linkedin (dedicated) plugin (probably as part of a greater service you're developing) then it might make most sense.

The whole system you are describing is very long winded and requires a large amount of development time. Alot of the data is not accessible directly or indirectly too. You cannot get email address's out from the API as a security feature (bots could just harvest emails for marketing campaigns).
First of all, you will need to make an application that allows for oAuth2 connections with the linkedin API service. People will log onto your website, click to join their linkedin account with your website and your website will receive back an access token to do the calls.
You will then need to build the queries which will access the data you require. The linkedin API documentation (http://developer.linkedin.com/) isn't greatly indepth but it gives you a good understand and points you where you need to go. There are also a couple of pre-done php API's around such as https://code.google.com/p/simple-linkedinphp/.
I have worked with many API's from twitters, facebooks and LinkedIn's and they all require a lot of back-end work to make sure that they are secure and get the correct data.
It would take me hours to go through exactly how to do it and has taken me many hours to get a solid implementation in place and working with all the different calls available.
If you have minimal coding knowledge, it would be best to go to an external company with a large amount of resources and knowledge in the field who can do it for you. Otherwise it may take many months to get a working prototype.

Google custom search on the whole web and limitations (gizoogle)

I am working on a search engine that needs to have access to results from google. Here are my options:
Using the custom search API
Using a proxy to make my server send searches and return the data
I am not sure about some things though:
Is the custom search API limited? I may need a really big amount of queries, so if the use is limited it will be a problem.
Is it "authorized" to use a proxy in node that would send search queries to google and intercept the result to show to my users? If I do so, wouldn't I run to some limitations?
The inspiration here is gizoogle which managed to plug into google API (they have the same results as google) while still not using custom search (custom search displays adds, and there aren't any on this website). So I assume they have some sort of proxy, but how come google let them run those queries?
Edit: It turns out that the custom search API is also limited. So, how did gizoogle do ?

Ok here is how I solved this problem:
It turns out that google has a lost API (probably deprecated so be aware of this) for client-side ajax search. It looks like that:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=test&rsz=large
Just go to that url to see what results it gives.
So basically here is the process:
The user types a search
It is sent to your server in ajax
The server might modify the search depending on your application (filtering forbidden words or whatever)
Your server polls the ajax web service from google - don't forget to add the getparameter userIp which is needed to avoid limitations (google limits incoming queries from each user, so your server has to tell google that it is making a request on behalf of this userIp
You send back the results to the client, and then use javascript to display them
The only drawback is that the search must be made in ajax, meaning that the page is empty at load and filled later. But you could actually use get parameters in URL to preload the search and fill the page before sending it to the client though.

Google Custom Search (GCS) has a free mode and a paid ("enterprise") mode.
Both modes are regulated by a terms of service (Custom Search Terms of Service) - make sure you read carefully.
From what I understand, you can use the free mode and search as much as you'd like. Because google is returning the results, they also return ads, so they get paid that way.
The paid mode gives you access to the API, and let's you turn off the ads and do other things. But it comes at a cost.
I've been combing through the documentation and terms and the like -- it's really not Google's best effort. But if you are using it exactly as they describe, it's pretty standard, really.

Depends on your project size and funds available but you could get a GSA http://www.google.com/enterprise/search/products/gsa.html
The Dr Oz webite uses this to index and pull in results from partnered sites, you would have the ability to include Google results as well. Highly customizable with the works from source weight ranking, filtering options to custom output.

Hide urls in html/javascript file

I am using ajax in my website and in order to use the ajax, I habe to write the name of the file for example:
id = "123";
$.getJSON(jquerygetevent.php?id=" + id, function(json)
{
//do something
});
how can I protect the url? I dont want people to see it and use it...

that is a limitation of using client side scripts. there is no real way to obfuscate it from the user there are many ways to make it less readable (minify etc) but in the end an end-user can still view the code

Hi Ron and welcome to the internet. The internet was (to quote Wikipedia on the subject)
The origins of the Internet reach back to research of the 1960s, commissioned by the United States government in collaboration with private commercial interests to build robust, fault-tolerant, and distributed computer networks. The funding of a new U.S. backbone by the National Science Foundation in the 1980s, as well as private funding for other commercial backbones, led to worldwide participation in the development of new networking technologies, and the merger of many networks. The commercialization of what was by the 1990s an international network resulted in its popularization and incorporation into virtually every aspect of modern human life.
Because of these origins, and because of the way that the protocols surrounding HTTP resource identification (like for URLs) there's not really any way to prevent this. Had the internet been developed as a commercial venture initially (think AOL) then they might have been able to get away with preventing the browser from showing the new URL to the user.
So long as people can "view source" they can see the URLs in the page that you're referring them to visit. The best you can do is to obfuscate the links using javascript, but at best that's merely an annoyance. What can be decoded for the user can be decoded for a bot.
Welcome to the internet, may your stay be a long one!

I think the underlying issue is why you want to hide the URL. As everyone has noted, there is no way to solve the actual resolved URL. Once it is triggered, FireBug gives you everything you need to know.
However, is the purpose to prevent a user from re-using the URL? Perhaps you can generate one-time, session-relative URLs that can only be used in the given HTTP Session. If you cut/paste this URL to someone else, they would be unable to use it. You could also set it to expire if they tried to Refresh. This is done all the time.
Is the purpose to prevent the user from hacking your URL by providing a different query parameter? Well, you should be handling that on the server side anyways, checking if the user is authorized. Even before activating the link, the user can use a tool like FireBug to edit your client side code as much as they want. I've done this several times to live sites when they're not functioning the way I want :)
UPDATE: A HORRIBLE hack would be to drop an invisible Java Applet on the page. They can also trigger requests and interact with Javascript. Any logic could be included in the Applet code, which would be invisible to the user. This, however, introduces additional browser compatibility issues, etc, but can be done. I'm not sure if this would show up in Firebug. A user could still monitor outgoing traffic, but it might be less obvious. It would be better to make your server side more robust.

Why not put some form of security on your php script instead, check a session variable or something like that?
EDIT is response to comment:
I think you've maybe got the cart before the horse somehow. URLs are by nature public addresses for resources. If the resource shouldn't be publicly consumable except in specific instances (i.e. from within your page) then it's a question of defining and implementing security for the resource. In your case, if you only want the resource called once, then why not place a single use access key into the calling page? Then the resource will only be delivered when the page is refreshed. I'm unsure as to why you'd want to do this though, does the resource expose sensitive information? Is it perhaps very heavy on the server to run the script? And if the resource should only be used to render the page once, rather than update it once it's rendered, would it perhaps be better to implement it serverside?

you can protect (hide) anything on client, just encrypt/encode it into complicated format to real human

We Keep Coding

JavaScript is the programming language of the Web.

Facebook and their method for collecting data - reverse engineering the methods? - javascript

Related

How to "store" data from an API call and refresh it (make new call) overtime (replacing old stored contents) on website

How to send a pre-built request

Incorporate LinkedIn to a webpage

Google custom search on the whole web and limitations (gizoogle)

Hide urls in html/javascript file

Categories

Resources