讨论»开发

XMLHttpRequest, redirections & privacy

发表于:2016-08-25

XMLHttpRequest, redirections & privacy

Hi,

I need tour advices in order to improve a script.

WHAT I WANT:
from an initial URL, follow multiple redirection (and some vicious ones like METAREFRESH !)
get the final URL (& clean it before opening which is fine)
final goal is to decrap links to protect safety and privacy (ex: do no store cookies).

WHAT I DO:
try a XMLHttpRequest then if it fail GM_xmlhttpRequest
because XMLHttpRequest seems to follow redirection without storing unwanted cookies

MY CODE:

var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (xhttp.readyState === xhttp.DONE) {
var URL = xhttp.getResponseHeader("Location");
URL = decodeURIComponent(URL);
if (URL != 'null') { myFunction(URL); }
}
if (URL == 'null') {
GM_xmlhttpRequest({
url: initialURL,
method: "GET",
onload: function(response) {
var URL = response.finalUrl;
myFunction(URL);
}
});
}
};
xhttp.open("GET", initialURL, true);
xhttp.send();

myFunction(URL) {
Clean URL
Open URL
}

So please give me some precious help !

发表于:2016-09-02
编辑于:2016-09-02

it wasn't really clear so here is a better explaination

I need to follow links with multiples redirections in order to get the final URL
Browsers does it but while doing this they've got a lot of cookies for tracking & affiliation purpose

ex:
start from always the same link shortener:
https://www.~linkshortener.com/url/?e=AGfyxEyWcav%2FzZyyfT%2BlEFYd%2FqQotmdDZqv4QR7zv4U%2BReCRPJdTho6wBc2gydD5&is_code=1&w=all&l=deal_url&i=253580
cookie set
302 - Found
https://clk.tradedoubler.com/click?p=269404&a=2027175&g=23064498
Meta Refresh
https://clk.tradedoubler.com/click?p=269404&a=2027175&g=23064498&f=0
302 - Moved Temporarily
http://solutions.tradedoubler.com/redirect/carrefour?aff_id=2027175&aff_name=affiliation.com%2Findex.php&prog_id=269404&tduid=2754bf422a1b1442f9bc813368d3a823&url=http://courses.carrefour.fr/drive/accueil%23utm_source=tradedoubler&utm_medium=display&utm_campaign=affiliation&utm_term=2027175_affiliation.com%2Findex.php%23tduid=2754bf422a1b1442f9bc813368d3a823%23xtor=AL-32280676-[5]-[2027175]_[affiliation.com%2Findex.php]-[textlink]
cookie set
301 - Moved Permanently
http://solutions.tradedoubler.com/redirect/carrefour/?aff_id=2027175&aff_name=affiliation.com%2Findex.php&prog_id=269404&tduid=2754bf422a1b1442f9bc813368d3a823&url=http://courses.carrefour.fr/drive/accueil%23utm_source=tradedoubler&utm_medium=display&utm_campaign=affiliation&utm_term=2027175_affiliation.com%2Findex.php%23tduid=2754bf422a1b1442f9bc813368d3a823%23xtor=AL-32280676-[5]-[2027175]_[affiliation.com%2Findex.php]-[textlink]
302 - Found
http://courses.carrefour.fr/drive/accueil#utm_medium=affiliation&utm_source=tradedoubler&utm_campaign=BR&utm_term=2027175_affiliation.com/index.php#tduid=2754bf422a1b1442f9bc813368d3a823#xtor=AL-32280676-[BR]-[2027175]_[affiliation.com/index.php]-[-[textlink]]
END

once i've got the final URL i'll do some regex to get a clean URL before opening it
http://courses.carrefour.fr/drive/accueil

my question is :
How to get my final url WITHOUT unwanted cookies


Please note that at the begining i use an external service but ~linkshortener.com detect it and blacklist it so it doesn't work anymore

Thanks

woxxom管理员
发表于:2016-09-03
编辑于:2016-09-03

Don't use XMLHttpRequest, it only follows redirects allowed by cross origin policy.

GM_xmlhttpRequest automatically follows all redirects regardless of CORS, allegedly, so it should be used. Also, try method: 'HEAD' because you don't need the html. Don't forget makePrivate: true.

As for metarefresh, you'll probably need to parse it manually (method: 'GET' is required) but I don't see why that might be a problem. Actually the question still doesn't state what the problem is with the code.

发表于:2016-09-03

Thanks wOxxOm,
no issue with the code itself, when i face issues i try harder ^^

just need help to improve it because i'm not good enough to understand every mechanism

i've set xmlrequest to head and added makePrivate: true
but if i use only GM_xmlhttpRequest (even with makePrivate) it still allow cookie creation.
that's why i use XMLHttpRequest first

301 cookie are real pain

woxxom管理员
发表于:2016-09-03

Apparently makePrivate is a nonstandard Scriptish thing. In Greasemonkey/Tampermonkey it's anonymous: true

发表于:2016-09-03

Yes ! it works great

i've got one final things to explore xmlhttpRequest VS GM_xmlhttpRequest

ex:
xml: http://www.banggood.com/bang/?tt=15981_12_191610_&r=http://www.banggood.com/fr/Original-Xiaomi-Hybrid-Dual-Drivers-Wired-Control-In-Ear-Earphone-Headphone-With-Mic-p-1010328.html
GM: http://www.banggood.com/fr/Original-Xiaomi-Hybrid-Dual-Drivers-Wired-Control-In-Ear-Earphone-Headphone-With-Mic-p-1010328.html?utm_source=tradetracker&utm_medium=tradetracker&utm_content=15981&utm_campaign=100001

i can't use xmlhttpRequest only because it fails if there is only a 302 redirect
now i can use only GM_xmlhttpRequest (with anonymous: true to prevent cookie)

but xmlhttpRequest often returns the second last URL which almost every times contains the real URL in clear

So i wondering if i should continue to use both or only GM_xmlhttpRequest

发表回复

登录以发表回复。