Changing the user-agent - php

Well, I have the following problem. I made a tool that checks the status of a website.
For example if I enter www.youtube.com it will say
http://www.youtube.com HTTP/1.0 200 OK
and for a website with a redirect it will say:
http://www.imgur.com HTTP/1.1 302 Moved Temporarily
http://imgur.com/ HTTP/1.1 200 OK
Alright, this works just as it should, however I would like to make it so that you can select the user-agent. So for example Android or something. Because youtube on android will redirect to m.youtube.com
I made a dropdownlist already with different user-agents and now what I don't know is how to change a user-agent. When I search google it just gives me browser plugins or addons.
I hope someone knows of a way to do this.
Thanks in advance!

You can send a CURL request and change the user agent like this.
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

Related

Alternative to file_get_contents and curl to get remote image

I had a music database PHP-script that automatically gets album-covers from remote server via file_get_contents. For some time now, it doesn't work anymore. I tried to do same thing with curl and Gd Library, but same problem, it returns "403 - forbidden". I guess it´s any type of hotlink protection in remote server, I can open remote image URL in browser, but I can´t grab it to my server.
Is there any alternative to bypass this issue and grab remote image?
To spoof the user-agent and other references in a CURL request you can use this code:
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
This will probably bypass the hotlink protection, it bypasses my own ;-)
You can use ajax to determine which image you need and load that directly to browser.
It will not violate hot-linking. And should work fine.

Force HTTP while fetching page source with PHP

How would I force HTTP (Not HTTPS), while getting the source code of: http://www.youtube.com/watch?v=2YqEDdzf-nY?
I've tried using get_file_contents, but it goes to HTTPS.
There is no way, because google forces you to use https. It will not accept longer unsecure connection.
They even start to downrank websites, which are not on SSL.
As for your Comment, i have done a little bit more research.
Maybe it is depended on the user-agent. I have no time to confirm this.
Try CURL with this User Agent:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101

Blocking HTTP POST attack via mod_rewrite

I have a WordPress site that is being attacked with the following HTTP POST requests:
x.x.x.x - - [15/Jul/2013:01:26:52 -0400] "POST /?CtrlFunc_stttttuuuuuuvvvvvwwwwwwxxxxxyy HTTP/1.1" 200 23304 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
x.x.x.x - - [15/Jul/2013:01:26:55 -0400] "POST / HTTP/1.1" 200 23304 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
The attack itself isn't bad enough to bring down Apache, but it does drive up the CPU usage more than I'd like it to. I would therefore like to block these using mod_rewrite -- straight to a 403 page should do it -- but I've not had any luck so far with anything I've tried. I would like to block all blank HTTP POST requests (to /) as well as /?CtrlFunc_*
What I've done as a workaround for now is block all HTTP POST traffic but that won't work long-term.
Any ideas? I've invested several hours on this and have not made much progress.
Thanks!
Instead of blocking the request via mod_rewrite, I'd use it as bait to record the IP of the offenders. Then, adding them to a 96 hour black list within your firewall will block all requests from them.
See Fail2ban.
Specifically, I believe that Fail2ban filters are the right place to start looking to write your url-specific case.
http://www.fail2ban.org/wiki/index.php/MANUAL_0_8#Filters
http://www.fail2ban.org/wiki/index.php/HOWTO_apache_myadmin_filter
Here's is a Fail2ban blog post that creates a filter for this POST attack.

Getting Always Error 500 Asking with Reddit's API for Votes Count

I'm using Reddit's API to get votes count for a given URL (I'm doing that like this, http://www.reddit.com/api/info.json?url=$url). I'm always getting Error 500 message. I give you a snippet of my code below. Anyone can help me?
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,15);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
$content = curl_exec($ch);
echo $content;
curl_close($ch);
Echo is always returning me the next line.
<html><body><h1>500 Server Error</h1>An internal server error occured.</body></html>
Thanks for reading.
--- EDITED ---
It is working locally.
$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
reddit's API rules state the following about user agents:
Change your client's User-Agent string to something unique and descriptive, preferably referencing your reddit username.
Example: User-Agent: super happy flair bot v1.0 by /u/spladug
Many default User-Agents (like "Python/urllib" or "Java") are drastically limited to encourage unique and descriptive user-agent strings.
If you're making an application for others to use, please include a version number in the user agent. This allows us to block buggy versions without blocking all versions of your app.
NEVER lie about your user-agent. This includes spoofing popular browsers and spoofing other bots. We will ban liars with extreme prejudice.
That doesn't explain the 500 error, nevertheless, this is where I'd start if the same URL works just fine when you use your browser. If you are getting 500 errors when also using your browser, then you probably aren't using the info API correctly (and consequently found a bug).

#file_get_contents() and curl failed to get page contents, I need alternate code

some sites are blocking #file_get_contents and the curl code also. I need code(PHP) that circumvents that problem. I only need to get the page contents so I can extract the title.
You probably need to set the user agent string to emulate a "real" browser:
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:2.0) Gecko/20110319 Firefox/4.0');

Categories