How can I download a web page by using Wget? - php

Firstly, i want download the web page: http://acm.sgu.ru/problem.php?contest=0&problem=161
I try to use the command:
wget -o 161.html http://acm.sgu.ru/problem.php?contest=0&problem=161
But it not work!
Anyone help me ?

The URL you are providing to wget contains characters that have special meaning in the shell (&), therefore you have to escape them by putting them inside single quotes.
Option -o file is used to log all messages to the provided file.
If you want the page to written to the provided file use option -O file (capital O).
Try:
wget -O 161.html 'http://acm.sgu.ru/problem.php?contest=0&problem=161'

Related

Compile C++ file Using PHP

I am using PHP on Windows machin. I also use Dev C++. I can perfectly compile .cpp file on CMD using this command:
g++ hello.cpp -O3 -o hello.exe
Now what I am trying to do is running the same command using php system() function, so it looks like this:
system("g++ c:\wamp\www\grader\hello.cpp -O3 -o C:\wamp\www\grader\hello.exe");
but it doesn't compile. I am lost, please tell me what am I missing?
I also looked up at this question and thats exactly what I need, but I couldnt find a usefull solution for my case there:
Php script to compile c++ file and run the executable file with input file
Use the PHP exec command.
echo exec('g++ hello.cpp -O3 -o hello.exe');
should work.
There's a whole family of different exec & system commands in PHP, see here:
http://www.php.net/manual/en/ref.exec.php
If you want the output into a variable, then use :
$variable = exec('g++ hello.cpp -O3 -o hello.exe');
If that doesn't work, then make sure that g++ is available in your path, and that your logged in with sufficient enough privliges to allow it to execute.
You may find also that it's failing beacuse PHP is essentially being executed by your web server (Unless your also running PHP from the cmd prompt) , and the web server user ID may not have write access to the folder where G++ is trying to create the output file.
Temporarily granting write access to 'Everybody' on the output folder will verify if that is the case.
Two things:
You are using double quotes and are not escaping the \ inside the path.
You are not using a full path to g++.
The first one is important as \ followed by something has a special meaning in such a string (you might know \n as new line), the second one is relevant since the PHP environment might have a different search path.
A solution might be
system("c:\\path\\to\\g++ c:\\wamp\\www\\grader\\hello.cpp -O3 -o C:\\wamp\\www\\grader\\hello.exe");
Alternatively you can use single quotes, intead of double quotes, they use diffeent,less strict escaping rules
system('c:\path\to\g++ c:\wamp\www\grader\hello.cpp -O3 -o C:\wamp\www\grader\hello.exe');
or use / instead of \, which is supported by windows, too.
system("c:/path/to/g++ c:/wamp/www/grader/hello.cpp -O3 -o C:/wamp/www/grader/hello.exe");
What you do is your choice, while many might consider the first one as ugly, and the last one as bad style on Windows ;-)
Thanks to everyone. I tried to run the codes given in above posts and it worked like a charm.
I ran the following code using my browser
$var = exec("g++ C:/wamp/www/cpp/hello.cpp -O3 -o C:/wamp/www/cpp/hello.exe");
echo $var;
The exe file is created. I am able to see the result when i run the exe file but the problem is when i run the above code in the browser, the result is not displayed on the webpage. I gave full access permission to all users but still give does not show the result on the webpage.
I really need help on this since i am doing a project on simulated annealing where i want to get the result from compiled c++ program and display it in the webpage with some jquery highcharts.
Thanks again to all, it has helped me alot and i have learnt alot as well.

How to setup a wget cron job command

How to setup a cron job command to execute an URL?
/usr/bin/wget -q http://www.domain.com/cron_jobs/job1.php >/dev/null 2>&1
Why can't I make this work!? Have tried everything.. The PHP script should send an email and create some files, but none is done
The command returns this:
Output from command /usr/bin/wget -q http://www.domain.com/cron_jobs/job1.php ..
No output generated
... but it still creates an empty file in /root on each execute!? Why?
Use curl like this:
/usr/bin/curl http://domain.com/page.php
Don't worry about the output, it will be ignored
I had the same problem. The solution is understanding that wget is outputting two things: the results of the url request AND activity messages about what it's doing.
By default, if you do not specify an output file, it will create one, seemingly named after the file in your url, in the current folder where wget is run.
If you want to specify a different output file:
-O outputfile.txt
will output the url results to outputfile.txt, overrwriting what's there.
If you wish to append to that file, write to std out and then append to the file from there:
and here's the trick: to write to std out use:
-O-
the second dash is in lieu of a filename and tells wget to write the url results to std out.
then use the append syntax, >>, to append to a file of your choice:
wget -O- http://www.invisibility.com >>/var/log/invisibility.log
The lower case o, specifies the location of the activity log, so if you wish to log activity for the url request, you can:
wget -o http://someurl.com /var/log/activity.log
-q suppresses output of activity messages
wget -q http://someurl.com /var/log/activity.log
will not log any activity to the specified file, and I think that is the crux where people get confused.
Remember:
-O is shorthand for --output-document
-o is shorthand for --output-file, which is the activity log.
Took me hours to get it working. Thank you for people writing down solutions.
One also needs to make sure to check whether single or double quotes are needed, otherwise it will parse the url wrong leading to error messages:
This worked (using single quotes):
/usr/bin/wget -O -q 'http://domain.com/cron-file.php'
This gave errors (using double quotes):
/usr/bin/wget -O -q "http://domain.com/cron-file.php"
Don't know if the /usr/bin/ is needed. Read about different ways of how to do the order of the -O -q. It is hard to find a reliable definitive source on the web for this subject. So many different examples.
An online wget manual can be found here, for the available options (but check with the Linux distro one is using for an up to date version):
http://unixhelp.ed.ac.uk/CGI/man-cgi?wget
For use wget to display HTML:
wget -qO- http://www.example.com

How to wget a file when the filename isn't known?

I am trying to automate the download of a file using wget and calling the php script from cron, the filename always consists of filename and date, however the date changes depending on when the file is uploaded. The trouble is there is no certainty of when the file is updated, and hence the final name can never really be known until the directory is checked.
An example filename is file20100818.tbz
I tried using wildcards within wget but they have failed, both using * and %
Thanks in advance,
Greg
Assuming the file type is constant then from the wget man page:
You want to download all the GIFs from
a directory on an HTTP server. You
tried wget
http://www.server.com/dir/*.gif, but
that didn't work because HTTP
retrieval does not support globbing.
In that case, use:
wget -r -l1 --no-parent -A.gif http://www.server.com/dir/
So, you want to use the -A flag, something like:
wget -r -l1 --no-parent -A.tbz http://www.mysite.com/path/to/files/
For the sake of clarity, because this threads shows up in google search when searching "wget and wildcards" and because the answers above don't bring sensitive solution and there doesn't seem to be anything else on SO answering this:
According to the wget manual, you can use the wildcards when using ftp and using the option -g on (--glob=on), however, wget will return an error unless you are using all the -r -np -nd options. Thanks to Wiseman20#ubuntuforums for showing us the way.
Samplecode:
wget -r -np -nd --glob=on ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.*.tar.gz
You can for loop each date like this:
<?php
for($i=0;$i<30;$i++)
{
$filename = "file".date("Ymd", time() + 86400 * $i).".tbz";
//try file download, if successful, break out of loop.
?>
You can increase number of tries in for loop.

Why could wget not work with PHP's exec function?

My script tries to exec() wget but seems to fail (though, no error raises up). What could be the problem? Should I tune PHP somehow? I just installed Apache and PHP on Ubuntu...
Add third parameter to exec() to find out the exit code of wget.
Maybe wget is not in the (search) path of the apache/php process.
Did you try an absolute path to the wget executable?
What is your $_GET['one']? The name of a video file? A number? A url? What's $file? What' $one?
Obvious error sources:
Are all of those variables set? If $one is blank, then wget has nowhere to go to fetch your file. If $_GET['one'] and $file are blank, then your output file will most likely not exist, either because the directory can't be found ($_GET['one']) is empty, or $file is empty, causing wget to try and output to a directory name, which is not allowed.
'illegal' characters in any of the variables. Does $file contain shell meta-characters? Any of ;?*/\ etc...? Those will all screw up the command line.
Why are you using wget anyways? You're passing raw query parameters out to a shell, which is just asking for trouble. It would be trivial to pass in shell metacharacters, which would allow remote users to run ANYTHING on your webserver. Consider the following query:
http://example.com/fetch.php?one=;%20rm%20-rf%20/%20;
which in your script becomes:
wget -O /var/www/videos/; rm -rf / ;/$file $one
and now your script is happily deleting everything on the server which your web server's user has permissions for.

How to grab live text from a URL?

Im trying to grab all data(text) coming from a URL which is constantly sending text, I tried using PHP but that would mean having the script running the whole time which it isn’t really made for (I think). So I ended up using a BASH script.
At the moment I use wget (I couldn’t get CURL to output the text to a file)
wget --tries=0 --retry-connrefused http://URL/ --output-document=./output.txt
So wget seems to be working pretty well, apart from one thing, every time I re-start the script wget will clear the output.txt file and start filling it again, which isn’t what I want. Is there a way to tell wget to append to the txt file?
Also, is this the best way to capture the live stream of data?
Should I use a different language like Python or …?
You can do wget --tries=0 --retry-connrefused $URL -O - >> output.txt.
Explanation: the parameters -O is short for --output-document, and a dash - means standard output.
The line command > file means write "write output of command to file", and command >> file means "append output of command to file" which is what you want.
Curl doesn't follow redirects by default and outputs nothing if there is a redirect. I always specify the --location option just in case. If you want to use curl, try:
curl http://example.com --location --silent >> output.txt
The --silent option turns off the progress indicator.
You could try this:
while true
do
wget -q -O - http://example.com >> filename # -O - outputs to the screen
sleep 2 # sleep 2 sec
done
curl http://URL/ >> output.txt
the >> redirects the output from curl to output.txt, appending to any data already there. (If it was just > output.txt - that would overwrite the contents of output.txt each time you ran it).

Categories