PHP: Security when using CURL? - php

I have a page like this. User write an URL into a form and submit. Once the URL is submitted, I connect that page with CURL, search for a string. If it finds the string, it adds URL into our database. If not, it gives an error to user.
I sanitize URL with htmlspecialchars() also a regex to allow A-Z, 1-9, :/-. symbols. I also sanitize the content retrieved from other website with htmlspecialchars() also.
My question is, can they enter an URL like;
www.evilwebsite.com/shell.exe or shell.txt
Would PHP run it, or simply look for the HTML output? Is it safe as it is or if not, what should I do?
Thank you.
Ps. allow_url_fopen is disabled. That's why I use curl.

I don't see why htmlspecialchars or a Regex would be necessary here, you don't need those. Also, there is no way that PHP will "automatically" parse the content retrieved using cURL. So yes, it is save (unless you do stuff like eval with the output).
However, when processing the retrieved content later, be aware that the input is user-provided and needs to be handled accordingly.

curl makes a request and to a server and the server sends back data. If there were an executable file on a web server you'd get back the binary of the file. Unless you write the file to your disk and execute it there should be no problem. Security in that sense should not be an issue.

Related

Why does my php script creates a new long URL?

my php script creates for some reason a super long new URL.
My original URL looks like this
http://someserveryoudon'tneedtoknow/index.php
And this is what I get after running the script.
http://someserveryoudon'tneedtoknow/index.php?vorname=and&nachname=andasd&ort=asd&email=asd&sonstiges=+Bitte+nur+ausfuellen+wenn+%27Sonstiges%27+ausgewaehlt+wurde+&sonstiges=&sonstiges=&sonstiges=&sonstiges=&sonstiges=
The script is about typing some data in some windows. And the weird words standing in the long URL are german words. Maybe important about them is that they are used in my script as messages standing in some textboxes. And some of them are variables.
Do you know what I can do to make php stop this? (At least I guess it's php's fault)
Yanakin
These are GET parameters. You should use POST to avoid this. POST is recommended anyway.
Reasons why you should use POST
It's secure
These parameters can be stored anywhere. POST doesn't store parameters in the URL. It sends the parameters in data. It can possibly stop some attacks.
GET /signup.php?username=john&password=johnny1234567890 HTTP/1.1
or
POST /signup.php HTTP/1.1
username=john&password=johnny1234567890
What seems better?
It's stored all on your computer. In your browsing history. Everywhere!
It's shorter
Not everyone wants to see https://example.com/signup.php?username=john&password=johnny1234567890&confidentialstuff=105650970950940 in the URL.
it looks like you pass data using get parameter. so that is why your url have a data like email is asd and so on..

Security of fetching a url content in php

I am concerned about the safety of fetching content from unknown url in PHP.
We will basically use cURL to fetch html content from user provided url and look for Open Graph meta tags, to show the links as content cards.
Because the url is provided by the user, I am worried about the possibility of getting malicious code in the process.
I have another question: does curl_exec actually download the full file to the server? If yes then is it possible that viruses or malware be downloaded when using curl?
Using cURL is similar to using fopen() and fread() to fetch content from a file.
Safe or not, depends on what you're doing with the fetched content.
From your description, your server works as some kind of intermediary that extracts specific subcontent from a fetched HTML content.
Even if the fetched content contains malicious code, your server never executes it, so no harm will come to your server.
Additionally, because your server only extracts specific subcontent (Open Graph meta tags, as you say),
everything else that is not what you're looking for in the fetched content is ignored,
which means your users are automatically protected.
Thus, in my opinion, there is no need to worry.
Of course, this relies on the assumption that the content extraction process is sound.
Someone should take a look at it and confirm it.
does curl_exec actually download the full file to the server?
It depends on what you mean by "full file".
If you mean "the entire HTML content", then yes.
If you mean "including all the CSS and JS files that the feched HTML content may refer to", then no.
is it possible that viruses or malware be downloaded when using curl?
The answer is yes.
The fetched HTML content may contain malicious code, however, if you don't execute it, no harm will come to you.
Again, I'm assuming that your content extraction process is sound.
Short answer is file_get_contents is safe you retrieve data, even curl is. It is up to you what you do with that data.
Few Guidelines:
1. Never Run eval on that data.
2. Don't save it to database without filtering.
3. Don't even use file_get_contents or curl.
Use: get_meta_tags
array get_meta_tags ( string $filename [, bool $use_include_path = false ] )
// Example
$tags = get_meta_tags('http://www.example.com/');
You will have all meta tags parsed, filtered in an array.
you can use httpclient.class instead of file_get_content or curl. because it connect's the page through the socket.After download the data you can take the meta data using preg_match.
Expanding on the answer made by Ray Radin.
Tips on precautionary measures
He is correct that if you use sound a sound process to search the fetched resource there should be no problem in fetching whatever url is provided. Some examples here are:
Don't store the file in a public facing directory on your webserver. Then you expose yourself to this being executed.
Don't store it in a database, this might lead to a second order sql injection attack
In general, don't store anything from the resource you are requesting, if you have to do this use a specific whitelist of what you are searching for
Check the header information
Even though there is no foolprof way of validating what you are requesting with a specific url. There are ways you can make your life easier and prevent some potential issues.
For example a url might point to a large binary, large image file or something similar.
Make a HEAD request first to get the header information. Then look at the Content-type and Content-length headers to see if the content is a plain text html file
You should however not trust these since they can be spoofed. Doing this will hovewer make sure that even non-malicous content won't crash your script. Requesting image files is presumably something you don't want users to do.
Guzzle
I recommend using Guzzle to do your request since it is in my opinion provides some functionallity that should make this easier
It is safe but you will need to do a proper data check before using it. As you should with any data input anyway.

Editing and Saving user HTML with Javascript - how safe is it?

For example I have a Javascript-powered form creation tool. You use links to add html blocks of elements (like input fields) and TinyMCE to edit the text. These are saved via an autosave function that does an AJAX call in the background on specific events.
The save function being called does the database protection, but I'm wondering if a user can manipulate the DOM to add anything he wants(like custom HTML, or an unwanted script).
How safe is this, if at all?
First thing that comes to mind is that I should probably search for, and remove any inline javascript from the received html code.
Using PHP, JQuery, Ajax.
Not safe at all. You can never trust the client. It's easy even for a novice to modify DOM on the client side (just install Firebug for Firefox, for example).
While it's fine to accept HTML from the client, make sure you validate and sanitize it properly with PHP on the server side.
Are you saving the full inline-html in your database?
If so, try to remake everything and only save the nessesary data to your backend. ALL fields should also be controlled if they are recieved in the expected way.
All inline-js is easily removed.
You can never trust the user!
Absolutely unsafe, unless you take the steps to make it safe of course. StackOverflow allows certain tags, filtered so that users can't do malicous things. You'll definately need to do something similar.
I'd opt to sanitize input server side so that everyone gets their input sanitized, whether they've blocked scripts or not. Using something like this: http://www.phpclasses.org/package/3746-PHP-Remove-unsafe-tags-and-attributes-from-HTML-code.html or http://grom.zeminvaders.net/html-sanitizer implemented with AJAX would be a pretty good solution

PHP sanitize and check input procedure for simplexml?

I am passing a textarea input boxs' contents via POST to my php file from html (no javascript allowed).
I then use simplexml to get the feed at the url the user entered.
Unfortunately, the user can enter anything into the textarea. Which I am told is dangerous.
What is the recommended way to clean and secure the POST contents using PHP to get them ready and safe for the simplexml procedure?
(basically, to be sure they are not malicious and check they are a valid url)
Content inside a $_POST array are strings, so there's nothing ineherently unsafe there.
User enters php code? It surely won't be executed, so no problem here (this, among many others, is a reason not to use such things as eval()). So whatever php function or command he writes it will be read as a simple string, and string are no harmful whatever they contain.
User enters malicious javascript? Still no problem, as javascript inside php, or inside a database for what that matters, is pretty useless since it needs a browser to execute.
This leads to the real issue: user supplied contents needs to be "sanitized" only right before passing it to the target medium. If you're going to feed a database , use the escaping tools provided by your engine. If you're going to output it on the webpage, that's when you need to sanitize from malicious XSS attacks.
Sanitizing a POST array per se , before actually doing anything with its content, is wrong as you never know for sure when and where that content needs to be used; so don't even think to use strip_tags() or analogue functions that comes to your mind right after you get the POST value, but pass it as is and add the necessary escaping/sanitizing just when needed.
What you actually need to do, then, you only know, so act accordingly
Which I am told is dangerous.
it is wrong.
What is the recommended way to clean and secure the POST contents
it am afraid there is nothing to secure

GET PHP but using # instead of ?=

When using the $_GET[] in php, it searches for a variable e.g ?id=value
Is there a way to call the #value instead?
No, because the hash part of the url is client-side only and is not sent to the server.
When you enter an URL such as http://server.com/dir/file.php?a=1#something in your browser's URL textbox, the browser opens a connection to the server.com and then issues a HTTP command GET /dir/file.php?a=1 HTTP/1.1. This is the only data sent to the server.
Hence, the server never gets the #something part, and this means there is no script on server side you could write to read that value.
Similar question explained here: How to get Url Hash (#) from server side
I've been able to work around it by getting the fragment via javascript and sending an ajax request with the fragment as the $_GET contents.
Without knowing your whole case, I may be off track, but there is the possibility of sending the #something to the server via a simple GET type of xmlhttprequest.
Yeah there is a way. I think what you want to do is:
$arValues = array_values($_GET);
// whatever else you want to do with the values

Categories