Strip URL's from file using PHP - php

I have a list of URL's I would like to clean back to their root domain (and sub domains), so far I have this which is fairly simple and works how I need it to:
echo parse_url('https://app.ffgx.com/fdgdfg', PHP_URL_HOST) . '<br />';
echo parse_url('https://www.fgigo.com/fgdfg', PHP_URL_HOST) . '<br />';
However I have a large list of over 200 URL's, so adding each URL like this will be very time consuming.
My question is, how can I upload this script to my server and include an upload button which will allow me to upload a list of URL's in txt or csv format; it will then run through the list and present me with all the stripped URL's?

Something like this with txt file. Where inputfile.txt contains your url's that should be 1 in a line.
https://app.ffgx.com/fdgdfg
https://www.ffgx.com/fdgdfg
Then Store all the extracted domain in array, $url_list. Then depend on you what you will do with array data.
$url_list = array();
$handle = #fopen("/path/inputfile.txt", "r");
if ($handle) {
while (($buffer = fgets($handle, 4096)) !== false) {
$url_list[] = parse_url($buffer, PHP_URL_HOST);
}
if (!feof($handle)) {
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}

If your URL's in the file will only be starting with "https" or "http", then you read the text file and by using str_replace(), all the url's can be removed / replaced. And then again you can write the new replaced content into the file.

Related

PHP hexadecimal matrix map

I have hexadecimal content saved inside a string, however I would like to get a value that is located at a specific address (ex. 0xD3)
I have the following code to read the .bin file and show it's content:
$handle = #fopen($file, "r");
if ($handle) {
while (!feof($handle)) {
$hex = bin2hex(fread ($handle , 16 ));
print $hex."\n";
$input_output .= $hex;
}
fclose($handle);
}
I am looking for a way to map the addresses in order to find the values I want to.
Ideas ? thanks in advance

Batch download URLs in PHP?

So, I have a PHP script that is supposed to download images that the user inputs. However, if the user uploads a TXT file and it contains direct links to images, it should download the images from all the URLs in the file. My script seems to be working, although it seems that only the last file is downloaded while the others are stored as files containing no data.
Here's the portion of my script where it parses the TXT
$contents = file($file_tmp);
$parts = new SplFileObject($file_tmp);
foreach($parts as $line) {
$url = $line;
$dir = "{$save_loc}".basename($url);
$fp = fopen ($destination, 'w+');
$raw = file_get_contents($url);
file_put_contents($dir, $raw);
}
How do I make it download every URL from the TXT file?
When you iterate over an SplFileObject, you get the whole line, including whitespace. Your URL will thus be something like
http://example.com/_
(php seems to mangle the newline to an underscore) and thus you'll get an error for many URLs (some URLs will still work fine, since they contain the important information prior. For instance, Batch download URLs in PHP? works, but https://stackoverflow.com/_ does not). If an error occurs, file_get_contents will return false, and file_put_contents will interpret that like an empty string.
Also, the line $fp = fopen ($destination, 'w+'); is really strange. For one, since $destination is not defined, it would error anyways. Even if $destination is defined, you'll end up with lots of file handles and overwrite that poor file multiple times. You can just remove it.
To summarize, your code should look like
<?php
$file_tmp = "urls.txt";
$save_loc = "sav/";
$parts = new SplFileObject($file_tmp);
foreach($parts as $line) {
$url = trim($line);
if (!$url) {
continue;
}
$dir = "{$save_loc}".basename($url);
$raw = file_get_contents($url);
if ($raw === false) {
echo 'failed to donwload ' . $url . "\n";
continue;
}
file_put_contents($dir, $raw);
}
It looks like line
$parts = new SplFileObject($file_tmp);
isn't necessary as well as
$fp = fopen ($destination, 'w+');
file() function reads entire file into array. You just have call trim() on each array element to remove new line from characters. Following code should work properly:
<?php
$save_loc = './';
$urls = file('input.txt');
foreach($urls as $url) {
$url = trim($url);
$destination = $save_loc . basename($url);
$content = file_get_contents($url);
if ($content) {
file_put_contents($destination, $content);
}
}

simplexml_load_file odd behavior

I have a file (sites.txt) that has two entries:
http://www.url1.com/test1.xml
http://www.url2.com/test2
Whenever I execute the below PHP code, the 'url1.com' returns false, and the 'url2.com' is loaded into $xml. The odd part is that if I interchange the URLs in the file, i.e.
http://www.url2.com/test2
http://www.url1.com/test1.xml
It loads both. Both URLs are valid XML documents. Why does the order matter here?
Code:
if (file_exists('sites.txt')) {
$file_handle = fopen("sites.txt", "r");
while (!feof($file_handle)) {
$site = fgets($file_handle);
$xml[] = simplexml_load_file($site);
}
fclose($file_handle);
}
try changing your text file to a csv then explode contents of the file on the delimiter:
http://www.url1.com/test1.xml,
http://www.url2.com/test2
$file = fopen("sites.txt", "r");
$files = explode(",", $file);
Sounds like there are some other things going on in addition to this but that you may have that sorted out...

Parsing and Writing Files in PHP

I'm attempting to open a directory full of text files, and then read each file line-by-line, writing the information in each line to a new file. Within each text file in the directory I'm trying to iterate, the information is formed like:
JunkInfo/UserName_ID_Date_Location.Type
So I want to open every one of those text files and write a line to my new file in the form of:
UserName,ID,Date,Location,Type
Here's the code I've come up with so far:
<?php
$my_file = 'info.txt';
$writeFile = fopen($my_file, 'w') or die('Cannot open file: '.$my_file); //implicitly creates file
$files = scandir('/../DirectoryToScan');
foreach($files as $file)
{
$handle = #fopen($file, "r");
if ($handle)
{
while (($buffer = fgets($handle, 4096)) !== false)
{
$data = explode("_", $buffer);
$username = explode("/", $data[0])[1];
$location = explode(".", $data[3])[0];
$type = explode(".", $data[3])[1];
$stringToWrite = $username . "," . $data[1] . "," . $data[2] . "," . $location . "," . $type;
fwrite($writeFile, $stringToWrite);
}
if (!feof($handle))
{
echo "Error: unexpected fgets() fail\n";
}
fclose($handle);
}
}
fclose($writeFile);
?>
So my problem is, this doesn't seem to work. I just never get anything happening -- the output file is never written and I'm not sure why.
There is one potential issue with the scandir() line:
$files = scandir('/../DirectoryToScan');
The path begins with a /, which means that it is looking in the root of the server. So, the directory it's trying to read is /DirectoryToScan. To fix it, you can just remove the leading /. Of course, this could be a sample path for this example and may not actually apply to reality, or maybe you really do have a directory in the root of your system named that - in these cases, feel free to ignore this bit =P.
The next thing is when you're using fopen() on the files you're iterating through. scandir() returns the name of the file, not the full path. You'll need to concat the directory name and the file each time:
$dir = '../DirectoryToScan/';
$files = scandir($dir);
foreach($files as $file) {
$handle = #fopen($dir . $file, "r");
I'm currently running an older version of PHP, so directly-accessing array indexes from return-functions, such as with explode("/", $data[0])[1], doesn't work for me (it was added in PHP 5.4).
Other than that, the rest of your code looks like it should work fine (minus any potential logic/data errors that I may have overlooked).

PHP include external file and make an Array

I want to include an external file, and then get all the content, remove the only HTML on the page <br /> and replace with , and fire it into an array.
datafeed.php
john_23<br />
john_5<br />
john_23<br />
john_5<br />
grabber.php
<?php
// grab the url
include("http://site.com/datafeed.php");
//$lines = file('http://site.com/datafeed.php);
// loop through array, show HTML source as HTML source.
foreach ($lines as $line_num => $line) {
// replace the <br /> with a ,
$removeBreak = str_replace("<br />",",", $removeBreak);
$removeBreak = htmlspecialchars($line);
// echo $removeBreak;
}
// fill our array into string called SaleList
$SaleList = ->array("");
I want to load a php file from the server directory, get the HTML contents of this file and place it into a useable array.
It would look like
$SaleList = -getAndCreateArray from the file above >array("");
$SaleList = ->array("john_23, john_5, john_23");
Here is a working version of grabber.php:
<?php
function getSaleList($file) {
$saleList = array();
$handle = fopen($file, 'r');
if (!$handle) {
throw new RuntimeException('Unable to open ' . $file);
}
while (($line = fgets($handle)) !== false) {
$matches = array();
if (preg_match('/^(.*?)(\s*\<br\s*\/?\>\s*)?$/i', $line, $matches)) {
$line = $matches[1];
}
array_push($saleList, htmlspecialchars($line));
}
if (!feof($handle)) {
throw new RuntimeException('unexpected fgets() fail on file ' . $file);
}
fclose($handle);
return $saleList;
}
$saleList = getSaleList('datafeed.php');
print_r($saleList);
?>
By using a regular expression to find the <br />, the code is able to deal with many variations such as <br>, <BR>, <BR />, <br/>, etc.
The output is:
Array
(
[0] => john_23
[1] => john_5
[2] => john_23
[3] => john_5
)
You don't seem to have grasped what include does.
If you want to process the contents of some file using your PHP code, then include is the wrong construct - you should be using file() or file_get_contents().
i.e. using the line of code you've commented out in your question.
Where include is the right construct to use....you should never, ever include remote files directly - its much MUCH slower than a local filesystem read - and very insecure. There are times when it does make sense to fetch the file from a remote location and cache it locally though.
And you should NEVER have inline HTML nor PHP code (html in PHP variables/conditional expressions, and PHP defines/class/function/include/require are OK) in an include file.
May you need something like this?
$file = file_get_contents('newfile.php');
echo str_replace("<br>", ",", $file);
But I didn't get what you try to insert into the array...

Categories