PHP string of IP addresses in a .txt file? - php

I'm writing a feature for an admin panel that blocks ip addresses on the apache level. The file is called blacklist.txt and looks like 10.0.0.1,10.0.0.2,10.0.0.3, ... All a single line, with each ip address separated by a comma. After reading What is the best way to write a large file to disk in PHP?, I am still unsure of the best practices on the matter.
Here's what I want to do: IF an administrator presses the 'ban hammer', the file is read looking for strpos($file, $ip), if it's not found, append to the end of the file and the .htaccess file blocks accordingly.
Question: is a .txt file suitable for this potentially large amount of data? I do not want to execute a query to check if someone is banned every time a page is requested
EDIT:
The purpose is to block single ip addresses that have 10 failed login attempts in the past 12 hours. I would think that the 'recover my password' would prevent a normal client from doing this.

Question: is a .txt file suitable for this potentially large amount of
data?
No, it is not. A database with proper indexing is.

First for reading your File in CSV format
you can use many ways. example:
$rows = array_map('str_getcsv', file('myfile.csv'));
$header = array_shift($rows);
$csv = array();
foreach ($rows as $row) {
$csv[] = array_combine($header, $row);
}
src: http://steindom.com/articles/shortest-php-code-convert-csv-associative-array
for checking that on each page load and to minimize the Reading of that file
you can use a memory cache , something like memCache, then search the array for the incoming ip. note: memory cache is faster then Database query.
PHP shared memory ref: http://www.php.net/manual/en/book.shmop.php
memCache php.net/memcache
Array Search php.net/in_array
also to return the key if value found php.net/array_search
note: in 1 mb file you can store ~65K IP considering an ip is the following format: "255.255.255.255,"
it's even better if you put the key of the array the ip, then instead of searching the array for that ip you can Check if the Key exist with this: php.net/array_key_exists‎

Related

File truncated when reading

I am writing some json results in files in PHP on shared hosting (fwrite).
Then I read those files to extract json results (file_get_contents).
It happens some times (maybe one out of more than one thousand) that when I read this file it appears truncated: I can only read a multiple of the first 32768 bytes of the file.
I added some code to copy/paste the file I am reading in case the json string is not valid, and I then get 2 different files: the original one was correctly written as it contains a valid json string and the copied one contains only the beginning of the original one and has a size of x*32768 bytes.
Would you have any idea of what could be the problem and how to solve this? (I don't know how to investigate further)
Thank you
Without example code it is impossible to give a 'fix my code' answer, but when doing file write/read sort of programming, you should follow a simple process (which, from the description, is missing one fairly critical step!)
First, write to a TEMP file (you are writing to a file, but it is important here to write to a TEMP file - otherwise, you could have race conditions....... ;);
an easy way to do that in php
$yourData = "whateverYourDataIs....";
$goodfilename = 'whateverYourGoodFileNameIsForYourData.json';
$tempfilename = 'tempfile' . time(); // MANY ways to do this (lots of SO posts on it - just get a unique name every time you write ('unique' may not be needed if you only occasionally do a write, but it is a good safety measure to avoid collisions and time() works for many programs.)
// Now, use $tempfilename in your fwrite.
$fwrite = fwrite($tempfilename,$yourData);
if ($fwrite === false) {
// the write failed, so do whatever 'error' function you may need
// since it failed, there should be no file, but not a bad idea to attempt to delete it
unlink ($tempfile);
}
else {
// the write succeeded, so let's do a 'sanity check' on the file to make sure it is good JSON (this is a 'paranoid' check, but "better safe than sorry", right?)
if(json_decode($tempfile)){
// we know the file is good JSON, so now RENAME (this is really fast, so collisions are almost impossible) NOTE: see http://php.net/manual/en/function.rename.php comments for some potential challenges and workarounds if you have trouble with rename.
rename($tempfilename,$goodfilename);
}
// Now, the GOOD file will contain your new data - and those read issues are gone! (though never say 'never' - it may be possible, but very unlikely!)
}
This may/not be your issue directly and you will have to suit this to fit your code, but as a safety factor - and a good way to avoid collisions, it should give you ~100% read success, which I believe is what you are after!)
If this doesn't help, then some direct code will be needed to provide a more complete answer.
As suggested by #UlrichEckhardt comment, it was due to read / write concurrency problem. I was trying to read a file that was being writen. I solved this by just waiting before trying to read the file again

How to avoid storing 1 million element in an array PHP

I'm parsing a 1 000 000 line csv file in PHP to recover this datas: IP Address, DNS , Cipher suites used.
In order to know if some DNS (having several mail servers) has different Cipher suites used on their servers, I have to store in a array a object containing the DNS name, a list of the IP Address of his servers, and a list of cipher suites he uses. At the end I have an array of 1 000 000 elements. To know the number of DNS having different cipher suites config on their servers I do:
foreach($this->allDNS as $dnsObject){
$res=0;
if(count($dnsObject->getCiphers()) > 1){ //if it has several different config
res++;
}
return $res;
}
Problem: Consumes too much memory, i can't run my code on 1000000 line csv (if I don't store these data in a array, I parse this csv file in 20 sec...). Is there a way to bypass this problem ?
NB: I already put
ini_set('memory_limit', '-1');
but this line just bypass the memory error.
Saving all of those CSV data will definitely take its toll on the memory.
One logical solution to your problem is to have a database that will store all of those data.
You may refer to this link for a tutorial on parsing your CSV file and storing it to database.
Write the processed Data (for each Line seperately) into one File (or Database)
file_put_contents('data.txt', $parsingresult, FILE_APPEND);
FILE_APPEND will append the $parsingresult at the End of the File-Content.
Then you can access the processed Data by file_get_contents() or file().
Anyways. I think, using a Database and some Pre-Processing would be the best Solution if this is needed more often.
You can use fgetcsv() to read and parse the CSV file one line at a time. Keep the data you need and discard the line:
// Store the useful data here
$data = array();
// Open the CSV file
$fh = fopen('data.csv', 'r');
// The first line probably contains the column names
$header = fgetcsv($fh);
// Read and parse one data line at a time
while ($row = fgetcsv($fh)) {
// Get the desired columns from $row
// Use $header if the order or number of columns is not known in advance
// Store the gathered info into $data
}
// Close the CSV file
fclose($fh);
This way it uses the minimum amount of memory needed to parse the CSV file.

Parse CSV content

This must be relatively easy, but I'm struggling to find a solution. I receive data using proprietary network protocol with encryption and at the end the entire received content ends up in a variable. The content is actually that of a CSV file - and I need to parse this data.
If this were a regular file on disk, I could use fgetcsv; if I could somehow break the content into individual records, I could use str_getcsv - but how can I break this file into records? Simple reading until a newline will not work, because CSV can contain values with line breaks in them. Below is an example set of data:
ID,SLN,Name,Address,Contract no
123,102,Market 1a,"Main street, Watertown, MA, 02471",16
125,97,Sinthetics,"Another address,
Line 2
City, NY 10001",16
167,105,"Progress, ahead",,18
All of this data is held inside one variable - and I need to parse it.
Of course, I can always write this data into a temporary file on disk the read/parse it using fgetcsv, but it seems extremely inefficient to me.
If fgetcsv works for you, consider this:
file_put_contents("php://temp",$your_data_here);
$stream = fopen("php://temp","r");
// $result = fgetcsv($stream); ...
For more on php://temp, see the php:// wrapper

PHP invalidating a CSV file

Hey guys I've seen a lot of options on fread (which requires a fiole, or writing to memory),
but I am trying to invalidate an input based on a string that has already been accepted (unknown format). I have something like this
if (FALSE !== str_getcsv($this->_contents, "\n"))
{
foreach (preg_split("/\n/", $this->_contents) AS $line)
{
$data[] = explode(',', $line);
}
print_r($data); die;
$this->_format = 'csv';
$this->_contents = $this->trimContents($data);
return true;
}
Which works fine on a real csv or csv filled variable, but when I try to pass it garbage to invalidate, something like:
https://www.gravatar.com/avatar/625a713bbbbdac8bea64bb8c2a9be0a4 which is garbage (since its a png), it believes its csv
anyway and keeps on chugging along until the program chokes. How can I fix this? I have not seen and CSV validators that
are not at least several classes deep, is there a simple three or four line to (in)validate?
is there a simple three or four line to (in)validate?
Nope. CSV is so loosely defined - it has no telltale signs like header bytes, and there isn't even a standard for what character is used for separating columns! - that there technically is no way to tell whether a file is CSV or not - even your PNG could technically be a gigantic one-column CSV with some esoteric field and line separator.
For validation, look at what purpose you are using the CSV files for and what input you are expecting. Are the files going to contain address data, separated into, say, 10 columns? Then look at the first line of the file, and see whether enough columns exist, and whether they contain alphanumeric data. Are you looking for a CSV file full of numbers? Then parse the first line, and look for the kinds of values you need. And so on...
If you have an idea of the kinds of CSVs likely to make it to your system, you could apply some heuristics -- at the risk of not accepting valid CSVs. For instance, you could look at line length, consistency of line length, special characters, etc...
If all you are doing is checking for the presence of commas and newlines, then any sufficiently large, random file will likely have those and thus pass such a CSV test.

Modify a line with "KEY - AMOUNT" of a file in PHP

this has been bugging me for ages now but i can't figure it out..
Basically i'm using a hit counter which stores unique IP address in a file. But what i'm trying to do is get it to count how many hits each IP address has made.
So instead of the file reading:
222.111.111.111
222.111.111.112
222.111.111.113
I want it to read:
222.111.111.111 - 5
222.111.111.112 - 9
222.111.111.113 - 41
This is the code i'm using:
$file = "stats.php";
$ip_list = file($file);
$visitors = count($ip_list);
if (!in_array($_SERVER['REMOTE_ADDR'] . "\n", $ip_list))
{
$fp = fopen($file,"a");
fwrite($fp, $_SERVER['REMOTE_ADDR'] . "\n");
fclose($fp);
$visitors++;
}
What i was trying to do is change it to:
if (!in_array($_SERVER['REMOTE_ADDR'] . " - [ANY NUMBER] \n", $ip_list))
{
$fp = fopen($file,"a");
fwrite($fp, $_SERVER['REMOTE_ADDR'] . " - 1 \n");
fclose($fp);
$visitors++;
}
else if (in_array($_SERVER['REMOTE_ADDR'] . " - [ANY NUMBER] \n", $ip_list))
{
CHANGE [ANY NUMBER] TO [ANY NUMBER]+1
}
I think i can figure out the last adding part, but how do i represent the [ANY NUMBER] part so that it finds the IP whatever the following number is?
I realise i'm probably going about this all wrong but if someone could give me a clue i'd really appreciate it.
Thanks.
This is bad idea, don't do it this way.
Its normal to store website statics in the file-system but not with pre-aggregation applied to it.
If you going to use the file-system then do post-aggregation on the data otherwise use a database.
What you are doing is a very bad idea
But lets first answer the actual question you are asking.
To be able to do that you will have to actually process the file first in some kind of data structure that allows for that to be done. I'd presonally recommend an array in the form of IP => AMOUNT.
For example (untested code):
$fd = file($file);
$ip_list = array();
for ($fd as $line) {
list($ip, $amount) = explode("-", $line);
$ip_list[$ip] = $amount;
}
Note that the code is not perfect as it would leave a space at the end of $ip and another in front of $amount due to the nature of your original data. But it works good enough just to point you in the right direction. A more "accurate" solution would involve regular expressions or modifying the original data source to a more convenient format.
Now the real answer to your actual problem
Your process will quickly become a performance bottleneck as you would have to open up that file, process it and write it all back again afterwards (not sure if you can do in-line editing of an open file) for every request.
As you are trying to do some kind of per-IP hit count, there are a lot of better solutions to your problem:
Use an existing solution for it (like piwik)
Use an actual database for your data
Keep your file simple with just a list of IPs and post-process it off-line periodically to make it be the format you want
You can avoid writing that file altogether if you have access to your webserver's logs (and they are setup to log every request with the originating IP) and you can post-process that file instead
in_array() simply does a basic a string match. it will NOT look for substrings. Ignoring how bad an idea it is to use a flat file for data storage, what you want is preg_grep, which allows you to use regexes
$ip_list = file('ips.txt');
$matches = preg_replace('/^\d+\.\d+\.\d+\.\d+ - \d+$/', $ip_list);
of course, this is a very basic and very broken IP address match, and will not help you actually CHANGE the value in $ip_list, because you don't get the actual index(es) of the matched lines.

Categories