this has been bugging me for ages now but i can't figure it out..
Basically i'm using a hit counter which stores unique IP address in a file. But what i'm trying to do is get it to count how many hits each IP address has made.
So instead of the file reading:
222.111.111.111
222.111.111.112
222.111.111.113
I want it to read:
222.111.111.111 - 5
222.111.111.112 - 9
222.111.111.113 - 41
This is the code i'm using:
$file = "stats.php";
$ip_list = file($file);
$visitors = count($ip_list);
if (!in_array($_SERVER['REMOTE_ADDR'] . "\n", $ip_list))
{
$fp = fopen($file,"a");
fwrite($fp, $_SERVER['REMOTE_ADDR'] . "\n");
fclose($fp);
$visitors++;
}
What i was trying to do is change it to:
if (!in_array($_SERVER['REMOTE_ADDR'] . " - [ANY NUMBER] \n", $ip_list))
{
$fp = fopen($file,"a");
fwrite($fp, $_SERVER['REMOTE_ADDR'] . " - 1 \n");
fclose($fp);
$visitors++;
}
else if (in_array($_SERVER['REMOTE_ADDR'] . " - [ANY NUMBER] \n", $ip_list))
{
CHANGE [ANY NUMBER] TO [ANY NUMBER]+1
}
I think i can figure out the last adding part, but how do i represent the [ANY NUMBER] part so that it finds the IP whatever the following number is?
I realise i'm probably going about this all wrong but if someone could give me a clue i'd really appreciate it.
Thanks.
This is bad idea, don't do it this way.
Its normal to store website statics in the file-system but not with pre-aggregation applied to it.
If you going to use the file-system then do post-aggregation on the data otherwise use a database.
What you are doing is a very bad idea
But lets first answer the actual question you are asking.
To be able to do that you will have to actually process the file first in some kind of data structure that allows for that to be done. I'd presonally recommend an array in the form of IP => AMOUNT.
For example (untested code):
$fd = file($file);
$ip_list = array();
for ($fd as $line) {
list($ip, $amount) = explode("-", $line);
$ip_list[$ip] = $amount;
}
Note that the code is not perfect as it would leave a space at the end of $ip and another in front of $amount due to the nature of your original data. But it works good enough just to point you in the right direction. A more "accurate" solution would involve regular expressions or modifying the original data source to a more convenient format.
Now the real answer to your actual problem
Your process will quickly become a performance bottleneck as you would have to open up that file, process it and write it all back again afterwards (not sure if you can do in-line editing of an open file) for every request.
As you are trying to do some kind of per-IP hit count, there are a lot of better solutions to your problem:
Use an existing solution for it (like piwik)
Use an actual database for your data
Keep your file simple with just a list of IPs and post-process it off-line periodically to make it be the format you want
You can avoid writing that file altogether if you have access to your webserver's logs (and they are setup to log every request with the originating IP) and you can post-process that file instead
in_array() simply does a basic a string match. it will NOT look for substrings. Ignoring how bad an idea it is to use a flat file for data storage, what you want is preg_grep, which allows you to use regexes
$ip_list = file('ips.txt');
$matches = preg_replace('/^\d+\.\d+\.\d+\.\d+ - \d+$/', $ip_list);
of course, this is a very basic and very broken IP address match, and will not help you actually CHANGE the value in $ip_list, because you don't get the actual index(es) of the matched lines.
Related
I've made this script to extract data from a CSV file.
$url = 'https://flux.netaffiliation.com/feed.php?maff=3E9867FCP3CB0566CA125F7935102835L51118FV4';
$data = array_map(function($line) { return str_getcsv($line, '|'); }, file($url));
It's working exactly as I want but I've just been told that it's not the proper way to do it and that I really should use fgetcsv instead.
Is it right ? I've tried many ways to do it with fgetcsv but didn't manage at all to get anything close.
Here is an example of what i would like to get as an output :
$data[4298][0] = 889698467841
$data[4298][1] = Figurine Funko Pop! - N° 790 - Disney : Mighty Ducks - Coach Bombay
$data[4298][2] = 108740
$data[4298][3] = 14.99
First of all, there is no the ONE proper way to do things in programming. It is up to you and depends on your use case.
I just downloaded the CSV file and it is ca. 20MB big. In your solution you download the whole file at once. If you do not have any memory restrictions and you do not have to give a fast feedback to the caller, I mean if the delay for downloading of the whole file is not important, your solution is better solution, if you want to guarantee the processing of the whole content. In this case, you read all the content at once and the further processing of the content does not depend on other things like your Internet connection etc.
If you want to use fgetcsv, you would read from the URL line by line squentially. Your connection has to remain until a line has been processed. In this you do not need big memory allocation but it would take longer to having processed the whole content.
Both methods have their pros and contras. You should know what is your goal. How often would you run this script? You should consider your use case and make a decision which method is the best for you.
Here is the same result without array_map():
$url = 'https://flux.netaffiliation.com/feed.php?maff=3E9867FCP3CB0566CA125F7935102835L51118FV4';
$lines = file($url);
$data = [];
foreach($lines as $line)
{
$data[] = str_getcsv(trim($line), '|');
//optionally:
//$data[] = explode('|',trim($line));
}
$lines = null;
First of all, I'm aware of the vast, vast keyspace of bitcoin addresses. However, I've been experimenting with Vanitygen these days and I was wondering if all the generated addresses in it were directly ported to a local server running the compiled blockchain instead of writing them to a file, wouldn't it be feasible?
1.With the current source of vanitygen, is it possible to directly drop chunks of addresses on the local server( say 'insight') and check for a positive balance?
How would you start with that?
Thanks in advance.
Here's my PHP code( Feel free to use it )
<?php
$lines = file('in.csv', FILE_IGNORE_NEW_LINES);
$i=0;
foreach ($lines as $line_num => $line) {
$address = explode(',', $line);
$variablee = file_get_contents($address[0]);
$i++;
if($variablee!="0"){
$file = 'out.txt';
$current = file_get_contents($file);
$current .= $line;
file_put_contents($file, $current);
}
echo "\n".$i;
}
?>
Update: There is just one question here, which is to direct vanitygen generated addresses directly to a local server running the compiled blockchain rather than writing them to a file. The code shown above works as fast as 1,000 addresses/second while I've heard people checking as many as 50K addresses/second for a positive balance. I've tried using cwebsocket from here but can't figure out a way to implement it into vanitygen
Update:My code as of now checks about 1,000 addresses/second
To import the addresses you want to format the private key into "Wallet Import Format" or 'WIF'.
See: https://en.bitcoin.it/wiki/Wallet_import_format
The native client will want to reindex the whole blockchain per address if you import a non client based keypairs.
The native client also has a cap on how many addresses it will track.
I'm writing a feature for an admin panel that blocks ip addresses on the apache level. The file is called blacklist.txt and looks like 10.0.0.1,10.0.0.2,10.0.0.3, ... All a single line, with each ip address separated by a comma. After reading What is the best way to write a large file to disk in PHP?, I am still unsure of the best practices on the matter.
Here's what I want to do: IF an administrator presses the 'ban hammer', the file is read looking for strpos($file, $ip), if it's not found, append to the end of the file and the .htaccess file blocks accordingly.
Question: is a .txt file suitable for this potentially large amount of data? I do not want to execute a query to check if someone is banned every time a page is requested
EDIT:
The purpose is to block single ip addresses that have 10 failed login attempts in the past 12 hours. I would think that the 'recover my password' would prevent a normal client from doing this.
Question: is a .txt file suitable for this potentially large amount of
data?
No, it is not. A database with proper indexing is.
First for reading your File in CSV format
you can use many ways. example:
$rows = array_map('str_getcsv', file('myfile.csv'));
$header = array_shift($rows);
$csv = array();
foreach ($rows as $row) {
$csv[] = array_combine($header, $row);
}
src: http://steindom.com/articles/shortest-php-code-convert-csv-associative-array
for checking that on each page load and to minimize the Reading of that file
you can use a memory cache , something like memCache, then search the array for the incoming ip. note: memory cache is faster then Database query.
PHP shared memory ref: http://www.php.net/manual/en/book.shmop.php
memCache php.net/memcache
Array Search php.net/in_array
also to return the key if value found php.net/array_search
note: in 1 mb file you can store ~65K IP considering an ip is the following format: "255.255.255.255,"
it's even better if you put the key of the array the ip, then instead of searching the array for that ip you can Check if the Key exist with this: php.net/array_key_exists
This question was asked on a message board, and I want to get a definitive answer and intelligent debate about which method is more semantically correct and less resource intensive.
Say I have a file with each line in that file containing a string. I want to generate an MD5 hash for each line and write it to the same file, overwriting the previous data. My first thought was to do this:
$file = 'strings.txt';
$lines = file($file);
$handle = fopen($file, 'w+');
foreach ($lines as $line)
{
fwrite($handle, md5(trim($line))."\n");
}
fclose($handle);
Another user pointed out that file_get_contents() and file_put_contents() were better than using fwrite() in a loop. Their solution:
$thefile = 'strings.txt';
$newfile = 'newstrings.txt';
$current = file_get_contents($thefile);
$explodedcurrent = explode('\n', $thefile);
$temp = '';
foreach ($explodedcurrent as $string)
$temp .= md5(trim($string)) . '\n';
$newfile = file_put_contents($newfile, $temp);
My argument is that since the main goal of this is to get the file into an array, and file_get_contents() is the preferred way to read the contents of a file into a string, file() is more appropriate and allows us to cut out another unnecessary function, explode().
Furthermore, by directly manipulating the file using fopen(), fwrite(), and fclose() (which is the exact same as one call to file_put_contents()) there is no need to have extraneous variables in which to store the converted strings; you're writing them directly to the file.
My method is the exact same as the alternative - the same number of opens/closes on the file - except mine is shorter and more semantically correct.
What do you have to say, and which one would you choose?
This should be more efficient and less resource-intensive as the previous two methods:
$file = 'passwords.txt';
$passwords = file($file);
$converted = fopen($file, 'w+');
while (count($passwords) > 0)
{
static $i = 0;
fwrite($converted, md5(trim($passwords[$i])));
unset($passwords[$i]);
$i++;
}
fclose($converted);
echo 'Done.';
As one of the comments suggests do what makes more sense to you. Since you might come back to this code in few months and you need to spend least amount of time trying to understand it.
However, if speed is your concern then I would create two test cases (you pretty much already got them) and use timestamp (create variable with timestamp at the beginning of the script, then at the end of the script subtract it from timestamp at the end of the script to work out the difference - how long it took to run the script.) Prepare few files I would go for about 3, two extremes and one normal file. To see which version runs faster.
http://php.net/manual/en/function.time.php
I would think that differences would be marginal, but it also depends on your file sizes.
I'd propose to write a new temporary file, while you process the input one. Once done, overwrite the input file with the temporary one.
I have a csv file with records being sorted on the first field. I managed to generate a function that does binary search through that file, using fseek for random access through file.
However, this is still a pretty slow process, since when I seek some file position, I actually need to look left, looking for \n characted, so I can make sure I'm reading a whole line (once whole line is read, I can check for first field value mentioned above).
Here is the function that returns a line that contains character at position x:
function fgetLineContaining( $fh, $x ) {
if( $x 125145411) // 12514511 is the last pos in my file
return "";
// now go as much left as possible, until newline is found
// or beginning of the file
while( $x > 0 && $c != "\n" && $c != "\r") {
fseek($fh, $x);
$x--; // go left in the file
$c = fgetc( $fh );
}
$x+=2; // skip newline char
fseek( $fh, $x );
return fgets( $fh, 1024 ); // return the line from the beginning until \n
}
While this is working as expected, I have to sad that my csv file has ~1.5Mil lines, and these left-seeks are slowing thins down pretty much.
Is there a better way to seek a line containing position x inside a file?
Also, it would be much better if object of a class could be saved to a file without serializing it, thus enabling reading of a file object-by-object. Does php support that?
Thanks
I think you really should consider using SQLite or MySQL again (like others have suggested in the comments). Most of the suggestions about pre-calculating indexes are already implemented "properly" in these SQL engines.
You said the speed wasn't good enough in SQL. Did you have the fields indexed properly? How were you querying the data? Where you using bulk queries, where you using prepared statements? Did the SQL process have enough ram to store it's indexes in RAM?
One thing you can possibly try to speed under the current algorithm is to load the (~100MB ?) file onto a RAM disc. No matter what you chose to do, either CVS or SQLite, this WILL help speed things up, especially if the hard drive seek time is your bottleneck.
You could possibly even read the whole file into PHP array's (assuming your computer has enough RAM for that). That would allow you to do your search via index ($big_array[$offset]) lookups.
Also one thing to keep in mind, PHP isn't exactly super fast at doing low level things fast. You might want to consider moving away from PHP in favor of C or C++.