substr() PHP not working for array elements - php

$nomadspage = "http://www.nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/";
$html = file_get_contents($nomadspage);
$count = preg_match_all('/<a href="([^"]+)">[^<]*<\/a>/i', $html, $files);
unset($files[1]); //deletes repeat array from preg_match
$files = $files[0]; //deletes container array from preg_match
foreach ($files as $key => $value) {
if (substr($value, 0, 3) !== "gfs") {
unset($files[$key]);
}
}
var_dump($files);
I have an array with file names from an HTTP directory. I want to filter these files names so that all of the files that don't start with the three letters gfs are deleted from the array. However, for some reason, the substr() function does not work. It does not pull a substring from the file names. Therefore, the if statement does not work. Anybody know why this is happening and how to fix it?

$files[0] contains the strings that match the entire regular expression, so substr($value, 0, 3) is always "<a ". You should set $files to $files[1], not $files[0], it contains all the matches of the ([^"]+) pattern.
Actually, it's best not to use regular expressions to parse HTML. Use a DOM parser library, such as the DOMDocument class.

Related

correct regex date pattern for dd/mm/yyyy

I need to update the same line, which is also including a date in dd/mm/yyyy format along with some string, in a group of files. I have checked answers here given to similar questions however couldn’t make any of the patterns suggested run in my code.
My current PHP code is:
<?php
// get the system date
$sysdate = date("d/m/Y");
// open the directory
$dir = opendir($argv[1]);
$files = array();
// sorts the files alphabetically
while (($file = readdir($dir)) !== false) {
$files[] = $file;
}
closedir($dir);
sort($files);
// for each ordered file will run the in the clauses part
foreach ($files as $file) {
$lines = '';
// filename extension is '.hql'
if (strpos($file,".hql") != false || strpos($file,".HQL") != false)
{
$procfile = $argv[1] . '\\' . $file;
echo "Converting filename: " . $procfile . "\n";
$handle = fopen($procfile, "r");
$lines = fread($handle, filesize($procfile));
fclose($handle);
$string = $lines;
// What are we going to change runs in here
$pattern = '[0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]';
$replacement = $sysdate;
$lines = preg_replace($pattern, $replacement, $string);
echo $lines;
$newhandle = fopen($procfile, 'w+');
fwrite($newhandle, $lines);
fclose($newhandle);
// DONE
}
}
closedir($dir);
?>
When I run this code on command prompt, it doesn’t give any error message and it seems to be running properly. But once it finishes and I check my files, I see that the content of each file is getting deleted and they all become 0 KB files with nothing in them.
You have no delimiters set in place for your regular expression.
A delimiter can be any (non-alphanumeric, non-backslash, non-whitespace) character.
You want to use a delimiter besides / so you avoid having to escape / already in your pattern.
You could use the following to change your format:
$pattern = '~[0-9]{4}/[0-9]{2}/[0-9]{2}~';
See Live demo
This one also do basic checks (month between 1-12, day between 1-31)
(0(?!0)|[1-2]|3(?=[0-1]))\d\/(0(?!0)|1(?=[0-2]))\d\/\d{4}
See it live: http://regex101.com/r/jG9nD5
You should surround the regular expression with delimiter character.
For example:
$pattern = '![0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]!';
/ is commonly used, but because the regular expression contains / itself, I used ! instead.
Besides the lack of delimiters (# and ~ are favorites, if / is used in the pattern), you are looking for 4 digits at the beginning: yyyy/mm/dd. Decide what you're looking for. You might also be able to do something like
[0-9]{4}/[0-9]{2}/[0-9]{2}
or even
\d{4}/\d{2}/\d{2}
... I know those will work in Perl, but I haven't tried them with PHP (they ought to work, as the "p" in preg stands for Perl, but no guarantees).
Why use regex? Use DateTime class for validation.
var_dump(validateDate('2012-02-28', 'Y-m-d')); # true
var_dump(validateDate('28/02/2012', 'd/m/Y')); # true
var_dump(validateDate('30/02/2012', 'd/m/Y')); # false
function
Your code can be rewritten in short like this:
#!/usr/bin/php
<?php
// get the system date
$sysdate = date('d/m/Y');
// change working directory to the specified one
chdir($argv[1]);
// loop over the *.hql files in sorted order
foreach (glob('*.{hql,HQL}', GLOB_BRACE) as $file) {
echo "Converting filename: $argv[1]\\$file\n";
$contents = file_get_contents($file);
$contents = preg_replace('#\d{4}/\d{2}/\d{2}#', $sysdate, $contents);
echo $contents;
file_put_contents($file, $contents);
}
The problem was with the missing PCRE regex delimiters as others already pointed out. Even after fixing this, the code was not really nice.
The glob and file_get_contents functions are available as of PHP 4.3.0. The file_put_contents function is available as of PHP 5.
glob makes your code more succinct, readable and even portable as you won‘t have to mention directory separator anywhere except the info message. You used \\ but should have used DIRECTORY_SEPARATOR if you wanted your code to be portable.
The file_get_contents function fetches the whole contents of a file as a string. The file_put_contents function does the opposite – stores a string in a file. If you want it in PHP 4, use this implementation:
if (!function_exists('file_put_contents')):
function file_put_contents($filename, $data) {
$handle = fopen($filename, 'w');
$result = fwrite($handle, $data);
fclose($handle);
return $result;
}
endif;
Also notice that the final ?> is not necessary in PHP.

get the results of curl in variables

i got a piece of code that so far returns me data like this when i use print $result;
ssl_card_number=41**********1111
ssl_exp_date=0213
ssl_amount=132.86
ssl_salestax=0.00
ssl_invoice_number=5351353519500
ssl_result=0
ssl_result_message=APPROVED
ssl_txn_id=00000000-0000-0000-0000-00000000000
ssl_approval_code=123456
ssl_cvv2_response=P
ssl_avs_response=X
ssl_account_balance=0.00
ssl_txn_time=11/21/2012 12:38:20 PM
thats from view page source.
and the page itself shows it as :
ssl_card_number=41**********1111 ssl_exp_date=0213 ssl_amount=132.86 ssl_salestax=0.00 ssl_invoice_number=8601353519473 ssl_result=0 ssl_result_message=APPROVED ssl_txn_id=00000000-0000-0000-0000-00000000000 ssl_approval_code=123456 ssl_cvv2_response=P ssl_avs_response=X ssl_account_balance=0.00 ssl_txn_time=11/21/2012 12:37:54 PM
i need to be able to handle each of the "keys" in a better way and dont know how to explode them maybe ?
One possible approach:
parse_str(preg_replace('#\s+(?=\w+=)#', '&', $result), $array);
var_dump($array);
Explanation: preg_replace will turn all the whitespace before the param names into '&' symbol - making this string similar to the regular GET request url. Then parse_str (the function created specifically for parsing such urls) will, well, parse this string (sent as the first param), making an associative array of it.
In fact, you don't even have to use preg_replace here, if each param=value string begins from a new line; str_replace("\n", '&') should do the trick.
An alternative approach:
$pairs = preg_split('#\s+(?=\w+=)#', $x);
foreach ($pairs as $pair) {
list ($key, $value) = explode('=', $pair, 2);
$array[$key] = $value;
}
Here you first create an array of 'key-value pair' strings, then split each element by =: the first part would be the key, the second - the value.
You can use the regular expression reported by #raina77ow or you could use explodes (riskier):
<?php
$tmps = explode("\n",$result); //this gives you each line separate
foreach($tmps as $tmp){
list($key,$value) = explode('=',$tmp,2);
echo $key.' has value '.$value."\n";
//you can even create vars with the "key" if you are sure that they key is a "clean" string:
$$key=$value;
//or put everything into an array - similar to the regexp
$result_array[$key] = $value;
}
?>

This script won't find Absolute Urls

in the code below, it is supposed to scan links and index them in the array [links]. but for some reason, they won't index.
I am starting to think if my regex code is wrong, how can i improve it. Also is it my file_get_contents command? Is it used correctly?
$links = Array();
$URL = 'http://www.theqlick.com'; // change it for urls to grab
// grabs the urls from URL
$file = file_get_contents($URL);
$abs_url = preg_match_all("'^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$^'", $file, $link);
if (!empty($abs_url)) {
$links[] = $abs_url;
}
In your preg_match_all you are saving into $link not $links.
preg_match_all Returns the number of full pattern matches (which might be zero), or FALSE if an error occurred (c) php.net
preg_match_all("'^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$^'", $file, $matches);
if (!empty($matches)
$links = $matches;
Your regex is wrong. You have a head anchor ^ at the end of the pattern adjacent to a tail match $. I don't think the anchors really aren't needed. Additionally, your variable you are storing matches in $link (no s). Plus your pattern delimiter appears to be the ' character. Was that intentional? It would fortunately work, but I'm guessing you didn't intend for that?
Try this:
$matchCount = preg_match_all("/(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?/", $file, $matches);
if ($matchCount)
{
foreach ($matches as $match)
{
$links[] = $match[0];
}
}
Read up on PHP regular expressions.

PhP Regex or Basename()

I'm trying to extract filenames from a list of files with pathes like :
/a/b/c/d/file1.jpg
/e/f/g/h/file2.png
/i/j/k/l/file3.txt
I want to get a string that is a valid filename (for linux) that is between a "/" is a jpeg file (ends with ".jpg").
In this example, "file1" would be the only valid match.
At the moment I have this RegEx :
/(?<=\/)(.*?)(?=\.(js))/gim
I don't really know if it's better to do this with RegEx or if it's better / possible with basename().
The goal I want to achieve is to get all the strings that match to be placed in an array.
Don't know if I'm doing this right though.
Regex isn't required here. I've assumed you can get your paths into an array.
<?php
$text = file_get_contents("list.txt");
$foo = explode(PHP_EOL, $text);
$bar = array();
foreach($foo as $key => $value){
if(pathinfo($value, PATHINFO_EXTENSION) == "jpg"){
$bar[] = basename($foo[$key],".".pathinfo($value, PATHINFO_EXTENSION));
}
}
print_r($bar);
?>
Outputs:
Array ( [0] => file1 )
Live example: http://codepad.viper-7.com/ewkUHs

In PHP, if I find a word in a file, can I make the line that the word came from into a $string

I want to find a word in a large list file.
Then, if and when that word is found, take the whole line of the list file that the word was found in?
so far I have not seen any PHP string functions to do this
Use a line-delimited regular expression to find the word, then your match will contain the whole line.
Something like:
preg_match('^.*WORD.*$, $filecontents, $matches);
Then $matches will have the full lines of the places it found WORD
You could use preg_match:
$arr = array();
preg_match("/^.*yourSearch.*$/", $fileContents, $arr);
$arr will then contain the matches.
$path = "/path/to/wordlist.txt";
$word = "Word";
$handle = fopen($path,'r');
$currentline = 1; //in case you want to know which line you got it from
while(!feof($handle))
{
$line = fgets($handle);
if(strpos($line,$word))
{
$lines[$currentline] = $line;
}
$currentline++;
}
fclose($handle);
If you want to only find a single line where the word occurs, then instead of saving it to an array, save it somewhere and just break after the match is made.
This should work quickly on files of any size (using file() on large files probably isn't good)
Try this one:
$searhString = "search";
$result = preg_grep("/^.*{$searhString}.*$/", file('/path/to/your/file.txt'));
print_r($result);
Explanation:
file() will read your file and produces array of lines
preg_grep() will return array element in which matching pattern is found
$result is the resulting array.

Categories