create indexed array inside preg_replace_callback function - php

I have simple code to replace links in text with numbers 1 to whatever the number of links in text is.
echo preg_replace_callback('/regex/',
function ($links) {
$reg = "/regex/i";
preg_match($reg, $links[0], $url);
static $count = 1;
return '['. $count++ .']';
}, $html);
This is working ok. Now I want to add same number if same links are used in text more times. What I came up with is to create indexed array and inside the function compare current link to existing links in array. If same is found then already indexed link is used along with index number(which will be the link number in text) .
Problem: when I create and add urls to index inside function like this:
$arr=array();
$array[] = $url[0];
var_dump($arr) shows all links with same index number 0. How can I solve this? Thank you.

You are probably getting this if you declare your array inside the callback as this function is called once for each item and so doesn't remember previous values.
One simple way is to define the array outside and use() the value in your callback, which allows the function to access the variable outside it's scope...
$arr=array();
echo preg_replace_callback('/regex/',
function ($links) use (&$arr) {
$reg = "/regex/i";
preg_match($reg, $links[0], $url);
if ( isset($arr[$url[0]]) ) {
$count = ++$arr[$url[0]];
}
else {
$count = 0;
$arr[$url[0]] = $count;
}
return '['. $count++ .']';
}, $html);
Mandatory Note: (for me anyway) processing HTML/XML is usually better done using something like DOMDocument as it can understand both the structure and semantics of the document.

Related

PHP code performance optimization help needed

I have the following code, which checks is an element exists, and if it exists, it checks for the same name, with an incremented number at the end.
For example, it checks is the key "test" exists in the array $this->elements, and if it exists, it checks for "test2", and so on, until the key doesn't exist.
My original code is:
if (isset($this->elements[$desired])) {
$inc = 0;
do {
$inc++;
$new_desired = $desired . $inc;
} while (isset($this->elements[$new_desired]));
$desired = $new_desired;
}
I tried with:
if (isset($this->elements[$desired])) {
return $this->generateUniqueElement($desired, $postfix);
}
private function generateUniqueElement($desired, $postfix) {
$new_desired = $desired . $postfix;
return isset($this->elements[$new_desired]) ? $this->generateUniqueElement($desired, ++$postfix) : $new_desired;
}
But in my tests there's no speed improvement.
Any idea how can I improve the code? On all the pages, this code is called over 10 000 times. And sometimes even over 100k times.
Anticipated thanks!
Without further knowledge on how you generate this list, here's an idea:
$highestElementIds = [];
foreach($this->elements as $element) {
preg_match('/(.*?)(\d+)/', $element, $matches);
$text = $matches[1];
$id = (int)$matches[2];
if(!isset($highestElementIds[$text])) {
$highestElementIds[$text] = $id;
} else {
if($id > $highestElementIds[$text]) {
$highestElementIds[$text] = $id;
}
}
}
// find some element by a simple array access
$highestElementIds['test']; // will return 2 in your example
If your code is really being called 100k times, it should be a lot faster to iterate your list only once and then get the highest id directly from an array which contains the highest number (since you don't need to iterate through it again).
That being said, I still wonder what's the actual reason for having such a huge array in the first place...
Typical unique IDs are either random (UUID or random chars) or sequential numbers. The latter is as simple as it gets and it can be generated with a simple counter:
function generateNewElement($postfix) {
static $i = 0;
return sprintf('%d%s', $i++, $postfix);
}
echo generateNewElement('foo'), PHP_EOL;
echo generateNewElement('foo'), PHP_EOL;
echo generateNewElement('foo'), PHP_EOL;
echo generateNewElement('foo'), PHP_EOL;
0foo
1foo
2foo
3foo
Of course this is just a generic solution so it may not fit your specific use case.

Variable Variables for Array Key

Currently I am attempting to call a multidimensional array, using a string as a key or keys. I would like to use the following code, but I think the key is being interpreted as a string. Any solution?
$data= [];
$data['volvo'] = "nice whip";
$test = "['volvo']";
$data['drivers']['mike'] = "decent";
$test2 = "['drivers']['mike']";
echo $data$test; // should read 'nice whip'
echo $data$test2; // should read 'decent'
You just use the variable (which should just be the string and not PHP syntax) in place of the string literal.
$cars = [];
$cars['volvo'] = 'nice whip';
$test = 'volvo';
echo $cars[$test];
If you need a dynamic array access solution, you could also write a function, which does the actual array access like this:
function path($array, $path) {
$path = is_array($path) ? $path : explode('.', $path);
$current = $array;
while (count($path)) {
$seg = array_shift($path);
if (!isset($current[$seg])) throw new Exception('Invalid path segment: ' . $seg);
$current = $current[$seg];
}
return $current;
}
In your case, this would look like this
echo path($data, 'volvo');
echo path($data, 'drivers.mike');
or
echo path($data, ['volvo']);
echo path($data, ['drivers', 'mike']);
The problem is you can't pass multiple levels in one string like that. (If so, PHP would have to start looking for code fragments inside string array keys. And how would it know whether to interpret them as fragments and then split the string key up, or keep treating it as one string??)
Alt 1
One solution is to change the structure of $data, and make it a single level array. Then you supply keys for all the levels needed to find your data, joined together as a string. You would of course need to find a separator that works in your case. If the keys are plain strings then something simple like underscore should work just fine. Also, this wouldn't change the structure of your database, just the way data is stored.
function getDbItem($keys) {
// Use this to get the "old version" of the key back. (I.e it's the same data)
$joinedKey = "['".implode("'],['", $keys)."']";
$joinedKey = implode('_', $keys);
return $data[$joinedKey];
}
// Example
$data = [
'volvo' => 'nice whip',
'drivers_mike' => 'decent'
];
var_dump(getDbItem(['drivers', 'mike'])); // string(6) "decent"
Alt 2
Another way is to not change number of levels in $data, but simply traverse it using the keys passed in:
$tgt = $data;
foreach($keys as $key) {
if (array_key_exists($key, $tgt)) {
$tgt = $tgt[$key];
}
else {
// Non existing key. Handle properly.
}
}
// Example
$keys = ['drivers', 'mike'];
// run the above code
var_dump($tgt); // string(6) "decent"

Array permutations while maintaining headings

I have an array that contains any number of elements, and is allowed to be a multidimensional array, too. My testing example of such array data is:
$arr = array(
array('Material-A', 'Material-B'),
array('Profile-A', 'Profile-B', 'Profile-C'),
array('Thread-A', 'Thread-B'),
// ... any number of elements
);
From this multidimensional array I need to create a single array that is linear in the following format.
$arrFormated = array(
'Material-A',
'Material-A_Profile-A',
'Material-A_Profile-A_Thread-A',
'Material-A_Profile-A_Thread-B',
'Material-A_Profile-A_Thread-C',
'Material-A_Profile-B',
'Material-A_Profile-B_Thread-A',
'Material-A_Profile-B_Thread-B',
'Material-A_Profile-B_Thread-C',
'Material-A_Profile-C',
'Material-A_Profile-C_Thread-A',
'Material-A_Profile-C_Thread-B',
'Material-A_Profile-C_Thread-C',
'Material-B',
'Material-B_Profile-A',
'Material-B_Profile-A_Thread-A'
// Repeat similar pattern found above, etc...
);
For a recursive function, the best that I've been able to come up with thus far is as follows:
private function showAllElements($arr)
{
for($i=0; $i < count($arr); $i++)
{
$element = $arr[$i];
if (gettype($element) == "array") {
$this->showAllElements($element);
} else {
echo $element . "<br />";
}
}
}
However, this code is no where close to producing my desired results. The outcome from the above code is.
Material-A
Material-B
Profile-A
Profile-B
Profile-C
Thread-A
Thread-B
Could somebody please help me with the recursive side of this function so I may get my desired results?
I'd generally recommend thinking about what you want to be recursive. You tried to work with the current element in every recursion step, but your method needs to look at the next array element of the original Array in each recursion step. In this case, it's more useful to pass an index to your recursive function, because the 'current element' (the $arr in showAllElements($arr)) is not helpful.
I think this code should do it:
$exampleArray = array(
array('Material-A', 'Material-B'),
array('Profile-A', 'Profile-B', 'Profile-C'),
array('Thread-A', 'Thread-B','Thread-C'),
// ... any number of elements
);
class StackOverflowQuestion37823464{
public $array;
public function dumpElements($level = 0 /* default parameter: start at first element if no index is given */){
$return=[];
if($level==count($this->array)-1){
$return=$this->array[$level]; /* This is the anchor of the recursion. If the given index is the index of the last array element, no recursion is neccesarry */
}else{
foreach($this->array[$level] as $thislevel) { /* otherwise, every element of the current step will need to be concatenated... */
$return[]=$thislevel;
foreach($this->dumpElements($level+1) as $stringifyIt){ /*...with every string from the next element and following elements*/
$return[]=$thislevel.'_'.$stringifyIt;
}
}
}
return $return;
}
}
$test=new StackOverflowQuestion37823464();
$test->array=$exampleArray;
var_dump($test->dumpElements());

PHP Loop through XML files, Put in Array, Then Sort

I have a folder of XML files that look like this, all with different timestamps
<?xml version="1.0"?>
<comment>
<timestamp>1390601221</timestamp>
</comment>
I'm using the glob function to put all of these into an array
$xmls = glob("xml/*.xml");
Then I'm trying to put the timestamp value and xml path into a new array so I can sort by the timestamp. This is how I'm doing it.
$sorted_xmls = array();
foreach ($xmls as $xml) {
$raw_xml = file_get_contents($xml);
$data = simplexml_load_string($raw_xml);
$time = $data->timestamp;
array_push($sorted_xmls, array($time, $xml));
}
All of this seems to work fine. Now I want to sort by timestamp. With the newest first.
foreach ($sorted_xmls as $key => $row) {
$final_sorted[$key] = $row[0];
}
array_multisort($final_sorted, SORT_ASC);
It doesn't seem to be working as expected. Am I doing something wrong? I assume it's on the sorting portion
You are calling array_multisort() in the wrong way here. The way you need to call it is as in example #3 on the manual page, "sorting database results".
The way this works is that you pass the "columns" you want to sort by, and the flags to sort that column by, in order, then pass the target array (the array that will actually be sorted) as the last argument.
So if you change your last line to this:
array_multisort($final_sorted, SORT_ASC, $sorted_xmls);
...then $sorted_xmls should be sorted in the way you would like it to be.
However, a more efficient, albeit more complex, way to do this might be to sort the $xmls array directly using usort(), and load the files from disk at the same time.
$xmls = glob("xml/*.xml");
usort($xmls, function($a, $b) {
// Temporary array to hold the loaded timestamps
// Because this is declared static in a closure, it will be free'd when
// the closure goes out of scope, i.e. when usort() returns
// If you want to store the timestamps for use later, you can import a
// reference to an external variable into the closure with a use() element
static $timestamps = array();
// Load XML from disk if not already loaded
if (!isset($timestamps[$a])) {
$timestamps[$a] = simplexml_load_file($a)->timestamp;
}
if (!isset($timestamps[$b])) {
$timestamps[$b] = simplexml_load_file($b)->timestamp;
}
// Return values appropriate for sorting
if ($timestamps[$a] == $timestamps[$b]) {
return 0;
}
return $timestamps[$a] < $timestamps[$b] ? 1 : -1;
});
print_r($xmls);

Find index of value in associative array in php?

If you have any array $p that you populated in a loop like so:
$p[] = array( "id"=>$id, "Name"=>$name);
What's the fastest way to search for John in the Name key, and if found, return the $p index? Is there a way other than looping through $p?
I have up to 5000 names to find in $p, and $p can also potentially contain 5000 rows. Currently I loop through $p looking for each name, and if found, parse it (and add it to another array), splice the row out of $p, and break 1, ready to start searching for the next of the 5000 names.
I was wondering if there if a faster way to get the index rather than looping through $p eg an isset type way?
Thanks for taking a look guys.
Okay so as I see this problem, you have unique ids, but the names may not be unique.
You could initialize the array as:
array($id=>$name);
And your searches can be like:
array_search($name,$arr);
This will work very well as native method of finding a needle in a haystack will have a better implementation than your own implementation.
e.g.
$id = 2;
$name= 'Sunny';
$arr = array($id=>$name);
echo array_search($name,$arr);
Echoes 2
The major advantage in this method would be code readability.
If you know that you are going to need to perform many of these types of search within the same request then you can create an index array from them. This will loop through the array once per index you need to create.
$piName = array();
foreach ($p as $k=>$v)
{
$piName[$v['Name']] = $k;
}
If you only need to perform one or two searches per page then consider moving the array into an external database, and creating the index there.
$index = 0;
$search_for = 'John';
$result = array_reduce($p, function($r, $v) use (&$index, $search_for) {
if($v['Name'] == $search_for) {
$r[] = $index;
}
++$index;
return $r;
});
$result will contain all the indices of elements in $p where the element with key Name had the value John. (This of course only works for an array that is indexed numerically beginning with 0 and has no “holes” in the index.)
Edit: Possibly even easier to just use array_filter, but that will not return the indices only, but all array element where Name equals John – but indices will be preserved:
$result2 = array_filter($p, function($elem) {
return $elem["Name"] == "John" ? true : false;
});
var_dump($result2);
What suits your needs better, resp. which one is maybe faster, is for you to figure out.

Categories