The script below assigns numerical ID's to paragraphs (e.g. [p id="1"]) in articles extracted from my database, except for the last paragraph, which is [p id="last].
$c = 1;
$r = preg_replace_callback('/(<p( [^>]+)?>)/i', function ($res) {
global $c;
return '<p'.$res[2].' id="'.intval($c++).'">';
}, $text);
$r = preg_replace('/(<p.*?)id="'.($c-1).'"(>)/i', '\1id="Last"\2', $r);
$text = $r;
It works, but when I have my error reporting on, I get the following error Undefined offset: 2. It isn't critical, but it's kind of a nuisance when I'm testing my pages. Any idea how I can kill it?
I've improved the regex by:
Removing a group /<p( [^>]+)?>/i
Changing ( [^>]+)? to ([^>]*). This way you don't have an optional group, but the characters inside this group is optional. Which means you will always have this group.
Just a preference, I change the delimiters ~<p([^>]*)>~i
Now let's attack the php code:
$text = '<p>test</p> another <p class="test">test</p> and another one <p style="color:red">';
$c = 1;
$r = preg_replace_callback('~<p([^>]*)>~i', function($res) use (&$c){
return '<p'.$res[1].' id="'.$c++.'">';
}, $text);
var_dump($r, $c);
Note that I used a closure use (&$c) with a reference &. This way we can update $c.
Online demo
Related
So, basically I'm trying to count the number of landline phone numbers in a list of both landlines and mobile phone numbers $mobile_list (071234567890,02039989435,0781...)
$mobile_array = explode(",",$mobile_list); // turn into an array
$landlines = array_count_values($mobile_array); // create count variable
echo $landlines["020..."]; // print the number of numbers
So, I get the basic count specific elements function, but I don't see where I can specify if an element 'starts with' or 'contains' a sequence. With the above you can only specify an exact phone number (obviously not useful).
Any help would be great!
I don't see any reason to first explode the string to an array, and then check each array item.
That is a complete waste of performance!
I suggest using preg_match_all and match with word boundary "020".
That means the "word" has to start with 020.
$mobile_list = "071234567890,02039989435,0781,020122,123020";
preg_match_all("/\b020\d+\b/", $mobile_list, $m);
var_dump($m);
echo count($m[0]); // 2
https://3v4l.org/ucSDm
The lightest and fastest method I have found is to explode on ",020".
The array that is returned has item 0 as undefined, meaning we don't know if it's a 020 number so I have to look at that manually.
$temp = explode(",020", $mobile_list);
$cnt = count($temp);
if(substr($temp[0],0,3) != "020") $cnt--;
echo $cnt;
A small scale test shows this as the fastest method.
https://3v4l.org/rD54d
You can use array_reduce() to count the occurrences of strings beginning with '020'
$mobile_list = "02039619491,07143502893,02088024526,07351261813,02095694897";
$mobile_array = explode(',', $mobile_list);
function landlineCount($carry, $item)
{
if (substr($item, 0, 3) === '020') {
return $carry += 1;
}
return $carry;
}
$count = array_reduce($mobile_array, 'landlineCount');
echo $count;
prints 3
I'm sure the OP has finished what they needed to do hours ago but for fun here is a faster way to count the landlines.
I hadn't spotted that the question original code was exploding the string.
That isn't necessary, you can just count the sub strings with substr_count() this could miss the first which wouldn't have a comma before it so I check for that too with substr().
If you need the total count of all numbers you can just count the commas with substr_count() again and add one.
$count = substr($mobile_list, 0, 3) === '020' ? 1 : 0;
$count += substr_count($mobile_list, ",020");
$totalCount = substr_count($mobile_list, ",") + 1;
echo $count;
echo $totalCount;
Here is the bench run a 1000 times to get an average.
https://3v4l.org/Sma66
Use array_filter() or preg_grep() functions to find all numbers that contain or starts with given number sequence.
Note: There is easier and better solution in other answers that cover request to find values that start with given number sequence.
Because you have mentioned - "but I don't see where I can specify if an element 'starts with' or 'contains' a sequence." - My code assumes that you wan't to find any occurrence of sequence, not only in start of string of each item.
$mobile_list = '02000, 02032435, 039002300, 00305600';
$mobile_array = explode(",",$mobile_list); // turn into an array
$landlines = array_count_values($mobile_array); // create count variable
$sequence = '020'; // print the number of numbers
function filter_phone_numbers($mobile_array, $sequence){
return array_filter($mobile_array, function ($item) use ($sequence) {
if (stripos($item, $sequence) !== false) {
return true;
}
return false;
});
}
$filtered_items = array_unique (filter_phone_numbers($mobile_array, $sequence)); //use array_unique in case we find same number that both contains or starts with sequence
echo count($filtered_items);
Or with preg_grep():
$mobile_list = '02000, 02032435, 039002300, 00305600';
$mobile_array = explode(",",$mobile_list); // turn into an array
$landlines = array_count_values($mobile_array); // create count variable
$sequence = preg_quote('020', '~'); ; // print the number of numbers
function grep_phone_numbers($mobile_array, $sequence){
return preg_grep('~' . $sequence . '~', $mobile_array);
}
//use array_unique in case we find same number that both contains or starts with sequence
$filtered_items = array_unique(grep_phone_numbers($mobile_array, $sequence));
echo count($filtered_items);
I recommend doing this with the database. The database is design to manage data and can do it a lot more efficient than PHP can. You can simply put it into a query and just get the result you want in 1 go:
SELECT * FROM phone_numbers WHERE number LIKE '020%'
If you get the data from the database anyways, that LIKE adds a little time to the query, but less that it takes PHP to loop, strpos and store the results. Also, as you return a smaller dataset, less resources are being used.
I have a PHP script that I wrote probably 10 years ago. I don't remember what version PHP was on at the time but my script worked just fine without complaint from the interpreter. Now, I've had to move my scripts to a new web host and, under PHP 7.x, the interpreter complains loudly about a certain line of this script and I'm looking for an elegant way to get it to shut up.
The offending line is:-
list($degrees, $minutes, $seconds) = preg_split("/ /", $coord);
The $coord variable contains a GPS coordinate in one of three forms: "degrees minutes seconds", "degrees decimal-minutes", or "decimal-degrees". So, the preg_split() may return 1, 2, or 3 elements. If it returns only 1 or 2 elements, the interpreter complains loudly about the undefined references to $seconds and/or $minutes. I see that there is a LIMIT parameter that I could specify for preg_split() that gives it a maximum number of elements to return but there doesn't seem to be a complimentary parameter to tell it the MINIMUM number of elements to return. Any suggestions welcome.
Sample coords: '-97.74019' or '-97 44.411' or '-97 44 24.7'
Totally agree with Anant, but you can do it in bit more elegant way:
<?php
$coordArray = preg_split('/ /', $coord);
$degrees = $coordArray[0] ?? 0;
$minutes = $coordArray[1] ?? 0;
$seconds = $coordArray[2] ?? 0;
The line gives a E_NOTICE ("Notice: Undefined offset: X in Command line code on line 1"). You can either change the error reporting level to not include E_NOTICE or just hide disable the error reporting for this particular line with the # operator. There is no harm in using # here. Unmatched variables will be assigned NULL (in 5 and 7):
$coord = "x y";
#list($degrees, $minutes, $seconds) = preg_split("/ /", $coord);
var_dump($degrees, $minutes, $seconds);
Gives:
string(1) "x"
string(1) "y"
NULL
I wouldn't generally recommend it, but to suppress all E_NOTICE error notices, you can unset it in the error_reporting setting:
ini_set("error_reporting", E_ALL&~E_NOTICE);
You can convert that code like below:-
$d_m_s_array = preg_split("/ /", $coord);
$degrees = (!empty($d_m_s[0]) ? $d_m_s[0] :0;
$minutes = (!empty($d_m_s[1]) ? $d_m_s[1] :0;
$seconds = (!empty($d_m_s[2]) ? $d_m_s[2] :0;
Note:-
Also your original code will generate a notice only (if error reporting is on for that too). Look here:- https://eval.in/707979
It will not stop the program execution, but you can+have-to resolve this notice by the above code
Wrap the variably-sized array with array_pad as demonstrated below:
list($degrees, $minutes, $seconds) = array_pad( preg_split("/ /", $coord), 3, null);
Your intuition is right that you need a MINIMUM number of elements to return, and that is exactly what array_pad will do for you.
I want to get the replace a number in a string after multiplying it by a variable. I have the following PHP:
$desc = "+2.23% critical damage";
$count = 3;`
Now I want to use the value of $count * $desc within a new string, as shown here:
$sum = "+6.69% critical damage";
How do I manage this? How can I multiply the numbers in this string with $count?
Three possibilities.
Ugly as hell, but working (use floatval() on the string):
<?php
$desc = "+2.23% critical damage";
$count = 3;
$sum = floatval($desc) * $count . "% critical damage";
echo $sum;
# 6.69% critical damage
?>
Second: a regex approach.
<?php
# define a regex to allow digits and points directly connected
$regex = '~[\d.]+~';
preg_match($regex, $desc, $num);
$sum = floatval($num[0]) * $count . "% critical damage";
echo $sum;
# 6.69% critical damage
?>
Third possibility (probably the best): try to refine your actual requirements and edit your question :)
In case the rest of your string is dynamic, it may be more robust to replace the floating point number using preg_replace_callback() -- this way you are only touching the number in the string.
sprintf() is very handy for standardizing the format of the product.
Code: (Demo)
$desc = "+2.23% critical damage";
$count = 3;
var_export(
preg_replace_callback(
'~^[+-]\d+(?:\.\d+)?~',
function($m) use ($count) {
return sprintf('%+.2f', $m[0] * $count);
},
$desc,
1
)
);
Output:
'+6.69% critical damage'
So, I got some ideas off here about how to do this and took on board some of the code suggestions; I have LaTeX files with components in the form
{upper}{lower} where upper could be anything from plain text to LaTeX including its own nested {} and lower could be blank or substantial latex. Desired output is a pair of PHP strings $upper and $lower that contain only the content of the two parent braces.
$upperlowerQ='some string'; // in format {upper}{lower}
$qparts=nestor($upperlowerQ);
$upper=$qparts[0];
$lower=$qparts[1];
function nestor($subject) {
$result = false;
preg_match_all('~[^{}]+|\{(?<nested>(?R)*)\}~', $subject, $matches);
foreach($matches['nested'] as $match) {
if ($match != "") {
$result[] = $match;
$nesty = nestor($match);
if ($nesty)
$result = array_merge($result,$nesty);
}
}
return $result;
}
This function works for about 95% of my data (this upper/lower splitting is called in a loop for about 1,000 times) but it is failing on a few. An example of something it fails on looks like this:
{Draw an example of a reciprocal graph in the form $y=\frac{a}{x}$}{
\begin{tikzpicture}
\begin{axis}[xmin=-8,xmax=8,ymin=-5,ymax=12,samples=50,grid=both,grid style={gray!30},xtick={-8,...,8},ytick={-5,...,12},axis x line = bottom,
axis y line = left, axis lines=middle]
\end{axis}
\end{tikzpicture}\par
%ans: smooth reciprocal function plotted.
}
which gives:
$upper as Draw an example of a reciprocal graph in the form $y=\frac{a}{x}$ (which is correct) but $lower as a, which is the numerator of the fraction in the upper part... any ideas appreciated. It is always $lower that is wrong... $upper seems correct.
For any future readers, #Jonny5's response above worked perfectly. eval.in
Added from comments
Try using regex like this: {((?:[^}{]+|(?R))*)} for only extracting what's inside the outer { } and to check if exactly 2 items are matched by returned matchcount of preg_match_all.
$upper = ""; $lower = "";
if(preg_match_all('/{((?:[^}{]+|(?R))*)}/', $str, $out) == 2) {
$upper=$out[1][0]; $lower=$out[1][1];
}
See test at eval.in
I am using a terrible wrapper of PDFLib that doesn't handle the problem PDFLib has with cells that are more than the character limit (Which is around 1600 characters per cell).
So I need to break a large paragraph into smaller strings that fit neatly into the cells, without breaking up words, and as close to the end of the line as possible.
I am completely stumped about how to do this efficiently (I need it to run in a reasonable amount of time)
Here is my code, which cuts the block up into substrings based on character length alone, ignoring the word and line requirements I stated above:
SPE_* functions are static functions from the wrapper class,
SetNextCellStyle calls are used to draw a box around the outline of the cells
BeginRow is required to start a row of text.
EndRow is required to end a row of text, it must be called after BeginRow, and if the preset number of columns is not completely filled, an error is generated.
AddCell adds the string to the second parameter number of columns.
function SPE_divideText($string,$cols,$indent,$showBorders=false)
{
$strLim = 1500;
$index = 0;
$maxIndex = round((strlen($string) / 1500-.5));
$retArr= array();
while(substr($string, $strLim -1500,$strLim)!=FALSE)
{
$retArr[$index] = substr($string, $strLim -1500,$strLim);
$strLim+=1500;
SPE_BeginRow();
SPE_SetNextCellStyle('cell-padding', '0');
if($indent>0)
{
SPE_Empty($indent);
}
if($showBorders)
{
SPE_SetNextCellStyle('border-left','1.5');
SPE_SetNextCellStyle('border-right','1.5');
if($index == 0)
{
SPE_SetNextCellStyle('border-top','1.5');
}
if($index== $maxIndex)
{
SPE_SetNextCellStyle('border-bottom','1.5');
}
}
SPE_AddCell($retArr[$index],$cols-$indent);
SPE_EndRow();
$index++;
}
}
Thanks in advance for any help!
Something like this should work.
function substr_at_word_boundary($string, $chars = 100)
{
preg_match('/^.{0,' . $chars. '}(?:.*?)\b/iu', $string, $matches);
$new_string = $matches[0];
return ($new_string === $string) ? $string : $new_string;
}
$string = substr_at_word_boundary($string, 1600)