I'm trying to create a basic concordance script that will print the ten words before and after the value found inside an array. I did this by splitting the text into an array, identifying the position of the value, and then printing -10 and +10 with the searched value in the middle. However, this only presents the first such occurrence. I know I can find the others by using array_keys (found in positions 52, 78, 80), but I'm not quite sure how to cycle through the matches, since array_keys also results in an array. Thus, using $matches (with array_keys) in place of $location below doesn't work, since you cannot use the same operands on an array as an integer. Any suggestions? Thank you!!
<?php
$text = <<<EOD
The spread of a deadly new virus is accelerating, Chinese President Xi Jinping warned, after holding a special government meeting on the Lunar New Year public holiday.
The country is facing a "grave situation" Mr Xi told senior officials.
The coronavirus has killed at least 42 people and infected some 1,400 since its discovery in the city of Wuhan.
Meanwhile, UK-based researchers have warned of a real possibility that China will not be able to contain the virus.
Travel restrictions have come in place in several affected cities. From Sunday, private vehicles will be banned from central districts of Wuhan, the source of the outbreak.
EOD;
$new = explode(" ", $text);
$location = array_search("in", $new, FALSE);
$concordance = 10;
$top_range = $location + $concordance;
$bottom_range = $location - $concordance;
while($bottom_range <= $top_range) {
echo $new[$bottom_range] . " ";
$bottom_range++;
}
?>
You can just iterate over the values returned by array_keys, using array_slice to extract the $concordance words either side of the location and implode to put the sentence back together again:
$words = explode(' ', $text);
$concordance = 10;
$results = array();
foreach (array_keys($words, 'in') as $idx) {
$results[] = implode(' ', array_slice($words, max($idx - $concordance, 0), $concordance * 2 + 1));
}
print_r($results);
Output:
Array
(
[0] => least 42 people and infected some 1,400 since its discovery in the city of Wuhan.
Meanwhile, UK-based researchers have warned of a
[1] => not be able to contain the virus.
Travel restrictions have come in place in several affected cities. From Sunday, private vehicles will
[2] => able to contain the virus.
Travel restrictions have come in place in several affected cities. From Sunday, private vehicles will be banned
)
If you want to avoid generating similar phrases where a word occurs twice within $concordance words (e.g. indexes 1 and 2 in the above array), you can maintain a position for the end of the last match, and skip occurrences that occur in that match:
$words = explode(' ', $text);
$concordance = 10;
$results = array();
$last = 0;
foreach (array_keys($words, 'in') as $idx) {
if ($idx < $last) continue;
$results[] = implode(' ', array_slice($words, max($idx - $concordance, 0), $concordance * 2 + 1));
$last = $idx + $concordance;
}
print_r($results);
Output
Array
(
[0] => least 42 people and infected some 1,400 since its discovery in the city of Wuhan.
Meanwhile, UK-based researchers have warned of a
[1] => not be able to contain the virus.
Travel restrictions have come in place in several affected cities. From Sunday, private vehicles will
)
Demo on 3v4l.org
Try this:
<?php
$text = <<<EOD
The spread of a deadly new virus is accelerating, Chinese President Xi Jinping warned, after holding a special government meeting on the Lunar New Year public holiday.
The country is facing a "grave situation" Mr Xi told senior officials.
The coronavirus has killed at least 42 people and infected some 1,400 since its discovery in the city of Wuhan.
Meanwhile, UK-based researchers have warned of a real possibility that China will not be able to contain the virus.
Travel restrictions have come in place in several affected cities. From Sunday, private vehicles will be banned from central districts of Wuhan, the source of the outbreak.
EOD;
$words = explode(" ", $text);
$concordance = 10; // range -+
$result = []; // result array
$index = 0;
if (count($words) === 0) // be sure there is no empty array
exit;
do {
$location = array_search("in", $words, false);
if (!$location) // break loop if $location not found
break;
$count = count($words);
// check range of array indexes
$minRange = ($location - $concordance > 0) ? ($location-$concordance) : 0; // array can't have index less than 0 (shorthand if)
$maxRange = (($location + $concordance) < ($count - 1)) ? ($location+$concordance) : $count - 1; // array can't have index equal or higher than array count (shorthand if)
for ($range = $minRange; $range < $maxRange; $range++) {
$result[$index][] = $words[$range]; // group words by index
}
unset($words[$location]); // delete element which contain "in"
$words = array_values($words); // reindex array
$index++;
} while ($location); // repeat until $location exist
print_r($result); // <--- here's your results
?>
I am using
ps -l -u user
to get the running processes of a given user.
Now, when I want to split the information into arrays in PHP I am in trouble because ps outputs the data for humans to read without fixed delimiters. So you can't split with space or tab as regex.
So far I can only detect the columns by character positions.
Is there any way in php to split a string into an array at certain positions? Something like:
$array=split_columns($string, $positions=array(1, 10, 14))
to cut a string into pieces at positions 1, 10 and 14?
I decided to try a regex approach with dynamic pattern building. Not sure it is the best way, but you can give it a try:
function split_columns ($string, $indices) {
$pat = "";
foreach ($indices as $key => $id) {
if ($key==0) {
$pat .= "(.{" . $id . "})";
} else if ($key<count($indices)) {
$pat .= "(.{" . ($id-$indices[$key-1]) . "})";
}
}
$pats = '~^'.$pat.'(.*)$~m';
preg_match_all($pats, $string, $arr);
return array_slice($arr, 1);
}
$string = "11234567891234567\n11234567891234567"; // 1: '1', 2: '123456789', 3: '1234', 4: '567'
print_r (split_columns($string, $positions=array(1, 10, 14)));
See the PHP demo
The point is:
Build the pattern dynamically, by checkign the indices, subtracting the previous index value from each subsequent one, and append the (.*)$ at the end to match the rest of the line.
The m modifier is necessary for ^ to match the start of the line and $ the end of the line.
The array_slice($arr, 1); will remove the full match from the resulting array.
A sample regex (meeting OP requirements)) will look like ^(.{1})(.{9})(.{4})(.*)$
I modified Wiktor's solution as I don't need that many information.
function split_columns ($string, $indices) {
$pat = "";
foreach ($indices as $key => $id) {
if ($key==0) {
$pat .= "(.{" . $id . "})";
} else if ($key<count($indices)) {
$pat .= "(.{" . ($id-$indices[$key-1]) . "})";
}
}
$pats = '~^'.$pat.'(.*)$~m';
preg_match_all($pats, $string, $arr, PREG_SET_ORDER);
$arr=$arr[0];
return array_slice($arr, 1);
}
In PHP preg_split will help you here. You can split by a number of whitespaces e.g.:
<?
$text = '501 309 1 4004 0 4 0 2480080 10092 - S 0 ?? 0:36.77 /usr/sbin/cfpref
501 310 1 40004004 0 37 0 2498132 33588 - S 0 ?? 0:23.86 /usr/libexec/Use
501 312 1 4004 0 37 0 2471032 8008 - S 0 ?? 19:06.48 /usr/sbin/distno';
$split = preg_split ( '/\s+/', $text);
print_r($split);
If you know the number of columns you can then go through the array and take that number of columns as one row.
I have string that is random in nature example 'CBLBTTCCBB'. My goal is to count the occurrence of the string CTLBT in the string CBLBTTCCBB. Repetition of letters used for checking is not allowed. For example, once CTLBT is formed, the remaining random letters for the next iteration will be BCCB.
The scenario is that we have scratch card where users can win letters to form the word CTLBT. Based on the records of the user the letters that he won are in a string CBLBTTCCBB that is ordered from left to right based on the purchase of the scratch card.
I thought of using strpos but it seems inappropriate since it uses the exact arrangement of the substring from larger string.
Any thoughts on how to solve this?
Thanks!
Note:
Question is not a duplicate of How to count the number of occurrences of a substring in a string? since the solution posted in the given link is different. substr_count counts the occurrence of a substring from a string that assumes the string is in a correct order in which the substring will be formed.
Probably then instead strpos you can use preg_replace then:
function rand_substr_count($haystack, $needle)
{
$result = $haystack;
for($i=0; $i<strlen($needle); $i++) {
$result = preg_replace('/'.$needle[$i].'/', '', $result, 1);
}
if (strlen($haystack) - strlen($result) == strlen($needle)) {
return 1 + rand_substr_count($result, $needle);
} else {
return 0;
}
}
echo rand_substr_count("CBLBTTCCBB", "CTLBT");
If I understood correctly I would do this (with prints for showing the results):
<?
# The string to test
$string="CBLBTTCCBBTLTCC";
# The winning word
$word="CTLBT";
# Get the letters from the word to use them as unique array keys
$letters=array_unique(str_split($word));
print("Letters needed are:\n".print_r($letters,1)."\n");
# Initialize the work array
$match=array("count" => array(),"length"=> array());
# Iterate over the keys to build the array with the number of time the letter is occuring
foreach ($letters as $letter) {
$match['length'][$letter] = substr_count($word,$letter); #Number of time this letter appears in the winning word
$match['count'][$letter] = floor(substr_count($string,$letter) / $match['length'][$letter]); # count the letter (will be 0 if none present) and divide by the number of time it should appear, floor it so we have integer
}
print("Count of each letter divided by their appearance times:\n".print_r($match['count'],1)."\n");
# Get the minimum of all letter to know the number of times we can make the winning word
$wins = min($match['count']);
# And print the result
print("Wins: $wins\n");
?>
wich output:
Letters needed are:
Array
(
[0] => C
[1] => T
[2] => L
[3] => B
)
Count of each letter divided by their appearance times:
Array
(
[C] => 5
[T] => 2
[L] => 2
[B] => 4
)
Wins: 2
As you wish to count the combination regardless of the order, the minimum count of letter will be the number of times the user win, if one letter is not present, it will be 0.
I let you transform this into a function and clean the print lines you don't wish ;)
I have a MySQL database table column that has a size value in multiple formats where users manually entered different value formats.
Using PHP I need to iterate DB table and process this field to grab a withd and height value from each column when the column value matches into the pattern we create...
Below are 90% of the values in these formats. Many are the same format but with either single or double digit to left or right side of lowercase or capital X
Usung PHP how could I match each string to strip all non numeric characters from the value on the left and right side of the X.
left = width
right side = height
1x1
1X1
1"x1"
12x12
12X12
12"x12"
12"X12"
NULL
'' ,_ empty field
I just need to get these values into a width and height variable in PHP.
If I can grab everything left of lowercase and capital X ad well as right of and strip all non numbers then I think it would work easily
There are other values as well and those one should be ignored as they will not fit the pattern. Below is example of some of those odd values I found so far...
18" channel letters
64x20 x 2
Glass Dimensions: 12"x72"
172.61 cm x 28.46 cm
230.15 cm x 42.07 cm
24x24 Interior Double Sided
These type of values should be ignored so I can manually edit these later
I've written up a function called rough_strip_all that should strip all characters in a string except for those listed. Adding this step may resolve the issue for you, but if it doesn't, you might have to look into recompiling to enable the UTF8 support for PCRE.
<?php
// Strips out all characters except for those in allowed set
function rough_strip_all( $string, $allowed_set = '0123456789x. ' )
{
// Takes the allowed set, splits it into character by character,
// then converts each character in the array to its ASCII value
$allowed_ascii = array_map( function($a) {
return ord( $a );
}, str_split( $allowed_set ) );
$return = '';
for( $i = 0, $ilen = mb_strlen( $string ); $i < $ilen; $i++ )
{
// Check if the ASCII value of current character is in the list of allowed
// ascii characters given. If it is, add it to the return string
$ascii = ord( $string{$i} );
if( in_array( $ascii, $allowed_ascii ) )
{
$return .= $string{$i};
}
}
// Returns the newly compiled string
return $return;
}
// Original string
$string = "Misc text: 35.25”x 21.00” 123 extra text 456";
// Display original string
echo "Original string: {$string}<br />";
// Strips out all characters except the following: '0123456789x. '
$string = rough_strip_all( strtolower( $string ) );
// Strip out all characters except for numbers, letter x, decimal points, and spaces
$string = preg_replace( '/([^0-9x \.])/ui', '', $string );
// Find anything that fits the number X number format (including decimal numbers)
preg_match( '/([0-9]+(\.[0-9]+)?) ?x ?([0-9]+(\.[0-9]+)?)/ui', $string, $values );
// Match found
if( !empty( $values ) )
{
// Set dimensions in easy to read variables
$dimension_a = $values[1];
$dimension_b = $values[3];
// Values returned
echo "Dimension A: {$dimension_a}<br />";
echo "Dimension B: {$dimension_b}<br />";
}
// No match found
else
{
echo "No match found.";
}
?>
This should should also work for the additional outliers you added in as it strips out all non-essential characters first, then attempts to make a match. I've also added some display logic to it so you can see the original string and what each dimension is after its been processed, or a message if there has been no match.
Even easier would be preg_match_all("/[0-9]+/", $string, $matches);
Test cases:
1x1
1\"x1
12X12
Array ( [0] => Array ( [0] => 1 [1] => 1 ) ) Array ( [0] => Array ( [0] => 1 >[1] => 1 ) ) Array ( [0] => Array ( [0] => 12 [1] => 12 ) )
I have a range of whole numbers that might or might not have some numbers missing. Is it possible to find the smallest missing number without using a loop structure? If there are no missing numbers, the function should return the maximum value of the range plus one.
This is how I solved it using a for loop:
$range = [0,1,2,3,4,6,7];
// sort just in case the range is not in order
asort($range);
$range = array_values($range);
$first = true;
for ($x = 0; $x < count($range); $x++)
{
// don't check the first element
if ( ! $first )
{
if ( $range[$x - 1] + 1 !== $range[$x])
{
echo $range[$x - 1] + 1;
break;
}
}
// if we're on the last element, there are no missing numbers
if ($x + 1 === count($range))
{
echo $range[$x] + 1;
}
$first = false;
}
Ideally, I'd like to avoid looping completely, as the range can be massive. Any suggestions?
Algo solution
There is a way to check if there is a missing number using an algorithm. It's explained here. Basically if we need to add numbers from 1 to 100. We don't need to calculate by summing them we just need to do the following: (100 * (100 + 1)) / 2. So how is this going to solve our issue ?
We're going to get the first element of the array and the last one. We calculate the sum with this algo. We then use array_sum() to calculate the actual sum. If the results are the same, then there is no missing number. We could then "backtrack" the missing number by substracting the actual sum from the calculated one. This of course only works if there is only one number missing and will fail if there are several missing. So let's put this in code:
$range = range(0,7); // Creating an array
echo check($range) . "\r\n"; // check
unset($range[3]); // unset offset 3
echo check($range); // check
function check($array){
if($array[0] == 0){
unset($array[0]); // get ride of the zero
}
sort($array); // sorting
$first = reset($array); // get the first value
$last = end($array); // get the last value
$sum = ($last * ($first + $last)) / 2; // the algo
$actual_sum = array_sum($array); // the actual sum
if($sum == $actual_sum){
return $last + 1; // no missing number
}else{
return $sum - $actual_sum; // missing number
}
}
Output
8
3
Online demo
If there are several numbers missing, then just use array_map() or something similar to do an internal loop.
Regex solution
Let's take this to a new level and use regex ! I know it's nonsense, and it shouldn't be used in real world application. The goal is to show the true power of regex :)
So first let's make a string out of our range in the following format: I,II,III,IIII for range 1,3.
$range = range(0,7);
if($range[0] === 0){ // get ride of 0
unset($range[0]);
}
$str = implode(',', array_map(function($val){return str_repeat('I', $val);}, $range));
echo $str;
The output should be something like: I,II,III,IIII,IIIII,IIIIII,IIIIIII.
I've come up with the following regex: ^(?=(I+))(^\1|,\2I|\2I)+$. So what does this mean ?
^ # match begin of string
(?= # positive lookahead, we use this to not "eat" the match
(I+) # match I one or more times and put it in group 1
) # end of lookahead
( # start matching group 2
^\1 # match begin of string followed by what's matched in group 1
| # or
,\2I # match a comma, with what's matched in group 2 (recursive !) and an I
| # or
\2I # match what's matched in group 2 and an I
)+ # repeat one or more times
$ # match end of line
Let's see what's actually happening ....
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^
(I+) do not eat but match I and put it in group 1
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^
^\1 match what was matched in group 1, which means I gets matched
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^^^ ,\2I match what was matched in group 1 (one I in thise case) and add an I to it
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^^^^ \2I match what was matched previously in group 2 (,II in this case) and add an I to it
I,II,III,IIII,IIIII,IIIIII,IIIIIII
^^^^^ \2I match what was matched previously in group 2 (,III in this case) and add an I to it
We're moving forward since there is a + sign which means match one or more times,
this is actually a recursive regex.
We put the $ to make sure it's the end of string
If the number of I's don't correspond, then the regex will fail.
See it working and failing. And Let's put it in PHP code:
$range = range(0,7);
if($range[0] === 0){
unset($range[0]);
}
$str = implode(',', array_map(function($val){return str_repeat('I', $val);}, $range));
if(preg_match('#^(?=(I*))(^\1|,\2I|\2I)+$#', $str)){
echo 'works !';
}else{
echo 'fails !';
}
Now let's take in account to return the number that's missing, we will remove the $ end character to make our regex not fail, and we use group 2 to return the missed number:
$range = range(0,7);
if($range[0] === 0){
unset($range[0]);
}
unset($range[2]); // remove 2
$str = implode(',', array_map(function($val){return str_repeat('I', $val);}, $range));
preg_match('#^(?=(I*))(^\1|,\2I|\2I)+#', $str, $m); // REGEEEEEX !!!
$n = strlen($m[2]); //get the length ie the number
$sum = array_sum($range); // array sum
if($n == $sum){
echo $n + 1; // no missing number
}else{
echo $n - 1; // missing number
}
Online demo
EDIT: NOTE
This question is about performance. Functions like array_diff and array_filter are not magically fast. They can add a huge time penalty. Replacing a loop in your code with a call to array_diff will not magically make things fast, and will probably make things slower. You need to understand how these functions work if you intend to use them to speed up your code.
This answer uses the assumption that no items are duplicated and no invalid elements exist to allow us to use the position of the element to infer its expected value.
This answer is theoretically the fastest possible solution if you start with a sorted list. The solution posted by Jack is theoretically the fastest if sorting is required.
In the series [0,1,2,3,4,...], the n'th element has the value n if no elements before it are missing. So we can spot-check at any point to see if our missing element is before or after the element in question.
So you start by cutting the list in half and checking to see if the item at position x = x
[ 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9 ]
^
Yup, list[4] == 4. So move halfway from your current point the end of the list.
[ 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9 ]
^
Uh-oh, list[6] == 7. So somewhere between our last checkpoint and the current one, one element was missing. Divide the difference in half and check that element:
[ 0 | 1 | 2 | 3 | 4 | 5 | 7 | 8 | 9 ]
^
In this case, list[5] == 5
So we're good there. So we take half the distance between our current check and the last one that was abnormal. And oh.. it looks like cell n+1 is one we already checked. We know that list[6]==7 and list[5]==5, so the element number 6 is the one that's missing.
Since each step divides the number of elements to consider in half, you know that your worst-case performance is going to check no more than log2 of the total list size. That is, this is an O(log(n)) solution.
If this whole arrangement looks familiar, It's because you learned it back in your second year of college in a Computer Science class. It's a minor variation on the binary search algorithm--one of the most widely used index schemes in the industry. Indeed this question appears to be a perfectly-contrived application for this searching technique.
You can of course repeat the operation to find additional missing elements, but since you've already tested the values at key elements in the list, you can avoid re-checking most of the list and go straight to the interesting ones left to test.
Also note that this solution assumes a sorted list. If the list isn't sorted then obviously you sort it first. Except, binary searching has some notable properties in common with quicksort. It's quite possible that you can combine the process of sorting with the process of finding the missing element and do both in a single operation, saving yourself some time.
Finally, to sum up the list, that's just a stupid math trick thrown in for good measure. The sum of a list of numbers from 1 to N is just N*(N+1)/2. And if you've already determined that any elements are missing, then obvously just subtract the missing ones.
Technically, you can't really do without the loop (unless you only want to know if there's a missing number). However, you can accomplish this without first sorting the array.
The following algorithm uses O(n) time with O(n) space:
$range = [0, 1, 2, 3, 4, 6, 7];
$N = count($range);
$temp = str_repeat('0', $N); // assume all values are out of place
foreach ($range as $value) {
if ($value < $N) {
$temp[$value] = 1; // value is in the right place
}
}
// count number of leading ones
echo strspn($temp, '1'), PHP_EOL;
It builds an ordered identity map of N entries, marking each value against its position as "1"; in the end all entries must be "1", and the first "0" entry is the smallest value that's missing.
Btw, I'm using a temporary string instead of an array to reduce physical memory requirements.
I honestly don't get why you wouldn't want to use a loop. There's nothing wrong with loops. They're fast, and you simply can't do without them. However, in your case, there is a way to avoid having to write your own loops, using PHP core functions. They do loop over the array, though, but you simply can't avoid that.
Anyway, I gather what you're after, can easily be written in 3 lines:
function highestPlus(array $in)
{
$compare = range(min($in), max($in));
$diff = array_diff($compare, $in);
return empty($diff) ? max($in) +1 : $diff[0];
}
Tested with:
echo highestPlus(range(0,11));//echoes 12
$arr = array(9,3,4,1,2,5);
echo highestPlus($arr);//echoes 6
And now, to shamelessly steal Pé de Leão's answer (but "augment" it to do exactly what you want):
function highestPlus(array $range)
{//an unreadable one-liner... horrid, so don't, but know that you can...
return min(array_diff(range(0, max($range)+1), $range)) ?: max($range) +1;
}
How it works:
$compare = range(min($in), max($in));//range(lowest value in array, highest value in array)
$diff = array_diff($compare, $in);//get all values present in $compare, that aren't in $in
return empty($diff) ? max($in) +1 : $diff[0];
//-------------------------------------------------
// read as:
if (empty($diff))
{//every number in min-max range was found in $in, return highest value +1
return max($in) + 1;
}
//there were numbers in min-max range, not present in $in, return first missing number:
return $diff[0];
That's it, really.
Of course, if the supplied array might contain null or falsy values, or even strings, and duplicate values, it might be useful to "clean" the input a bit:
function highestPlus(array $in)
{
$clean = array_filter(
$in,
'is_numeric'//or even is_int
);
$compare = range(min($clean), max($clean));
$diff = array_diff($compare, $clean);//duplicates aren't an issue here
return empty($diff) ? max($clean) + 1; $diff[0];
}
Useful links:
The array_diff man page
The max and min functions
Good Ol' range, of course...
The array_filter function
The array_map function might be worth a look
Just as array_sum might be
$range = array(0,1,2,3,4,6,7);
// sort just in case the range is not in order
asort($range);
$range = array_values($range);
$indexes = array_keys($range);
$diff = array_diff($indexes,$range);
echo $diff[0]; // >> will print: 5
// if $diff is an empty array - you can print
// the "maximum value of the range plus one": $range[count($range)-1]+1
echo min(array_diff(range(0, max($range)+1), $range));
Simple
$array1 = array(0,1,2,3,4,5,6,7);// array with actual number series
$array2 = array(0,1,2,4,6,7); // array with your custom number series
$missing = array_diff($array1,$array2);
sort($missing);
echo $missing[0];
$range = array(0,1,2,3,4,6,7);
$max=max($range);
$expected_total=($max*($max+1))/2; // sum if no number was missing.
$actual_total=array_sum($range); // sum of the input array.
if($expected_total==$actual_total){
echo $max+1; // no difference so no missing number, then echo 1+ missing number.
}else{
echo $expected_total-$actual_total; // the difference will be the missing number.
}
you can use array_diff() like this
<?php
$range = array("0","1","2","3","4","6","7","9");
asort($range);
$len=count($range);
if($range[$len-1]==$len-1){
$r=$range[$len-1];
}
else{
$ref= range(0,$len-1);
$result = array_diff($ref,$range);
$r=implode($result);
}
echo $r;
?>
function missing( $v ) {
static $p = -1;
$d = $v - $p - 1;
$p = $v;
return $d?1:0;
}
$result = array_search( 1, array_map( "missing", $ARRAY_TO_TEST ) );