I'm creating an excel file with a bunch of different data points from information that I'm scraping off the web with Python.
One of those data points is a nested array, which is becoming a string when either it's inserted into the CSV file or read from the PHP file on my server.
The whole idea behind using the nested array is so that I can insert each pair of images and thumbnails into their respective columns in a single row on a separate MySQL table.
Nested Array
images_and_thumbnails = [
['https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg']
]
PHP Script to Process the Excel File
$str2 = 'INSERT INTO deals_images_and_thumbnails (asin, image, thumbnail) VALUES (:asin, :image, :thumbnail)';
$sta2 = $conn->prepare($str2);
$file = fopen($_SESSION['file'], 'r');
while (!feof($file)) {
while($row = fgetcsv($file)) {
if (count($row) === 31) {
$images_and_thumbnails = $row[8];
foreach ($images_and_thumbnails as $value) {
$sta2->execute([
'asin' => $asin,
'image' => $value[0],
'thumbnail' => $value[1]
]);
}
The issue is that $images_and_thumbnails is a string, which is obviously "an invalid argument" for the foreach loop.
Is there any way to convert the string back to an array?
Will simply removing the double quotes do the job?
If the format of $images_and_thumbnails is fixed, you could use explode to split it up:
$images_and_thumbnails = "[
['https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg']
]";
foreach (explode('],', $images_and_thumbnails) as $i_and_t) {
$value = explode("', '", trim($i_and_t, "[]' \t\r\n"));
print_r($value);
}
However if it can be variable with spacing, it is better to use preg_split:
foreach (preg_split('/\'\s*\]\s*,\s*\[\s*\'/', $images_and_thumbnails) as $i_and_t) {
$value = preg_split('/\'\s*,\s*\'/', trim($i_and_t, "[]' \t\r\n"));
print_r($value);
}
If you're 100% certain that the data is safe, you could also eval it i.e.
eval ("\$images_and_thumbnails = [
['https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg']
];");
print_r($images_and_thumbnails);
Any of those methods will give you the same result:
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg
)
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg
)
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg
)
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg
)
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg
)
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg
)
Array (
[0] => https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg
[1] => https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg
)
Demo on 3v4l.org
As single call of preg_match_all() with the PREG_SET_ORDER flag will set up a multidimensional array that will make isolating your desired data a snap. Furthermore, if you wanted to perform validation on the input data, you could write a more strict pattern to ensure you are getting valid jpg strings.
If this was my task and I had no control over the format of the input data, this is how I would parse it. One call does it all.
Code: (Demo) (Regex Demo)
$string = <<<STRING
images_and_thumbnails = [
['https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg'],
['https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg', 'https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg']
]
STRING;
if (preg_match_all("~\s*\['([^']*)',\s*'([^']*)']~", $string, $out, PREG_SET_ORDER)) {
foreach ($out as $row) {
var_export($row); // to demonstrate what is generated
$image = $row[1]; // for your actual usage
$thumbnail = $row[2]; // for your actual usage
echo "\n---\n";
}
}
Output:
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/615JCt72MXL._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/41rExpVS75L._US40_.jpg',
)
---
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/71Ss5tJW-4L._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/41RpAwvZJ5L._US40_.jpg',
)
---
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/6157znz2BeL._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/41mSje9rDSL._US40_.jpg',
)
---
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/815wlLde-gL._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/51jty5d4BpL._US40_.jpg',
)
---
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/71D2gVlCUOL._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/41kCBJYI%2BCL._US40_.jpg',
)
---
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/71EfsMWdx0L._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/41utl4%2B%2B%2BoL._US40_.jpg',
)
---
array (
0 => ' [\'https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg\', \'https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg\']',
1 => 'https://images-na.ssl-images-amazon.com/images/I/61m4mFpIvVL._UY575_.jpg',
2 => 'https://images-na.ssl-images-amazon.com/images/I/41S27BGn0UL._US40_.jpg',
)
---
Related
with some help of this forum i found a way to get an array that perfectly fits my task. I ran into a follow-up problem tho:
I got the following example-array
Array (
[A] => Array (
[D] => Array (
[A] => Array (
[M] => Array (
[result] => ADAM )
[N] => Array (
[result] => ADAN )
)
)
)
[H] => Array (
[E] => Array (
[N] => Array (
[R] => Array (
[Y] => Array (
[result] => HENRY )
)
[N] => Array (
[E] => Array (
[S] => Array (
[result] => HENNES )
)
)
)
)
)
)
Where the Letters are Indexes and i end up with an result array for each name.
Now i am looking for a way to Search this array with a specific search-string and it should be possible from the 3rd Char on. So if i Search for 'ADA' i want to get the value from all following result-arrays which would be "ADAM" and "ADAN" as both follow on Array['A']['D']['A'].
I didnt have any trouble starting to search at the right Index but i cant figure out a way to access all 'result'-arrays. Only found ways to search for the final Value (ADAM, ADAN) but as statet im looking for all final values possible from my searchpoint.
So basically i want to get all Values from the result arrays following the last Char from my Search-String as Index of my Array. Hopefully that explanation points out what im looking for.
Thanks in Advance!
In short:
//my Input
$searchstring = 'ADA';
//Output i want
"ADAM", "ADAN";
//Input
$searchstring = 'ADAM';
//Output
"ADAM"
EDIT: I edited this question with my approach so far as a comment pointed out i should do this (thanks for that! ) so i tried to go this way:
When i had my Example-Array i tried to only select the necessary part of the structure:
$searchquery = 'HEN';
//Searchquery as Array
$check = str_split($searchquery);
//second isntance of the original array which is named $result
$finalsearch = $result;
foreach($check as $key) {
$finalsearch = $finalsearch[$key];
}
//check with output if i selected the right area
print_r($finalsearch);
Output i got from this:
Array ( [R] => Array ( [Y] => Array ( [result] => HENRY ) ) [N] => Array ( [E] => Array ( [S] => Array ( [result] => HENNES ) ) ) )
So i am in the right are of the structure.
then i tried to find ways to search for all Instances of the index 'result'.
i found the following functions and approaches that all enabled me to search for a specific value but not the indexes.
$newArray = array_values($finalsearch);
array-search($searchquery, $finalsearch);
That was the Point where i started turning in circles
First part is to find the start point for the list, this is just a case of looping over each character in the search string and moving onto that value in the array.
Once you have found the start point, you can use array_walk_recursive() which will only visit the leaf nodes - so this will only be the names (in this case), so create a list of all these nodes and return them...
function getEntry ( array $result, string $search ) {
for($i = 0; isset($search[$i]); $i++){
$result = $result[$search[$i]];
}
$output = [];
array_walk_recursive($result, function ( $data ) use (&$output) {
$output[] = $data;
});
return $output;
}
$searchstring = 'ADA';
print_r(getEntry($result, $searchstring));
which should give...
Array
(
[0] => ADAM
[1] => ADAN
)
This script first iterates over the keys containing the chars of $searchstring and if it has found it and no errors were thrown, it walks the array recursively to find all result keys to add it to the $result array. After that it implodes the $result array and echos it.
$searchstring = 'HE';
for( $i = 0; $i < strlen( $searchstring ); $i++ ) {
$sub = #( isset( $sub ) ? $sub[$searchstring[$i]] : $array[$searchstring[$i]] )
or die( 'no results found' );
}
array_walk_recursive( $sub, function( $value ) use ( &$results ) {
$results[] = $value;
});
echo implode( ', ', $results );
I am writing code in PHP to collect all the hashtags which I've used in all my media posts and see in how many posts I've used the hashtag and how many likes the post with that hashtag received in total.
I have collected all of the media posts in my database and are now able to export this information. Here is an example of the multidimensional array which is being output:
Array
(
[0] => Array
(
[id] => 1
[caption] => #londra #london #london_only #toplondonphoto #visitlondon #timeoutlondon #londres #london4all #thisislondon #mysecretlondon #awesomepix #passionpassport #shootermag #discoverearth #moodygrams #agameoftones #neverstopexploring #beautifuldestinations #artofvisuals #roamtheplanet #jaw_dropping_shots #fantastic_earth #visualsoflife #bdteam #nakedplanet #ourplanetdaily #earthfocus #awesome_earthpix #exploretocreate #londoneye
[likesCount] => 522
)
[1] => Array
(
[id] => 2
[caption] => #londra #london #london_only #toplondonphoto #visitlondon #timeoutlondon #londres #london4all #thisislondon #mysecretlondon #awesomepix #passionpassport #shootermag #discoverearth #moodygrams #agameoftones #neverstopexploring #beautifuldestinations #artofvisuals #roamtheplanet #jaw_dropping_shots #fantastic_earth #visualsoflife #bdteam #nakedplanet #ourplanetdaily #earthfocus #awesome_earthpix #harrods #LDN4ALL_One4All
[likesCount] => 1412
)
)
I am able to separate these hashtags out using the following function:
function getHashtags($string) {
$hashtags= FALSE;
preg_match_all("/(#\w+)/u", $string, $matches);
if ($matches) {
$hashtagsArray = array_count_values($matches[0]);
$hashtags = array_keys($hashtagsArray);
}
return $hashtags;
}
Now I want to create a multidimensional array for each hashtag which should look like this:
Array
(
[0] => Array
(
[hash] => #londra
[times_used] => 2
[total_likes] => 153
)
[1] => Array
(
[hash] => #london
[times_used] => 12
[total_likes] => 195
)
)
I am quite new to this and not sure how to achieve this. Help and suggestions are appreciated!
It would be easier to use the hashtags as keys in your array. You can
transform it later to your final format if you want to. The idea is to
traverse your input array and within each element iterate on the given
hashtags string, increasing counters.
And if your hashtags are always in a string like that, separated by
whitespace, you can also get an array of then with explode() or
preg_split() for finer control.
$posts = # your input array
$tags = [];
foreach ($posts as $post) {
$hashtags = explode(' ', $post['caption']);
foreach ($hashtags as $tag) {
if (!key_exists($tag, $tags)) {
# first time seeing this one, initialize an entry
$tags[$tag]['counter'] = 0;
$tags[$tag]['likes'] = 0;
}
$tags[$tag]['counter']++;
$tags[$tag]['likes'] += $post['likesCount'];
}
}
Transforming to something closer to your original request:
$result = array_map(function($hashtag, $data) {
return [
'hash' => $hashtag,
'times_used' => $data['counter'],
'total_likes' => $data['likes'],
'average_likes' => $data['likes'] / $data['counter'],
];
}, array_keys($tags), $tags);
When I do a print_r on my $_POST, I have an array that may look like this:
Array
(
[action] => remove
[data] => Array
(
[row_1] => Array
(
[DT_RowId] => row_1
[name] => Unit 1
[item_price] => 150.00
[active] => Y
[taxable] => Y
[company_id] => 1
)
)
)
The row_1 value can be anything formatted like row_?
I want that number as a string, whatever the number is. That key and the DT_RowID value will always be the same if that helps.
Right now I am doing this, but it seems like a bad way of doing it:
//the POST is a multidimensinal array... the key inside the 'data' array has the id in it, like this: row_2. I'm getting the key value here and then removing the letters to get only the id nummber.
foreach ($_POST['data'] AS $key => $value) {
$id_from_row_value = $key;
}
//get only number from key = still returning an array
preg_match_all('!\d+!', $id_from_row_value, $just_id);
//found I had to use [0][0] since it's still a multidimensional array to get the id value
$id = $just_id[0][0];
It works, but I'm guessing there's a faster way of getting that number from the $_POST array.
<?php
$array = [
'data' => [
'row_1' => [],
'row_2' => [],
]
];
$nums = [];
foreach ($array['data'] as $key => $val) {
$nums[] = preg_replace('#[^\d]#', '', $key);
}
var_export($nums);
Outputs:
array (
0 => '1',
1 => '2',
)
Please remember that regular expressions, used in preg_match are not the fastest solution. What I would do is split the string by _ and take the second part. Like that:
$rowId = explode("_", "row_2")[1];
And put that into your loop to process all elements.
I am having trouble sorting through a multidimensional array that I am pulling from an XML feed that can be different every time. I need to find something and place the result in a variable. I am still learning PHP and this unfortunately is a bit over my head.
I'll break down what I have. An example of my array contained in $valsarray:
Array
( [0] => Array
(
[tag] => GIVENNAME
[type] => complete
[level] => 8
[value] => peter
)
[1] => Array
(
[tag] => FAMILYNAME
[type] => complete
[level] => 8
[value] => rabbit
)
[2] => Array
(
[tag] => COMPLETENUMBER
[type] => complete
[level] => 9
[value] => 123-345-4567
)
[3] => Array
(
[tag] => URIID
[type] => complete
[level] => 9
[value] => customerEmail#gmail.com
)
)
Now I understand that I can get the result by using: $phone = $valsarray[2][value];
However, my problem is that if no phone number was given, the XML feed will not contain the phone number array so Array 3 would become Array 2.
So my question is how would I go about looping through the arrays to find if COMPLETENUMBER exists and then assigning the phone number contained in value to a $phone variable?
Here's one way:
$tags = array_column($valsarray, null, 'tag');
if(isset($tags['COMPLETENUMBER'])) {
$phone = $tags['COMPLETENUMBER']['value'];
}
Or if you only care about value:
$tags = array_column($valsarray, 'value', 'tag');
if(isset($tags['COMPLETENUMBER'])) {
$phone = $tags['COMPLETENUMBER'];
}
So in short:
Get an array of the value values indexed by tag
If COMPLETENUMBER index is set then get the value from value
After the array_column() you can then get whatever value you want:
$email = $tags['URIID'];
This loop would do it:
foreach($valsarray as $fieldArray)
{
if ($fieldArray['tag'] === 'COMPLETENUMBER')
{
$phone = $fieldArray['value'];
break;
}
}
If you need to do this type of thing repeatedly on the same array, you'd be better off reindexing it than searching each time. You could reindex like this:
foreach($valsarray as $key => $fieldArray)
{
$valsarray[$fieldArray['tag']] = $fieldArray;
unset($valsarray[$key]);
}
After reindexing it, you can do this for any field you want:
$phone = $valsarray['COMPLETENUMBER']['value'];
You can array_filter to get only COMPLETENUMBER entries, and set $phone if one is found.
$items = array_filter($valsarray, function($x) { return $x['tag'] == 'COMPLETENUMBER'; });
$phone = $items ? reset($items)['value'] : null;
Based on your other comments, if you want to get the values for a subset of tags from the array, you can use in_array in the array_filter callback. This could be wrapped in the array_column suggested in AbraCadaver's answer to get an array of values for any of the tags you're interested in:
$tags = ['COMPLETENUMBER', 'URIID'];
$data = array_column(array_filter($valsarray, function($x) use ($tags) {
return in_array($x['tag'], $tags);
}), 'value', 'tag');
The result would be like:
array (size=2)
'COMPLETENUMBER' => string '123-345-4567' (length=12)
'URIID' => string 'customerEmail#gmail.com' (length=23)
Can someone please put me out of my misery and explain why I'm missing the middle value when I try to push the results of a preg_match into another array? It's either something silly or a vast gap in my understanding. Either way I need help. Here is my code:
<?php
$text = 'The group, which gathered at the Somerfield depot in Bridgwater, Somerset,
on Thursday night, complain that consumers believe foreign meat which has been
processed in the UK is British because of inadequate labelling.';
$word = 'the';
preg_match_all("/\b" . $word . "\b/i", $text, $matches, PREG_OFFSET_CAPTURE);
$word_pos = array();
for($i = 0; $i < sizeof($matches[0]); $i++){
$word_pos[$matches[0][$i][0]] = $matches[0][$i][1];
}
echo "<pre>";
print_r($matches);
echo "</pre>";
echo "<pre>";
print_r($word_pos);
echo "</pre>";
?>
I get this output:
Array
(
[0] => Array
(
[0] => Array
(
[0] => The
[1] => 0
)
[1] => Array
(
[0] => the
[1] => 29
)
[2] => Array
(
[0] => the
[1] => 177
)
)
)
Array
(
[The] => 0
[the] => 177
)
So the question is: why am I missing the [the] => 29? Is there a better way? Thanks.
PHP arrays are 1:1 mappings, i.e. one key points to exactly one value. So you are overwriting the middle value since it also has the key the.
The easiest solution would be using the offset as the key and the matched string as the value. However, depending on what you want to do with the results a completely different structure might be more appropriate.
First you assign $word_pos["the"] = 29 and then you OVERWRITE IT with $word_pos["the"] = 177.
You don't overwrite The because indexes are case sensitive.
So maybe use an array of objects like this:
$object = new stdClass;
$object->word = "the"; // for example
$object->pos = 29; // example :)
and assign it to an array
$positions = array(); // just init once
$positions[] = $object;
alternatively you can assign an associative array instead of the object, so it would be like
$object = array(
'word' => 'the',
'pos' => 29
);
OR assign the way you do, but instead of overwriting, just add it to an array, like:
$word_pos[$matches[0][$i][0]][] = $matches[0][$i][1];
instead of
$word_pos[$matches[0][$i][0]] = $matches[0][$i][1];
so you get something like:
Array
(
[The] => Array
(
[0] => 0
)
[the] => Array
(
[0] => 29
[1] => 177
)
)
Hope that helps :)
What is happening actually :
when i=0,
$word_pos[The] = 0 //mathches[0][0][0]=The
when i=1
$word_pos[the] = 29
when i=3
$word_pos[the] = 177 //here this "the" key overrides the previous one
//so your middle 'the' is going to lost :(
Now an array based solution can be like this :
for($i = 0; $i < sizeof($matches[0]); $i++){
if (array_key_exists ( $matches[0][$i][0] , $word_pos ) ) {
$word_pos[$matches[0][$i][0]] [] = $matches[0][$i][1];
}
else $word_pos[$matches[0][$i][0]] = array ( $matches[0][$i][1] );
}
Now if you dump $word_pos the output should be :
Array
(
[The] => Array
(
[0] => 0
)
[the] => Array
(
[0] => 29 ,
[1] => 177
)
)
Hope that helps.
Reference : array_key_exist