Display correct data from .json file - php

I apologise in advance if this is vague, I have battled with this for nigh on 10 hours now and have got nowhere...
My client has been screwed over by their SEO company and I am trying to help to save their business. One major problem we have is that theirs websites content is dependant on a data feed hosted on their SEO companys website. The content for different areas around their main town is auto generated by this script. Part of the script is hosted on the clients server and this communicates with the feed.
I have found a JSON file with all the data that the feed uses, but am struggling to get the script to read off of this file instead.
There appears to be 2 functions I should be interested in generateContentFromFeed and generateTextFromFile. Currently it appears to be using generateContentFromFeed (but may be using generateTextFromFile elsehwere, but I'm quite sure it isnt. I believe this function is there if you dont want to get the data externally).
I have tried swapping the functions around, and changing the source of the feed in the config file, but with no joy. All this achieved was outputting the entire contents of the json file.
Files are below:
content.class.php
<?php
/**
* This class manipulates content from files / JSON into an array
* Also includes helpful functions for formatting content
*/
class Content {
public $links = array();
public $replacements = array();
private function stripslashes_gpc($string) {
if (get_magic_quotes_gpc()) {
return stripslashes($string);
} else {
return $string;
}
}
private function sourceContentToArray($source) {
// array
if (is_array($source)) {
$array = $source;
// json
} elseif (json_decode($source) != FALSE || json_decode($source) != NULL) {
$array = json_decode($source, 1);
// file
} elseif (file_exists($source)) {
if (!$array = json_decode(file_get_contents($source), 1)) {
echo "File empty or corrupt";
return false;
}
} else {
echo 'Source content not recognised';
return false;
}
return $array;
}
public function generateContent($source, $type="paragraph", $label="", $subject, $num=0, $randomize_order=false) {
if (!$array = $this->sourceContentToArray($source))
return false;
$array = empty($this->links) ? $this->loadLinks($array, $subject) : $array;
$this->loadGlobalReplacements($array, $subject);
$ca = array();
foreach ($array['content'] as $c) {
if ($c['type'] == $type) {
if (empty($label) || (!empty($label) && $c['label'] == $label)) {
$ca[] = $c;
}
}
}
$rc = array();
foreach ($ca as $k => $a) {
$rc[] = $this->randomizeContent($a, $subject.$k);
}
if ((!is_array($num) && $num >= 1) || (is_array($num) && $num[0] >= 1)) {
$rc = $this->arraySliceByInteger($rc, $subject, $num);
} else if ((!is_array($num) && $num > 0 && $num < 1) || (is_array($num) && $num[0] > 0 && $num[0] < 1)) {
$rc = $this->arraySliceByPercentage($rc, $subject, $num);
} else {
if ($randomize_order == true)
$rc = $this->arraySliceByPercentage($rc, $subject, 1);
}
return $rc;
}
public function formatContent($source, $type, $subject, $find_replace=array()) {
$c = "";
foreach ($source as $k => $s) {
$text = "";
if ($type == "list" || $type == "paragraph") {
$text .= "<h3>";
foreach ($s['title'] as $t) {
$text .= $t." ";
}
$text .= "</h3>";
}
if ($type == "list") {
$text .= "<ul>";
} else if ($type == "paragraph") {
$text .= "<p>";
}
foreach ($s['parts'] as $b) {
if ($type == "list")
$text .= "<li>";
$text .= $b." ";
if ($type == "list")
$text .= "</li>";
}
if ($type == "list") {
$text .= "</ul>";
} else if ($type == "paragraph") {
$text .= "</p>";
}
$text = $this->findReplace($s['replacements'], $text, $subject.$k."1");
$text = $this->injectLinks($this->links, $text, $subject.$k."2");
$text = $this->findReplace($this->replacements, $text, $subject.$k."3");
$text = $this->findReplace($find_replace, $text, $subject.$k."4");
$text = $this->aAnReplacement($text);
$text = $this->capitaliseFirstLetterOfSentences($text);
$c .= $this->stripslashes_gpc($text);
}
return $c;
}
public function injectLinks($links, $text, $subject) {
global $randomizer;
if (empty($links))
return $text;
foreach ($links as $k => $link) {
$_link = array();
if (preg_match("/\{L".($k+1)."\}/", $text)) {
preg_match_all("/\{L".($k+1)."\}/", $text, $vars);
foreach ($vars[0] as $vark => $varv) {
$_link['link'] = empty($_link['link']) ? $this->links[$k]['link'] : $_link['link'];
$l_link = $randomizer->fetchEncryptedRandomPhrase($_link['link'], 1, $subject.$k.$vark);
unset($_link['link'][array_search($l_link, $_link['link'])]);
$_link['link'] = array_values($_link['link']);
$_link['text'] = empty($_link['text']) ? $this->links[$k]['text'] : $_link['text'];
$l_text = $randomizer->fetchEncryptedRandomPhrase($_link['text'], 2, $subject.$k.$vark);
unset($_link['text'][array_search($l_text, $_link['text'])]);
$_link['text'] = array_values($_link['text']);
$link_html = empty($l_link) ? $l_text : "".$l_text."";
$text = preg_replace("/\{L".($k+1)."\}/", $link_html, $text, 1);
$this->removeUsedLinksFromPool($l_link);
}
}
}
return $text;
}
private function loadLinks($source, $subject) {
global $randomizer;
if (!empty($source['links'])) {
foreach ($source['links'] as $k => $l) {
$source['links'][$k]['link'] = preg_split("/\r?\n/", trim($l['link']));
$source['links'][$k]['text'] = preg_split("/\r?\n/", trim($l['text']));
}
$this->links = $source['links'];
}
return $source;
}
private function loadGlobalReplacements($source, $subject) {
global $randomizer;
$source['replacements'] = $this->removeEmptyIndexes($source['replacements']);
foreach ($source['replacements'] as $k => $l) {
$source['replacements'][$k] = preg_split("/\r?\n/", trim($l));
}
$this->replacements = $source['replacements'];
return $source;
}
private function removeUsedLinksFromPool($link) {
foreach ($this->links as $key => $links) {
foreach ($links['link'] as $k => $l) {
if ($l == $link) {
unset($this->links[$key]['link'][$k]);
}
}
}
}
private function randomizeContent($source, $subject) {
global $randomizer;
$source['title'] = $this->removeEmptyIndexes($source['title']);
foreach ($source['title'] as $k => $t) {
$source['title'][$k] = trim($randomizer->fetchEncryptedRandomPhrase(preg_split("/\r?\n/", trim($t)), 1, $subject.$k));
}
$source['parts'] = $this->removeEmptyIndexes($source['parts']);
foreach ($source['parts'] as $k => $b) {
$source['parts'][$k] = trim($randomizer->fetchEncryptedRandomPhrase(preg_split("/\r?\n/", trim($b)), 2, $subject.$k));
}
$source['structure'] = trim($source['structure']);
if ($source['type'] == "list") {
$source['parts'] = array_values($source['parts']);
$source['parts'] = $randomizer->randomShuffle($source['parts'], $subject."9");
} else if ($source['structure'] != "") {
$source['structure'] = $randomizer->fetchEncryptedRandomPhrase(preg_split("/\r?\n/", $source['structure']), 3, $subject);
preg_match_all("/(\{[0-9]{1,2}\})/", $source['structure'], $matches);
$sc = array();
foreach ($matches[0] as $match) {
$sc[] = str_replace(array("{", "}"), "", $match);
}
$bs = array();
foreach ($sc as $s) {
$bs[] = $source['parts'][$s];
}
$source['parts'] = $bs;
}
$source['replacements'] = $this->removeEmptyIndexes($source['replacements']);
foreach ($source['replacements'] as $k => $r) {
$source['replacements'][$k] = preg_split("/\r?\n/", trim($r));
}
return $source;
}
private function removeEmptyIndexes($array, $reset_keys=false) {
foreach($array as $key => $value) {
if ($value == "") {
unset($array[$key]);
}
}
if (!empty($reset_keys))
$array = array_values($array);
return $array;
}
private function arraySliceByPercentage($array, $subject, $decimal=0.6) {
global $randomizer;
$array = $randomizer->randomShuffle($array, $subject);
if (is_array($decimal))
$decimal = $randomizer->fetchEncryptedRandomPhrase(range($decimal[0], $decimal[1], 0.1), 1, $subject);
$ac = count($array);
$n = ceil($ac * $decimal);
$new_array = array_slice($array, 0, $n);
return $new_array;
}
private function arraySliceByInteger($array, $subject, $number=10) {
global $randomizer;
$array = $randomizer->randomShuffle($array, $subject);
if (is_array($number))
$number = $randomizer->fetchEncryptedRandomPhrase(range($number[0], $number[1]), 1, $subject);
$new_array = array_slice($array, 0, $number);
return $new_array;
}
private function aAnReplacement($text) {
$irregular = array(
"hour" => "an",
"europe" => "a",
"unique" => "a",
"honest" => "an",
"one" => "a"
);
$text = preg_replace("/(^|\W)([aA])n ([^aAeEiIoOuU])/", "$1$2"." "."$3", $text);
$text = preg_replace("/(^|\W)([aA]) ([aAeEiIoOuU])/", "$1$2"."n "."$3", $text);
foreach ($irregular as $k => $v) {
if (preg_match("/(^|\W)an? ".$k."/i", $text)) {
$text = preg_replace("/(^|\W)an? (".$k.")/i", "$1".$v." "."$2", $text);
}
}
return $text;
}
private function capitaliseFirstLetterOfSentences($text) {
$text = preg_replace("/(<p>|<li>|^|[\n\t]|[\.\?\!]+\s)(<a.*?>)?(?!%)([a-z]{1})/se", "'$1$2'.strtoupper('$3')", $text);
return $text;
}
public function findReplace($find_replace, $input, $subject) {
global $randomizer;
if (!empty($find_replace)) {
foreach ($find_replace as $key => $val) {
if (is_array($val)) {
$fr = $val;
$pattern = "/".preg_quote($key)."/i";
preg_match_all($pattern, $input, $vars);
foreach ($vars[0] as $vark => $varv) {
$fr = empty($fr) ? $val : $fr;
$new_val = $randomizer->fetchEncryptedRandomPhrase($fr, 1, $subject.$key.$vark);
unset($fr[array_search($new_val, $fr)]);
$fr = array_values($fr);
$input = preg_replace($pattern, $new_val, $input, 1);
}
} else {
$pattern = "/".preg_quote($key)."/i";
$input = preg_replace($pattern, $val, $input);
}
}
}
return $input;
}
public function generateTextFromFile($file, $subject, $find_replace=array()) {
global $randomizer;
$content = trim(file_get_contents($file));
$lines = preg_split("/\r?\n/s", $content);
$text = $randomizer->fetchEncryptedRandomPhrase($lines, 4, $subject);
$text = $this->findReplace($find_replace, $text, $subject);
return $text;
}
public function generateContentFromFeed($file, $type, $label="", $subject, $num=0, $find_replace=array()) {
global $cfg;
$vars = array ( "api_key" => $cfg['feed']['api_key'],
"subject" => $subject,
"file" => $file,
"type" => $type,
"label" => $label,
"num" => $num,
"find_replace" => json_encode($find_replace));
$encoded_vars = http_build_query($vars);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"http://feeds.redsauce.com/example/index.php");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $encoded_vars);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($ch);
curl_close($ch);
return $content;
}
}
?>
config.php
<?php
// LOCAL SERVER
// Paths
$cfg['basedir'] = "public_html/directory/"; // change this for inlcudes and requires.
$cfg['baseurl'] = "/directory/"; // change this for relative links and linking to images, media etc.
$cfg['fullbaseurl'] = "http://customerssite.com/directory/"; // change this for absolute links and linking to pages
/*
// Database - local
$cfg['database']['host'] = "127.0.0.1";
$cfg['database']['name'] = "username";
$cfg['database']['user'] = 'root'; // change this to the database user
$cfg['database']['password'] = ''; // change this to the database password
*/
// Database - web
$cfg['database']['host'] = "localhost";
$cfg['database']['name'] = "db";
$cfg['database']['user'] = 'user'; // change this to the database user
$cfg['database']['password'] = 'pass'; // change this to the database password
// Errors
$cfg['errors']['display'] = 1; // change this to display errors on or off (on for testing, off for production)
$cfg['errors']['log'] = 0; // change this to log errors to /log/error.log
// Caching
$cfg['caching']['status'] = 0; // determines whether the cache is enabled. 1 for enabled, 0 for disabled
$cfg['caching']['expiry_time'] = 604800; // determines expiry time of cached items (604800 = 14 days)
$cfg['caching']['directory'] = $cfg['basedir']."public_html/_cache/"; // directory in which cached files are stored
// Analytics
$cfg['analytics']['status'] = 1;
$cfg['analytics']['tracking_id'] = "UA-21030138-1";
// Javascript
$cfg['maps']['status'] = 0; // load google maps javascript
$cfg['jquery_ui']['status'] = 0; // load jquery ui javascript + css
// Site defaults
$cfg['site_name'] = "Customer"; // change this to the site name
$cfg['default']['page_title'] = "Customer"; // change this to the default title of all pages
$cfg['default']['page_description'] = "Customer"; // change this to the default meta-description of all pages
$cfg['default']['email'] = "Customer"; // change this to the administrators email for receiving messages when users signup etc
$cfg['default']['category']['id'] = 130;
// Email
$cfg['email']['address'] = "Customer"; // change this to the administrators email for receiving messages when users signup etc
$cfg['email']['from'] = $cfg['site_name']." <Customer>"; // email will appear to have come from this address
$cfg['email']['subject'] = "Enquiry from ".$cfg['site_name']; // subject of the email
$cfg['email']['success_message'] = "Thank you for your enquiry. Someone will be in touch with you shortly."; // message to display if the email is sent successfully
$cfg['email']['failure_message'] = "There was an error processing your enquiry. Please try again."; // message to display if the email is not sent successfully
// Content feed
$cfg['feed']['api_key'] = "Customer"; // enter the unique content feed api key here
$cfg['feed']['url'] = "http://feeds.customersseocompany.com/customer/index.php"; // URL to connect to to pull the content feed
// Dates
date_default_timezone_set('Europe/London');
$cfg['default']['date_format'] = "d/m/Y H:i:s";
// Options
$cfg['listings']['type'] = "ajax"; // options are none (to display no listings), page (to display the listings on the page) or ajax (to load the listings via AJAX)
$cfg['listings']['num_per_page'] = 10; // maximum number of listings to show per page
// Routing
$cfg['routes'] = array(
'category' => array(
"(?P<category>.+?)-category",
"_pages/category.php"
),
't1' => array(
"(?P<category>.+?)-in-(?P<t1_name>.+?)_(?P<county>.+)",
"_pages/tier1.php"
),
't1_map' => array(
"t1-map-data-(?P<t1_name>.+)-(?P<page_num>[0-9]+)",
"_ajax/map-data.php"
),
't2_map' => array(
"t2-map-data-(?P<t2_name>.+)-(?P<t1_name>.+)-(?P<page_num>[0-9]+)",
"_ajax/map-data.php"
),
'ajax_listings' => array(
"ajax-listings-(?P<category>.+?)-(?P<t2_id>.+)-(?P<page_num>[0-9]+)",
"_ajax/listings.php"
),
'search' => array(
"^search\/?$",
"_pages/search.php"
),
'single' => array(
".*?-(?P<listing_id>[0-9]+)",
"_pages/single.php"
)
);
// Site specific
// Encoding
header('Content-type: text/html; charset=UTF-8');
?>
part of the code used in tier1.php (file used to generate each location and keywords page) which generates the paragraph text :
<?php
$randParas = array(1, 2, 3);
$numParas = $randomizer -> fetchEncryptedRandomPhrase($randParas, 1, $_SERVER['REQUEST_URI']);
$content = $content->generateContentFromFeed(
// file
"categories/".$category[0]['category_slug']."/paragraphs",
// type
"paragraph",
// label
"",
// subject
$_SERVER['REQUEST_URI']."7",
// num
$numParas,
// find_ replace
array(
"XX" => $t1_location[0]['definitive_name'],
"YY" => $cfg['site_name']
)
);
?>
A snippet from one of the json files (I am aware the code I have pasted terminates early, I just wanted to put a part of it here as the file is huge!):
{"filename":" xx keywords","content":[{"title":{"1":"xx example\r\nexample in xx\r\nexample in the xx region\r\nexample in the xx area","2":"","3":"","4":""},"type":"paragraph","label":"","structure":"","parts":{"1":"Stylish and practical, a xx keyword \r\nPractical and stylish, a xx keyword \r\nUseful and pragmatic, a xx keyword \r\nPragmatic and useful, a xx keyword \r\nModern and convenient, a xx keyword ",
If you need any more info, please tell me what you need. I really appreciate the help in advance, as I really want to help this client out. They are great people who do not deserve to be hit by a negative SEO company.
What would be the most helpful is if somebody knows what this script is! I can then buy/download it and generate my own feed using the data.
If you can help with either the code to generate the feed, or how I should go about getting the data out of the json files and filter it correctly, that would be great!
Many thanks,
Kevin
ps. Sorry for all the code, I have been told off before for not posting enough!
EDIT : Here is the code I am now using after the suggestion below:
<?php
$randParas = array(1, 2, 3);
$numParas = $randomizer -> fetchEncryptedRandomPhrase($randParas, 1, $_SERVER['REQUEST_URI']);
$content = $content->generateContent(
// file
"content/categories/".$category[0]['category_slug']."/paragraphs.json",
// type
"paragraph",
// label
"",
// subject
$_SERVER['REQUEST_URI']."7",
// num
$numParas,
// find_ replace
array(
"XX" => $t1_location[0]['definitive_name'],
"YY" => $cfg['site_name']
)
);
?>
<?php
if($numParas == '3'){
$divClass = 'vertiCol';
}
elseif($numParas == '2'){
$divClass = 'vertiColDub';
}
else {
$divClass = 'horizCol';
}
$randImages = glob('_images/categories/'.$category[0]['category_slug'].'/*.jpg');
$randImages = $randomizer->randomShuffle($randImages, $_SERVER['REQUEST_URI']);
$fc = preg_replace("/<h3>.*?<\/h3>/", "$0", $content);
$fc = preg_replace("/<h3>.*?<\/h3><p>.*?<\/p>/", "<div class=\"contCol ".$divClass."\">$0$1</div>", $content);
$fca = preg_split("/(\.|\?|\!)/", $fc, -1, PREG_SPLIT_DELIM_CAPTURE);
$prStr = '';
foreach ($fca as $fck => $fcv) {
if ($fck % 3 == 0 && $fck != 0 && !in_array($fcv, array(".", "?", "!")))
$prStr .= "</p>\n<p>";
$prStr .= $fcv;
}
preg_match_all('/<div class="contCol '.$divClass.'"><h3>.*?<\/h3>.*?<\/div>/s', $prStr, $matches);
$randAlign = array(
array('topleft'),
array('topright'),
#array('bottomleft'),
# array('bottomright')
);
$i=0;
$randSelectAlign = $randomizer->fetchEncryptedRandomPhrase($randAlign, 0, $_SERVER['REQUEST_URI']);
$randSelectAlign = $randSelectAlign[0];
foreach($matches[0] as $newPar){
$i++;
if($randSelectAlign=='topleft'){
echo str_replace('</h3><p>', '</h3><p><span class="imgWrap" style="float:left"><img src="'.$cfg['baseurl'].$randImages[$i].'" width="170" /></span>', $newPar);
}
elseif($randSelectAlign=='topright'){
echo str_replace('</h3><p>', '</h3><p><span class="imgWrap" style="float:right"><img src="'.$cfg['baseurl'].$randImages[$i].'" width="170" /></span>', $newPar);
}
elseif($randSelectAlign=='bottomleft'){
echo str_replace('</p></div>', '<span class="imgWrap" style="float:left"><img src="'.$cfg['baseurl'].$randImages[$i].'" width="170" /></span></div>', $newPar);
}
elseif($randSelectAlign=='bottomright'){
echo str_replace('</p></div>', '<span class="imgWrap" style="float:right"><img src="'.$cfg['baseurl'].$randImages[$i].'" width="170" /></span></div>', $newPar);
}
else {
}
//randomly from float array based on server uri!
//randomly select a way to display the images here etc
#
}
?>
Which is giving me the following error messages:
Notice: Array to string conversion in /home/account/public_html/subdomain/directory/_pages/tier1.php on line 230
Notice: Array to string conversion in /home/account/public_html/subdomain/directory/_pages/tier1.php on line 231
Warning: preg_split() expects parameter 2 to be string, array given in /home/account/public_html/subdomain/directory/_pages/tier1.php on line 233
Warning: Invalid argument supplied for foreach() in /home/account/public_html/subdomain/directory/_pages/tier1.php on line 237
These lines of code are :
230 $fc = preg_replace("/<h3>.*?<\/h3>/", "$0", $content);
231 $fc = preg_replace("/<h3>.*?<\/h3><p>.*?<\/p>/", "<div class=\"contCol ".$divClass."\">$0$1</div>", $content);
233 $fca = preg_split("/(\.|\?|\!)/", $fc, -1, PREG_SPLIT_DELIM_CAPTURE);
235 $prStr = '';
237 foreach ($fca as $fck => $fcv) {
if ($fck % 3 == 0 && $fck != 0 && !in_array($fcv, array(".", "?", "!")))
$prStr .= "</p>\n<p>";
$prStr .= $fcv;
}
I presume this is because its spitting the content out as an array? Is there anything in particular I need to do to the data to make it output correctly?
Many thanks,
Kevin

Having glanced over the code, I think I have some idea what it's supposed to do, though I'm still confused about the randomizer stuff ;-)
You can pass your JSON data as a string into the generateContent() method as the first parameter ($source).
The generateContent() returns an array, so you will have to run it through formatContent() (I think).

Related

Data Matrix Barcode split to different data

I have a data matrix barcode that have the input
Data Matrix Barcode = 0109556135082301172207211060221967 21Sk4YGvF811210721
I wish to have output as below:-
Items
Output
Gtin
09556135082301
Expire Date
21-07-22
Batch No
60221967
Serial No
Sk4YGvF8
Prod Date
21-07-21
But My coding didn't detect after the space
$str = "0109556135082301172207211060221967 21Sk4YGvF811";
if ($str != null){
$ais = explode("_",$str);
for ($aa=0;$aa<sizeof($ais);$aa++)
{
$ary = $ais[$aa];
while(strlen($ary) > 0) {
if (substr($ary,0,2)=="01"){
$gtin = substr($ary,2,14);
$ary = substr($ary,-(strlen($ary)-16));
}
else if (substr($ary,0,2)=="17"){
$expirydate = substr($ary,6,2)."-".substr($ary,4,2)."-20".substr($ary,2,2);
$ary = substr($ary,-(strlen($ary)-8));
}
else if (substr($ary,0,2)=="10"){
$batchno = substr($ary,2,strlen($ary)-2);
$ary = "";
}
else if (substr($ary,0,2)=="21"){
$serialno = substr($ary,2,strlen($ary)-2);
$ary = "";
}
else if (substr($ary,0,2)=="11"){
$proddate = substr($ary,6,2)."-".substr($ary,4,2)."-20".substr($ary,2,2);
$ary = substr($ary,-(strlen($ary)-8));
}
else {
$oth = "";
}
}
}
My code output https://onecompiler.com/php/3yg6gs5ea didn't come out the result I expected. Anyway to modify it?
Solution
You can use a regex to make it short and easy.
Note
Your string does not contain all the characters of the code given in the description.
Code
$str = "0109556135082301172207211060221967 21Sk4YGvF811210721";
$datePattern = '/(\d\d)(\d\d)(\d\d)/';
$dateReplacement = '\3-\2-\1';
$matches = [];
preg_match('/(?<gtin>.{18})(?<expire>.{6})(?<batch>.*?)\s(?<serial>.{12}(?<prod>.{6}))/', $str, $matches);
$matches['expire'] = preg_replace($datePattern, $dateReplacement, $matches['expire']);
$matches['prod'] = preg_replace($datePattern, $dateReplacement, $matches['prod']);
$matches = array_filter($matches, fn($value, $key) => !is_numeric($key), ARRAY_FILTER_USE_BOTH);
$keyMap = [
'gtin' => 'Gtin',
'expire' => 'Expire Date',
'batch' => 'Batch No.',
'serial' => 'Serial No.',
'prod' => 'Production Date',
];
foreach ($keyMap as $key => $output) {
echo "<tr><td>$output</td><td>{$matches[$key]}</td></tr>\n";
}
Output
<tr><td>Gtin</td><td>010955613508230117</td></tr>
<tr><td>Expire Date</td><td>21-07-22</td></tr>
<tr><td>Batch No.</td><td>1060221967</td></tr>
<tr><td>Serial No.</td><td>21Sk4YGvF811210721</td></tr>
<tr><td>Production Date</td><td>21-07-21</td></tr>

output and call array from class function (rollingcurl)

Excuse my English, please.
I use Rollingcurl to crawl various pages.
Rollingcurl: https://github.com/LionsAd/rolling-curl
My class:
<?php
class Imdb
{
private $release;
public function __construct()
{
$this->release = "";
}
// SEARCH
public static function most_popular($response, $info)
{
$doc = new DOMDocument();
libxml_use_internal_errors(true); //disable libxml errors
if (!empty($response)) {
//if any html is actually returned
$doc->loadHTML($response);
libxml_clear_errors(); //remove errors for yucky html
$xpath = new DOMXPath($doc);
//get all the h2's with an id
$row = $xpath->query("//div[contains(#class, 'lister-item-image') and contains(#class, 'float-left')]/a/#href");
$nexts = $xpath->query("//a[contains(#class, 'lister-page-next') and contains(#class, 'next-page')]");
$names = $xpath->query('//img[#class="loadlate"]');
// NEXT URL - ONE TIME
$Count = 0;
$next_url = "";
foreach ($nexts as $next) {
$Count++;
if ($Count == 1) {
/*echo "Next URL: " . $next->getAttribute('href') . "<br/>";*/
$next_link = $next->getAttribute('href');
}
}
// RELEASE NAME
$rls_name = "";
foreach ($names as $name) {
$rls_name .= $name->getAttribute('alt');
}
// IMDB TT0000000 RLEASE
if ($row->length > 0) {
$link = "";
foreach ($row as $row) {
$tt_info .= #get_match('/tt\\d{7}/is', $doc->saveHtml($row), 0);
}
}
}
$array = array(
$next_link,
$rls_name,
$tt_info,
);
return ($array);
}
}
Output/Return:
$array = array(
$next_link,
$rls_name,
$tt_info,
);
return ($array);
Call:
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
function get_match($regex, $content, $pos = 1)
{
/* do your job */
preg_match($regex, $content, $matches);
/* return our result */
return $matches[intval($pos)];
}
require "RollingCurl.php";
require "imdb_class.php";
$imdb = new Imdb;
if (isset($_GET['action']) || isset($_POST['action'])) {
$action = (isset($_GET['action'])) ? $_GET['action'] : $_POST['action'];
} else {
$action = "";
}
echo " 2222<br /><br />";
if ($action == "most_popular") {
$popular = '&num_votes=1000,&production_status=released&groups=top_1000&sort=moviemeter,asc&count=40&start=1';
if (isset($_GET['date'])) {
$link = "https://www.imdb.com/search/title?title_type=feature,tv_movie&release_date=,".$_GET['date'].$popular;
} else {
$link = "https://www.imdb.com/search/title?title_type=feature,tv_movie&release_date=,2018".$popular;
}
$urls = array($link);
$rc = new RollingCurl([$imdb, 'most_popular']); //[$imdb, 'most_popular']
$rc->window_size = 20;
foreach ($urls as $url) {
$request = new RollingCurlRequest($url);
$rc->add($request);
}
$stream = $rc->execute();
}
If I output everything as "echo" in the class, everything is also displayed. However, I want to call everything individually.
If I now try to output it like this, it doesn't work.
$stream[0]
$stream[1]
$stream[3]
Does anyone have any idea how this might work?
Thank you very much in advance.
RollingCurl doesn't do anything with the return value of the callback, and doesn't return it to the caller. $rc->execute() just returns true when there's a callback function. If you want to save anything, you need to do it in the callback function itself.
You should make most_popular a non-static function, and give it a property $results that you initialize to [] in the constructor.. Then it can do:
$this->results[] = $array;
After you do
$rc->execute();
you can do:
foreach ($imdb->results as $result) {
echo "Release name: $result[1]<br>TT Info: $result[2]<br>";
}
It would be better if you put the data you extracted from the document in arrays rather than concatenated strings, e.g.
$this->$rls_names = [];
foreach ($names as $name) {
$this->$rls_names[] = $name->getAttribute('alt');
}
$this->$tt_infos = [];
foreach ($rows as $row) {
$this->$tt_infos[] = #get_match('/tt\\d{7}/is', $doc->saveHtml($row), 0);
}
$this->next_link = $next[0]->getAttribute('href'); // no need for a loop to get the first element of an array

trim not behaving consistently in php

I have an array of email ids - they are in the below format
<my_link_1#mysite.com>
<my_link_35#mysite.com>
<my_link_40#mysite.com>
I then use a foreach loop and parse each of the email ids as shown in the below function
protected function getLinkIds($emailAddresses = array()) {
$links = array();
foreach ($emailAddresses as $email) {
$username = substr($email, 0, strpos($email, '#'));
if (false === strpos($username, 'my_link_'))) {
continue;
}
CakeLog::write('debug', 'getLinkIds - username : '.$username);
CakeLog::write('debug', 'getLinkIds - trim(username) : ' . trim($username, '<>"'));
$linkstr = explode('my_link_', $username);
$links[trim($username, '<>"')] = $linkstr[1];
}
return $links;
}
I expect an array that looks like below
[my_link_1] => 1
[my_link_35] => 35
[my_link_40] => 40
but instead I get an array like below
[my_link_1] => 1
[<my_link_35] => 35
[<my_link_40] => 40
For some reason trim doest not trim the left caret beyond the first email id - baffling!!!
Like I had early mentioned in the comments of your question, using:
var_dump($emailAddresses);
To make sure the other lines do not start with a space otherwise trim would fail in your code sample as you don't have any trim's that are set to remove space.
You could have done it in the following way, to secure it would not start with a space in your code:
protected function getLinkIds($emailAddresses = array())
{
$links = array();
foreach ($emailAddresses as $email)
{
$email = trim($email);
//.... rest of your code
}
return $links;
}
Or even as simple as:
protected function getLinkIds($emailAddresses = array())
{
$links = array();
foreach ($emailAddresses as $email)
{
$username = trim(substr($email, 0, strpos($email, '#')), ' <');
//.... rest of your code
}
return $links;
}
Below is the original answer prior the OP answer in the comments.
You were almost there, when you stripped the username, all you had left was the initial <, which you could have removed by changing this line:
$username = substr($email, 0, strpos($email, '#'));
With the addition of trim, like so:
$username = trim(substr($email, 0, strpos($email, '#')), ' <');
Now you had the my_link_# left which you could again use a simple left trim(ltrim) to get what you wanted:
$id = ltrim($username, 'my_link_');
Now you have the # alone, so your code would look like:
protected function getLinkIds($emailAddresses = array())
{
$links = array();
foreach ($emailAddresses as $email)
{
$username = trim(substr($email, 0, strpos($email, '#')), ' <');
if (false === strpos($username, 'my_link_'))) {
continue;
}
CakeLog::write('debug', 'getLinkIds - username : '.$username);
CakeLog::write('debug', 'getLinkIds - trim(username) : ' . trim($username, '<>"'));
$links[$username] = ltrim($username, 'my_link_');
}
return $links;
}
Output:
Array
(
[my_link_1] => 1
[my_link_35] => 35
[my_link_40] => 40
)
Live DEMO
You can do this like that:
$emails = array(
'<my_link_1#mysite.com>',
'<my_link_35#mysite.com>',
'<my_link_40#mysite.com>'
);
$result = array();
foreach($emails as $email) {
$email = trim($email, '<>');
$tmp = strstr($email, '#', true);
preg_match('/_([0-9]*)$/', $tmp, $matches);
$result[$tmp] = $matches[1];
}
print_r($result);
Result:
Array
(
[my_link_1] => 1
[my_link_35] => 35
[my_link_40] => 40
)
http://codepad.viper-7.com/lN8WpI
Trim is working fine. Look at the link above.
But here's probably the cleanest way to do it:
DEMO
$email = "<my_link_40#mysite.com>";
$email = trim("<my_link_40#mysite.com>", "<>"); //my_link_40#mysite.com
$email = current(explode("#", $email)); //my_link_40
$email_int = (int) filter_var($email, FILTER_SANITIZE_NUMBER_INT); //40
$array = array( $email => $email_int );
var_dump($array);
array(1)
{
["my_link_40"] => int(40)
}

My if statement is not allowing a change to an non-indexed array value

In the script below I have an array. My array stores all the links, titles and descriptions from a web page. But I want to make sure that if there is no description, it will use the first 20 characters of a p tag using a function which works. Only problem is I have the jigsaw pieces and just can't seem to put them together, So I want my if statement to show that if the description is empty to use the function getWord instead of getMetas().
function getMetas($link) {
$str1 = file_get_contents($link);
if (strlen($str1)>0) {
preg_match_all( '/<meta.*?name=("|\')description("|\').*?content=("|\')(.*?)("|\')/i', $str1, $description);
if (count($description) > 1) {
return $description[4];
}
}
}
then my function goes here but there is no need to see that as i know that works.
function getWord() {
$html = file_get_contents($link);
preg_match('%(<p[^>]*>.*?</p>)%i', $html, $re);
$res = get_custom_excerpt($re[1]);
}
$outputs = array();
foreach ($links as $thisLink) {
$output[] = array("link" => $thisLink, "title" => Titles($thisLink), "description" => getMetas($thisLink));
if ($output['description'] == null) {
$output['description'] = $res;
}
$outputs[] = $output;
}
print_r($output);

How can I make my function work with my if statement that changes my array

In the script below I have an array. My array stores all the links, titles and descriptions from a web page. But I want to make sure that if there is no description, it will use the first 20 characters of a p tag using a function which works. Only problem is I have the jigsaw pieces and just can't seem to put them together, So I want my if statement to show that if the description is empty to use the function getWord instead of getMetas().
function getMetas($link) {
$str1 = file_get_contents($link);
if (strlen($str1)>0) {
preg_match_all( '/<meta.*?name=("|\')description("|\').*?content=("|\')(.*?)("|\')/i', $str1, $description);
if (count($description) > 1) {
return $description[4];
}
}
}
Then my function goes here (get_custom_excert), but there is no need to see that as I know that works.
function getWord() {
$html = file_get_contents($link);
preg_match('%(<p[^>]*>.*?</p>)%i', $html, $re);
$res = get_custom_excerpt($re[1]);
}
$outputs = array();
foreach ($links as $thisLink) {
$output[] = array("link" => $thisLink, "title" => Titles($thisLink), "description" => getMetas($thisLink));
if ($output['description'] == null) {
$output['description'] = getWord($res);
}
$outputs[] = $output;
}
print_r($output);
Is this waht you want?
function getMetas($link) {
$str1 = file_get_contents($link);
if (strlen($str1)>0) {
preg_match_all( '/<meta.*?name=("|\')description("|\').*?content=("|\')(.*?)("|\')/i', $str1, $description);
if (count($description) > 1) {
return $description[4];
} else {
return getWord($str1);
}
}
}
function getWord($html) {
preg_match('%(<p[^>]*>.*?</p>)%i', $html, $re);
return get_custom_excerpt($re[1]);
}
BTW, parsing HTML with regexp is very fragile, it would be better to use a DOM parser library.
How is this?
foreach ($links as $thisLink) {
$output = array("link" => $thisLink, "title" => Titles($thisLink), "description" => getMetas($thisLink));
if ($output['description'] == null) {
$output['description'] = getWord($res);
}
$outputs[] = $output;
}

Categories