I have an array that looks like this:
[6625] => Trump class="mediatype"> href="/news/picture">Slideshow: [6628] => href="http://www.example.com/news/picture/god=USRTX1N84J">GOP [6630] => nation
I need to be able to pull out anything within href="" of the array and put into a new one.
I have tried:
<?php
$homepage = file_get_contents('http://www.example.com/');
$arr = explode(" ",$homepage);
function getStringInBetween($string, $start, $end){
$string = " " . $string;
$initial = strpos($string, $start);
if ($initial == 0) return "";
$initial += strlen($start);
$length = strpos($string, $end, $initial) - $initial;
return substr($string, $initial, $length);
}
echo getStringInBetween($arr[0], 'href="', '"')
?>
Try this code, adapt it to suit you,
<?php
$homepage = file_get_contents('http://www.example.com/');
$arr = explode(" ",$homepage);
function getStringInBetween($string, $start, $end){
$string = " " . $string;
$initial = strpos($string, $start);
if ($initial == 0) return "";
$initial += strlen($start);
$length = strpos($string, $end, $initial) - $initial;
return substr($string, $initial, $length);
}
foreach ($arr as $val) {
if (strpos($val, 'href') !== false) {
echo getStringInBetween($val, 'href="', '"');
}
}
?>
This example when ran outputted google.com/hello.
Related
I am trying to generate html from a given string pattern, similar to a plugin.
There are three patterns, a no arg, a single arg and a multi arg string pattern. I can't change this pattern since it's from a CMS.
{pluginName} or {pluginName=3} or {pluginName id=3|view=simple|arg999=asv}
An example:
<p>Hi this is a html page</p>
<p>The following line should generate html</p>
{pluginName=3}
<p>The following line also should generate html</p>
{pluginName id=3|view=simple|arg999=asv}
My goal is to replace those "tags" with something (it's not relavant for this question the processing per say). However I want to be able to pass the args given to a class/function that should handle that logic.
This is my first attempt, without using regexes since I don't know how I could approach this problem with them (and mainly because they are slower).
<?php
function processPlugins($text, $pos = 0, $start = '{', $end = '}') {
$plugins = array('plugin1', 'plugin2');
while(($pos = strpos($text, $start, $pos)) !== false) {
$startPos = $pos;
$pos += strlen($start);
foreach($plugins as $plugin) {
if(substr($text, $pos, strlen($plugin)) === $plugin
&& ($endPos = strpos($text, $end, $pos + strlen($plugin))) !== false) {
$char = substr($text, $pos + strlen($plugin), 1); // 1 is strlen of (= or ' ')
$pos += strlen($plugin) + 1; // 1 is strlen of (= or ' ')
$argString = substr($text, $pos, $endPos - $pos);
if($char === ' ') { //Multi arg
$params = explode('|', trim($argString));
$paramDict = array();
foreach ($params as $param) {
list($k, $v) = array_pad(explode('=', $param), 2, null);
$paramDict[$k] = $v;
}
//$output = $plugin->processDictionary($paramDict);
var_dump($paramDict);
} elseif ($char === '=') { //One arg
//$output = $plugin->processArg($argString);
echo $argString . "\n";
} elseif ($char === $end) { //No arg
//$output = $plugin->processNoArg();
echo $plugin. "\n";
}
$pos = $endPos + strlen($end);
break;
}
}
}
}
processPlugins('{plugin1}');
processPlugins('{plugin2=3}');
processPlugins('{plugin2 arg1=b|arg2=d}');
The previous code works in a PHP sandbox.
This code seems to work (for now) but it seems sketchy. Would you approach this problem differently? Could I refactor this code somehow?
If you opt for string manipulation functions over regex, why not use explode for stripping the input down to the significant part?
Here is an alternative implementation:
function processPlugins($text, $pos = 0, $start = '{', $end = '}') {
$t = substr($text, $pos);
if($pos > 0) {
echo "$pos chracters removed from the begining: $t" . PHP_EOL;
} else {
echo "Starting with '$t'" . PHP_EOL;
}
$parts = explode($start, $t);
$t = $parts[1];
$parts = explode($end, $t);
$t = $parts[0];
echo "The part between curly braces: '$t'" . PHP_EOL;
$t = str_replace(['plugin1', 'plugin2'], '', $t);
echo "After plugin name has been removed: '$t'" . PHP_EOL;
$n = strlen($t);
if(!$n) {
echo "Processing complete: " . trim($parts[0]) . PHP_EOL . PHP_EOL;
return;
}
$params = explode('|', $t);
echo 'Key-Values: ' . json_encode($params) . PHP_EOL;
$kv = [];
foreach($params as $p) {
list($k, $v) = explode('=', trim($p));
echo " Item: '$p', Key: '$k', Value: '$v'" . PHP_EOL;
if($k === '') {
echo "Processing complete: $v" . PHP_EOL . PHP_EOL;
return;
}
$kv[$k] = $v;
}
echo "Processing complete: " . json_encode($kv) . PHP_EOL . PHP_EOL;
}
echo '<pre>';
processPlugins('{plugin1}');
processPlugins('{plugin2=3}');
processPlugins('{plugin2 arg1=b|arg2=d}');
Of course the echo lines could be thrown away. With them in place we get this output:
Starting with '{plugin1}'
The part between curly braces: 'plugin1'
After plugin name has been removed: ''
Processing complete: plugin1
Starting with '{plugin2=3}'
The part between curly braces: 'plugin2=3'
After plugin name has been removed: '=3'
Key-Values: ["=3"]
Item: '=3', Key: '', Value: '3'
Processing complete: 3
Starting with '{plugin2 arg1=b|arg2=d}'
The part between curly braces: 'plugin2 arg1=b|arg2=d'
After plugin name has been removed: 'arg1=b|arg2=d'
Key-Values: [" arg1=b","arg2=d"]
Item: ' arg1=b', Key: 'arg1', Value: 'b'
Item: 'arg2=d', Key: 'arg2', Value: 'd'
Processing complete: {"arg1":"b","arg2":"d"}
This version works with inputs having more than one plugin token.
function processPlugins($text, $pos = 0, $start = '{', $end = '}') {
$processed = [];
$t = substr($text, $pos);
$parts = explode($start, $t);
array_shift($parts);
foreach($parts as $part) {
$pparts = explode($end, $part);
$t = trim($pparts[0]);
$t = str_replace(['plugin1', 'plugin2'], '', $t);
$n = strlen($t);
if(!$n) {
$processed[] = trim($pparts[0]);
continue;
}
$params = explode('|', $t);
$kv = [];
foreach($params as $p) {
list($k, $v) = explode('=', trim($p));
if(trim($k) === '') {
$processed[] = trim($v);
continue 2;
}
$kv[trim($k)] = trim($v);
}
$processed[] = $kv;
}
return $processed;
}
function test($case) {
$p = processPlugins($case);
echo "$case => " . json_encode($p) . PHP_EOL;
}
$cases = [
'{plugin1}',
'{plugin2=3}',
'{plugin2 arg1=b|arg2=d}',
'text here {plugin1} and more{plugin2=55}here {plugin2 arg1=b|arg2=d} till the end'
];
foreach($cases as $case) {
test($case);
}
The output:
{plugin1} => ["plugin1"]
{plugin2=3} => ["3"]
{plugin2 arg1=b|arg2=d} => [{"arg1":"b","arg2":"d"}]
text here {plugin1} and more{plugin2=55}here {plugin2 arg1=b|arg2=d} till the end => ["plugin1","55",{"arg1":"b","arg2":"d"}]
I want to make my code to list specific file extension only. I have list of php files and I use this code to list all of then in page
so what I need here is to list specific extiontion only. php files only
right now my code lists all files in same folder
<?php
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$files1 = scandir(dirname(__FILE__));
?>
<?php
foreach($files1 as $myfile){
if($myfile!='.' && $myfile!='..' && $myfile!='index.php'){
$value = file_get_contents($myfile);
$fullstring = $value;
$parsed = get_string_between($fullstring, '$Cont1 = \'', '\'');
$parsed1 = get_string_between($fullstring, '$Con2 = \'', '\'');
?>
<?php
}
}
?>
can some one please post an answer with editing this and show me how do I make it to list php files only?
Hope this helps. It will only read files which filenames end with php
<?php
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$files1 = scandir(dirname(__FILE__));
?>
<?php
foreach($files1 as $myfile){
if (substr($myfile, -3) != "php")
continue;
if($myfile!='.' && $myfile!='..' && $myfile!='index.php'){
$value = file_get_contents($myfile);
$fullstring = $value;
$parsed = get_string_between($fullstring, '$Cont1 = \'', '\'');
$parsed1 = get_string_between($fullstring, '$Con2 = \'', '\'');
?>
<?php
}
}
?>
glob() is pretty much made to do this exactly, and it avoids regex in a loop, substringing on an arbitrary number of chars, etc.
foreach (glob("*.php") as $script) {
echo "$script size " . filesize($script) . PHP_EOL;
}
Something like this ( untested )
$Dir = new DirectoryIterator($dir);
$iterator = new RegexIterator($Dir, '/\.php$/i', RegexIterator::MATCH);
foreach($iterator as $fileinfo) {
if ($fileinfo->isDot()) continue;
var_dump( $fileinfo );
}
Or even better
$Dir = new FilesystemIterator(__DIR__, FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS | FilesystemIterator::KEY_AS_PATHNAME );
$iterator = new RegexIterator($Dir, '/\.php$/i', RegexIterator::MATCH);
foreach($iterator as $fileinfo) {
if ($fileinfo->isDot()) continue;
var_dump( $fileinfo );
}
This second one lets you skip the dots ('.', '..' etc) and changes all the \ Windows to / Linux style ( mainly for use on windows ), auto-magically.
I am having a ton of trouble running through finding a string between two strings.
This is the code i currently have
<?
$html = file_get_contents('mywebsite');
$tags = explode('<',$html);
foreach ($tags as $tag)
{
// skip scripts
if (strpos($tag,'script') !== FALSE) continue;
// get text
$text = strip_tags('<'.$tag);
// only if text present remember
if (trim($text) != '') $texts[] = $text;
//print_r($text);
echo($text);
}
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = $text;
$parsed = get_string_between($fullstring, "tag1", "tag2");
print_r($parsed);
echo ($parsed);
?>
I think the problem happens on this line:
$fullstring = $text;
I am not entirely sure if $text has the stripped down HTML from the above function. When i run this code i get the stripped out webpage like i expect but i got nothing between the tags i am setting.
Does anyone know why this might be happening or what i am missing?
I think its because you are declaring text as a local variable inside for loop. so , after when you are assigning $text to fullstring It's actually null. I don't understand what you are trying to do , but do this and see if it works
$fullstring = ""
foreach ($tags as $tag){
#your code as usual
echo($text);
$fullstring = $fullstring.$text;
}
and delete the $fullstring = $text line.
you can use this:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Reference
I want to cut the string, if the string length is greater than 80.My need is if the string contain tag and cut the string in between the opening and closing tag,string crop should be only after closing tag.THis is my code.
<?php
echo $word='hello good morning<span class="em emj2"></span> <span class="em emj13"></span> <span class="em emj19"></span> <span class="em emj13"></span> hai';
$a=strlen($word);
if($a>80)
{
echo substr($word,0,80);
}
else
echo $word;
?>
I know that my answer is not in good ethics as according to stackoverflow, as I dont have time to explain exactly how every part of it works. But this is a function I use to crop strings and maintain the HTML code.
function truncate($text, $length, $suffix = '…', $isHTML = true) {
$i = 0;
$simpleTags=array('br'=>true,'hr'=>true,'input'=>true,'image'=>true,'link'=>true,'meta'=>true);
$tags = array();
if($isHTML){
preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
foreach($m as $o){
if($o[0][1] - $i >= $length)
break;
$t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1);
if($t[0] != '/' && (!isset($simpleTags[$t])))
$tags[] = $t;
elseif(end($tags) == substr($t, 1))
array_pop($tags);
$i += $o[1][1] - $o[0][1];
}
}
$output = substr($text, 0, $length = min(strlen($text), $length + $i));
$output2 = (count($tags = array_reverse($tags)) ? '</' . implode('></', $tags) . '>' : '');
$pos = (int)end(end(preg_split('/<.*>| /', $output, -1, PREG_SPLIT_OFFSET_CAPTURE)));
$output.=$output2;
$one = substr($output, 0, $pos);
$two = substr($output, $pos, (strlen($output) - $pos));
preg_match_all('/<(.*?)>/s', $two, $tags);
if (strlen($text) > $length) { $one .= $suffix; }
$output = $one . implode($tags[0]);
$output = str_replace('</!-->','',$output);
return $output;
}
Then simply do like so:
truncate($your_string, '80', $suffix = '…', $isHTML = true);
So i have an HTML file as source, it contains several instances of the following code:
<span itemprop="name">NAME</span>
where the NAME part always changing to something different.
how can i write a php code that would go through the html code, extract all the names between the "<span itemprop="name">" and "</span>" and put it in an array?
i have tried this code but it doesn't work:
$prev=$html;
for($i=0; $i<10; $i++){
$current = explode('<span itemprop="name">', $prev);
$cur = explode('</span>', $current[1]);
$names[] = $cur[0];
$prev = $current[2];
}
print_r($names);
Probably better way would be using php DOMDocument or simple php dom or any DOM representative than the way you planed.
Here is example of working DOMDocument code:
$doc = new DOMDocument();
$doc->loadHTML('<html><body><span itemprop="name">1</span><span itemprop="name">2</span><span itemprop="name">3</span></body></html>');
$finder = new DomXPath($doc);
$nodes = $finder->query("//*[contains(#itemprop, 'name')]");
foreach($nodes as $node)
{
echo $node->nodeValue . '<br />';
}
Outputs:
1
2
3
I kinda feel bad for saying this... but you could use a regular expression
preg_match_all('/<span itemprop="name">(.*?)<\/span>/i', $matches);
var_dump($matches); // results are stored in the variable $matches;
This function will get us the "NAME"
function getbetween($content,$start,$end) {
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
This function will replace only the first occurence
<?php
function str_replace_once($search, $replace, $subject) {
$firstChar = strpos($subject, $search);
if($firstChar !== false) {
$beforeStr = substr($subject,0,$firstChar);
$afterStr = substr($subject, $firstChar + strlen($search));
return $beforeStr.$replace.$afterStr;
} else {
return $subject;
}
}
?>
now a loop
$start = '<span itemprop="name">';
$end = '</span>';
while(strpos($content, $start)) {
$name = getbetween($content, $start, $end);
$content = str_replace_once($start.$name.$end, '',$content);
echo $name.'<br>';
}
use this function:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Refenter link description here