php substring occurances between two strings in an html file - php

So i have an HTML file as source, it contains several instances of the following code:
<span itemprop="name">NAME</span>
where the NAME part always changing to something different.
how can i write a php code that would go through the html code, extract all the names between the "<span itemprop="name">" and "</span>" and put it in an array?
i have tried this code but it doesn't work:
$prev=$html;
for($i=0; $i<10; $i++){
$current = explode('<span itemprop="name">', $prev);
$cur = explode('</span>', $current[1]);
$names[] = $cur[0];
$prev = $current[2];
}
print_r($names);

Probably better way would be using php DOMDocument or simple php dom or any DOM representative than the way you planed.
Here is example of working DOMDocument code:
$doc = new DOMDocument();
$doc->loadHTML('<html><body><span itemprop="name">1</span><span itemprop="name">2</span><span itemprop="name">3</span></body></html>');
$finder = new DomXPath($doc);
$nodes = $finder->query("//*[contains(#itemprop, 'name')]");
foreach($nodes as $node)
{
echo $node->nodeValue . '<br />';
}
Outputs:
1
2
3

I kinda feel bad for saying this... but you could use a regular expression
preg_match_all('/<span itemprop="name">(.*?)<\/span>/i', $matches);
var_dump($matches); // results are stored in the variable $matches;

This function will get us the "NAME"
function getbetween($content,$start,$end) {
$r = explode($start, $content);
if (isset($r[1])){
$r = explode($end, $r[1]);
return $r[0];
}
return '';
}
This function will replace only the first occurence
<?php
function str_replace_once($search, $replace, $subject) {
$firstChar = strpos($subject, $search);
if($firstChar !== false) {
$beforeStr = substr($subject,0,$firstChar);
$afterStr = substr($subject, $firstChar + strlen($search));
return $beforeStr.$replace.$afterStr;
} else {
return $subject;
}
}
?>
now a loop
$start = '<span itemprop="name">';
$end = '</span>';
while(strpos($content, $start)) {
$name = getbetween($content, $start, $end);
$content = str_replace_once($start.$name.$end, '',$content);
echo $name.'<br>';
}

use this function:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Refenter link description here

Related

Processing text in PHP finding a matching character

How can I process text with some codes.
So suppose I have text as below
Hello {::first_name::} {::last_name::},
How are you?
Your organisation is {::organisation::}
For any text between {:: and ::} should be evaluated to get its value.
I tried exploding text to array using space as delimiter and then parsing array items to look for "{::" and if found get string between "{::" and "::}" and calling database to get this field value.
So basically these will be db fields.
Below is the code I have tried
$msg = "Hello {::first_name::} {::last_name::},
How are you?
Your organisation is {::organisation::}";
$msg_array = explode(" ", $msg);
foreach ($msg_array as $str) {
if (strpos($str, "{::") !== false) {
$field_str = get_string_between($str, "{::", "::}");
$field_value = $bean->$field_str; //Logic that gets the value of the field
$msgStr .= $field_value . " ";
} else {
$msgStr .= $str . " ";
}
}
function get_string_between($string, $start, $end)
{
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
Your script seems fine. Your script in fiddle
If you are looking for alternative way, you can try using preg_match_all() with str_replace(array, array, source)
<?php
$bean = new stdClass();
$bean->first_name = 'John';
$bean->last_name = 'Doe';
$bean->organisation = 'PHP Company';
$string = "Hello {::first_name::} {::last_name::}, How are you? Your organisation is {::organisation::}";
// find all placeholders
preg_match_all('/{::(.+?)::}/i', $string, $matches);
$placeholders = $matches[0];
//strings inside placeholders
$parts = $matches[1];
// return values from $bean by matching object property with strings inside placeholders
$replacements = array_map(function($value) use ($bean) {
// use trim() to remove unexpected space
return $bean->{trim($value)};
}, $parts);
echo $newstring = str_replace($placeholders, $replacements, $string);
Short format:
$string = "Hello {::first_name::} {::last_name::}, How are you? Your organisation is {::organisation::}";
preg_match_all('/{::(.+?)::}/i', $string, $matches);
$replacements = array_map(function($value) use ($bean) {
return $bean->{trim($value)};
}, $matches[1]);
echo str_replace($matches[0], $replacements, $string);
And if you prefer to use a function:
function holder_replace($string, $source = null) {
if (is_object($source)) {
preg_match_all('/{::(.+?)::}/i', $string, $matches);
$replacements = array_map(function($value) use ($source) {
return (property_exists(trim($value), 'source')) ? $source->{trim($value)} : $value;
}, $matches[1]);
return str_replace($matches[0], $replacements, $string);
}
return $string;
};
echo holder_replace($string, $bean);
OUTPUT:
Hello John Doe, How are you? Your organisation is PHP Company
fiddle
Or you can simply use str_replace function:
$data = "{:: string ::}";
echo str_replace("::}", "",str_replace("{::", "", $data));

php listing specific file extension only

I want to make my code to list specific file extension only. I have list of php files and I use this code to list all of then in page
so what I need here is to list specific extiontion only. php files only
right now my code lists all files in same folder
<?php
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$files1 = scandir(dirname(__FILE__));
?>
<?php
foreach($files1 as $myfile){
if($myfile!='.' && $myfile!='..' && $myfile!='index.php'){
$value = file_get_contents($myfile);
$fullstring = $value;
$parsed = get_string_between($fullstring, '$Cont1 = \'', '\'');
$parsed1 = get_string_between($fullstring, '$Con2 = \'', '\'');
?>
<?php
}
}
?>
can some one please post an answer with editing this and show me how do I make it to list php files only?
Hope this helps. It will only read files which filenames end with php
<?php
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$files1 = scandir(dirname(__FILE__));
?>
<?php
foreach($files1 as $myfile){
if (substr($myfile, -3) != "php")
continue;
if($myfile!='.' && $myfile!='..' && $myfile!='index.php'){
$value = file_get_contents($myfile);
$fullstring = $value;
$parsed = get_string_between($fullstring, '$Cont1 = \'', '\'');
$parsed1 = get_string_between($fullstring, '$Con2 = \'', '\'');
?>
<?php
}
}
?>
glob() is pretty much made to do this exactly, and it avoids regex in a loop, substringing on an arbitrary number of chars, etc.
foreach (glob("*.php") as $script) {
echo "$script size " . filesize($script) . PHP_EOL;
}
Something like this ( untested )
$Dir = new DirectoryIterator($dir);
$iterator = new RegexIterator($Dir, '/\.php$/i', RegexIterator::MATCH);
foreach($iterator as $fileinfo) {
if ($fileinfo->isDot()) continue;
var_dump( $fileinfo );
}
Or even better
$Dir = new FilesystemIterator(__DIR__, FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS | FilesystemIterator::KEY_AS_PATHNAME );
$iterator = new RegexIterator($Dir, '/\.php$/i', RegexIterator::MATCH);
foreach($iterator as $fileinfo) {
if ($fileinfo->isDot()) continue;
var_dump( $fileinfo );
}
This second one lets you skip the dots ('.', '..' etc) and changes all the \ Windows to / Linux style ( mainly for use on windows ), auto-magically.

echo hyperlinks in array

I have an array that looks like this:
[6625] => Trump class="mediatype"> href="/news/picture">Slideshow: [6628] => href="http://www.example.com/news/picture/god=USRTX1N84J">GOP [6630] => nation
I need to be able to pull out anything within href="" of the array and put into a new one.
I have tried:
<?php
$homepage = file_get_contents('http://www.example.com/');
$arr = explode(" ",$homepage);
function getStringInBetween($string, $start, $end){
$string = " " . $string;
$initial = strpos($string, $start);
if ($initial == 0) return "";
$initial += strlen($start);
$length = strpos($string, $end, $initial) - $initial;
return substr($string, $initial, $length);
}
echo getStringInBetween($arr[0], 'href="', '"')
?>
Try this code, adapt it to suit you,
<?php
$homepage = file_get_contents('http://www.example.com/');
$arr = explode(" ",$homepage);
function getStringInBetween($string, $start, $end){
$string = " " . $string;
$initial = strpos($string, $start);
if ($initial == 0) return "";
$initial += strlen($start);
$length = strpos($string, $end, $initial) - $initial;
return substr($string, $initial, $length);
}
foreach ($arr as $val) {
if (strpos($val, 'href') !== false) {
echo getStringInBetween($val, 'href="', '"');
}
}
?>
This example when ran outputted google.com/hello.

Get substring between two strings PHP - Reading HTML

I am having a ton of trouble running through finding a string between two strings.
This is the code i currently have
<?
$html = file_get_contents('mywebsite');
$tags = explode('<',$html);
foreach ($tags as $tag)
{
// skip scripts
if (strpos($tag,'script') !== FALSE) continue;
// get text
$text = strip_tags('<'.$tag);
// only if text present remember
if (trim($text) != '') $texts[] = $text;
//print_r($text);
echo($text);
}
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = $text;
$parsed = get_string_between($fullstring, "tag1", "tag2");
print_r($parsed);
echo ($parsed);
?>
I think the problem happens on this line:
$fullstring = $text;
I am not entirely sure if $text has the stripped down HTML from the above function. When i run this code i get the stripped out webpage like i expect but i got nothing between the tags i am setting.
Does anyone know why this might be happening or what i am missing?
I think its because you are declaring text as a local variable inside for loop. so , after when you are assigning $text to fullstring It's actually null. I don't understand what you are trying to do , but do this and see if it works
$fullstring = ""
foreach ($tags as $tag){
#your code as usual
echo($text);
$fullstring = $fullstring.$text;
}
and delete the $fullstring = $text line.
you can use this:
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = 'this is my [tag]dog[/tag]';
$parsed = get_string_between($fullstring, '[tag]', '[/tag]');
echo $parsed; // (result = dog)
Reference

Itirate through array, and run function on each value

I need to iterate through an array, and edit each value but not differently.
<?php
Function parseStatus($Input, $Start, $End){
$String = " " . $Input;
$Init = StrPos($String, $Start);
If($Init == 0){
Return '';
}
$Init += StrLen($Start);
$Length = StrPos($String, $End, $Init) - $Init;
Return SubStr($String, $Init, $Length);
}
Function getAllStatuses($Username){
$DOM = new DOMDocument();
$DOM->validateOnParse = True;
#$DOM->loadHtml(File_Get_Contents('http://lifestream.aol.com/stream/' . $Username));
$xPath = new DOMXPath($DOM);
$Stream = $DOM->getElementById('stream')->nodeValue; // return stream content for display name
$Nodes = $xPath->query('//div[#class="stream"]');
$Name = Explode(' ', Trim($Stream));
$User = $Name[0];
$Statuses = Array();
ForEach($Nodes as $Node){
ForEach($Node->getElementsByTagName('li') as $Key => $Tags){
$Statuses[] = $Tags->nodeValue;
}
}
ForEach($Statuses as $Status){
If(StrPos($Status, 'Services')){
Echo 'services is definitely in there';
$New = AIM::parseStatus($Status, $User, 'Services');
Echo $New;
Break;
}
}
?>
The issue is, $New only echos the very first output, but how do I get that to run through each value in the array, and do the same thing?
Expected output:
[name as start] what i need [word Services]
Then on each value in the array, do the same thing so it'd be like:
what i need
again what i need but different string
etc.
Thanks for any help.
The Break; in your foreach loop is, well, breaking the loop.
Remove the Break; and it should work.
Have a read here:
http://www.php.net/break
break ends execution of the current for, foreach, while, do-while or switch structure.

Categories