I can make this work..
if ($content = file_get_contents("http://www.somerandomwebsite.com")) {
echo $content;
}
but.. is there a way to do this?
if ($content = file_get_contents("site:somerandomwebsite.com")) {
echo "Still Indexed!";
}
else {
echo "Google does not love you anymore";
}
You want this URL: http://www.google.com/search?q=site%3Asomerandomwebsite.com
But just checking if there is content is not enough, you will need to parse the actual HTML of the resulting page.
You would do better to implement this: http://code.google.com/apis/websearch/docs/ (although it depreciated, I have not found the replacement - anyone know what it is?)
Related
I've searched around and around and I'm not sure how this really works.
I have the tags
<taghere>content</taghere>
and i want to pull the "content" so i can put an ifstatement depending on what the "content" is as the "content" is varrying depending on the page
i.e
<taghere>HelloWorld</taghere>
$content = //function that returns the text between <taghere> and </taghere>
if($content == "HelloWorld")
{
//execute function;
}
else if($content =="Bonjour")
{
//execute seperate function
}
i tried using preg but it doesnt seem to work and just returns whatever value is in the lines field instead of actually giving me the information within the tags
If I understand your question correctly, you want the data INSIDE the tag "taghere".
If you are parsing HTML, you should use DOMDocument
Try something similar to this:
<?php
// Assuming your content (the html where those tags are found) is available as $html
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your HTML
libxml_clear_errors();
// Note: Tag names are case sensitive
$text = $dom->getElementsByTagName('taghere');
// Echo the content
echo $text
you can use DomDocument and loadXML to do this
<?php
function doAction($word=""){
$html="<taghere>$word</taghere>";
$doc = new DOMDocument();
$doc->loadXML($html);
//discard white space
$hTwo= $doc->getElementsByTagName('taghere'); // here u use your desired tag
if($hTwo->item(0)->nodeValue== "HelloWorld")
{
echo "1";
}
else if($hTwo->item(0)->nodeValue== "Bonjour")
{
echo "2";
//execute seperate function
}
}
doAction($word="Bonjour");
You cannot do it like that. Technically it is possible but it's more than an overkill. And you mixed up PHP with HTML in a way that doesn't work.
To achieve the thing that you want you have to do something like this:
$content = 'something';
if ($comtent === 'something') {
//do something
}
if ($content === 'something else') {
//do something else
}
echo '<tag>'. $content . '</tag>' ;
Of course you can change $content in the ifs.
Dont forget, you can allways add an ID into a tag so you can reference it with java script.
<tag id='tagid'>blah blah blah </tag>
<script>
document.getElementById(tagid)
</script>
This might be a much simpler way to get what you are thinking about then some of the other responses
I don't know what regex you tried and therefor not what would have been wrong. Might have been the escaping of the <
<?php
if(preg_match('#\<taghere>(.*)\</taghere>#', $document, $a)){
$content = $a[1];
}
?>
I suppose there will be only one
I wanna replace braces with <?php ?> in a file with php extension.
I have a class as a library and in this class I have three function like these:
function replace_left_delimeter($buffer)
{
return($this->replace_right_delimeter(str_replace("{", "<?php echo $", $buffer)));
}
function replace_right_delimeter($buffer)
{
return(str_replace("}", "; ?> ", $buffer));
}
function parser($view,$data)
{
ob_start(array($this,"replace_left_delimeter"));
include APP_DIR.DS.'view'.DS.$view.'.php';
ob_end_flush();
}
and I have a view file with php extension like this:
{tmp} tmpstr
in output I save just tmpstr and in source code in browser I get
<?php echo $tmp; ?>
tmpstr
In include file <? shown as <!--? and be comment. Why?
What you're trying to do here won't work. The replacements carried out by the output buffering callback occur after PHP code has already been parsed and executed. Introducing new PHP code tags at this stage won't cause them to be executed.
You will need to instead preprocess the PHP source file before evaluating it, e.g.
$tp = file_get_contents(APP_DIR.DS.'view'.DS.$view.'.php');
$tp = str_replace("{", "<?php echo \$", $tp);
$tp = str_replace("}", "; ?>", $tp);
eval($tp);
However, I'd strongly recommend using an existing template engine; this approach will be inefficient and limited. You might want to give Twig a shot, for instance.
do this:
function parser($view,$data)
{
$data=array("data"=>$data);
$template=file_get_contents(APP_DIR.DS.'view'.DS.$view.'.php');
$replace = array();
foreach ($data as $key => $value) {
#if $data is array...
$replace = array_merge(
$replace,array("{".$key."}"=>$value)
);
}
$template=strtr($template,$replace);
echo $template;
}
and ignore other two functions.
How does this work:
process.php:
<?php
$contents = file_get_contents('php://stdin');
$contents = preg_replace('/\{([a-zA-Z_][a-zA-Z_0-9]*)\}/', '<?php echo $\1; ?>', $contents);
echo $contents;
bash script:
process.php < my_file.php
Note that the above works by doing a one-off search and replace. You can easily modify the script if you want to do this on the fly.
Note also, that modifying PHP code from within PHP code is a bad idea. Self-modifying code can lead to hard-to-find bugs, and is often associated with malicious software. If you explain what you are trying to achieve - your purpose - you might get a better response.
I am very new to both php and xml. What I am trying to do in
php is read in xml from a call to a url, and then parse the xml.
(I can get this to work in the example below when $urlip = 'localfile.xml'
but not when I put in a url. Ive checked the url by going to it with my browser,
and I can see the xml. I also did a show source, copied it and then pasted the
xml into the localfile and that works fine.
What am I doing wrong in trying to get the xml from the url?
Thank you
The error being returned is:
Error loading XML Start tag expected, ‘<' not found
Here is my code snip it:
$urlip="test.xml";# for debugging since I cannot read from the url yet! not sure why....
if (($xml = file_get_contents($urlip))===false) {
echo "error fetching XML\n";
} else {
libxml_use_internal_errors(true);
$data = simplexml_load_string($xml,null,LIBXML_NOCDATA);
if (!$data) {
echo "Error loading XML\n";
foreach(libxml_get_errors() as $error) {
echo "\t", $error->message;
}
} else {
foreach ($data as $item) {
$type = $item->TAB_TYPE;
$number=$item->ALT_ID;
$title = $item->SHORT_DESCR;
$searchlink = $item->ID;
$rsite=$item->CATEGORY;
echo "type $type, number $number, title $title, search link $searchlink, site $rsite\n";
}
}
}
Most likely situation from what it looks like:
Your function queries the remote URL and returns you an empty string, which passes the condition of your 'if' statement.
After that - you try to pass the empty string into XML, but it cannot, so it gives you an error.
Your steps to solve it:
configure php to open remote urls as comments to your question state - url_fopen
use another way to get content from the URL - cURL library works well
I'm testing a parser using SIMPLE_HTML_DOM and while parsing
the returned HTML DOM from this URL: HERE
It is not finding the H1 elements...
I tried returning all the div's with success.
I'm using a simple request for diagnosing this problem:
foreach($html->find('H1') as $value) { echo "<br />F: ".htmlspecialchars($value); }
While looking at the source code I realized that:
h1 is upper case -> H1 - but the SIMPLE_HTML... is handling that:
//PaperG - If lowercase is set, do a case insensitive test of the value of the selector.
if ($lowercase) {
$check = $this->match($exp, strtolower($val), strtolower($nodeKeyValue));
} else {
$check = $this->match($exp, $val, $nodeKeyValue);
}
if (is_object($debugObject)) {$debugObject->debugLog(2, "after match: " . ($check ? "true" : "false"));}
Can any body help me understanding what is going on here?
Try This
$oHtml = str_get_html($html);
foreach($oHtml->find('h1') as $element)
{
echo $element->innertext;
}
You will also use regular expression following function return an array of all h1 tag's innertext
function getH1($yourhtml)
{
$h1tags = preg_match_all("/(<h1.*>)(\w.*)(<\/h1>)/isxmU", $yourhtml, $patterns);
$res = array();
array_push($res, $patterns[2]);
array_push($res, count($patterns[2]));
return $res;
}
Found it...
But cant explain it!
I tested with another code including H1 (uppercase) and it worked.
While playing with the SIMPLE_HTML_DOM code i commented the "remove_noise" and now its working
perfectly, I think it's because that this website has invalid HTML and
the noise remover is removing too much and not ending after the end tags scripts:
// $this->remove_noise("'<\s*script[^>]*[^/]>(.*?)<\s*/\s*script\s*>'is");
// $this->remove_noise("'<\s*script\s*>(.*?)<\s*/\s*script\s*>'is");
Thank you all for your help.
I'm looking to create a PHP script where, a user will provide a link to a webpage, and it will get the contents of that webpage and based on it's contents, parse the contents.
For example, if a user provides a YouTube link:
http://www.youtube.com/watch?v=xxxxxxxxxxx
Then, it will grab the basic information about that video (thumbnail, embed code?)
Or they might provide a vimeo link:
http://www.vimeo.com/xxxxxx
Or even if they were to provide any link, without a video attached, such as:
http://www.google.com/
And it could grab just the page Title or some meta content.
I'm thinking I'd have to use file_get_contents, but I'm not exactly sure how to use it in this context.
I'm not looking for someone to write the entire code, but perhaps provide me with some tools so that I can accomplish this.
You can use either the curl or the http library. You send a http request, and can use the library to get the information from the http response.
I know this question is quite old, but I'll answer just in case someone hits it looking for the same thing.
Use oEmbed (http://oembed.com/) for YouTube, Vimeo, Wordpress, Slideshare, Hulu, Flickr and many other services. If not in the list or you want to make it more precise, you can use this:
http://simplehtmldom.sourceforge.net/
It's a sort of jQuery for PHP, meaning you can use HTML selectors to get portions of the code (i.e.: all the images, get the contents of a div, return only text (no HTML) contents of a node, etc).
You could do something like this (could be done more elegantly but this is just an example):
require_once("simple_html_dom.php");
function getContent ($item, $contentLength)
{
$raw;
$content = "";
$html;
$images = "";
if (isset ($item->content) && $item->content != "")
{
$raw = $item->content;
$html = str_get_html ($raw);
$content = str_replace("\n", "<BR /><BR />\n\n", trim($html->plaintext));
try
{
foreach($html->find('img') as $image) {
if ($image->width != "1")
{
// Don't include images smaller than 100px height
$include = false;
$height = $image->width;
if ($height != "" && $height >= 100)
{
$include = true;
}
/*else
{
list($width, $height, $type, $attr) = getimagesize($image->src);
if ($height != "" && $height >= 100)
$include = true;
}*/
if ($include == true)
{
$images = $images . '<div class="theImage"><img src="'.$image->src.'" alt="'.$image->alt.'" class="postImage" border="0" /></div>';
}
}
}
}
catch (Exception $e) {
// Do nothing
}
$images = '<div id="images">'.$images.'</div>';
}
else
{
$raw = $item->summary;
$content = str_get_html ($raw)->plaintext;
}
return (substr($content, 0 , $contentLength) . (strlen ($content) > $contentLength ? "..." : "") . $images);
}
file_get_contents() would work in this case assuming that you have allow_fopen_url set to true in your php.ini. What you would do is something like:
$pageContent = #file_get_contents($url);
if ($pageContent) {
preg_match_all('#<embed.*</embed>#', $pageContent, $matches);
$embedStrings = $matches[0];
}
That said, file_get_contents() won't give you much in the way of error handling other receiving the content on success or false on failure. If you would like to have more rich control over the request and access the HTTP response codes, use the curl functions and in particular, curl_get_info, to look at the response codes, mime types, encoding, etc. Once you get the content via either curl or file_get_contents() your code for parsing it to look for the HTML of interest will be the same.
Maybe Thumbshots or Snap already have some of the functionality you want?
I know that's not exactly what you are looking for, but at least for the embedded stuff that might be handy. Also txwikinger already answered your other question. But maybe that helps ypu anyway.