morning. I am wanting to take all segments of php code out of a file located on my local server. Problem is i dont seem to be getting anywhere, no php errors just browser errors.
$file_contents = "<xmp>".file_get_contents("../www.cms.actwebdesigns.co.uk2/pageIncludes/instalation/selectMainPages.php")."</xmp>";
if(preg_match_all("#<\?php((?!\?>).)*#is", $file_contents, $matches))
{
foreach($matches[0] as $phpCode)
{
$code = "<xmp>".$phpCode."\n?></xmp>";
}
}
echo "dsds";
?>
could someone please point me in the right direction?
working with this:
$file_contents = token_get_all(file_get_contents("../www.cms.actwebdesigns.co.uk2/logged.php"));
$start=0;
$end=0;
$segmentArray = array();
foreach($file_contents as $key => $token)
{
$tokenName = token_name($key);
if($start==0 && $end==0 && $tokenName=="T_OPEN_TAG")
{
$start=1;
}
if(start==1 && $end==0 && $tokenName!="T_CLOSE_TAG")
{
$entryNo = count($segmentArray);
$segmentArray[$entryNo][] = $token;
}
if($tokenName=="T_CLOSE_TAG")
{
$start=0;
}
}
You might want to tokenize the PHP script using the Tokenizer extension:
http://php.net/manual/en/book.tokenizer.php
The extensions is built into PHP since PHP v4.3.0.
$tokens = token_get_all(file_get_contents($file));
http://www.php.net/manual/en/function.token-get-all.php
Not sure how to use this. Puts all code into an array. For me to use it wouldn't i have to implode it or something then im back to square one?
Related
Let's say that you have a class in php with functions and all that. How could you check if there is no code outside the class?
I tried to code this checker with PHP and did it with regex and tokens but nothing worked for me :/
An exmple
<?php
class example {
var $name;
var $password;
function __construct($name, $password) {
$this->name = $name;
$this->password = $password;
}
----Allowed code----
}
----Not allowed code----
?>
EDIT: (SOLVED)
Thanks #user3163495 for all the information
Here what I did:
1ยบ I tried to get the class name inside the file with this two functions:
function getClass($tokens) {
$clases = array();
for($i = 0; $i < count($tokens); $i++) {
//Tipo del token actual
$tokenName = getTokenName($tokens, $i);
if($tokenName === "T_CLASS") {
//Searchs the name that it has in the file.
return getClassName($tokens, $i);
}
}
return "";
}
function getClassName($tokens, $i) {
$index = $i + 1;
//Line in which is the class inside the file
$lineaClase = getTokenLine($tokens, $i);
//Line to be updated while searching
$lineaTemp = getTokenLine($tokens, $index);
//Type of token to be updated while searching
$tokenName = getTokenName($tokens, $index);
//Searchs in the parsed array for the first class token
while($index < count($tokens) &&
$lineaClase === $lineaTemp &&
($tokenName === "UNKOWN" || $tokenName === "T_WHITESPACE")) {
$index++;
$tokenName = getTokenName($tokens, $index);
$lineaTemp = getTokenLine($tokens, $index);
}
//Returns the name of the class
return getTokenContent($tokens, $index);
}
Then, I injected PHP code in the end of the file that I tried to check if it's only a class. Also I saved this new content in a new file called temp.php and finally I shell-executed this to get the echo of the injected code, that will correspond to beginning_of_class:end_of_class. Here is where I used what #user3163495 told me, thank you again.
function codeInjection($clase, $contenido) {
//PHP code to inject, thanks to #user3163495
$codigoPHP = "<?php \$class = new ReflectionClass(\"{$clase}\"); \$comienzo = \$class->getStartLine(); \$fin = \$class->getEndLine(); echo \$comienzo . \":\" . \$fin; ?>";
$contenido .= $codigoPHP;
//Creating temp file
file_put_contents("temp.php", $contenido);
//Returning result of execution
return shell_exec("php temp.php");
}
Further, I removed from the token parsed array those tokens which line where between the beginning and the end of the class. Last I go through the array searching for something that is different than a comment, white space, etc..
(Variables are in spanish, if you don't understand the meaning of some feel free to ask)
If you are wanting to "scan" the questionable file from another script to see if there is any code outside the class, then you could use the ReflectionClass in PHP.
Step 1: get the file name that your class is defined in
$class = new ReflectionClass("example");
$fileName = $class->getFileName();
Step 2: get the starting and ending lines of code that the class definition occupies in the file
$startLine = $class->getStartLine();
$endLine = $class->getEndLine();
$numLines = $endLine - $startLine;
Step 3: use file_get_contents() on the file name you obtained in Step 1, and see if there is any forbidden code before the start line or after the end line. You'll have to test and play around with what you get as I don't know exactly where getStartLine() and getEndLine() consider "start" and "end", respectively.
I hope you get the idea.
Some code lifted from this answer: https://stackoverflow.com/a/7909101/3163495
The following code works with all YouTube domains except for youtu.be. An example would be: http://www.youtube.com/watch?v=ZedLgAF9aEg would turn into: ZedLgAF9aEg
My question is how would I be able to make it work with http://youtu.be/ZedLgAF9aEg.
I'm not so great with regex so your help is much appreciated. My code is:
$text = preg_replace("#[&\?].+$#", "", preg_replace("#http://(?:www\.)?youtu\.?be(?:\.com)?/(embed/|watch\?v=|\?v=|v/|e/|.+/|watch.*v=|)#i", "", $text)); }
$text = (htmlentities($text, ENT_QUOTES, 'UTF-8'));
Thanks again!
//$url = 'http://www.youtube.com/watch?v=ZedLgAF9aEg';
$url = 'http://youtu.be/ZedLgAF9aEg';
if (FALSE === strpos($url, 'youtu.be/')) {
parse_str(parse_url($url, PHP_URL_QUERY), $id);
$id = $id['v'];
} else {
$id = basename($url);
}
echo $id; // ZedLgAF9aEg
Will work for both versions of URLs. Do not use regex for this as PHP has built in functions for parsing URLs as I have demonstrated which are faster and more robust against breaking.
Your regex appears to solve the problem as it stands now? I didn't try it in php, but it appears to work fine in my editor.
The first part of the regex http://(?:www\.)?youtu\.?be(?:\.com)?/matches http://youtu.be/ and the second part (embed/|watch\?v=|\?v=|v/|e/|.+/|watch.*v=|) ends with |) which means it matches nothing (making it optional). In other words it would trim away http://youtu.be/ leaving only the id.
A more intuitive way of writing it would be to make the whole if grouping optional I suppose, but as far as I can tell your regex is already solving your problem:
#http://(?:www\.)?youtu\.?be(?:\.com)?/(embed/|watch\?v=|\?v=|v/|e/|.+/|watch.*v=)?#i
Note: Your regex would work with the www.youtu.be.com domain as well. It would be stripped away, but something to watch out for if you use this for validating input.
Update:
If you want to only match urls inside [youtube][/youtube] tags you could use look arounds.
Something along the lines of:
(?<=\[youtube\])(?:http://(?:www\.)?youtu\.?be(?:\.com)?/(?:embed/|watch\?v=|\?v=|v/|e/|[^\[]+/|watch.*v=)?)(?=.+\[/youtube\])
You could further refine it by making the .+ in the look ahead only match valid URL characters etc.
Try this, hope it'll help you
function YouTubeUrl($url)
{
if($url!='')
{
$newUrl='';
$videoLink1=$url;
$findKeyWord='youtu.be';
$toBeReplaced='www.youtube.com';
if(IsContain('watch?v=',$videoLink1))
{
$newUrl=tMakeUrl($videoLink1);
}
else if(IsContain($videoLink1, $findKeyWord))
{
$videoLinkArray=explode('/',$videoLink1);
$Protocol='';
if(IsContain('://',$videoLink1))
{
$protocolArray=explode('://',$videoLink1);
$Protocol=$protocolArray[0];
}
$file=$videoLinkArray[count($videoLinkArray)-1];
$newUrl='www.youtube.com/watch?v='.$file;
if($Protocol!='')
$newUrl.=$Protocol.$newUrl;
else
$newUrl=tMakeUrl($newUrl);
}
else
$newUrl=tMakeUrl($videoLink1);
return $newUrl;
}
return '';
}
function IsContain($string,$findKeyWord)
{
if(strpos($string,$findKeyWord)!==false)
return true;
else
return false;
}
function tMakeUrl($url)
{
$tSeven=substr($url,0,7);
$tEight=substr($url,0,8);
if($tSeven!="http://" && $tEight!="https://")
{
$url="http://".$url;
}
return $url;
}
You can use bellow function for any of youtube URL
I hope this will help you
function checkYoutubeId($id)
{
$youtube = "http://www.youtube.com/oembed?url=". $id ."&format=json";
$curl = curl_init($youtube);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$return = curl_exec($curl);
curl_close($curl);
return json_decode($return, true);
}
This function return Youtube video detail if Id match to youtube video ID
A little improvement to #rvalvik answer would be to include the case of the mobile links (I've noticed it while working with a customer who used an iPad to navigate, copy and paste links). In this case, we have a m (mobile) letter instead of www. Regex then becomes:
#(https?://)?(?:www\.)?(?:m\.)?(?:youtu\.be/|youtube\.com(?:/embed/|/v/|/watch?.*?v=))([\w\-]{10,12}).*#x
Hope it helps.
A slight improvement of another answer:
if (strpos($url, 'feature=youtu.be') === TRUE || strpos($url, 'youtu.be') === FALSE )
{
parse_str(parse_url($url, PHP_URL_QUERY), $id);
$id = $id['v'];
}
else
{
$id = basename($url);
}
This takes into account youtu.be still being in the URL, but not the URL itself (it does happen!) as it could be the referring feature link.
Other answers miss out on the point that some youtube links are part of a playlist and have a list paramater also which is required for embed code. So to extract the embed code from link one could try this JS code:
let urlEmbed = "https://www.youtube.com/watch?v=iGGolqb6gDE&list=PL2q4fbVm1Ik6DCzm9XZJbNwyHtHGclcEh&index=32"
let embedId = urlEmbed.split('v=')[1];
let parameterStringList = embedId.split('&');
if (parameterStringList.length > 1) {
embedId = parameterStringList[0];
let listString = parameterStringList.filter((parameterString) =>
parameterString.includes('list')
);
if (listString.length > 0) {
listString = listString[0].split('=')[1];
embedId = `${parameterStringList[0]}?${listString}`;
}
}
console.log(embedId)
Try it out here: https://jsfiddle.net/AMITKESARI2000/o62dwj7q/
try this :
$string = explode("=","http://www.youtube.com/watch?v=ZedLgAF9aEg");
echo $string[1];
would turn into: ZedLgAF9aEg
Hi everyone once again!
We need some help to develop and implement a multi-curl functionality into our crawler. We have a huge array of "links to be scanned" and we loop throw them with a Foreach.
Let's use some pseudo code to understand the logic:
1) While ($links_to_be_scanned > 0).
2) Foreach ($links_to_be_scanned as $link_to_be_scanned).
3) Scan_the_link() and run some other functions.
4) Extract the new links from the xdom.
5) Push the new links into $links_to_be_scanned.
5) Push the current link into $links_already_scanned.
6) Remove the current link from $links_to_be_scanned.
Now, we need to define a maximum number of parallel connections and be able to run this process for each link in parallel.
I understand that we're gonna have to create a $links_being_scanned or some kind of queue.
I'm really not sure how to approach this problem to be honest, if anyone could provide some snippet or idea to solve it, it would be greatly appreciated.
Thanks in advance!
Chris;
Extended:
I just realized that is not the multi-curl itself the tricky part, but the amount of operations done with each link after the request.
Even after the muticurl, I would eventually have to find a way to run all this operations in parallel. The whole algorithm described below would have to run in parallel.
So now rethinking, we would have to do something like this:
While (There's links to be scanned)
Foreach ($Link_to_scann as $link)
If (There's less than 10 scanners running)
Launch_a_new_scanner($link)
Remove the link from $links_to_be_scanned array
Push the link into $links_on_queue array
Endif;
And each scanner does (This should be run in parallel):
Create an object with the given link
Send a curl request to the given link
Create a dom and an Xdom with the response body
Perform other operations over the response body
Remove the link from the $links_on_queue array
Push the link into the $links_already_scanned array
I assume we could approach this creating a new PHP file with the scanner algorithm, and using pcntl_fork() for each parallel proccess?
Since even using multi-curl, I would eventually have to wait looping on a regular foreach structure for the other processes.
I assume I would have to approach this using fsockopen or pcntl_fork.
Suggestions, comments, partial solutions, and even a "good luck" will be more than appreciated!
Thanks a lot!
DISCLAIMER: This answer links an open-source project with which I'm involved. There. You've been warned.
The Artax HTTP client is a socket-based HTTP library that (among other things) offers custom control over the number of concurrent open socket connections to individual hosts while making multiple asynchronous HTTP requests.
Limiting the number of concurrent connections is easily accomplished. Consider:
<?php
use Artax\Client, Artax\Response;
require dirname(__DIR__) . '/autoload.php';
$client = new Client;
// Defaults to max of 8 concurrent connections per host
$client->setOption('maxConnectionsPerHost', 2);
$requests = array(
'so-home' => 'http://stackoverflow.com',
'so-php' => 'http://stackoverflow.com/questions/tagged/php',
'so-python' => 'http://stackoverflow.com/questions/tagged/python',
'so-http' => 'http://stackoverflow.com/questions/tagged/http',
'so-html' => 'http://stackoverflow.com/questions/tagged/html',
'so-css' => 'http://stackoverflow.com/questions/tagged/css',
'so-js' => 'http://stackoverflow.com/questions/tagged/javascript'
);
$onResponse = function($requestKey, Response $r) {
echo $requestKey, ' :: ', $r->getStatus();
};
$onError = function($requestKey, Exception $e) {
echo $requestKey, ' :: ', $e->getMessage();
}
$client->requestMulti($requests, $onResponse, $onError);
IMPORTANT: In the above example the Client::requestMulti method is making all the specified requests asynchronously. Because the per-host concurrency limit is set to 2, the client will open up new connections for the first two requests and subsequently reuse those same sockets for the other requests, queuing requests until one of the two sockets become available.
you could try something like this, haven't checked it, but you should get the idea
$request_pool = array();
function CreateHandle($url) {
$handle = curl_init($url);
// set curl options here
return $handle;
}
function Process($data) {
global $request_pool;
// do something with data
array_push($request_pool , CreateHandle($some_new_url));
}
function RunMulti() {
global $request_pool;
$multi_handle = curl_multi_init();
$active_request_pool = array();
$running = 0;
$active_request_count = 0;
$active_request_max = 10; // adjust as necessary
do {
$waiting_request_count = count($request_pool);
while(($active_request_count < $active_request_max) && ($waiting_request_count > 0)) {
$request = array_shift($request_pool);
curl_multi_add_handle($multi_handle , $request);
$active_request_pool[(int)$request] = $request;
$waiting_request_count--;
$active_request_count++;
}
curl_multi_exec($multi_handle , $running);
curl_multi_select($multi_handle);
while($info = curl_multi_info_read($multi_handle)) {
$curl_handle = $info['handle'];
call_user_func('Process' , curl_multi_getcontent($curl_handle));
curl_multi_remove_handle($multi_handle , $curl_handle);
curl_close($curl_handle);
$active_request_count--;
}
} while($active_request_count > 0 || $waiting_request_count > 0);
curl_multi_close($multi_handle);
}
You should look for some more robust solution to your problem. RabbitMQ
is a very good solution I used. There is also Gearman but I think it is your choice.
I prefer RabbitMQ.
I will share with you my code which I have used to collect email addresses from certain website.
You can modify it to fit your needs.
There were some problems with relative URL's there.
And I do not use CURL here.
<?php
error_reporting(E_ALL);
$home = 'http://kharkov-reklama.com.ua/jborudovanie/';
$writer = new RWriter('C:\parser_13-09-2012_05.txt');
set_time_limit(0);
ini_set('memory_limit', '512M');
function scan_page($home, $full_url, &$writer) {
static $done = array();
$done[] = $full_url;
// Scan only internal links. Do not scan all the internet!))
if (strpos($full_url, $home) === false) {
return false;
}
$html = #file_get_contents($full_url);
if (empty($html) || (strpos($html, '<body') === false && strpos($html, '<BODY') === false)) {
return false;
}
echo $full_url . '<br />';
preg_match_all('/([A-Za-z0-9_\-]+\.)*[A-Za-z0-9_\-]+#([A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]\.)+[A-Za-z]{2,4}/', $html, $emails);
if (!empty($emails) && is_array($emails)) {
foreach ($emails as $email_group) {
if (is_array($email_group)) {
foreach ($email_group as $email) {
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
$writer->write($email);
}
}
}
}
}
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
preg_match_all("/$regexp/siU", $html, $matches, PREG_SET_ORDER);
if (is_array($matches)) {
foreach($matches as $match) {
if (!empty($match[2]) && is_scalar($match[2])) {
$url = $match[2];
if (!filter_var($url, FILTER_VALIDATE_URL)) {
$url = $home . $url;
}
if (!in_array($url, $done)) {
scan_page($home, $url, $writer);
}
}
}
}
}
class RWriter {
private $_fh = null;
private $_written = array();
public function __construct($fname) {
$this->_fh = fopen($fname, 'w+');
}
public function write($line) {
if (in_array($line, $this->_written)) {
return;
}
$this->_written[] = $line;
echo $line . '<br />';
fwrite($this->_fh, "{$line}\r\n");
}
public function __destruct() {
fclose($this->_fh);
}
}
scan_page($home, 'http://kharkov-reklama.com.ua/jborudovanie/', $writer);
i have this very simple php script:
<?php
require 'functions.php';
$token = "some-number";
$id = "other-number";
$albums = get_url_contents("https://graph.facebook.com/".$id."/albums?access_token=".$token);
$aObject = json_decode($albums, true);
foreach ($aObject['data'] as $i => $a) {
$photos = get_url_contents("https://graph.facebook.com/".$a['id']."/photos?access_token=".$token);
$bObject = json_decode($photos, true);
foreach ($bObject['data'] as $y => $b) {
if (strpos($b['name'],"#test1") !== false) {
echo($b['name']."<br>".$b['source']."<br>".$b['created_time']."<br>");
}
}
}
?>
The execution time is always more than 10 seconds, is any way to notify the user with a perceptual text or something?
ok I learned something new.
it is possible. Look there: http://bytes.com/topic/php/answers/5153-status-note-scripts-run-long-time
You could display message at the beginning of the script execution and hide it via javascript on the end of the execution.
I hope you understand what I mean
How about asynch load with js or jquery
I was following this tutorial.
I need to use a php file's ouput in my HTML file to dynamically load images into a gallery. I call
function setOutput()
{
if (httpObject.readyState == 4)
document.getElementById('main').src = httpObject.responseText;
alert("set output: " + httpObject.responseText);
}
from
function doWork()
{
httpObject = getHTTPObject();
if (httpObject != null) {
httpObject.open("GET", "gallery.php?no=0", true);
httpObject.send(null);
httpObject.onreadystatechange = setOutput;
}
}
However, the alert returns the php file, word for word. It's probably a really stupid error, but I can't seem to find it.
The php file:
<?php
if (isset($_GET['no'])) {
$no = $_GET['no'];
if ($no <= 10 && $no >1) {
$xml = simplexml_load_file('gallery.xml');
echo "images/" . $xml->image[$no]->src;
}
else die("Number isn't between 1 and 10");
}
else die("No number set.");
?>
If the alert is returning the contents of the PHP file instead of the results of executing it, then the server is not executing it.
Test by accessing the URI directly (instead of going via JavaScript).
You probably need to configure PHP support on the server.
Your Server doesn't serve/parse PHP files! You could test your JavaScript code by setting the content of gallery.php to the HTML code you want to receive.