I am working on a php script to pull quest data from wowhead, particularly what starts and ends the quest, whether it is an item or a npc, and what its id or name is, respectively. This is the relevant portion of the whole script, with the rest involving database insertion. This is the completed snippet of code I came up with if anyone is interested. Also, seeing as this will run about 15,000 times, is this the best method of obtaining/storing the data?
<?php
$quests = array();
//$questlimit = 14987;
$questlimit = 5;
$currentquest = 1;
$questsprocessed = 0;
while($questsprocessed != $questlimit)
{
echo "<br>";
echo " Start of iteration: ".$questsprocessed." ";
echo "<br>";
echo " Attempting to process quest: ".$currentquest." ";
echo "<br>";
$quests[$currentquest] = array();
$baseurl = 'http://wowhead.com/quest=';
$fullurl = $baseurl.$currentquest;
$data = drupal_http_request($fullurl);
$queststartloc1 = strpos($data->data, 'quest_start');
$queststartloc2 = strpos($data->data, 'quest_end');
if($queststartloc1==false)
{$currentquest++; echo "No data for this quest"; echo "<br>"; continue;}
$questendloc1 = strpos($data->data, 'quest_end');
$questendloc2 = strpos($data->data, 'x5DDifficulty');
$startcaptureLength = $queststartloc2 - $queststartloc1;
$endcaptureLength = $questendloc2 - $questendloc1;
$quest_start_raw = substr($data->data,$queststartloc1, $startcaptureLength);
$quest_end_raw = substr($data->data, $questendloc1, $endcaptureLength);
$startDecoded = preg_replace('~\\\\x([A-Fa-f0-9]{2})~e', 'chr("0x$1")', $quest_start_raw);
$endDecoded = preg_replace('~\\\\x([A-Fa-f0-9]{2})~e', 'chr("0x$1")', $quest_end_raw);
$quests[$currentquest]['Start'] = array();
$quests[$currentquest]['End'] = array();
if(strstr($startDecoded, 'npc'))
{
$quests[$currentquest]['Start']['Type'] = "npc";
preg_match('~npc=(\d+)~', $startDecoded, $startmatch);
}
else
{
$quests[$currentquest]['Start']['Type'] = "item";
preg_match('~item=(\d+)~', $startDecoded, $startmatch);
}
$quests[$currentquest]['Start']['ID'] = $startmatch[1];
if(strstr($endDecoded, 'npc'))
{
$quests[$currentquest]['End']['Type'] = "npc";
preg_match('~npc=(\d+)~', $endDecoded, $endmatch);
}
else
{
$quests[$currentquest]['End']['Type'] = "item";
preg_match('~item=(\d+)~', $endDecoded, $endmatch);
}
$quests[$currentquest]['End']['ID'] = $endmatch[1];
//var_dump($quests[$currentquest]);
echo " End of iteration: ".$questsprocessed." ";
echo "<br>";
echo " Processed quest: ".$currentquest." ";
echo "<br>";
$currentquest++;
$questsprocessed++;
}
?>
These are called "escape sequences". Normally, they're used to represent characters not printable otherwise, but can encode any character. In php, you can decode them like this:
$text = '
quest_start\\x5DStart\\x3A\\x20\\x5Bitem\\x3D16305\\x5D\\x5B\\x2Ficon\\x5D\\x5B\\x2Fli\\x5D\\x5Bli\\x5D\\x5Bicon\\x20name\\x3Dquest_end\\x5DEnd\\x3A\\x20\\x5Burl\\x3D\\x2Fnpc\\x3D12696\\x5DSenani\\x20Thunderheart\\x5B\\x2Furl\\x5D\\x5B\\x2Ficon\\x5D\\x5B\\x2Fli\\x5D\\x5Bli\\x5DNot\\x20sharable\\x5B\\x2Fli\\x5D\\x5Bli
';
$decoded = preg_replace('~\\\\x([A-Fa-f0-9]{2})~e', 'chr("0x$1")', $text);
Which gives you a string similar to this:
quest_start]Start: [item=16305][/icon][/li][li][icon name=quest_end]End: [url=/npc=12696]Senani Thunderheart[/url][/icon][/li][li]Not sharable[/li][li
(obviously, some kind of BB-code). To remove all bbcodes, yet one replacement is necessary:
$clean = preg_replace('~(\[.+?\])+~', ' ', $decoded);
Related
<?php
$id = array(
"UC4MubF2asPHQbXN44DVNeXg",
"UCXuDgoo_oiZf8UkIXs3Y_kw",
"UCMnDuOzzJrWzr5tfemDcqlQ",
"UC9FH1mkHLFQuPPEPu9CfR1A",
"UCfyEAw41i7PRetP2erYf9dg",
);
// Getting Emails from DB and Looping with Channel Title and Video Title
$conn = mysqli_connect("localhost","username","password","dbname");
$query = "SELECT * FROM prospects ORDER BY email";
$rows = mysqli_query($conn, $query);
while($row = mysqli_fetch_array($rows)) {
$to[] = $row['email'];
}
$size = sizeof($id);
//echo "<br>Number of channel Ids: ".$size."<br><br>";
// Looping each Url for Each Email to get Data
foreach($to as $t){
echo $t."<br>";
for ($i = 0; $i < $size; $i++) {
$url = "https://www.youtube.com/channel/$id[$i]";
// echo $url."<br>";
$channel = trim(explode('https://www.youtube.com/channel/', $url)[1]);
// echo $channel."<br>";
$rss = "https://www.youtube.com/feeds/videos.xml?channel_id=$channel";
// echo $rss."<br>";
$xml = simplexml_load_file($rss);
$title = $xml->title;
// echo $title."<br>";
$videoTitle = $xml->entry[0]->title;
// echo $videoTitle."<br>";
$id[$i] = $xml->id;
// echo $id[$i]."<br>";
// $idOnly = substr($id[$i] , strpos($id[$i] , "yt:channel:") + 11);
// echo $idOnly."<br>";
$pub[$i] = $xml->entry[0]->published;
// echo $pub[$i]."<br>";
$realDate_ = new DateTime($pub[$i]);
// var_dump($realDate_);
$realDate2_ = $realDate_->format("D, d M Y")."<br>";
//echo $realDate2_;
$today_ = new DateTime();
// $today2_ = $today_->format("D, d M Y")."<br>";
// echo $today2_;
// if ($today2_ == $realDate2_) {
// echo "true";
// This is the Result with Every Email
echo $content = "• <a href='https://bladingflix.com/render.php?email=$t&channel=$idOnly'>".$title." - ".$videoTitle."</a><br>";
// }
}
}
?>
For the First Email the Data Shows Properly but in for the others
Purpose = get collected links add the user email to each url repeat for each.
Please make below changes in your existing code
In the second for() loop you are replacing the $id array as below
$id[$i] = $xml->id;
Instead of replacing the $id[$i] value, create new variable as below
$xmlId = $xml->id;
$idOnly = substr($xmlId, strpos($xmlId , "yt:channel:") + 11);
And please uncomment the $idOnly variable.
It will work. And try to refactor this code.
I have some code which uses odbc calls. I've had no problem with the pulling data from a MS SQL database until this bit of code. Instead of saving the string to a variable it outputs the string and saves an int 1 to the variable instead.
below is the relevant code:
$qry = 'SELECT * FROM cst_AdEssayDetails_vw WHERE AdEnrollSchedID = ?';
$essay = odbc_prepare($conn,$qry);
if(odbc_execute($essay,array($AdEnrollSchedID))){
if (odbc_num_rows($essay)>0){
while (odbc_fetch_row($essay)){
$setTopic = odbc_result($essay,'setTopic');
$essay_active = odbc_result($essay,'active');
$essay_date = date("m-d-y", strtotime(odbc_result($essay,'StartDate')));
$essayID = odbc_result($essay,'EssayID');
$EnrollSchedID = odbc_result($essay,'AdEnrollSchedID');
$intro = odbc_result($essay,'intro');
$support = odbc_result($essay,'support');
$conclusion = odbc_result($essay,'conclusion');
$essay_comment = odbc_result($essay,'comment');
$submitted = odbc_result($essay,'submitted');
$dateSubmitted = date("m-d-y", strtotime(odbc_result($essay,'dateSubmitted')));
$essay_grade = odbc_result($essay,'grade');
$AdStudentEssayID = odbc_result($essay,'AdStudentEssayID');
$topic = odbc_result($essay,'topic');
echo '<intro1>'.$intro.'</intro1>';
echo '<intro_len>'.strlen($intro).'</intro_len>';
if (1 > strlen($essay_comment)){
$essay_comment = ' ';
} else {
$essay_comment = urlencode($essay_comment);
}
if (1 > strlen($essay_grade)) {
$essay_grade = ' ';
}
if (2 < strlen($topic)){
$topic = urlencode($topic);
} else {
$topic = ' ';
}
if (2 < strlen($intro)){
$intro = urlencode($intro);
} else {
$intro = ' ';
}
if (2 < strlen($support)){
$support = urlencode($support);
} else {
$support = ' ';
}
if (2 < strlen($conclusion)){
$conclusion = urlencode($conclusion);
} else {
$conclusion = ' ';
}
echo '<setTopic>'.$setTopic.'</setTopic>';
echo '<essay_active>'.$essay_active.'</essay_active>';
echo '<essay_date>'.$essay_date.'</essay_date>';
echo '<essayID>'.$essayID.'</essayID>';
echo '<topic>'.$topic.'</topic>';
echo '<intro>'.$intro.'</intro>';
echo '<support>'.$support.'</support>';
echo '<conclusion>'.$conclusion.'</conclusion>';
echo '<essay_comment>'.$essay_comment.'</essay_comment>';
echo '<submitted>'.$submitted.'</submitted>';
echo '<dateSubmitted>'.$dateSubmitted.'</dateSubmitted>';
echo '<essay_grade>'.$essay_grade.'</essay_grade>';
echo '<AdStudentEssayID>'.$AdStudentEssayID.'</AdStudentEssayID>';
}
} else {
The output I get from this is:
'Test Introduction Paragraph
Test Supporting Paragraph
2nd test supporting paragraph
Test Conclusion Paragraph
Test Essay Topic
<intro1>1</intro1>
<intro_len>1</intro_len>
<setTopic>%3Cp%3ETest%20Topic%3C%2Fp%3E</setTopic>
<essay_active>1</essay_active>
<essay_date>01-01-70</essay_date>
<essayID>29</essayID>
<topic></topic>
<intro></intro>
<support></support>
<conclusion></conclusion>
<essay_comment>1</essay_comment>
<submitted>1</submitted>
<dateSubmitted>07-11-16</dateSubmitted>
<essay_grade></essay_grade>
<AdStudentEssayID>59</AdStudentEssayID>'
The output I expect is:
'<intro1>Test Introduction Paragraph</intro1>
<intro_len>1</intro_len>
<setTopic>%3Cp%3ETest%20Topic%3C%2Fp%3E</setTopic>
<essay_active>1</essay_active>
<essay_date>01-01-70</essay_date>
<essayID>29</essayID>
<topic>Test Essay Topic</topic>
<intro>Test Introduction Paragraph</intro>
<support>Test Supporting Paragraph 2nd test supporting paragraph</support>
<conclusion>Test Conclusion Paragraph</conclusion>
<essay_comment>1</essay_comment>
<submitted>1</submitted>
<dateSubmitted>07-11-16</dateSubmitted>
<essay_grade></essay_grade>
<AdStudentEssayID>59</AdStudentEssayID>'
So the question is, Why is it doing this? Why does it output 'intro','support','conclusion', and 'topic' instead of saving them to the variable?
I'm currently facing a problem here, I have many different strings like these:
"he#00ff00llo"
"#cc9200test"
And so on.
I'm outputting this on an HTML page (Grabbing it through my database)
Anyways, what I want to achieve is to read those #00ff00 and output it as the color itself.
EDIT:
these are usernames, and most of the time they go like this: #ff0000#SomeName | Which I then want to turn into <span style="color: #ff0000">#SomeName</span> oh and the there are usernames with several colorcodes.
EDIT:
My friend gave me this code which solved my problem. :)
IT WAS SOLVED IN PHP
ALSO THE CODE BELOW (Posted by Oriol) IS WORKING! :)
THIS CODE WORKS(PHP):
<?php
function colorCodesRenderProperly($name)
{
$name = htmlspecialchars($name);
if(preg_match('/^(#[0-9a-fA-F]{6})+$/', $name) === 1)
{
return $name;
}
preg_match_all('/#[0-9a-fA-F]{6}/', $name, $codes);
$replaced = array();
$codes_original = $codes;
$i = 0;
$count = 1;
foreach($codes[0] as &$code)
{
if(in_array($codes_original[0][$i], $replaced))
{
continue;
}
$code = sprintf('%02s', dechex((hexdec($code[1].$code[2])/255*128)))
.sprintf('%02s', dechex((hexdec($code[3].$code[4])/255*128)))
.sprintf('%02s', dechex((hexdec($code[5].$code[6])/255*128)));
$name = str_replace($codes_original[0][$i], "<span style=\"color: #$code;\">", $name, $count);
$replaced[] = $codes_original[0][$i];
$i++;
$count = 1;
}
while($i > 0)
{
$name .= "</span>";
$i--;
}
return $name;
}
?>
Try this:
var txt = document.createTextNode("he#00ff00llo"),
wrapper = document.createElement('span'),
regExp = /#[\da-f]{6}/i,
pos;
wrapper.appendChild(txt);
while(~(pos = txt.nodeValue.search(regExp))) {
txt = txt.splitText(pos);
var span = wrapper.cloneNode(false);
span.style.color = txt.nodeValue.substr(0,7);
txt.nodeValue = txt.nodeValue.substr(7);
span.appendChild(txt);
wrapper.appendChild(span);
}
// append wrapper to the DOM
i m using the php code to exectue the code using the cron. i have set the cron time and command in cpanel also.
1). But whenever cron runs i receive a mail
/home/letsview/public_html/getfeed.php: line 1: ?php: No such file or directory
/home/letsview/public_html/getfeed.php: line 3: syntax error near unexpected token `'/home/letsview/public_html/wp-config.php''
/home/letsview/public_html/getfeed.php: line 3: `include_once('/home/letsview/public_html/wp-config.php');'
i have set this command in cpanel "/home/letsview/public_html/getfeed.php"
i have also tried this PHP: Error trying to run a script via Cron job and added this command on the top of the file /usr/local/lib/php/ but it still not working
Here is the code of cron file getfeed.php
<?php
#!/usr/local/lib/php/
include_once('/home/letsview/public_html/wp-config.php');
include_once('/home/letsview/public_html/wp-includes/wp-db.php');
include_once('/home/letsview/public_html/wp-admin/includes/file.php');
include_once('/home/letsview/public_html/wp-admin/includes/image.php');
include_once('/home/letsview/public_html/wp-admin/includes/media.php');
global $wpdb;
//property_type
$xml = simplexml_load_file("/home/letsview/public_html/letsviewproperties.xml",'SimpleXMLElement', LIBXML_NOCDATA);
$TotalPostadded = 0;
$TotalUseradded = 0;
foreach($xml as $child)
{
//Insert Post
$postdata = array();
$postdata['post_title'] = trim($child->title);
$postdata['post_content'] = trim($child->content);
//$postdata['guid'] = trim($child->url);
$postdata['post_status'] = 'publish';
$postdata['post_type'] = 'post';
$postdata['post_date'] = date('Y-m-d H:i:s');
//Insert Post Meta
$postmetadata = array();
$addresstext = trim($child->FullAddress->address1);
if($addresstext != ''){
$addresstext .= ', ';
}
$addresstext .= trim($child->FullAddress->address2);
if($addresstext != ''){
$addresstext .= ', ';
}
$addresstext .= trim($child->FullAddress->address3);
if($addresstext != ''){
$addresstext .= ', ';
}
$addresstext .= trim($child->FullAddress->address4);
$postmetadata['price'] = trim($child->price);
$postmetadata['property_type'] = trim($child->type);
$postmetadata['bed_rooms'] = trim($child->rooms);
$postmetadata['bath_rooms'] = trim($child->bathrooms);
$postmetadata['address'] = $addresstext;
$postmetadata['add_city'] = trim($child->city);
$postmetadata['add_state'] = trim($child->FullAddress->region);
$postmetadata['add_country'] = trim($child->FullAddress->country);
$postmetadata['add_zip_code'] = trim($child->postcode);
$postmetadata['other_guid'] = trim($child->url);
$postmetadata['post_from_feed'] = true;
//Insert Author(agent)
$authordata = array();
$authormetadata = array();
if(!empty($child->agent->agent_name)){
//Author data
$authordata['user_login'] = trim(pg_create_string($child->agent->agent_name));
$authordata['user_nicename'] = trim($child->agent->agent_name);
$authordata['display_name'] = trim($child->agent->agent_name);
$authordata['user_email'] = trim($child->agent->agent_email);
$authordata['user_url'] = trim($child->url);
$authordata['role'] = trim('agent');
$authordata['user_registered'] = date('Y-m-d H:i:s');
//Author meta data
$authormetadata['user_phone'] = trim($child->agent->agent_phone);
$authormetadata['user_address'] = trim($child->agent->agent_address);
}
foreach($child->pictures as $pictures)
{
$postimagedata = array();
$imageloop = 0;
foreach($pictures as $picture)
{
$postimagedata[$imageloop] = (string)$picture->picture_url;
$imageloop++;
}
}
$postmetadata['post_from_feed_images'] = serialize($postimagedata);
if($postdata['post_title'] != ''){
$sql = "select count(post_title) as post from ".$wpdb->prefix."posts where post_title = '".$postdata['post_title']."' and post_status = '".$postdata['post_status']."'";
$sqlresult = $wpdb->get_results($sql);
foreach ( $sqlresult as $post ) {
if($post->post == 0)
{
if(!empty($authordata)){
$user_id = wp_insert_user( $authordata );
if(!empty($user_id) && empty($user_id->errors)){
$TotalUseradded++;
echo "User added = ".$user_id."<br />";
if(!empty($authormetadata)){
foreach($authormetadata as $meta_key=>$meta_value){
add_user_meta( $user_id, $meta_key, $meta_value);
echo "User Meta = ".$meta_key." Inserted<br />";
}
}
}elseif(!empty($user_id->errors)){
$userdata = get_user_by('email', $authordata['user_email']);
$user_id = $userdata->ID;
echo "User fetched = ".$user_id."<br />";
}
$postdata['post_author'] = $user_id;
}
$post_id = wp_insert_post($postdata);
if(!empty($post_id)){
$TotalPostadded++;
echo "<br />"."Post Inserted = ".$post_id;
$properties_category_id = 109;
$cat = "INSERT INTO wp_term_relationships ( object_id, term_taxonomy_id ) VALUES ( '".$post_id."','".$properties_category_id."' )";
$catid = $wpdb->query($cat);
echo "<br />"."Post attached to Category ID = ".$properties_category_id."<br />";
if(!empty($postmetadata)){
foreach($postmetadata as $key=>$value){
add_post_meta($post_id, $key,$value, true);
echo "Post Meta = ".$key." Inserted<br />";
}
}
}
}
}
}
}
$cron = "<br />"."Corn Done";
$cron .= "<br />"."Total Post added = ".$TotalPostadded;
$cron .= "<br />Total User added = ".$TotalUseradded;
echo $cron;
mail('xxxxxx#xxxxx.com','Lets view Properties Corn',$cron);
function pg_create_string($text)
{
// replace all non letters or digits with -
$text = preg_replace('/\W+/', '-', $text);
// trim and lowercase
$text = strtolower(trim($text, '-'));
return $text;
}
?>
Can any one help me??
The start of the file should be:
#!/usr/bin/php
<?php
This assumes that your PHP binary is in the folder /usr/bin. If it isn't, then change the #! line appropriately.
Even better:
#!/usr/bin/env php
<?php
will almost certainly work as it uses the system's env command to work out where php is.
Add this to the very top of your file and chmod the file to execute rights (555 or 775) etc.
#!/usr/local/lib/php
<?php
// your code
Where /usr/local/lib/php is the path to php.
Or if that doesn't work you can change the cron command:
/usr/local/lib/php /home/letsview/public_html/getfeed.php
#!/usr/local/lib/php
<?php
// php code
And you sure the php cli in /usr/local/lib
Because crawling the web can cost a lot of time I want to let pcntl_fork() help me in creating multiple childs to split my code in parts.
Master - crawling the domain
Child - When receiving a link child must crawl the link found on the domain
Child - Must do the same as 2. when receiving new link.
Can i make as many as i want, or do i have to set a maximum of childs?
Here's my code:
class MyCrawler extends PHPCrawler
{
function handlePageData(&$page_data)
{ // CHECK DOMEIN
$domain = $_POST['domain'];
$keywords = $_POST['keywords'];
//$tags = get_meta_tags($page_data["url"]);
//$iKeyFound = null;
$find = $keywords;
$str = file_get_contents($page_data["url"]);
if(strpos($str, $find) == true && $page_data["received"] == true)
{
$keywords = $_POST['keywords'];
if($page_data["header"]){
echo "<table border='1' >";
echo "<tr><td width='300'>Status:</td><td width='500'> ".strtok($page_data["header"], "\n")."</td></tr>";}
else "<table border='1' >";
// PRINT EERSTE LIJN
echo "<tr><td>Page requested:</td><td> ".$page_data["url"]."</td></tr>";
// PRINT STATUS WEBSITE
// PRINT WEBPAGINA
echo "<tr><td>Referer-page:</td><td> ".$page_data["referer_url"]."</td></tr>";
// CONTENT ONTVANGEN?
if ($page_data["received"]==true)
echo "<tr><td>Content received: </td><td>".$page_data["bytes_received"] / 8 . " Kbytes</td></tr></table>";
else
echo "<tr><td>Content:</td><td> Not received</td></tr></table>";
$domain = $_POST['domain'];
$link = mysql_connect('localhost', 'crawler', 'DRZOIDBERGGG');
if (!$link)
{
die('Could not connect: ' . mysql_error());
}
mysql_select_db("crawler");
if(empty($page_data["referer_url"]))
$page_data["referer_url"] = $page_data["url"];
strip_tags($str, '<p><b>');
$matches = $keywords;
//$match = preg_match_all("'/<(*.?)(*.?)>(*.?)'".$keywords."'(*.?)<\/($1)>/'", $str, $matches, PREG_SET_ORDER);
//echo $match;
$doc = new DOMDocument();
$doc->loadHTML($str);
$xPath = new DOMXpath($doc);
$xPathQuery = "//text()[contains(translate(.,'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'), '".strtoupper($keywords)."')]";
$elements = $xPath->query($xPathQuery);
if($elements->length > 0){
foreach($elements as $element){
print "Gevonden: " .$element->nodeValue."<br />";
}}
$result = mysql_query("SELECT * FROM crawler WHERE data = '".$element->nodeValue."' ") ;
if(mysql_num_rows($result)>0)
echo 'Column already exist';
else{
echo 'added';
mysql_query("INSERT INTO crawler (id, domain, url, keywords, data) VALUES ('', '".$page_data["referer_url"]."', '".$page_data["url"]."', '".$keywords."', '".$element->nodeValue. "' )");
}
echo '<br>';
echo "<br><br>";
echo str_pad(" ", 5000); // "Force flush", workaround
flush();
}
FORGOT TO SAY: I NEED A WIN x(86) 32 bits workaround!
Because it's not supported on my client.
I wonder if you wouldn't be better served by going with something like Gearman for this.
It's a job manager that runs on your system and you submit jobs to it (via php if you like), and then it assigns them to workers (again, written in php), who then report back with their result. It's pretty robust and flexible in that you can let it run more workers to handle more workload.
shell_exec does the thing but don't know how to use.
Look into this: http://in.php.net/manual/en/ref.pcntl.php#37369