UPDATE: I think the cakePhp updateAll is the problem. If i uncomment the updateAll and pr the results i get in 1-2 seconds so many language Detections like in 5 minutes!!!! I only must update one row and can determine that row with author and title... is there a better and faster way???
I'm using detectlanguage.com in order to detect all english texts in my sql database. My Database consists of about 500.000 rows. I tried many things to detect the lang of all my texts faster. Now it will take many days... :/
i only send 20% of the text (look at my code)
i tried to copy my function and run the function many times. the copied code shows the function for all texts with a title starting with A
I only can run 6 functions at the same time... (localhost)... i tried a 7th function in a new tab, but
Waiting for available socket....
public function detectLanguageA()
{
set_time_limit(0);
ini_set('max_execution_time', 0);
$mydatas = $this->datas;
$alldatas = $mydatas->find('all')->where(['SUBSTRING(datas.title,1,1) =' => 'A'])->where(['datas.lang =' => '']);
foreach ($alldatas as $row) {
$text = $row->text;
$textLength = round(strlen($text)*0.2);
$text = substr($text,0,$ltextLength);
$title = $row->title;
$author = $row->author;
$languageCode = DetectLanguage::simpleDetect($text);
$mydatas->updateAll(
['lang' => $languageCode], // fields
['author' => $author,'textTitle' => $title]); // conditions*/
}
}
I hope some one has a idea for my problem... Now the language detection for all my texts will take more than one week :/ :/
My computer runs over 20 hours with only little interruptions... But i only detected the language of about 13.000 texts... And in my database are 500.000 texts...
Now i tried sending texts by batch, but its also to slow... I always send 20 texts in one Array and i think thats the maximum...
Is it possible that the cakePhp 3.X updateAll-function makes it so slowly?
THE PROBLEM WAS THE CAKEPHP updateAll
Now i'm using: http://book.cakephp.org/3.0/en/orm/saving-data.html#updating-data with a for loop and all is fast and good
use Cake\ORM\TableRegistry;
$articlesTable = TableRegistry::get('Articles');
for ($i = 1; $i < 460000; $i++) {
$oneArticle = $articlesTable->get($i);
$languageCode = DetectLanguage::simpleDetect($oneArticle->lyrics);
$oneArticle->lang = $languageCode;
$articlesTable->save($oneSong);
}
Related
I am trying to build an 'analysis' feature for my translation software. A translator will be able to analyze a project which checks for similarities in a glossary.
On a project with 10,000 rows (each row contains a source text between 1-500 characters) with a glossary containing 25,000 terms, my current analysis algorithm takes a RIDICULOUS amount of time. I need to get this down to a couple of minutes maximum.
My algorithm looks something like this (I removed code that doesn't effect performance):
foreach($rows as $row){ //10,000 rows
$source = $rows["source"];
$matchPercent = $glossary->findMatch($source); //This line of code is extremely slow
$matchPercents[$matchPercent]++;
}
//Now I have an array of all the matching percentages and how many rows fall into each percentage match
public function findMatch($source)
{
$highestMatchPercent = 0;
foreach ($this->terms as $term) { //25,000 terms
$matchPercent = 0;
similar_text(strtolower($source), strtolower($term), $matchPercent);
$matchPercent = floor($matchPercent);
if ($matchPercent > $highestMatchPercent) $highestMatchPercent = $matchPercent;
if ($highestMatchPercent == 100) return $highestMatchPercent; //Added effeciency
}
return $highestMatchPercent;
}
How can I achieve similar results and speed this process up?
I've tried levenshtein, but it's max character limit is a problem.
I have this code on my website, this function is called on every webpage, but it's slow (I did a lot of research and without this function the TTFB is about 100ms, but with this it can be even 2 seconds).
The function replaces every text in [] to a link, if a match is found in the card database. E.g.: [Inner Fire] >> and the output on the website will be:
<a href/card/id/name" class="quality1">Inner Fire</a>
It's working really great, but there are 3000 cards in the database and this is slow. Anyone can come up with a better solution to speed the process up?
Thank you in advance.
Some clarifications before the code:
sql_query:
function sql_query($conn, $query)
{
return mysqli_query($conn, $query);
}
Similar function with sql_fetch.
char_convert: converts utf-8 characters to HTML entity (decimal)
function coloredcard($text)
{
global $conn;
$query = "SELECT id, quality, name, collectible FROM cards";
$result = sql_query($conn, $query);
while ($card = sql_fetch($result))
{
$name_replace = strtolower(str_replace(str_split("\' "), "-", $card['name']));
if ($card['collectible']!=0) //if collectible, replace [card_names]
{
$from = '['.char_convert($card['name']).']';
$to = ''.$card['name'].'';
$text = str_ireplace($from, $to, $text);
}
elseif ($card['collectible']==0) //if not collectible replace (noncollectible card names)
{
$from = '('.char_convert($card['name']).')';
$to = ''.$card['name'].'';
$text = str_ireplace($from, $to, $text);
}
}
return $text;
}
Please let me know if you need further information.
The best way to accelerate this code will be to limit the number of cards that need to be fetched from the database. I'm not going to write the code for you, but here's an outline of how that could work:
Extract all the card names which are [linked] in the page, e.g. using preg_match_all().
Perform a single SQL query to load all of those cards, using WHERE name IN ('name1', 'name2', 'name3', …).
Loop through the result of that query and perform replacements on the HTML where appropriate.
Just my 2 cents:
You are not going to display 3000+ cards at the same time do you? So why not implementing an infinite loader which requests only a bunch of them (10 or so) and then asks for more as the user scrolls down?
$query = "SELECT id, quality, name, collectible FROM cards LIMIT ".$offset.",10";
Solution no.2:
have another table in which you store which cards are needed on which page, something like:
cardpage(cardid, pageid)
and have a JOIN query between card and cardpage tables
You can use MySQL's own string functions to do the replacement stuff while fetching data, much faster than iterating in PHP:
https://dev.mysql.com/doc/refman/5.7/en/string-functions.html
You're making 3000 calls to str_replace(). You can accomplish the same result in one. See the docs for str_replace(), notably that the first and second parameters can be arrays:
$search = ['things', 'to', 'search', 'for', ... ];
$replace = ['things', 'to', 'replace', 'with', ... ];
$output = str_replace($search, $replace, $input);
Also, cache the output so that you only have to perform the replacement once.
I've had excellent support here, so I figured I'd try again, as I have no clue where to even begin looking for this answer.
I have a simple MySQL database, named "testimonials", which contains 3 tables, "id", "name", "content"
What I want to do, is display testimonials within a fixed size block. Just simply displaying the content is no problem at all, however where I'm stuck is the (somewhat) unique way I'm trying to make it work. I would like to display a random item on each page load, and then to check the character length of the "content" within the testimonial, and if it's equal to or greater than XX length, then just display the one testimonial, otherwise if it's less than XX in length to display a second testimonial (assuming it combined with the first doesn't break the container box).
The box in question is 362px in width and 353px in height using 14px font with Verdana. An example of how the testimonial will appear on the page is like this:
"This is the testimonial content, some nice message from a client."
-- Billy Bob, Owner, Crazy Joe's Tavern
The "name" table in the database holds everything in bold (minus the -- of course), in case someone felt the need to ask.
As I typed that, I felt as if I was asking for a miracle, however I'll still post the question, hoping someone might just know the answer. As always, thanks for any help I may get, if this is just simply asking too much, I'm not totally against the idea of only displaying one testimonial at a time making a ground rule saying they have to contain a minimum of XX characters.
Thanks!
Quick Update: I didn't expect to get answers so quickly, I'm not at my desk at the moment, so as soon as I sit back down I'll go through and see which answer fits best. However, do you guys get together and try to make your answer more complex than the previous answer? lol, thanks though, for anyone who's offering help, you guys rock!
Final edit: I decided against this whole idea, as it just way over complicated everything. For the time being, I'm just going to display all testimonials, and make them scroll, while I work on a jQuery snippet to make it prettier. Thanks everyone for your help though! Should I decide again to do this, I'll be trying my chosen answer.
You just need a loop. Pseudo-code:
$length = 0;
$target = 200; // or whatever
while( $length < $target ) {
$comment = getOneComment();
displayComment($comment);
$length += strlen( $comment['content'] ); // assuming getOneComment() returns an associative array
}
To make it pretty, if the display box is going to a be a fixed height, you could use some jQuery to toggle whether to show the second comment on not.
Assuming you have testimonials in an array:
$testimonials = array(
't1' => array(
'content' => 'testimonials 1 content..',
'author' => 'the author'
),
't2' => array(
'content' => 'testimonials 2 content..',
'author' => 'the author 2'
),
);
You could have an maxLengthTestimonialsContent and maxLenthAllTestimonnials variable :
$maxLengthTestimonialsContent = 120;
$maxLenthAllTestimonnials = 240;
And now with a simple loop you build the array testimonials that you will use to show:
$testimonialsToShow = array();
$i = 1; $totalLength = 0
foreach($testimonials as $t) {
if( $i > 1 && strlen( $t['content']) < $maxLengthTestimonialsContent
&& $totalLength < $maxLenthAllTestimonnials )
break; // basically here you test that testimonials less first
// and with less length than maxLengthTestimonial, and also
// total length less than maxLengthAll to be stored
//in $testimonialsToShow
else {
$testimonialsToShow[] = $t;
$totalLength = $t['content'];
}
}
Something like this is what you would need.
<?php
$str = $res["testimonial"];
if (strlen($str) > 50) {
// Logic to retrieve and display second testimonial
}
?>
Obviously there's some more processing you'll have to come up with to determine if the second testimonial is short enough to fit or not. But that should get you started.
EDIT:
For the randomization, I use this on my own site:
$referrals = mysql_query("SELECT id FROM ts_testimonials");
$referralView = array();
$i = 0;
while ($newReferral = mysql_fetch_array($referrals)) {
$referralView[$i] = $newReferral['id'];
$i++;
}
if (sizeof($referralView) >= 1){
$referralTop = rand(0,sizeof($referralView)-1);
$newReferralTop = mysql_fetch_array(mysql_query("SELECT * FROM ts_testimonials WHERE id = '".$referralView[$referralTop]."'"));
if (sizeof($referralView) >=2){
$referralBottom = rand(0,sizeof($referralView)-1);
while ($referralBottom == $referralTop) {
$referralBottom = rand(0,sizeof($referralView)-1);
}
$newReferralBottom = mysql_fetch_array(mysql_query("SELECT * FROM ts_testimonials WHERE id = '".$referralView[$referralBottom]."'"));
}
}
I'm not especially experienced in PHP, but I've been trying to create a basic blog for a friend's website. I thought the easiest thing to do for now would be to use static files, so I'm using XML to store the blog entries. I've managed to set it up perfectly in that I can display the posts as I want them. However, I now want a nav bar which will allow me to select posts based on date, as most blogs have. The files are simply named 1.xml, 2.xml, 3.xml etc. so I can iterate through them. Here's the code that shows how the data array is organised (it's an array within an array so that the first level will be equivalent to the number in the filename +1). So I'm having a lot of trouble working out how I can create the nav bar (ul, li etc.) from this data. Presumably I'd need years to be unique and then the months in the years to be unique and also with the days, then I can have each title (obviously a link) come under the proper date.
$data = array();
for ($i = 1; $i <= $numberOfPosts; $i++) {
$filename = './blogentries/' . $i . '.xml';
if (!file_exists($filename))
throw new Exception();
$blogentry = simplexml_load_file($filename);
$title = $blogentry->title;
$dateD = $blogentry->date->day;
$dateM = $blogentry->date->month;
$dateY = $blogentry->date->year;
if (strlen($dateY) === 2) $dateY = '20' . $dateY;
$entryParagraphs = $blogentry->entry->children();
$data[] = array(
(string)$title,
array(
(string)$dateY,
(string)$dateM,
(string)$dateD
),
$entryParagraphs
);
}
Thanks for any help you can give. And sorry if I've not been as eloquent as I might have been, I hope you'll forgive my relative ignorance!
From what I understand, I will go with this type of solution :
First of all, if you do know a little bit of OOP, please create an Article class.
After that, here is what I would do for what you are asking :
Instead of creating an array (which should in fact be a class, (the first array, I don't know if you do realize that), I would do this array :
$data[$dateY][$dateM][$dateD]=$blogentry;
Then, you have all your articles classified by Year, then month, then day, so it becomes really simple to end with your request.
edit :
When I said it should be a class, I'm talking about this array :
array(
(string)$title,
array(
(string)$dateY,
(string)$dateM,
(string)$dateD
),
$entryParagraphs
)
It's typically what a class is designed for.
I'm having an issue producing this array multiple times upon how many coupons purchased.
Now it looks like
$coupon_array = array(
'user_id'=>$_POST["user_id"],
'mergent_id'=>$_POST["merchant_id"],
'deals_id'=>$_POST["deal_id"],
'order_id'=>$order_id,
'secret'=>$secret,
'expire_time'=>$time,
'create_time'=>$time,
'status'=>1
);
$this->common_model->insertData('coupon', $coupon_array);
But i have a post value such as:
"quantity"=>$_POST["quantity"]
and i would like to produce this X times. Example:
$quantity x $this->common_model->insertData('coupon', $coupon_array);
Sorry for my english, and i hope i explain this so it's understandable... ;)
Another one! when we insert the coupons they all have the same md5($secret), is it possible to have that also with all the different code...
$secret = md5($secret);
$coupon_array = array(
'user_id'=>$_POST["user_id"],
'mergent_id'=>$_POST["merchant_id"],
'deals_id'=>$_POST["deal_id"],
'order_id'=>$order_id,
'secret'=>$secret,
'expire_time'=>$time,
'create_time'=>$time,
'status'=>1
);
Well, if I understand what you want, you can use for, but that's obvious:
for($i=0; $i<$this->input->post('quantity');$i++) {
$coupon_array['secret'] = md5($coupon_array['secret'].$i);
$this->common_model->insertData('coupon', $coupon_array);
}
Also, never use $_POST["..."] in CodeIgniter, use only $this->input->post('...') as it escapes properly. More info about input class can be found here.
for ($i=0; $i<$quanity; $i++) {
$this->common_model->insertData('coupon', $coupon_array);
}