I've got the below function, which attempts to match users on specific whitelisted fields, which works brilliantly, for small amounts of data, but in our production environment, we can have > 1 million user records, and Eloquent is (understandably) slow when creating models in: $query->get() at the end. I asked a question this morning about how to speed this up and the accepted answer was brilliant and worked a treat, the only problem now, is that the resulting query being sent to DB::select($query->toSql()... has lost all of the required extra relational information I need. So is there any way (keeping as much of the current function as possible), to add joins to DB::select so that I can maintain speed and not lose the relations, or will it require a complete re-write?
The recipients query should include relations for tags, contact details, contact preferences etc, but the resulting sql from $query->toSql() has no joins and only references the one table.
public function runForResultSet()
{
$params = [];
// Need to ensure that when criteria is empty - we don't run
if (count($this->segmentCriteria) <= 0) {
return;
}
$query = Recipient::with('recipientTags', 'contactDetails', 'contactPreferences', 'recipientTags.tagGroups');
foreach ($this->segmentCriteria as $criteria) {
$parts = explode('.', $criteria['field']);
$fieldObject = SegmentTableWhiteListFields::where('field', '=', $parts[1])->get();
foreach ($fieldObject as $whiteList) {
$params[0] = [$criteria->value];
$dateArgs = ((strtoupper($parts[1]) == "AGE" ? false : DatabaseHelper::processValue($criteria)));
if ($dateArgs != false) {
$query->whereRaw(
DatabaseHelper::generateOperationAsString(
$parts[1],
$criteria,
true
),
[$dateArgs['prepared_date']]
);
} else {
// Need to check for empty value as laravel's whereRaw will not run if the provided
// params are null/empty - In which case we need to use whereRaw without params.
if (!empty($criteria->value)) {
$query->whereRaw(
\DatabaseHelper::generateOperationAsString(
$parts[1],
$criteria
),
$params[0]
);
} else {
$query->whereRaw(
\DatabaseHelper::generateOperationAsString(
$parts[1],
$criteria
)
);
}
}
}
}
// Include any tag criteria
foreach ($this->segmentRecipientTagGroupCriteria as $criteria) {
$startTagLoopTime = microtime(true);
switch (strtoupper($criteria->operator)) {
// IF NULL check for no matching tags based on the tag group
case "IS NULL":
$query->whereHas(
'recipientTags',
function ($subQuery) use ($criteria) {
$subQuery->where('recipient_tag_group_id', $criteria->recipient_tag_group_id);
},
'=',
0
);
break;
// IF NOT NULL check for at least 1 matching tag based on the tag group
case "IS NOT NULL":
$query->whereHas(
'recipientTags',
function ($subQuery) use ($criteria) {
$subQuery->where('recipient_tag_group_id', $criteria->recipient_tag_group_id);
},
'>=',
1
);
break;
default:
$query->whereHas(
'recipientTags',
function ($subQuery) use ($criteria) {
$dateArgs = (DatabaseHelper::processValue($criteria));
$subQuery->where('recipient_tag_group_id', $criteria->recipient_tag_group_id);
if ($dateArgs != false) {
$subQuery->whereRaw(
DatabaseHelper::generateOperationAsString(
'name',
$criteria,
true
),
[$dateArgs['prepared_date']]
);
} else {
// Need to check for empty value as laravel's whereRaw will not run if the provided
// params are null/empty - In which case we need to use whereRaw without params.
if (!empty($criteria->value)) {
$subQuery->whereRaw(
\DatabaseHelper::generateOperationAsString(
'name',
$criteria
),
[$criteria->value]
);
} else {
$subQuery->whereRaw(
\DatabaseHelper::generateOperationAsString(
'name',
$criteria
)
);
}
}
},
'>=',
1
);
}
}
//$collection = $query->get(); // slow when dealing with > 25k rows
$collection = DB::select($query->toSql(), $query->getBindings()); // fast but loses joins / relations
// return the response
return \ApiResponse::respond($collection);
}
By lost relational information do you mean relations eagerly loaded the name of which you passed to with()?
This information was not lost, as it was never in the query. When you load relations like that, Eloquent runs separate SQL queries to fetch related objects for the objects from your main result set.
If you want columns from those relations to be in your result set, you need to explicitely add joins to your query. You can find information about how to do this in the documentation: https://laravel.com/docs/5.1/queries#joins
Related
I have a controller API method where I insert many rows (around 4000 - 8000), before inserting a new row I also check if a venue with the same ame was added already in the zone sothat's another Elouent call, my issue is I usually get timeout errors becuase the row inserting takes too much, I use set_time_limit(0) but this seems too hacky.
I think the key is the validation check I do before inserting a new row.
//Check if there is a venue with same name and in the same zone already added
$alreadyAdded = Venue::where('name', $venue['name'])->whereHas('address', function ($query) use ($address){
$query->where('provinceOrState' , $address['provinceOrState']);
})->orWhere('venueId',$venue['venueId'])->first();
Is there a way I can improve the performance of this method ? This is my complete method call:
public function uploadIntoDatabase(Request $request)
{
set_time_limit(0);
$count = 0;
foreach($request->input('venuesToUpload') as $index => $venue)
{
//Check if there is a venue with same name and in the same zone already added
$alreadyAdded = Venue::where('name', $venue['name'])->whereHas('address', function ($query) use ($address){
$query->where('provinceOrState' , $address['provinceOrState']);
})->orWhere('venueId',$venue['venueId'])->first();
if(!$alreadyAdded)
{
$newVenue = new Venue();
$newVenue->name = $venue['name'];
$newVenue->save();
$count++;
}
}
return response()->json([
'message' => $count.' new venues uploaded to database',
]);
}
use only one request to add the venues
$newVenues = [];
$count = 0;
foreach($request->input('venuesToUpload') as $index => $venue) {
//Check if there is a venue with same name and in the same zone already added
$alreadyAdded = Venue::where('name', $venue['name'])->whereHas('address', function ($query) use ($address){
$query->where('provinceOrState' , $address['provinceOrState']);
})->orWhere('venueId',$venue['venueId'])->count();
if(!$alreadyAdded) {
$newVenues [] = ['name' => $venur['name'];
}
}
if ($newVenues) {
$count = count($newVenues);
Venue::insert($newVenues);
}
As for the verification part, change the first to count cause you dont need to recover the data, just the information that it exists. And since you're verifying with both name and id, you can do some custom query that verifies all values in one query using a static table made from the request inputs and joining on the existing venues table where venues.id = null.
I have a complicated filter for my hotels and in the end i have a collection that I want to sort the parent relations by its nested relationship so here I have as below :
public function resultFilter($from_date, $to_date, $bed_count, $city_id, $stars, $type_id, $hits, $price, $accommodation_name, $is_recommended, $start_price, $end_price, $has_discount, $facility_id)
{
// $data = QueryBuilder::for(Accommodation::class)
// ->allowedFilters(['city_id','grade_stars','accommodation_type_id'])
// ->allowedIncludes('gallery')
// ->when($bed_count, function ($q, $bed_count) {
// $q->with([
// 'accommodationRoomsLimited' => function ($q) use ($bed_count) {
// $q->where('bed_count', $bed_count);
// }
// ]);
// })
// ->paginate(10);
// ->get();
// ->orderBy('hits','DESC')->paginate(10);
$data = Accommodation::with(['city','accommodationFacilities', 'gallery', 'accommodationRoomsLimited.discount', 'accommodationRoomsLimited', 'accommodationRoomsLimited.roomPricingHistorySearch' => function ($query) use ($from_date, $to_date) {
$query->whereDate('from_date', '<=', $from_date);
$query->whereDate('to_date', '>=', $to_date);
}])->when($bed_count, function ($q, $bed_count) {
$q->whereHas('accommodationRoomsLimited', function($query) use ($bed_count) {
$query->where('bed_count', $bed_count);
});
})->when($accommodation_name, function ($query, $accommodation_name) {
$query->where('name', 'like', $accommodation_name);
})->when($is_recommended, function ($query,$is_recommended){
$query->where('is_recommended', $is_recommended);
})->when($start_price, function ($query, $start_price) {
$query->with([
'accommodationRoomsLimited.roomPricingHistorySearch' => function ($q) use ($start_price) {
$q->where('sales_price', '<', $start_price);
}
]);
})->when($has_discount, function ($query, $has_discount) {
$query->with([
'accommodationRoomsLimited' => function ($q) use ($has_discount) {
$q->has('discount');
}
]);
})
->whereIn('city_id', $city_id)
->whereIn('grade_stars', $stars)
->orWhere('accommodation_type_id', $type_id);
if ($hits) { // or == 'blabla'
$data = $data->orderBy('hits','DESC');
} elseif ($price) { // == A-Z or Z-A for order asc,desc
$f = $data->get();
foreach ($f as $datas) {
foreach ($datas->accommodationRoomsLimited as $g) {
dd($data);
$data = $data->accommodationRoomsLimited()->orderBy($g->roomPricingHistorySearch->sales_price);
}
}
}
$data = $data->paginate(10);
return $data;
}
So if you read code I added the sales_price that I want to sort my $data by it if the $price exists in the request. So in a short term question, I want to sort $data by sales_price in this query above.
NOTE
: this filters may get more complicated so any other best practice or better way for that like spatie Query builder or local scopes would be appreciated although i tried both and yet they have their own limitation
I've faced that problem before. And it seems I need to explain a little about eager loading first.
You can't order by eager loading, you can order it after you fetch the data. Because
eager load will split join query for better performance. For example you querying accomodation and has relation with city. The accomodation table has 1000 records and the city table has 10.000 records. let's say the maximum id for eager loading is 250, the unique city_id from accomodation table is 780. There will be 5 query generated.
$data = Accomodation::with('city')->get();
select * from accomodation
select * from city where id in [unique_id_1 - unique_id_250]
select * from city where id in [unique_id_251 - unique_id_500]
select * from city where id in [unique_id_501 - unique_id_750]
select * from city where id in [unique_id_751 - unique_id_780]
Then laravel will do the job to create the relation by city query results. By this method you will fix N+1 problem from join query, thus it's should be faster.
Then imagine you want to order accomodation bycity.name with with method in query builder. let's take the 3rd query for example.
$data = Accomodation::with([
'city' => function($q) { return $q->orderBy('name'); },
])->get();
the query will be:
select * from city where id in [unique_id_251 - unique_id_500] order by name
The city results will be ordered, but laravel will read it the same way. It'll create accomodation first, then relate it with city queries. So the order from city won't affected accomodation order.
Then how to order it? I found out couple ways to achieve that.
Join query. this is the easiest way, but will make query slower. if your data isn't really big and the join query won't hurt your performance. Maybe 0.003 seconds better performance isn't really worth your 8 hours.
sortBy in collection function. You can sort it with a method from collection.
for example if you want to order the accomodation based on country.name from city relation, this script will help you.
$data = Accomodation::with('city.country')->get();
$data->sortBy(function($item) {
return $item->city->country->name;
});
Flatten the collection. This method will try to flatten the collection so the results will be like join query then sorting it. You can use map method from collection. I do believe all the filters and searchable strings is should be included in data.
$data->map(function($item) {
return [
'city_name' => $city->name,
...
all_searchable_data,
all_shareable_data,
...
];
})->sortBy('key1);
Change eager loading direction if possible. You can order it with changing base models. For example you use city instead accomodation to order it by city.name
$data = City::with('accomodation')->orderBy('name')->get();
And last, If your data rarely changes (example every 2 hours), You might thinking to use cache. You only need to invalidate the cache every 2 hours and create the new one. From my experiences, cache always faster than querying database if the data is big. You just need to know the interval or event to invalidate the cache.
Anything you choose is up to you. But please remember this, when you processing bulk data with the collection from laravel, It could be slower than querying from the database. Maybe it's because PHP performance.
For me the best way is using eager loading then ->map() it then cache it. Why do I need to map it first before cache it? The reason is, by selecting some attribute will reduce the cache size. Then you'll be gain more performance by. And I can say it will produce more readable and beatiful code.
Bonus
this is how I doing this.
$data = Cache::remember("accomodation", 10, function() {
$data = Accommodation::with([
'city',
...
])
->get();
return $data->map(function($item) {
return [
'city_name' => $item->city->name,
...
all_additional_data,
all_searchable_data,
all_shareable_data,
...
];
});
}
return $data->search(function($item) use ($searchKey, $searchAnnotiation, $searchValue) {
switch ($searchAnnotiation) {
case '>':
return $item[$searchKey] > $searchValue;
break;
case '<':
return $item[$searchKey] < $searchValue;
break;
}
})->sortBy($sortKey)->paginate();
The cache will save the processed data. thus the execution time needed is fetch data from cache, filter it, and sorting it. then transform it into paginate. you can set any additional cache in those flow for faster results.
$data->paginate() by create macro paginate for Collection.
I'm trying to make a artisan command in Laravel to remove all venues that have the same address and leave the one with the lowest ID number (so first created).
For this I need to check 3 fields: 'street', 'house_number', 'house_number_addition'
This is how far I've got:
$venues = Venue::select('street', 'house_number', 'house_number_addition', DB::raw('COUNT(*) as count'))
->groupBy('street', 'house_number', 'house_number_addition')
->having('count', '>', 1)
->get();
foreach ($venues as $venue) {
$this->comment("Removing venue: {$venue->street} {$venue->house_number} {$venue->house_number_addition}");
$venue->delete();
}
Only the delete is not working but is also not giving an error.
To be able to delete an item, Eloquent needs to know it's id. If you make sure your models' id is queried, you can call delete() without issues.
In your query, however, that won't work because you have a GROUP_BY statement, so SQL doesn't allow you to select the id column (see here).
The easiest solution here is to utilize Eloquent's Collection class to map over the models, something like:
$uniqueAddresses = [];
Venue::all()
->filter(function(Venue $venue) use (&$uniqueAddresses) {
$address = sprintf("%s.%s.%s",
$venue->street,
$venue->house_number,
$venue->house_number_addition);
if (in_array($address, $uniqueAddresses)) {
// address is a duplicate
return $venue;
}
$uniqueAddresses[] = $address;
})->map(function(Venue $venue) {
$venue->delete();
});
Or, to make your delete query a little more efficient (depending on how big your dataset is):
$uniqueAddresses = [];
$duplicates = [];
Venue::all()
->map(function(Venue $venue) use (&$uniqueAddresses, &$duplicates) {
$address = sprintf("%s.%s.%s",
$venue->street,
$venue->house_number,
$venue->house_number_addition);
if (in_array($address, $uniqueAddresses)) {
// address is a duplicate
$duplicates[] = $venue->id;
} else {
$uniqueAddresses[] = $address;
}
});
DB::table('venues')->whereIn('id', $duplicates)->delete();
Note: the last one will permanently delete your models; it doesn't work with Eloquent's SoftDeletes functionality.
You could, of course, also write a raw query to do all this.
My model looks like:
protected $appends = array('status');
public function getStatusAttribute()
{
if ($this->someattribute == 1) {
$status = 'Active';
} elseif ($this->someattribute == 2) {
$status = 'Canceled';
...
} else {
$status = 'Some antoher status';
}
return $status;
}
And I want to order a collection of this models by this status attribute, is it possible?
Model::where(...)->orderBy(???)
p.s. I need exactly orderBy, not sortBy solution.
There is no way to make Eloquent do this because it only creates a SQL query. It does not have the ability to translate the PHP logic in your code into a SQL query.
A work around is to loop over the results afterwards and manually check the appends fields. Collections may be useful here.
Either you can use
Model::where(...)->orderBy('someattribute')->get();
in this case you will only get integer value in place of someattribute, or you can use DB query s follows
DB::select(DB::raw('(CASE WHEN someattribute = 1 THEN "Active" CASE WHEN someattribute = 1 THEN "Canceled" ELSE "Some antoher status" END) AS status'))
->orderBy('someattribute', 'desc');
I have two models, User and Training, with Many to many relationship between them. I'm using the Laravel Datatables package to display a table of all the users. This is how the data controller method (which retrieves the query results and creates a Datatables table) looks like:
public function getData()
{
$users = User::select(array('users.id', 'users.full_name', 'users.email', 'users.business_unit', 'users.position_id'))
->where('users.is_active', '=', 1);
return \Datatables::of($users)
->remove_column('id')
->make();
}
How can I add a column to the created table which displays the total number of relations for each user (that is, how many Trainings does each User have)?
The brute force way would be to try a User::selectRaw(...) which has a built in subquery to get the count of trainings for the user and expose it as a field.
However, there is a more built-in way to do this. You can eager load the relationship (to avoid the n+1 queries), and use the DataTables add_column method to add in the count. Assuming your relationship is named trainings:
public function getData() {
$users = User::with('trainings')->select(array('users.id', 'users.full_name', 'users.email', 'users.business_unit', 'users.position_id'))
->where('users.is_active', '=', 1);
return \Datatables::of($users)
->add_column('trainings', function($user) {
return $user->trainings->count();
})
->remove_column('id')
->make();
}
The name of the column in add_column should be the same name as the loaded relationship. If you use a different name for some reason, then you need to make sure to remove the relationship column so it is removed from the data array. For example:
return \Datatables::of($users)
->add_column('trainings_count', function($user) {
return $user->trainings->count();
})
->remove_column('id')
->remove_column('trainings')
->make();
Edit
Unfortunately, if you want to order on the count field, you will need the brute force method. The package does its ordering by calling ->orderBy() on the Builder object passed to the of() method, so the query itself needs the field on which to order.
However, even though you'll need to do some raw SQL, it can be made a little cleaner. You can add a model scope that will add in the count of the relations. For example, add the following method to your User model:
Note: the following function only works for hasOne/hasMany relationships. Please refer to Edit 2 below for an updated function to work on all relationships.
public function scopeSelectRelatedCount($query, $relationName, $fieldName = null)
{
$relation = $this->$relationName(); // ex: $this->trainings()
$related = $relation->getRelated(); // ex: Training
$parentKey = $relation->getQualifiedParentKeyName(); // ex: users.id
$relatedKey = $relation->getForeignKey(); // ex: trainings.user_id
$fieldName = $fieldName ?: $relationName; // ex: trainings
// build the query to get the count of the related records
// ex: select count(*) from trainings where trainings.id = users.id
$subQuery = $related->select(DB::raw('count(*)'))->whereRaw($relatedKey . ' = ' . $parentKey);
// build the select text to add to the query
// ex: (select count(*) from trainings where trainings.id = users.id) as trainings
$select = '(' . $subQuery->toSql() . ') as ' . $fieldName;
// add the select to the query
return $query->addSelect(DB::raw($select));
}
With that scope added to your User model, your getData function becomes:
public function getData() {
$users = User::select(array('users.id', 'users.full_name', 'users.email', 'users.business_unit', 'users.position_id'))
->selectRelatedCount('trainings')
->where('users.is_active', '=', 1);
return \Datatables::of($users)
->remove_column('id')
->make();
}
If you wanted the count field to have a different name, you can pass the name of the field in as the second parameter to the selectRelatedCount scope (e.g. selectRelatedCount('trainings', 'training_count')).
Edit 2
There are a couple issues with the scopeSelectRelatedCount() method described above.
First, the call to $relation->getQualifiedParentKeyName() will only work on hasOne/hasMany relations. This is the only relationship where that method is defined as public. All the other relationships define this method as protected. Therefore, using this scope with a relationship that is not hasOne/hasMany throws an Illuminate\Database\Query\Builder::getQualifiedParentKeyName() exception.
Second, the count SQL generated is not correct for all relationships. Again, it would work fine for hasOne/hasMany, but the manual SQL generated would not work at all for a many to many relationship (belongsToMany).
I did, however, find a solution to both issues. After looking through the relationship code to determine the reason for the exception, I found Laravel already provides a public method to generate the count SQL for a relationship: getRelationCountQuery(). The updated scope method that should work for all relationships is:
public function scopeSelectRelatedCount($query, $relationName, $fieldName = null)
{
$relation = $this->$relationName(); // ex: $this->trainings()
$related = $relation->getRelated(); // ex: Training
$fieldName = $fieldName ?: $relationName; // ex: trainings
// build the query to get the count of the related records
// ex: select count(*) from trainings where trainings.id = users.id
$subQuery = $relation->getRelationCountQuery($related->newQuery(), $query);
// build the select text to add to the query
// ex: (select count(*) from trainings where trainings.id = users.id) as trainings
$select = '(' . $subQuery->toSql() . ') as ' . $fieldName;
// add the select to the query
return $query->addSelect(DB::raw($select));
}
Edit 3
This update allows you to pass a closure to the scope that will modify the count subquery that is added to the select fields.
public function scopeSelectRelatedCount($query, $relationName, $fieldName = null, $callback = null)
{
$relation = $this->$relationName(); // ex: $this->trainings()
$related = $relation->getRelated(); // ex: Training
$fieldName = $fieldName ?: $relationName; // ex: trainings
// start a new query for the count statement
$countQuery = $related->newQuery();
// if a callback closure was given, call it with the count query and relationship
if ($callback instanceof Closure) {
call_user_func($callback, $countQuery, $relation);
}
// build the query to get the count of the related records
// ex: select count(*) from trainings where trainings.id = users.id
$subQuery = $relation->getRelationCountQuery($countQuery, $query);
// build the select text to add to the query
// ex: (select count(*) from trainings where trainings.id = users.id) as trainings
$select = '(' . $subQuery->toSql() . ') as ' . $fieldName;
$queryBindings = $query->getBindings();
$countBindings = $countQuery->getBindings();
// if the new count query has parameter bindings, they need to be spliced
// into the existing query bindings in the correct spot
if (!empty($countBindings)) {
// if the current query has no bindings, just set the current bindings
// to the bindings for the count query
if (empty($queryBindings)) {
$queryBindings = $countBindings;
} else {
// the new count query bindings must be placed directly after any
// existing bindings for the select fields
$fields = implode(',', $query->getQuery()->columns);
$numFieldParams = 0;
// shortcut the regex if no ? at all in fields
if (strpos($fields, '?') !== false) {
// count the number of unquoted parameters (?) in the field list
$paramRegex = '/(?:(["\'])(?:\\\.|[^\1])*\1|\\\.|[^\?])+/';
$numFieldParams = preg_match_all($paramRegex, $fields) - 1;
}
// splice into the current query bindings the bindings needed for the count subquery
array_splice($queryBindings, $numFieldParams, 0, $countBindings);
}
}
// add the select to the query and update the bindings
return $query->addSelect(DB::raw($select))->setBindings($queryBindings);
}
With the updated scope, you can use the closure to modify the count query:
public function getData() {
$users = User::select(array('users.id', 'users.full_name', 'users.email', 'users.business_unit', 'users.position_id'))
->selectRelatedCount('trainings', 'trainings', function($query, $relation) {
return $query
->where($relation->getTable().'.is_creator', false)
->where($relation->getTable().'.is_speaker', false)
->where($relation->getTable().'.was_absent', false);
})
->where('users.is_active', '=', 1);
return \Datatables::of($users)
->remove_column('id')
->make();
}
Note: as of this writing, the bllim/laravel4-datatables-package datatables package has an issue with parameter bindings in subqueries in the select fields. The data will be returned correctly, but the counts will not ("Showing 0 to 0 of 0 entries"). I have detailed the issue here. The two options are to manually update the datatables package with the code provided in that issue, or to not use parameter binding inside the count subquery. Use whereRaw to avoid parameter binding.
I would setup your DB tables and Eloquent models using the conventions provided at http://laravel.com/docs/4.2/eloquent. In your example you would have three tables.
trainings
training_user
users
Your models would look something like this.
class Training {
public function users() {
return $this->belongsToMany('User');
}
}
class User {
public function trainings() {
return $this->belongsToMany('Training');
}
}
You can then use Eloquent to get a list of users and eager load their trainings.
// Get all users and eager load their trainings
$users = User::with('trainings')->get();
If you want to count the number of trainings per user you can simply iterate over $users and count the size of the trainings array.
foreach ( $users as $v ) {
$numberOfTrainings = sizeof($v->trainings);
}
Or you can simply do it in pure SQL. Note that my example below assumes you follow Laravel's conventions for naming tables and columns.
SELECT
u.*, COUNT(p.user_id) AS number_of_trainings
FROM
users u
JOIN
training_user p ON u.id = p.user_id
GROUP BY
u.id
Now that you have a couple of ways to count the number of relations, you can use whatever method you like to store that value somewhere. Just remember that if you store that number as a value in the user table you'll need to update it every time a user creates/updates/deletes a training (and vice versa!).