I'm using a Google Analytics API Class in PHP made by Doug Tan to retrieve Analytics data from a specific profile.
Check the url here: http://code.google.com/intl/nl/apis/analytics/docs/gdata/gdataArticlesCode.html
When you create a new instance of the class you can add the profile id, your google account + password, a daterange and whatever dimensions and metrics you want to pick up from analytics.
For example i want to see how many people visited my website from different country's in 2009.
//make a new instance from the class
$ga = new GoogleAnalytics($email,$password);
//website profile example id
$ga->setProfile('ga:4329539');
//date range
$ga->setDateRange('2010-02-01','2010-03-08');
//array to receive data from metrics and dimensions
$array = $ga->getReport(
array('dimensions'=>('ga:country'),
'metrics'=>('ga:visits'),
'sort'=>'-ga:visits'
)
);
Now you know how this API class works, i'd like to adress my problem.
Speed. It takes alot of time to retrieve multiple types of data from the analytics database, especially if you're building different arrays with different metrics/dimensions. How can i speed up this process?
Is it possible to store all the possible data in a cache so i am able to retrieve the data without loading it over and over again?
You can load the data in a cache sure, precisely how/where the data is cached is entirely up to you. You can use anything from per-request caching (which will be pretty useless for this particular problem) to things like APC, memcached, a local database or even just saving the raw results to files. These will not make the actual retrieval of the data from Google any quicker of course.
On that note, it is likely (not having seen the code) that the requests over to Google are probably being executed sequentially. It is likely possible to extend the PHP class to allow requesting multiple sets of data from Google in parallel (e.g. with cURL Multi).
Faced the same problem and decided to use a cronjob and save the data in a .json file I can use for the display.
$globalTrendData = $client->runReport([
'property' => 'properties/' . $property_id,
'dateRanges' => [
new DateRange([
'start_date' => '20daysAgo',
'end_date' => 'yesterday',
]),
],
'dimensions' => [
new Dimension(['name' => 'pagePath',]),
new Dimension(['name' => 'pageTitle',]),
new Dimension(['name' => 'city',]),
new Dimension(['name' => 'sessionSource',]),
new Dimension(['name' => 'date',])
],
'metrics' => [
new Metric(['name' => 'screenPageViews',]),
new Metric(['name' => 'userEngagementDuration',]),
new Metric(['name' => 'activeUsers',]),
]
]);
foreach ($globalTrendData->getRows() as $key => $row) {
$saved['globalTrendData'][$key]['dimension']['pagePath'] = (array) $row->getDimensionValues()[0]->getValue() ;
$saved['globalTrendData'][$key]['dimension']['pageTitle'] = (array) $row->getDimensionValues()[1]->getValue() ;
$saved['globalTrendData'][$key]['dimension']['city'] = (array) $row->getDimensionValues()[2]->getValue() ;
$saved['globalTrendData'][$key]['dimension']['source'] = (array) $row->getDimensionValues()[3]->getValue() ;
$saved['globalTrendData'][$key]['dimension']['date'] = (array) $row->getDimensionValues()[4]->getValue() ;
$saved['globalTrendData'][$key]['metric']['screenPageViews'] = (array) $row->getMetricValues()[0]->getValue() ;
$saved['globalTrendData'][$key]['metric']['userEngagementDuration'] = (array) $row->getMetricValues()[1]->getValue() ;
$saved['globalTrendData'][$key]['metric']['activeUsers'] = (array) $row->getMetricValues()[2]->getValue() ;
}
file_put_contents($GLOBALS['serverPath'].'/monitoring/statistics.json',json_encode($saved, JSON_PRETTY_PRINT)) ;
Json file output exemple :
"globalTrendData": {
"0": {
"dimension": {
"pagePath": {
"0": "\/modeles-maison\/liste"
},
"pageTitle": {
"0": "Plans de maisons 100% personnalisables - adapt\u00e9s \u00e0 votre style et \u00e0 votre budget"
},
"city": {
"0": "(not set)"
},
"source": {
"0": "(direct)"
},
"date": {
"0": "20220128"
}
},
"metric": {
"screenPageViews": {
"0": "18"
},
"userEngagementDuration": {
"0": "152"
},
"activeUsers": {
"0": "1"
}
}
}
}
Related
I have a collection like this
{
"name": "Sai Darshan"
}
{
"name": "Sathya"
}
{
"name": "Richie"
}
I want to match the documents with the name "Sathya" and "Richie".
How can I achieve this using $match.
I currently tried this
$db = $this->dbMongo->selectDB("userData");
$collection = $db->selectCollection("userObject");
$aggregationFields = [
[
'$match' => [
'name'=> 'Sathya',
'name'=> 'Richie',
]
]
];
$cursor = $collection->aggregate($aggregationFields)->toArray();
Currently I am getting only the document
{
"name": "Richie"
}
I am expecting to fetch both documents i.e. the documents with the name "Sathya" and "Richie".
I expect to do this with $match itself because I have further pipelines I want to pass this data to.
Is there anyway I can achieve this?.
Any help is appreciated.
Thank you.
#nimrod serok answered in the comments, which is to use the $in operator.
What is probably happening in with the query in the description is that the driver is de-duplicating the name entry. So the query that the database receives only includes the filter for 'name'=> 'Richie'. You can see some reference to that here in the documentation, and javascript itself will also demonstrate this behavior:
> filter = { name: 'Sathya', name: 'Richie' };
{ name: 'Richie' }
>
I am very new to MongoDB and PHP and I am trying to add an object into an already existing document.
The problem is that I am seeing the $push variable being used a lot, but can't seem to get it working on my own project.
My Document
{
"_id": {
"$oid": "622dfd21f8976876e162c303"
},
"name": "testuser",
"password": "1234"
}
Lets say this is a document inside my collection users. I want to add an object that is called domains with some data inside that object like below:
"domains": {
"name": "example.com"
}
I have tried a bunch of different things, but they do not seem to work.
What I tried:
I have tried doing to like this:
$doc = array(
"domains" => array("name": "nu.nl")
);
$collection->insert($doc);
And like this:
$insertOneResult = $collection->updateOne(
{"name" : "testuser"},
'$push': {"domains": {"name": "example.com"}}
);
But they both do not work.
Can anyone help me with this problem or comment a link which would help me with this problem?
Thanks in advance!
You were really close!
db.users.update({
"name": "testuser"
},
{
"$push": {
"domains": {
"name": "example.com"
}
}
})
Try it on mongoplayground.net.
I haven't used php since the ancient gforge days, but my guess at using the php connector would be (corrections welcome!):
<?php
$collection = (new MongoDB\Client)->test->users;
$updateResult = $collection->updateOne(
['name' => 'testuser'],
['$push' => ['domains' => ['name' => 'example.com']]]
);
printf("Matched %d document(s)\n", $updateResult->getMatchedCount());
printf("Modified %d document(s)\n", $updateResult->getModifiedCount());
I implemented elasticsearch using php for binary documents (fscrawler). It works just fine with the default settings. I can search the documents for the word I want and I get results that are case insensitive. However, I now want to do exact matches i.e on top of the current search, if the query is enclosed in quotes, I want to get results that only match the query exactly.. even case sensitive.
My mapping looks like this:
"settings": {
"number_of_shards": 1,
"index.mapping.total_fields.limit": 2000,
"analysis": {
"analyzer": {
"fscrawler_path": {
"tokenizer": "fscrawler_path"
}
},
"tokenizer": {
"fscrawler_path": {
"type": "path_hierarchy"
}
}
}
.
.
.
"content": {
"type": "text",
"index": true
},
My query for the documents looks like this:
if ($q2 == '') {
$params = [
'index' => 'trial2',
'body' => [
'query' => [
'match_phrase' => [
'content' => $q
]
]
]
];
$query = $client->search($params);
$data['q'] = $q;
}
For exact matches(does not work):
if ($q2 == '') {
$params = [
'index' => 'trial2',
'body' => [
'query' => [
'filter' =>[
'term' => [
'content' => $q
]
]
]
]
];
$query = $client->search($params);
$data['q'] = $q;
}
content field is the body of the document. How do I implement the exact match for a specific word or phrase in the content field?
Your content field, what I understand, would be significantly large as many documents may be more than 2-3 MB and that's a lot of words.
There'd be no point in using keyword field in order to do exact match as per the answer to your earlier question where I referred to using keyword. You should use keyword datatype for exact match only if your data is structured
What I understand is the content field you have is unstructured. In that case you would want to make use of Whitespace Analyzer on your content field.
Also for exact phrase match you make take a look at Match Phrase query.
Below is a sample index, documents and queries that would suffice your use case.
Mapping:
PUT mycontent_index
{
"mappings": {
"properties": {
"content":{
"type":"text",
"analyzer": "whitespace" <----- Note this
}
}
}
}
Sample Documents:
POST mycontent_index/_doc/1
{
"content": """
There is no pain you are receding
A distant ship smoke on the horizon
You are only coming through in waves
Your lips move but I can't hear what you're saying
"""
}
POST mycontent_index/_doc/2
{
"content": """
there is no pain you are receding
a distant ship smoke on the horizon
you are only coming through in waves
your lips move but I can't hear what you're saying
"""
}
Phrase Match:(To search a sentence with words in order)
POST mycontent_index/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": { <---- Note this for phrase match
"content": "There is no pain"
}
}
]
}
}
}
Match Query:
POST mycontent_index/_search
{
"query": {
"bool": {
"must": [
{
"match": { <---- Use this for token based search
"content": "there"
}
}
]
}
}
}
Note that your response should be accordingly.
For exact match for a word, just use a simple Match query.
Note that when you do not specify any analyzer, ES by default uses Standard Analyzer and this would cause all the tokens to be converted into lower case before storing them in Inverted Index. However, Whitespace Analyzer would not convert the tokens into lower case. As a result There and there are stored as two different tokens in your ES index.
I'm assuming you are aware of Analysis and Analyzer concepts and if not I'd suggest you to go through the links as that will help you know more on what I'm talking about.
Updated Answer:
Post understanding your requirements, there is no way you can apply multiple analyzers on a single field, so basically you have two options:
Option 1: Use multiple indexes
Option 2: Use multi-field in your mapping as shown below:
That way, your script or service layer would have the logic of pushing to different index or field depending on your input value(ones having double inverted comma and ones that are simple tokens)
Multi Field Mapping:
PUT <your_index_name>
{
"mappings":{
"properties":{
"content":{
"type":"text", <--- Field with standard analyzer
"fields":{
"whitespace":{
"type":"text", <--- Field with whitespace
"analyzer":"whitespace"
}
}
}
}
}
}
Ideally, I would prefer to have the first solution i.e making use of multiple indexes with different mapping, however I would strongly advise you to revisit your use-case because it doesn't make sense in managing querying like this but again its your call.
Note: A cluster of single node that's the worst possible option you can ever do and specially not for Production.
I'd suggest you ask that in separate question detailing your docs count, growth rate over next 5 years or something and would your use case be more read heavy or write intensive? Is that cluster something other teams may also would want to leverage. I'd suggest you to read more and discuss with your team or manager to get more clarity on your scenarios.
Hope this helps.
I need to get the duration of all the videos in a Youtube playlist.
I know that the API does not show me the duration of each video when doing the search of all, but it does show it if the query is made for a particular video.
Through PHP I tried to collect all the IDs from the playlist and then analyze each ID to get the data from the videos, but the script is too slow, although it should be stressed that it works well, is there any way to optimize it?...
function youtube_automusic($listas, $api_key, $resultados){
$nresultados = $resultados;
$lista_reproduccion_random = $listas;
$lista_reproduccion = $lista_reproduccion_random[array_rand($lista_reproduccion_random)];
$url_playlist = "https://www.googleapis.com/youtube/v3/playlistItems?part=snippet&fields=items(snippet(resourceId(videoId)))&type=video&videoCategoryId=10&maxResults=".$nresultados."&playlistId=".$lista_reproduccion."&key=".$api_key;
$data = dlPage($url_playlist);
$data_decode = json_decode($data, true);
$number_song = 1;
$respuesta = array();
foreach ($data_decode as $items){
foreach ($items as $item){
$lista_ids =$item['snippet']['resourceId']['videoId'];
$url_video = "https://www.googleapis.com/youtube/v3/videos?id=".$lista_ids."&part=snippet,contentDetails&fields=items(etag,id,snippet(publishedAt,title,thumbnails(default(url)),tags),contentDetails(duration))&key=".$api_key;
$data_video = dlPage($url_video);
$data_video_decode = json_decode($data_video, true);
foreach ($data_video_decode as $items_videos){
foreach ($items_videos as $item_video){
$data_final = array(
'etag' => $item_video['etag'],
'idvideo' => $item_video['id'],
'titulovideo' => $item_video['snippet']['title'],
'thumbnail' => $item_video['snippet']['thumbnails']['default']['url'],
'duracion' => $item_video['contentDetails']['duration'],
'videoplay' => $number_song++
);
array_push($respuesta, $data_final);
}
}
}
}
return json_encode($respuesta);
}
With your code on a 50 item playlist it would take 51 API calls.
Instead of doing a single videos request for each video in the playlist, get all the video IDs in the playlist first and then make videos requests for up to 50 at a time (the ID parameter takes a comma-separated list of up to 50 items).
Then a 50 item playlist would only take 2 API calls.
Should be much faster.
I just ran a test here
Request:
GET https://www.googleapis.com/youtube/v3/videos?part=snippet%2CcontentDetails&id=Ks-_Mh1QhMc&fields=items(etag%2Cid%2Csnippet(publishedAt%2Ctitle%2Cthumbnails(default(url))%2Ctags)%2CcontentDetails(duration))&key={YOUR_API_KEY}
Results:
{
"items": [
{
"etag": "\"RmznBCICv9YtgWaaa_nWDIH1_GM/aCBUdsaX0W34z3It8a8FCh5uteo\"",
"id": "Ks-_Mh1QhMc",
"snippet": {
"publishedAt": "2012-10-01T15:27:35.000Z",
"title": "Your body language may shape who you are | Amy Cuddy",
"thumbnails": {
"default": {
"url": "https://i.ytimg.com/vi/Ks-_Mh1QhMc/default.jpg"
}
},
"tags": [
"Amy Cuddy",
"TED",
"TEDTalk",
"TEDTalks",
"TED Talk",
"TED Talks",
"TEDGlobal",
"brain",
"business",
"psychology",
"self",
"success"
]
},
"contentDetails": {
"duration": "PT21M3S"
}
}
]
}
I suggest that you run the same request using the Google APIs Explorer using the video id that you are having an issue with to verify that its not an issue with the API not returning your duration.
Background Information
I have the following data in my mongo database:
{ "_id" :
ObjectId("581c97b573df465d63af53ae"),
"ph" : "+17771111234",
"fax" : false,
"city" : "abd",
"department" : "",
"description" : "a test"
}
I am now writing a script that will loop through a CSV file that contains data that I need to append to the document. For example, the data might look like this:
+17771111234, 10:15, 12:15, test#yahoo.com
+17771111234, 1:00, 9:00, anothertest#yahoo.com
Ultimately I want to end up with a mongo document that looks like this:
{ "_id" :
ObjectId("581c97b573df465d63af53ae"),
"ph" : "+17771111234",
"fax" : false,
"city" : "abd",
"department" : "",
"description" : "a test",
"contact_locations": [
{
"stime": "10:15",
"etime": "12:15",
"email": "test#yahoo.com"
},
{
"stime": "1:00",
"etime": "9:00",
"email": "anothertest#yahoo.com"
},
]
}
Problem
The code I've written is actually creating new documents instead of appending to the existing ones. And actually, it's not even creating a new document per row in the CSV file... which I haven't debugged enough yet to really understand why.
Code
For each row in the csv file, I'm running the following logic
while(!$csv->eof() && ($row = $csv->fgetcsv()) && $row[0] !== null) {
//code that massages the $row into the way I need it to look.
$data_to_submit = array('contact_locations' => $row);
echo "proving that the record already exists...: <BR>";
$cursor = $contact_collection->find(array('phnum'=>$row[0]));
var_dump(iterator_to_array($cursor));
echo "now attempting to update it....<BR>";
// $cursor = $contact_collection->update(array('phnum'=>$row[0]), $data_to_submit, array('upsert'=>true));
$cursor = $contact_collection->insert(array('phnum'=>$row[0]), $data_to_submit);
echo "AFTER UPDATE <BR><BR>";
$cursor = $contact_collection->find(array('phnum'=>$row[0]));
var_dump(iterator_to_array($cursor));
}
}
Questions
Is there a way to "append" to documents? Or do I need to grab the existing document, save as an array, merge my contact locations array with the main document and then resave?
how can I query to see if the "contact_locations" object already exists inside a document?
Hi yes you can do it !
1st you need to find your document and push the new value you need :
use findAndModify and $addToSet :
$cursor = $contact_collection->findAndModify(
array("ph" => "+17771111234"),
array('$addToSet' =>
array(
"contact_locations" => array(
"stime"=> "10:15",
"etime"=> "12:15",
"email"=> "test#yahoo.com"
)
)
)
);
The best part is $addToSet wont add 2 time the same stuff so you will not have twice the same value :)
Here the docs https://docs.mongodb.com/manual/reference/operator/update/addToSet/
I'm not sure the exact syntax in PHP as I've never done it before but I'm currently doing the same thing in JS with MongoDB and $push is the method you're looking for. Also if I may be a bit nitpicky I recommend changing $contact_collection to $contact_locations as a variable name. Array variable names are usually plural and being more descriptive is always better. Also make sure you find the array in the MongoDB first that you want to append to and that you use the MongoDb "update" command