Member Feed in Mongo DB - php

Which is better in MongoDB to do for member_feed Collection ?
{
"member_id": "153442",
"recent_activity": {
"content_id": "196004",
"content_type_id": "10",
"social_network_id": "9",
},
"_id": ObjectId("5352958667fa3812548e65da"),
"_type": {
"0": "Altibbi_Mongo_MemberFeed"
}
And repeat this object but with different "recent_activity " Object each time an activity happen in social network ,OR to build 1 Document for each member like this and array of recent activity objects
{
"member_id": "153442",
"recent_activity": {
"content_id": "196004",
"content_type_id": "10",
"social_network_id": "9",
},
"recent_activity": {
"content_id": "196005",
"content_type_id": "10",
"social_network_id": "9",
},
"recent_activity": {
"content_id": "196004",
"content_type_id": "10",
"social_network_id": "9",
},
"_id": ObjectId("5352958667fa3812548e65da"),
"_type": {
"0": "Altibbi_Mongo_MemberFeed"
}
Which is better For IO , Inserting , Updating and Selecting . I am using beanstalkd for queuing ?

In case you have a lot of activity records per user you should store it in separate documents (and not in an array as part of one document). Reason: there is a hard limit on document size (16MB).

Related

PHP - Populating a multidimensional array with a sql result, specifying keys to aggregate similar results

I'm trying to have an array that is gonna be passed as JSON to an angular View.
The structure of the sql result is roughly :
Order Number
Supplier
Qty
Date
OrderState
ProductA(including size)
GenericA(excluding size)
LibA
TypeA(Package,Batch,Product)
SizeA
SizeNameA
ProductB
GenericB
LibB
TypeB
SizeB
SizeNameB
QtyPerSizeB
ProductC
GenericC
LibC
TypeC
SizeC
SizeNameC
QtyPerSizeC
It is done with left joins, so the cost concerns the last non null product.
If it's a package :
ProductA -> Package
ProductB -> Batch (Child of A)
ProductC -> Product (Child of B)
If it's a batch or a package with no batches :
ProductA -> Package/Batch
ProductB -> Product
ProductC -> null
If it's only products :
ProductA -> Product
ProductB/C -> null
So I need to have a hierarchy of some sort so I can loop through these objects.
Something similar to this :
{
"order": {
"infos": {},
"products": {
"name": "x",
"type": "Package",
"sizes": {
"1": {
"qty": "",
"price": ""
},
"2": {
"qty": "",
"price": ""
}
},
"children": {
"name": "x",
"type": "Batch",
"sizes": {
"1": {
"qty": "",
"price": ""
},
"2": {
"qty": "",
"price": ""
}
},
"children": {
"name": "x",
"type": "Batch",
"sizes": {
"1": {
"qty": "",
"price": ""
},
"2": {
"qty": "",
"price": ""
}
},
}
}
}
}
}
I tried setting each of them by looping through all the results but due to the complexity, it takes too long to develop/modify/maintain.
So I was wondering if there was any way to specify another array containing keys and instructions (something like "parent of","sum of","child of"), and then merge the sql result into a multidimensional array that follows the structure above.
I look forward to reading your comments, you geniuses, and thanks for reading !
Raekh
Okay no responses.
I ended up creating classes that describe the data that I want.
If you're in need of a similar structure, contact me and I'll share how I did it.

How to filter distinct data in CloudSearch:AWS?

I have a query in aws cloudsearch. I did the following things
1) Created domain
2) uploaded the data & created indexing
I have data fields like : user_id, user_name, user_details, etc
My objective is to get the grouped/distinct data of particular field & its total count. In Cloudsearch Group by / Distinct key words not supported. So, I went through the cloudsearch documentation & done it by adding facet.user_id={} in my query string.
But I need user_name field data along with user_id and count.** Please update me regarding this.
Here is my full query : ?q="Tamil Selvan"&facet.user_id={}
Here is my query result :
{
"status": {
"rid": "isTcmOYp+AEKhpbc",
"time-ms": 6
},
"hits": {
"found": 986,
"start": 0,
"hit": []
},
"facets": {
"user_id": {
"buckets": [{
"value": "5",
"count": 213
}, {
"value": "182",
"count": 197
}]
}
}
}
My expected result :
{
"status": {
"rid": "isTcmOYp+AEKhpbc",
"time-ms": 6
},
"hits": {
"found": 986,
"start": 0,
"hit": []
},
"facets": {
"user_id": {
"buckets": [{
"value": "5",
"user_name":"Tamil Selvan",
"count": 213
}, {
"value": "182",
"user_name":"Tamil Selvi",
"count": 197
}]
}
}
}
The proper solution would be to look up the user_names for the user_id facet values from your datastore (which CloudSearch is not, or at least should not be).
CloudSearch is a search solution; you shouldn't be trying to ask it which user_name belongs to some user_id, as that's a question for your data store.

How to update document in one collection when a document of other collection is changed or updated in MongoDB

I've created two collections "Users" and "Posts".
Users document structure is as follows:
{
"_id": {
"$oid": "54dde0e32a2a999c0f00002a"
},
"first_name": "Vamsi",
"last_name": "Krishna",
"email": "vamshi#test.com",
"password": "5f4dcc3b5aa765d61d8327deb882cf99",
"date_of_birth": "1999-01-05",
"gender": "male",
"status": "Active",
"date_created": "2015-02-13 12:32:50"
}
While posts document structure is:
{
"_id": {
"$oid": "54e1a2892a2a99d00500002b"
},
"post_description": "Test post 1",
"posted_by": {
"id": "54dde0e32a2a999c0f00002a",
"first_name": "Vamsi",
"last_name": "Krishna",
"gender": "male"
},
"posted_on": "2015-02-16 08:55:53",
"comments": [],
"likes": {
"count": 0,
"liked_by": []
}
}
My query is that when user updates his information it should reflect everywhere like posted by, commented by and liked by. How can I achieve that?
I'm using PHP.
Thanks!!
Mongodb does not have a notion similar to sql on update cascade, so you have to do this in your application (whenever you update user information, update all other documents that relate to this user in other collections).
As you might have guessed this is super inefficient when there are a lot of such documents, which means that your schema is bad. Just have a userID in your document and this will link to your user's collection.

Which schema is better in web service API design

Recently, our team is going to develop mobile(iphone, android platforms) applications for our existing website, let user can use the application to more easy to read our content via the application.
But our team have different views in JSON schema of the API return, below are the sample response.
Schema type 1:
{
"success": 1,
"response": {
"threads": [
{
"thread_id": 9999,
"title": "Topic haha",
"content": "blah blah blah",
"category": {
"category_id": 100,
"category_name": "Chat Room",
"category_permalink": "http://sample.com/category/100"
},
"user": {
"user_id": 1,
"name": "Hello World",
"email": "helloworld#hello.com",
"user_permalink": "http://sample.com/user/Hello_World"
},
"post_ts": "2012-12-01 18:16:00T0800"
},
{
"thread_id": 9998,
"title": "asdasdsad ",
"content": "dsfdsfdsfds dsfdsf ds",
"category": {
"category_id": 101,
"category_name": "Chat Room 2",
"category_permalink": "http://sample.com/category/101"
},
"user": {
"user_id": 2,
"name": "Hello baby",
"email": "hellobaby#hello.com",
"user_permalink": "http://sample.com/user/2"
},
"post_ts": "2012-12-01 18:15:00T0800"
}
]
}
}
Schema type 2:
{
"success": 1,
"response": {
"threads": [
{
"thread_id": 9999,
"title": "Topic haha",
"content": "blah blah blah",
"category": 100,
"user": 1,
"post_ts": "2012-12-01 18:16:00T0800"
},
{
"thread_id": 9998,
"title": "asdasdsad ",
"content": "dsfdsfdsfds dsfdsf ds",
"category": 101,
"user": 2,
"post_ts": "2012-12-01 18:15:00T0800"
}
],
"category": [
{
"category_id": 100,
"category_name": "Chat Room",
"category_permalink": "http://sample.com/category/100"
},
{
"category_id": 101,
"category_name": "Chat Room 2",
"category_permalink": "http://sample.com/category/101"
}
],
"user": [
{
"user_id": 1,
"name": "Hello World",
"email": "helloworld#hello.com",
"user_permalink": "http://sample.com/user/Hello_World"
},
{
"user_id": 2,
"name": "Hello baby",
"email": "hellobaby#hello.com",
"user_permalink": "http://sample.com/user/Hello_baby"
}
]
}
}
Some Developers claim that if using schema type 2,
can reduce data size if the category & user entities comes too much duplicated. it does really reduce at least 20~40% size of response plain text.
once if the data size come less, in parsing it to JSON object, the memory get less
categoey & user can be store in hash-map, easy to reuse
reduce the overhead on retrieving data
I have no idea on it if schema type 2 does really enhanced. Because I read so many API documentation, never seen this type of schema design. For me, it looks like a relational database. So I have few questions, because I have no experience on designing a web services API.
Does it against API design principle (Easy to read, Easy to use) ?
Does it really get faster and get less memory resource on parsing on IOS / Android platform?
Does it can reduce the overhead between client & server?
Thanks you.
When I do such an application for android, I parse JSON just one and put it in database. Later I'm using ContentProvider to access it. In Your case You could use 2nd schema but without user, category part. Use lazy loading instead but it will be good solution just in case categories and users repeat often.

Split string into HTML entities

I'm trying to use PHP to create a JSON representation of a paragraph of text, keeping information about links/formatting etc.
Essentially, I want to convert this string:
"Hello <a href='www.google.com'>World!</a>. How are <b>you</b> today?"
Into these 7 JSON objects:
"1": {
"_id": "1",
"_type": "TEXT",
"value": "Hello "
},
"2": {
"_id": "2",
"_type": "TEXT",
"value": "World!",
"_attributes": {
"3": {
"_id": "3",
"_type": "LINK",
"src": "www.google.com"
}
}
},
"4": {
"_id": "4",
"_type": "TEXT",
"value": " How are "
},
"5": {
"_id": "5",
"_type": "TEXT",
"value": "you",
"_attributes": {
"6": {
"_id": "6",
"_type": "FORMATTING",
"bold": true,
}
}
},
"7": {
"_id": "7",
"_type": "TEXT",
"value": " today?"
}
I've hunted the internet/google and found plenty about splitting HTML, but can't seem to describe what I want. I need to separate the plain text from the link/formatting and create a single entity for each.
The "FORMATTING" attribute just adds "bold"/"underline"/"subscript" etc fields as appropriate.
Nested tags will simply create multiple attributes for their text entity.
I don't yet know how I'd handle a 2-word hyperlink that has one word bolded... perhaps it'll have to have 2 hyperlink attributes.
Any help MUCH appreciated!!
A DOMDocument is what you need. If you can live with slightly different names, you barely have to do any work, too.

Categories