Parsing this json file using PHP - php

I have the following json file. I need code in php to parse it. I tried all possible ways but no luck. I am not an expert in json parsing.
{
"_id" : { "oid" : "5213785fe4b0780ba56884d3" },
"author" : "Remate.ph",
"message" : "Suspected ASG abducts trader in Zamboanga City http://t.co/4hUttNPI",
"time_created" : 1357958119000,
"version" : "v2.1",
},
{
"_id" : { "oid" : "5213785fe4b0780ba56884d6" },
"author" : "Clydepatrick Jayme ",
"message" : "RT #iFootballPlanet: 74' Osasuna 0 - 0 Real Madrid\n\n-ASG.",
"time_created" : 1358022721000,
"version" : "v2.1",
}
I modified the above json file in the following way. And I wrote the php parsing code for it and it works fine.
{
"info1":{
"_id" : { "oid" : "5213785fe4b0780ba56884d3" },
"author" : "Remate.ph",
"message" : "Suspected ASG abducts trader in Zamboanga City http://t.co/4hUttNPI",
"time_created" : 1357958119000,
"version" : "v2.1",
},
"info2":{
"_id" : { "oid" : "5213785fe4b0780ba56884d6" },
"author" : "Clydepatrick Jayme ",
"message" : "RT #iFootballPlanet: 74' Osasuna 0 - 0 Real Madrid\n\n-ASG.",
"time_created" : 1358022721000,
"version" : "v2.1",
}
}
I used the following code to parse it.
<?php
//$string=file_get_contents("/Users/Anirudh/Downloads/test.json");
$string=file_get_contents("/Users/Anirudh/Sem-1/RA/Test/test.json");
$json_a=json_decode($string,true);
$jsonIterator = new RecursiveIteratorIterator(new RecursiveArrayIterator($json_a),RecursiveIteratorIterator::SELF_FIRST);
//$jsonIterator = new RecursiveArrayIterator(json_decode($string, TRUE));
foreach ($jsonIterator as $key => $val) {
echo "$key => $val\n";
}
?>
Can someone help me in getting the first json format parsed using PHP ?

Your json file has two errors:
Two 1st level items separated by comma (info1 and info2) must be an array, wrap the whole string with brackets [ and ].
Last 2nd level items (version) - remove trailing commas "version" : "v2.1",.
Corrected json:
[{
"_id" : { "oid" : "5213785fe4b0780ba56884d3" },
"author" : "Remate.ph",
"message" : "Suspected ASG abducts trader in Zamboanga City http://t.co/4hUttNPI",
"time_created" : 1357958119000,
"version" : "v2.1"
},
{
"_id" : { "oid" : "5213785fe4b0780ba56884d6" },
"author" : "Clydepatrick Jayme ",
"message" : "RT #iFootballPlanet: 74' Osasuna 0 - 0 Real Madrid\n\n-ASG.",
"time_created" : 1358022721000,
"version" : "v2.1"
}]

Related

Reading data from an array with json

In a php script I have the following
$jsonurlpers = "https://api.openarch.nl/1.1/records/show.json?archive=nha&identifier=8554bfba-9fd9-4ca2-876c-feb7da095d6c";
$jsondatapers = file_get_contents($jsonurlpers);
That string I want to decode with the next command
$jsonpers = json_decode($jsondatapers ,true)[0];
But somewhere it did not work...
When I go to to the #jsonulpers I see :
Event
EventType "Overlijden"
EventDate
Year "1901"
Month "7"
Day "7"
EventPlace
Place "Egmond-Binnen"
RelationEP
Can some one tell me how I can read the value behind EventType in a variable? I tried different options but none works...
Thanks,
Fred
I did try this
e.g.
$itemevent = $jsonpers['Event'];
$itemeventtype = $itemevent['EventType']);
abs I thought
$itemeventtype would have a value of "Overlijden", but it was empty
The (json-)output from the url you gave looks like this:
{
"Event" : {
"EventType" : "Overlijden",
"EventDate" : {
"Month" : "7",
"Day" : "7",
"Year" : "1901"
},
"EventPlace" : {
"Place" : "Egmond-Binnen"
}
},
"RelationEP" : [
{
"RelationType" : "Overledene",
"EventKeyRef" : "Event1",
"PersonKeyRef" : "Person1"
},
{
"PersonKeyRef" : "Person2",
"RelationType" : "Vader",
"EventKeyRef" : "Event1"
},
{
"PersonKeyRef" : "Person3",
"RelationType" : "Moeder",
"EventKeyRef" : "Event1"
}
],
"Source" : {
"SourcePlace" : {
"Place" : "Egmond-Binnen / Egmond-Binnen"
},
"SourceAvailableScans" : {
"Scan" : {
"Uri" : "https://nha.blob.core.windows.net/scans/BS%20Overlijden/Egmond-Binnen/1901/RNH_O_EGB_1901_006-a.jpg",
"OrderSequenceNumber" : "1"
}
},
"SourceLastChangeDate" : "2015-07-31",
"SourceRemark" : {
"Value" : "Datadump ExportBS+Overlijden_20210302_102010.csv van NHA via e-mail"
},
"SourceReference" : {
"InstitutionName" : "Noord-Hollands Archief",
"DocumentNumber" : "14",
"Place" : "Haarlem"
},
"RecordGUID" : "{8554bfba-9fd9-4ca2-876c-feb7da095d6c}",
"SourceIndexDate" : {
"To" : "1901-12-31",
"From" : "1901-01-01"
},
"SourceDate" : {
"Year" : "1901",
"Month" : "7",
"Day" : "8"
},
"SourceType" : "BS Overlijden"
},
"Person" : [
{
"BirthPlace" : {
"Place" : "Egmond-Binnen"
},
"PersonName" : {
"PersonNameLastName" : "Baltus",
"PersonNameFirstName" : "Aafje"
},
"Gender" : "Vrouw",
"Age" : {
"PersonAgeYears" : "8 maanden"
}
},
{
"PersonName" : {
"PersonNameFirstName" : "Jan",
"PersonNameLastName" : "Baltus"
}
},
{
"PersonName" : {
"PersonNameLastName" : "Kuijper",
"PersonNameFirstName" : "Grietje"
}
}
]
}
After "Person":, you see a [, which means there is an array of Persons.
The first person does have a "BirthPlace", for the other persons it seems to be unknown what the "BirthPlace" is, because it is not mentioned in the json.
print($jsonpers["Person"][0]["PersonName"]["PersonFirstName"] should be: "Aafje"
and
print($jsonpers["Person"][2]["PersonName"]["PersonFirstName"] should be: "Grietje"

Parse json with PHP. Getting Undefined property: stdClass errors

I am having some issues parsing a json file from Jenkins using PHP
{
"actions" : [
{
"causes" : [
{
"shortDescription" : "Started by an SCM change"
}
]
},
{
},
{
},
{
"buildsByBranchName" : {
"origin/release_5.6.0" : {
"buildNumber" : 242,
"buildResult" : null,
"marked" : {
"SHA1" : "fde4cfd86b8511d328037b9e9c55876007bb6e67",
"branch" : [
{
"SHA1" : "fde4cfd86b8511d328037b9e9c55876007bb6e67",
"name" : "origin/release_5.6.0"
}
]
},
"revision" : {
"SHA1" : "fde4cfd86b8511d328037b9e9c55876007bb6e67",
"branch" : [
{
"SHA1" : "fde4cfd86b8511d328037b9e9c55876007bb6e67",
"name" : "origin/release_5.6.0"
}
]
}
},
"origin/release_5.7.0" : {
"buildNumber" : 315,
"buildResult" : null,
"marked" : {
"SHA1" : "ae2cbf69a25e0632e0f1d3eeb27a907b154efce0",
"branch" : [
{
"SHA1" : "ae2cbf69a25e0632e0f1d3eeb27a907b154efce0",
"name" : "origin/release_5.7.0"
}
]
},
"revision" : {
"SHA1" : "ae2cbf69a25e0632e0f1d3eeb27a907b154efce0",
"branch" : [
{
"SHA1" : "ae2cbf69a25e0632e0f1d3eeb27a907b154efce0",
"name" : "origin/release_5.7.0"
}
]
}
},
I have tried doing the following
//Read in JSON object
$json_file2 = file_get_contents('url.com/json');
//Decode JSON file
$test = json_decode($json_file2); //object
//print_r($json_file2);
echo $test->causes;
I am also trying to access the different sections in "buildsByBranchName". I have tried many different variations of the code above, but I keep getting "Undefined property: stdClass" errors.
You are not accessing that value properly. causes resides under actions which is an array. Your code also won't work because causes is an array.
// This is an array so you can't use echo here.
$causes = $test->actions[0]->causes;
// echo out the shortDescription
echo $causes[0]->shortDescription;
or
echo $test->actions[0]->causes[0]->shortDescription;

Immense term error in elasticsearch

I'm working on a membership administration program, for wich we want to use Elasticsearch as search engine. At this point we're having problems with indexing certain fields, because they generate an 'immense term'-error on the _all field.
Our settings:
curl -XGET 'http://localhost:9200/my_index?pretty=true'
{
"my_index" : {
"aliases" : { },
"mappings" : {
"Memberships" : {
"_all" : {
"analyzer" : "keylower"
},
"properties" : {
"Amount" : {
"type" : "float"
},
"Members" : {
"type" : "nested",
"properties" : {
"Startdate membership" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"Enddate membership" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"Members" : {
"type" : "string",
"analyzer" : "keylower"
}
}
},
"Membership name" : {
"type" : "string",
"analyzer" : "keylower"
},
"Description" : {
"type" : "string",
"analyzer" : "keylower"
},
"elementId" : {
"type" : "integer"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1441310632366",
"number_of_shards" : "1",
"analysis" : {
"filter" : {
"my_char_filter" : {
"type" : "asciifolding",
"preserve_original" : "true"
}
},
"analyzer" : {
"keylower" : {
"filter" : [ "lowercase", "my_char_filter" ],
"tokenizer" : "keyword"
}
}
},
"number_of_replicas" : "1",
"version" : {
"created" : "1040599"
},
"uuid" : "nn16-9cTQ7Gn9NMBlFxHsw"
}
},
"warmers" : { }
}
}
We use the keylower-analyzer, because we don't want the fullname to be split on whitespace. This is because we want to be able to search on 'john johnson' in the _all field as well as in the 'Members'-field.
The 'Members'-field can contain multiple members, wich is where the problems start. When the field only contains a couple of members (as in the example below), there is no problem. However, the field may contain hundreds or thousands of members, wich is when we get the immens term error.
curl 'http://localhost:9200/my_index/_search?pretty=true&q=*:*'
{
"took":1,
"timed_out":false,
"_shards":{
"total":1,
"successful":1,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[
{
"_index":"my_index",
"_type":"Memberships",
"_id":"15",
"_score":1.0,
"_source":{
"elementId":[
"15"
],
"Membership name":[
"My membership"
],
"Amount":[
"100"
],
"Description":[
"This is the description."
],
"Members":[
{
"Members":"John Johnson",
"Startdate membership":"2015-01-09",
"Enddate membership":"2015-09-03"
},
{
"Members":"Pete Peterson",
"Startdate membership":"2015-09-09"
},
{
"Members":"Santa Claus",
"Startdate membership":"2015-09-16"
}
]
}
}
]
}
}
NOTE: The above example works! It's only when the field 'Members' contains (a lot) more members that we get the error. The error we get is:
"error":"IllegalArgumentException[Document contains at least one
immense term in field=\"_all\" (whose UTF8 encoding is longer than the
max length 32766), all of which were skipped. Please correct the
analyzer to not produce such terms. The prefix of the first immense
term is: '[...]...', original message: bytes can be at most 32766 in
length; got 106807]; nested: MaxBytesLengthExceededException[bytes can
be at most 32766 in length; got 106807]; " "status":500
We only get this error on the _all-field, not on the original Members-field. With ignore_above, it's not possible to search in the _all field on fullname anymore. With the standard analyzer, i would find this document if i would search on 'Santa Johnson', because the _all-fields has a token 'Santa' and 'Johnson'. That's why i use keylower for these fields.
What i would like is an analyzer that tokenizes on field, but doesn't break up the values in the fields itself. What happens now, is that the entire field 'Members' is being fed as one token, including the childfields. (so, the token in the example above would be:
John Johnson 2015-01-09 2015-09-03 Pete Peterson 2015-09-09 Santa Claus 2015-09-16
Is it possible to tokenize these fields in such a way that every field is being fed to _all as separate tokens, but without breaking up the values in the fields themself? So that the tokens would be:
John Johnson
2015-01-09
2015-09-03
Pete Peterson
2015-09-09
Santa Claus
2015-09-16
Note: We use the Elasticsearch php library.
There is a much better way of doing this. Whether or not the phrase search can span multiple field values is determined by position_offset_gap (in 2.0 it will be renamed into position_increment_gap). This parameter basically specifies how many words/positions should be "inserted" between the last token of one field and the first token of the following fields. By default, in elasticsearch prior to 2.0 position_increment_gap has value of 0. That's is what causing the issues that you describe.
By combining copy_to feature and specifying position_increment_gap you can create an alternative my_all field that will not have this issue. By setting this new field in index.query.default_field setting you can tell elasticsearch to use this field by default instead of _all field when no fields are specified.
curl -XDELETE "localhost:9200/test-idx?pretty"
curl -XPUT "localhost:9200/test-idx?pretty" -d '{
"settings" :{
"index": {
"number_of_shards": 1,
"number_of_replicas": 0,
"query.default_field": "my_all"
}
},
"mappings": {
"doc": {
"_all" : {
"enabled" : false
},
"properties": {
"Members" : {
"type" : "nested",
"properties" : {
"Startdate membership" : {
"type" : "date",
"format" : "dateOptionalTime",
"copy_to": "my_all"
},
"Enddate membership" : {
"type" : "date",
"format" : "dateOptionalTime",
"copy_to": "my_all"
},
"Members" : {
"type" : "string",
"analyzer" : "standard",
"copy_to": "my_all"
}
}
},
"my_all" : {
"type": "string",
"position_offset_gap": 256
}
}
}
}
}'
curl -XPUT "localhost:9200/test-idx/doc/1?pretty" -d '{
"Members": [{
"Members": "John Johnson",
"Startdate membership": "2015-01-09",
"Enddate membership": "2015-09-03"
}, {
"Members": "Pete Peterson",
"Startdate membership": "2015-09-09"
}, {
"Members": "Santa Claus",
"Startdate membership": "2015-09-16"
}]
}'
curl -XPOST "localhost:9200/test-idx/_refresh?pretty"
echo
echo "Should return one hit"
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"match_phrase" : {
"my_all" : "John Johnson"
}
}
}'
echo
echo "Should return one hit"
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"query_string" : {
"query" : "\"John Johnson\""
}
}
}'
echo
echo "Should return no hits"
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"match_phrase" : {
"my_all" : "Johnson 2015-01-09"
}
}
}'
echo
echo "Should return no hits"
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"query_string" : {
"query" : "\"Johnson 2015-01-09\""
}
}
}'
echo
echo "Should return no hits"
curl "localhost:9200/test-idx/doc/_search?pretty=true" -d '{
"query": {
"match_phrase" : {
"my_all" : "Johnson Pete"
}
}
}'

Find a document with Doctrine ODM with equals condition on nested array of objects

I got this kind of document:
{
"_id" : ObjectId("54ad5c3b9a703a3c088b4567"),
"hard" : 750,
"coordinates" : {
"x" : 0.2388169910939489,
"y" : 0.7996551291084174
},
"indicator" : 500,
"networkIdList" : {
"networkIdData" : [
{
"networkId" : "abc123",
"type" : "SomeNetwork"
},
{
"networkId" : "123asdf",
"type" : "AnotherNetWork"
},
{
"networkId" : "abc123",
"type" : "OneMoreNetwork"
}
]
}
}
And I need to perform a query to find the document that have "networkId" = "abc123" AND "type" = "SomeNetwork".
I have tried With this instruction:
$this->documentManager->createQueryBuilder('Mydocument') ->field('networkIdList.networkIdData.$.networkGamingId')->equals('abc123') ->field('networkIdList.networkIdData.$.type')->equals('')
->getQuery()
->execute());
But the cursor return no data.
I also try with
->where("function() {return this.networkIdList.networkIdData.$.networkGamingId == 'abc123'}")
but in this case i got an error that says the Object $ has no propierties.
And I need to perform a query to find the document that have "networkId" = "abc123" AND "type" = "SomeNetwork"
$qb = $dm->createQueryBuilder('Foo')
->field('networkIdList.networkIdData.networkId')->equals('abc123')
->field('networkIdList.networkIdData.type')->equals('SomeNetwork');

Parsing JSON results

I understand how to parse json with PHP, however I don't understand how to read it with the eye. Can someone please help me understanad this?
Here is my code
<?php
$json = file_get_contents('json.txt');
$json_output = json_decode($json);
foreach ( $json_output->query as $stf )
{
echo "{$stf->response->domains->name}\n";
}
?>
Here is a sample of the json result
{ "query" : { "host" : "test.com",
"tool" : "pro"
},
"response" : { "domain_count" : "13",
"domains" : [ { "last_resolved" : "2012-01-11",
"name" : "test1.com"
},
{ "last_resolved" : "2012-01-11",
"name" : "test2.com"
},
As you can see I tried query->response->domains->name and it didn't work.
How would I tried name?
Thank you in advance
query->response->domains is an indexed array, so you need to get an index, say [0], and then get the ->name from that.
echo $stf->response->domains[0]->name."\n";
foreach ( $json_output->query->response->domains as $domain )
{
echo $domain->name;
}
Study this http://json.org/
If you're trying to read it by eye, it might help to reformat:
{
"query" : {
"host" : "test.com",
"tool" : "pro"
},
"response" : {
"domain_count" : "13",
"domains" : [{
"last_resolved" : "2012-01-11",
"name" : "test1.com"
},{
"last_resolved" : "2012-01-11",
"name" : "test2.com"
}]
}
}

Categories