How to create csv file using raw text data

How to create csv file using raw text data - php

i am much confused at this point regarding the csv file creation and insert data in the database.
suppose i have below text data - that is of 45000 record set, i am posting dew of them below.
Winged Wheels in France, by Michael Myers Shoemaker 45790
A Battle Fought on Snow Shoes, by Mary Cochrane Rogers 45789
The German Classics of the Nineteenth and Twentieth Centuries, 45788
Volume 11, by Friedrich Spielhagen, Theodor Storm,
Wilhelm Raabe, Marion D. Learned and Ewald Eiserhardt
[Subtitle: Masterpieces of German Literature
Translated Into English]
Zofloya ou le Maure, Tomes 1-4, by Charlotte Dacre 45787
[Subtitle: Histoire du XVe si?cle]
[Language: French]
Their Majesties as I Knew Them, by Xavier Paoli 45786
[Subtitle: Personal Reminiscences of the
Kings and Queens of Europe]
[Translator: Alexander Teixeira de Mattos]
New York Times Current History: The European War, Vol. 8, 45785
Pt. 2, No. 1, July 1918, by Various
Gallery of Comicalities, by Robert Cruikshank, 45784
George Cruikshank and Robert Seymour
[Subtitle: Embracing Humorous Sketches]
Katri, by Emil Nervander 45783
[Subtitle: Kertomus 17 vuosi-sadasta]
[Language: Finnish]
The Little Brown Jug at Kildare, by Meredith Nicholson 45782
[Illustrator: James Montgomery Flagg]
Beaumont & Fletcher's Works (6 through 10), by Francis Beaumont 45781
and John Fletcher
[Subtitle: The Queen of Corinth; Bonduca; The Knight of the
Burning Pestle; Loves Pilgrimage; The Double Marriage]
Beaumont & Fletcher's Works (1 through 5), by Francis Beaumont 45780
and John Fletcher
[Subtitle: A Wife for a Month; The Lovers Progress;
The Pilgrim; The Captain; The Prophetess]
The Washington Historical Quarterly, Volume V, 1914, by Various 45779
[Editor: Edmond S. Meany]
Minstrelsy of the Scottish Border Volume III of 3, by Walter Scott 45778
[Subtitle: Consisting of Historical and Romantic Ballads,
Collected In the Southern Counties of Scotland; With
a Few Of Modern Date, Founded Upon Local Tradition.
In Three Volumes. Vol. III]
What i want is simply insert Winged Wheels in France, by Michael Myers Shoemaker in one column and 45790 in other column of CSV. then i will be able to add them to my database.
moreover, e.g,
The German Classics of the Nineteenth and Twentieth Centuries,
Volume 11, by Friedrich Spielhagen, Theodor Storm,
Wilhelm Raabe, Marion D. Learned and Ewald Eiserhardt
[Subtitle: Masterpieces of German Literature
Translated Into English]
i want to insert above text in this way:
The German Classics of the Nineteenth and Twentieth Centuries,
Volume 11, by Friedrich Spielhagen, Theodor Storm,
Wilhelm Raabe, Marion D. Learned and Ewald Eiserhardt
means no this portion:
[Subtitle: Masterpieces of German Literature
Translated Into English]
the ", by" should also omitted and so my new data would be like this. so actually i need three columns in the csv.
1 | Winged Wheels in France | Michael Myers Shoemaker | 45790
2 | The German Classics of the Nineteenth and Twentieth Centuries,
Volume 11 | Friedrich Spielhagen, Theodor Storm,
Wilhelm Raabe, Marion D. Learned and Ewald Eiserhardt | 45789
Please help in getting it inserted in excel file and create csv from it.
thank you all.

do like this
(not tested)
$f = file_get_contents('yourtextfile.txt');
$f = preg_replace('/\[(.*?)\]/s','',$f);
$f = str_replace(array("\n", "\r"), '', $f);
file_put_contents('temp.txt',$f);
$file = file('temp.txt');
foreach($file as $key => $line){
if($line!=null || $line!='')
{
mysqli_query($connection,"insert into table1(column1) values('$line')");
}
}
edit
$f = file_get_contents('tt.txt');
$f = preg_replace('/\[(.*?)\]/s','',$f);
$keywords = preg_split("/[ ]{15}/", $f);
print_r(array_filter($keywords));
i'm still not clear,whether you have those numbers in your file or you have just mentioned it there!!

Related

web scraping : how would you detect new items in a list?

I'm working on some PHP code that would grab a music playlist from a remote radio page - which means it is continuously updated.
I would like to store the tracks history in my database.
My problem is that I need to detect when new entries have been added to the remote tracklist, knowing that :
I don't know how often the remote page will be updated
I don't know how many tracks are displayed on the remote page. Sometimes it will be a single track, sometimes it will be a few dozen.
A same track could show up several times.
For example, I will get this data when grabbing the page for the first time :
Dead Combo — Esse Olhar Que Era Só Teu
Myron & E — If I Gave You My Love
Hooverphonic — Badaboum
Alain Chamfort — Bambou - Pilooski / Jayvich Reprise
William Onyeabor — Atomic Bomb
Curtis Mayfield — Move on up - Extended version
Mos Def — Ms. Fat Booty
Nicki Minaj — Feeling Myself
Disclosure — You & Me (Flume remix)
Otis Redding — My Girl - Remastered Mono
Then on the second time I'll get :
Charles Aznavour — Emmenez moi
Mos Def — Ms. Fat Booty
Rag'n'Bone Man — Human
Bernard Lavilliers — Idées noires
Julien Clerc — Ma préférence
The Rolling Stones — Just Your Fool
Dead Combo — Esse Olhar Que Era Só Teu
Myron & E — If I Gave You My Love
Hooverphonic — Badaboum
Alain Chamfort — Bambou - Pilooski / Jayvich Reprise
As you can see, the second time, I got entries 7->10 that seems to be the same than the first time (so entries 1->6 are the new ones); and track #2 was already played in the first list but seems to have been replayed since.
The new entries here would be :
Charles Aznavour — Emmenez moi
Mos Def — Ms. Fat Booty
Rag'n'Bone Man — Human
Bernard Lavilliers — Idées noires
Julien Clerc — Ma préférence
The Rolling Stones — Just Your Fool
I store tracks entries in a table, and tracks history in another one.
Structure of the tracks table
| ID | artist | title | album |
--------------------------------------------------
| 12 | Mos Def | Ms. Fat Booty | |
Structure of the tracks history table
| ID | track ID | time |
------------------------------------------
| 24 | 12 | 2016-07-03 13:40:26 |
Have you got any ideas on how I could handle this ?
Thanks !

I think you're trying to find the items at the end of the second list that match those at beginning of the first?
If you can store both lists in an array (the old list in $previous and the new list in $current), this function should help:
function find_old_tracks($previous, $current)
{
for ($i = 0; $i < count($current); $i++)
{
if ($previous[$i] == $current[$i]) continue;
return find_old_tracks($previous, array_slice($current, $i + 1));
}
return array_slice($previous, 0, $i);
}
It scans through $current for contiguous matches to $previous, recursing on the remainder every time it finds a missmatch. When I run this:
$previous = array(
'Dead Combo — Esse Olhar Que Era Só Teu',
'Myron & E — If I Gave You My Love',
'Hooverphonic — Badaboum',
'Alain Chamfort — Bambou - Pilooski / Jayvich Reprise',
'William Onyeabor — Atomic Bomb',
'Curtis Mayfield — Move on up - Extended version',
'Mos Def — Ms. Fat Booty',
'Nicki Minaj — Feeling Myself',
'Disclosure — You & Me (Flume remix)',
'Otis Redding — My Girl - Remastered Mono'
);
$current = array(
'Charles Aznavour — Emmenez moi',
'Mos Def — Ms. Fat Booty',
'Rag Bone Man — Human',
'Bernard Lavilliers — Idées noires',
'Julien Clerc — Ma préférence',
'The Rolling Stones — Just Your Fool',
'Dead Combo — Esse Olhar Que Era Só Teu',
'Myron & E — If I Gave You My Love',
'Hooverphonic — Badaboum',
'Alain Chamfort — Bambou - Pilooski / Jayvich Reprise'
);
$old_tracks = find_old_tracks($previous, $current);
$new_tracks = array_slice($current, 0, count($current) - count($old_tracks));
print "NEW TRACKS: " . implode($new_tracks, '; ');
print "<br /><br />OLD TRACKS: " . implode($old_tracks, '; ');
my output is:
NEW TRACKS: Charles Aznavour — Emmenez moi; Mos Def — Ms. Fat Booty;
Rag Bone Man — Human; Bernard Lavilliers — Idées noires; Julien Clerc
— Ma préférence; The Rolling Stones — Just Your Fool
OLD TRACKS: Dead Combo — Esse Olhar Que Era Só Teu; Myron & E — If I
Gave You My Love; Hooverphonic — Badaboum; Alain Chamfort — Bambou -
Pilooski / Jayvich Reprise
You can do what you like with that info on the database end.

Regex to isolate specific word in body of text until delimiter

I have the following specific output from which I would like to isolate from and including the word "industry" (whichever case) and the sub string until the next delimiter typically "|". I get the $output from an API So the contents of $output are always different but the generic expression may be something like: blah blah blah |industry = industry info| blah blah blah. If the word industry exists in the output I would just like to get industry = industry info. Is there a generic regex which can do this? The specific output I have returned is:
<?php
$output = '{{other uses|UBS (disambiguation)}} {{Use dmy dates|date=April
2015}} {{Infobox company |name = UBS Group AG |logo = [[File:UBS
Logo.svg|200px|UBS Group AG Logo]] |type = [[Aktiengesellschaft]]
([[Aktiengesellschaft|AG]])
[[Public company]] |traded_as = {{SWX|UBSG}} {{SWX|UBSN}}
{{nyse|UBS}} |foundation=1854 |predecessor = [[Union Bank of
Switzerland]] and [[Swiss Bank Corporation]] merged in 1998;
[[PaineWebber]] merged in 2000 |location = [[ZÃ¼rich]]
[[Basel]] |key_people = [[Axel A. Weber]] (Chairman){{br}}[[Sergio
Ermotti]] (CEO) {{br}} |area_served = Worldwide |industry =[[Banking]],
[[Financial services]] |products = [[Investment Banking]]
[[Investment Management]] [[Wealth Management]] [[Private Banking]]
[[Commercial Bank|Corporate Banking]]
[[Private Equity]]
[[Finance and Insurance]]
[[Retail Banking|Consumer Banking]]
[[Mortgage loans|Mortgages]]
[[Credit Cards]] |revenue = {{Increase}} [[Swiss franc|CHF]]28.027
billion (2014) |operating_income = {{Decrease}} CHF2.461 billion (2014)
{{cite web|title=UBS Annual Report
2014|url=http://www.ubs.com/global/en/about_ubs/
investor_relations/annualreporting/2014/_jcr_content/par/
columncontrol_0/col1/linklist/link.1899571414.file/
bGluay9wYXRoPS9jb250ZW50L2RhbS9zdGF0aWMvZ2xvYmFsL2ludmV
zdG9yX3JlbGF0aW9ucy9hbm51YWwyMDE0L2FubnVhbC1yZXBv
cnQtZ3JvdXAtMjAxNC1lbi5wZGY=/annual-report-group-2014-
en.pdf|publisher=UBS.com|accessdate=May 3, 2015}}
|assets = {{Increase}} CHF1.062 trillion (2014) |equity = {{Increase}}
CHF54.368 billion (2014) |num_employees = {{Decrease}} 60,155 (2014)
|caption=We Will Not Rest |homepage = [https://www.ubs.com/ UBS.com] }}
'''UBS AG''' is a Swiss global [[financial services]] company,
incorporated in the [[Canton of Zurich]],{{cite web|title=Trade Register:
UBS AG|url=http://www.moneyhouse.ch/en/u/ubs_ag_CH-270.3.004.646-4.htm}}
and co-headquartered in [[ZÃ¼rich]] and [[Basel]].{{cite
web|url=https://www.ubs.com/global/en/about_ubs/
investor_relations/faq/about.html|title=Corporate information - UBS
Global topics|work=ubs.com|accessdate=March 29, 2015}} The company
provides [[investment banking]], [[asset management]], and [[wealth
management]] services for private, corporate, and institutional clients
worldwide, and for retail clients in Switzerland as well.{{cite
web|url=https://www.ubs.com/global/en/about_ubs/
investor_relations/our_businesses.html|title=Our clients & businesses -
UBS Global topics|work=ubs.com|
accessdate=March 29, 2015}} The name ''UBS'' was originally an
abbreviation for the [[Union Bank of Switzerland]], but it ceased to be a
representational abbreviation after the bank's merger with [[Swiss Bank
Corporation]] in 1998. The company traces its origins to 1856, when the
earliest of its predecessor banks was founded.{{cite web|title=150 years
of banking tradition|url=https://www.ubs.com/global/en/about_ubs/
about_us/history/_jcr_content/rightpar/
teaser_0/linklist/link.651908116.file/
bGluay9wYXRoPS9jb250ZW50L2RhbS91YnMvZ2xvY
mFsL2Fib3V0X3Vicy9hYm91dF91cy9oaXN0b3J5X29mX3
Vicy8xNTBfeWVhcnNfb2ZfYmFua2luZ19FTkcucGRm/
150_years_of_banking_ENG.pdf|work=ubs.com|
accessdate=March 29, 2015}} UBS is the biggest
bank in Switzerland, operating in more than 50
countries with about 60,000 employees around the world, as of 2014.{{cite
web|title=About us: UBS in a few
words|url=https://www.ubs.com/global/en/
about_ubs/about_us/ourprofile.html|work=ubs.com}} It is considered the
world's largest manager of private wealth assets, with over [[Swiss
franc|CHF]]2.2 trillion in invested assets,J.P.Morgan Cazenove Europe';
?>

[^|]*\bindustry\b[^|]*
Try this.See demo.Use i flag.
https://regex101.com/r/uF4oY4/79
This will match a string which starts from after | has industry till the next |.
$re = "/[^|]*\\bindustry\\b[^|]*/i";
$str = "{{other uses|UBS (disambiguation)}} {{Use dmy dates|date=April \n2015}} {{Infobox company |name = UBS Group AG |logo = [[File:UBS \nLogo.svg|200px|UBS Group AG Logo]] |type = [[Aktiengesellschaft]] \n([[Aktiengesellschaft|AG]])\n[[Public company]] |traded_as = {{SWX|UBSG}} {{SWX|UBSN}}\n{{nyse|UBS}} |foundation=1854 |predecessor = [[Union Bank of \nSwitzerland]] and [[Swiss Bank Corporation]] merged in 1998; \n[[PaineWebber]] merged in 2000 |location = [[ZÃ¼rich]]\n[[Basel]] |key_people = [[Axel A. Weber]] (Chairman){{br}}[[Sergio \nErmotti]] (CEO) {{br}} |area_served = Worldwide |industry =[[Banking]], \n[[Financial services]] |products = [[Investment Banking]]\n[[Investment Management]] [[Wealth Management]] [[Private Banking]]\n[[Commercial Bank|Corporate Banking]]\n[[Private Equity]]\n[[Finance and Insurance]]\n[[Retail Banking|Consumer Banking]]\n[[Mortgage loans|Mortgages]]\n[[Credit Cards]] |revenue = {{Increase}} [[Swiss franc|CHF]]28.027 \nbillion (2014) |operating_income = {{Decrease}} CHF2.461 billion (2014) \n{{cite web|title=UBS Annual Report \n2014|url=http://www.ubs.com/global/en/about_ubs/\ninvestor_relations/annualreporting/2014/_jcr_content/par/\ncolumncontrol_0/col1/linklist/link.1899571414.file/\nbGluay9wYXRoPS9jb250ZW50L2RhbS9zdGF0aWMvZ2xvYmFsL2ludmV\nzdG9yX3JlbGF0aW9ucy9hbm51YWwyMDE0L2FubnVhbC1yZXBv\ncnQtZ3JvdXAtMjAxNC1lbi5wZGY=/annual-report-group-2014- \nen.pdf|publisher=UBS.com|accessdate=May 3, 2015}} \n|assets = {{Increase}} CHF1.062 trillion (2014) |equity = {{Increase}} \nCHF54.368 billion (2014) |num_employees = {{Decrease}} 60,155 (2014) \n|caption=We Will Not Rest |homepage = [https://www.ubs.com/ UBS.com] }} \n'''UBS AG''' is a Swiss global [[financial services]] company, \nincorporated in the [[Canton of Zurich]],{{cite web|title=Trade Register: \nUBS AG|url=http://www.moneyhouse.ch/en/u/ubs_ag_CH-270.3.004.646-4.htm}} \nand co-headquartered in [[ZÃ¼rich]] and [[Basel]].{{cite \nweb|url=https://www.ubs.com/global/en/about_ubs/\ninvestor_relations/faq/about.html|title=Corporate information - UBS \nGlobal topics|work=ubs.com|accessdate=March 29, 2015}} The company \nprovides [[investment banking]], [[asset management]], and [[wealth \nmanagement]] services for private, corporate, and institutional clients \nworldwide, and for retail clients in Switzerland as well.{{cite \nweb|url=https://www.ubs.com/global/en/about_ubs/\ninvestor_relations/our_businesses.html|title=Our clients & businesses - \nUBS Global topics|work=ubs.com|\naccessdate=March 29, 2015}} The name ''UBS'' was originally an \nabbreviation for the [[Union Bank of Switzerland]], but it ceased to be a \nrepresentational abbreviation after the bank's merger with [[Swiss Bank \nCorporation]] in 1998. The company traces its origins to 1856, when the \nearliest of its predecessor banks was founded.{{cite web|title=150 years \nof banking tradition|url=https://www.ubs.com/global/en/about_ubs/\nabout_us/history/_jcr_content/rightpar/\nteaser_0/linklist/link.651908116.file/\nbGluay9wYXRoPS9jb250ZW50L2RhbS91YnMvZ2xvY\nmFsL2Fib3V0X3Vicy9hYm91dF91cy9oaXN0b3J5X29mX3\nVicy8xNTBfeWVhcnNfb2ZfYmFua2luZ19FTkcucGRm/\n150_years_of_banking_ENG.pdf|work=ubs.com|\naccessdate=March 29, 2015}} UBS is the biggest\nbank in Switzerland, operating in more than 50 \ncountries with about 60,000 employees around the world, as of 2014.{{cite \nweb|title=About us: UBS in a few \nwords|url=https://www.ubs.com/global/en/\nabout_ubs/about_us/ourprofile.html|work=ubs.com}} It is considered the \nworld's largest manager of private wealth assets, with over [[Swiss \nfranc|CHF]]2.2 trillion in invested assets,J.P.Morgan Cazenove Europ";
preg_match_all($re, $str, $matches);

I would not apply a regex on a large input string like yours. As you can see in the regex debugger, vks' regex makes about 340,000 steps to finally fetch you a result.
I suggest splitting the string with | first, and then grepping out the info you need.
$chks = explode("|", $output);
foreach ($chks as $chk) {
if (strpos($chk,'industry =') !== false) {
echo $chk;
}
}
Result:
industry =[[Banking]],
[[Financial services]]
See IDEONE demo

Parsing feed that's inside namespace

I have tried several different approaches here, but to no avail. Can someone explain to me how the following type of feed can be parsed? Below is an excerpt
<?xml version='1.0'?>
<NoticeResults xmlns:sql="urn:schemas-microsoft-com:xml-sql">
<sql:query>
<Notice>
<PersonId>174171199</PersonId>
<NamePrefix></NamePrefix>
<NameAdditionalPrefix></NameAdditionalPrefix>
<FirstName>Donna</FirstName>
<MiddleName></MiddleName>
<LastName>Autrey</LastName>
<NameSuffix></NameSuffix>
<NameAdditionalSuffix></NameAdditionalSuffix>
<MaidenName></MaidenName>
<City></City>
<State>IL</State>
<Country>United States</Country>
<DateEntered>2015-02-17T00:00:00</DateEntered>
<DateCompleted>2015-02-17T00:00:00</DateCompleted>
<DateExpired>2015-03-19T00:00:00</DateExpired>
<NoticeText><p><B>PEKIN </B>- Donna Lou (Morris) Autrey, 83, of Pekin passed away at 1:10 p.m. Sunday, Feb. 15, 2015, at Autumn Accolade in Green Valley.</p><p>A graveside service will be at 11 a.m. Thursday at Lakeside Cemetery. There will be no visitation. Preston-Hanley Funeral Homes &amp; Crematory is in charge of arrangements.</p>
</NoticeText>
<NoticeType>Courtesy</NoticeType>
<Status>Active</Status>
<FromToYears></FromToYears>
<AffiliateSite>PJStar</AffiliateSite>
<AffiliateAdId>Autrey_02/17/2015_4673625</AffiliateAdId>
<PublishedBy>Peoria Journal Star</PublishedBy>
<DisplayURL>http://www.legacy.com/Link.asp?I=LS000174171199</DisplayURL>
<LocationList></LocationList>
<ShowInSpotlight>0</ShowInSpotlight>
<DateCreated>2015-02-17T01:12:37.510</DateCreated>
<RowVersion>793618506</RowVersion>
<GuestBookURL></GuestBookURL>
</Notice>
<Notice>
<PersonId>174171209</PersonId>
<NamePrefix></NamePrefix>
<NameAdditionalPrefix></NameAdditionalPrefix>
<FirstName>Lois</FirstName>
<MiddleName></MiddleName>
<LastName>Barden</LastName>
<NameSuffix></NameSuffix>
<NameAdditionalSuffix></NameAdditionalSuffix>
<MaidenName></MaidenName>
<City>Peoria</City>
<State>IL</State>
<Country>United States</Country>
<DateEntered>2015-02-17T00:00:00</DateEntered>
<DateCompleted>2015-02-17T00:00:00</DateCompleted>
<DateExpired>2015-03-19T00:00:00</DateExpired>
<NoticeText><img src="/Images/Cobrands/PJStar/Photos/C7KC6NVUW02_021715.jpg" align="left" border="0" vspace="4" hspace="10" lgyOrigName="C7KC6NVUW02.jpg"><p><B>PEORIA </B>- Lois Mae Barden, age 84, of Peoria passed away Sunday, Feb. 15, 2015, at OSF Saint Francis Medical Center in Peoria.</p><p>Lois was born on Sept. 12, 1930, in Macomb, Ill. She married James W. Barden on June 12, 1948, in Peoria. He preceded her in death in June of 1988. </p><p>She also was preceded in death by her parents and 12 siblings.</p><p>Surviving are her children, Kathy (Ted) Kindred of Manito, Ill., Lynn (Allen) Simer of Nineveh, Ind., and Jim (Deb) Barden of North Pekin, Ill.; four grandchildren; six great-grandchildren; one great-great-grandson; and her two sisters, Doris Butts and Maggie Hoyt.</p><p>Lois enjoyed spending time at home with her family. She was always there to care for them when they were in need. She also enjoyed listening to Christian music, crafting, cooking and crocheting.</p><p>A visitation will be from 1 to 2 p.m. Thursday, Feb. 19, 2015, at Davison-Fulton Woodland Chapel in Peoria. Burial will follow at Swan Lake Memory Gardens in Peoria. </p><p>Memorials may be made to Illinois Cancer Care.</p><p>Online condolences may be made through <a href="http://www.davison-fulton.com" target="_new" rel="nofollow">www.davison-fulton.com</a>.</p><center><br><a rel="nofollow" href="http://www.davison-fulton.com" target="_blank"><img src="/Images/Cobrands/PJStar/Logos/www.davison-fulton.com.jpg" border="0"></a></center>
</NoticeText>
<NoticeType>Paid</NoticeType>
<Status>Active</Status>
<FromToYears></FromToYears>
<AffiliateSite>PJStar</AffiliateSite>
<AffiliateAdId>Barden_02/17/2015_102257420</AffiliateAdId>
<PublishedBy>Peoria Journal Star</PublishedBy>
<DisplayURL>http://www.legacy.com/Link.asp?I=LS000174171209</DisplayURL>
<LocationList></LocationList>
<FHIndex>10868</FHIndex>
<FHName>Davison-Fulton Woodland Chapel</FHName>
<FHKnownByName1>Davison-Fulton Woodland Chapel</FHKnownByName1>
<FHAddressLine1>2021 North University Street</FHAddressLine1>
<FHCity>Peoria</FHCity>
<FHStateProvince>IL </FHStateProvince>
<FHZipCode>61604 </FHZipCode>
<FHPhoneNumber1>3096885700 </FHPhoneNumber1>
<FHUrl>www.davison-fulton.com</FHUrl>
<ShowInSpotlight>0</ShowInSpotlight>
<DateCreated>2015-02-17T01:13:01.857</DateCreated>
<ImageUrl>http://mi-cache.legacy.com/legacy/images/Cobrands/PJStar/Photos/C7KC6NVUW02_021715.jpg</ImageUrl>
<RowVersion>793622447</RowVersion>
<GuestBookURL>http://www.legacy.com/Link.asp?I=GB000174171209</GuestBookURL>
</Notice>
</sql:query>
</NoticeResults>
I have been banging my head against the table on this all day (huge waste of 8 hours). All we have to work with in our offices are PHP (limitation by our CMS). Can someone please explain how I would go about parsing through a feed like this? I've gotten pretty far, but still can't seem to figure this out. Please also explain why you do whatever you do in order to make this work. I'll have to parse several feeds structured this way and i've never dealt with this before. Any help will be greatly appreciated.
Thanks!
Chris

You can access the elements using DOM Extension. There is an example of looping through XML from w3schools.
<?php
$xmlDoc = new DOMDocument();
$xmlDoc->load("doc.xml");
$x = $xmlDoc->documentElement;
foreach ($x->childNodes AS $item) {
// Do whatever is needed
}
?>

How to parse a json in php

I am parsing the below json in php
{
"responseHeader":{
"status":0,
"QTime":22,
"params":{
"fl":"title,id",
"indent":"true",
"q":"einstein",
"hl.simple.pre":"<em>",
"hl.simple.post":"</em>",
"wt":"json",
"hl":"true",
"rows":"3"}},
"response":{"numFound":63,"start":0,"docs":[
{
"id":"1",
"title":"Albert Einstein"},
{
"id":"2088",
"title":"Nationalism"},
{
"id":"1551",
"title":"Dean Koontz"}]
},
"highlighting":{
"1":{
"text":[" for school exam September The Collected Papers of Albert <em>Einstein</em> Vol Doc s Unthinking for authority"]},
"2088":{
"text":[" in a letter to Alfred Kneser June Doc in The Collected Papers of Albert <em>Einstein</em> Vol Nationalism"]},
"1551":{
"text":[" changes since meeting Travis Did you get the leash on him yet <em>Einstein</em> Part Chapter Nora s query during"]}}}
using json_decode and looping through the result array I could get the individual elements in the docs section,
foreach ($myArray['response']['docs'] as $doc) {
echo $doc['id'] . "<br/>";
echo $doc['title'] . "<br/>";
}
I am now trying to figure out in getting the values from the highlighting section of this json. I want to get the text fields in the highlighting part and store it in a array.
"highlighting":{
"1":{
"text":[" for school exam September The Collected Papers of Albert <em>Einstein</em> Vol Doc s Unthinking for authority"]},
"2088":{
"text":[" in a letter to Alfred Kneser June Doc in The Collected Papers of Albert <em>Einstein</em> Vol Nationalism"]},
"1551":{
"text":[" changes since meeting Travis Did you get the leash on him yet <em>Einstein</em> Part Chapter Nora s query during"]}}}
The array should be like this,
"1" => " for school exam September The Collected Papers of Albert <em>Einstein</em> Vol Doc s Unthinking for authority"
"2088" => " in a letter to Alfred Kneser June Doc in The Collected Papers of Albert <em>Einstein</em> Vol Nationalism"
How to achieve this? Is there any way to map the id element of the docs to the number specified in the highlighting part?

You may try this (Example)
$myArray = json_decode($json, true);
$highlighting = array();
foreach($myArray['highlighting'] as $key => $value)
{
$highlighting[$key] = $value['text'][0];
}
Result :
Array (
[1] => for school exam September...
[2088] => in a letter to Alfred ...
[1551] => changes since meeting ...
)

PHP, Memory and Iteration

I've been noticing some odd behavior while experimenting with benchmarking SplFixedArrays. Take this little snippet of code, for instance...
<?php
$splFixedArray = new \SplFixedArray( 100000 );
echo number_format( memory_get_usage() ) . PHP_EOL;
$variable = 'Truffaut single-origin coffee wayfarers, church-key asymmetrical 90\'s trust fund hashtag before they sold out thundercats photo booth. Godard sustainable roof party keffiyeh, Odd Future chillwave mlkshk kogi VHS leggings hoodie art party next level dreamcatcher yr. Blog american apparel aesthetic tattooed farm-to-table, stumptown viral whatever mixtape raw denim Williamsburg skateboard flexitarian actually tofu. Echo Park lomo disrupt PBR, jean shorts irony fingerstache blog kale chips. Street art iPhone PBR fingerstache Bushwick Cosby sweater. McSweeney\'s mumblecore semiotics, twee quinoa tofu +1 fingerstache pop-up. Echo Park bitters disrupt irony. Truffaut single-origin coffee wayfarers, church-key asymmetrical 90\'s trust fund hashtag before they sold out thundercats photo booth. Godard sustainable roof party keffiyeh, Odd Future chillwave mlkshk kogi VHS leggings hoodie art party next level dreamcatcher yr. Blog american apparel aesthetic tattooed farm-to-table, stumptown viral whatever mixtape raw denim Williamsburg skateboard flexitarian actually tofu. Echo Park lomo disrupt PBR, jean shorts irony fingerstache blog kale chips. Street art iPhone PBR fingerstache Bushwick Cosby sweater.';
var_dump( $variable );
for( $i = 0; $i < 100000; $i++ )
{
$splFixedArray[ $i ] = $variable;
}
echo number_format( memory_get_usage() );
Which outputs...
1,032,080
string(1209) "Truffaut single-origin coffee wayfarers, church-key asymmetrical 90's trust fund hashtag before they sold out thundercats photo booth. Godard sustainable roof party keffiyeh, Odd Future chillwave mlkshk kogi VHS leggings hoodie art party next level dreamcatcher yr. Blog american apparel aesthetic tattooed farm-to-table, stumptown viral whatever mixtape raw denim Williamsburg skateboard flexitarian actually tofu. Echo Park lomo disrupt PBR, jean shorts irony fingerstache blog kale chips. Street art iPhone PB"...
1,032,384
Now, let's add a simple random integer onto the end while in the for loop...
<?php
$splFixedArray = new \SplFixedArray( 100000 );
echo number_format( memory_get_usage() ) . PHP_EOL;
$variable = 'Truffaut single-origin coffee wayfarers, church-key asymmetrical 90\'s trust fund hashtag before they sold out thundercats photo booth. Godard sustainable roof party keffiyeh, Odd Future chillwave mlkshk kogi VHS leggings hoodie art party next level dreamcatcher yr. Blog american apparel aesthetic tattooed farm-to-table, stumptown viral whatever mixtape raw denim Williamsburg skateboard flexitarian actually tofu. Echo Park lomo disrupt PBR, jean shorts irony fingerstache blog kale chips. Street art iPhone PBR fingerstache Bushwick Cosby sweater. McSweeney\'s mumblecore semiotics, twee quinoa tofu +1 fingerstache pop-up. Echo Park bitters disrupt irony. Truffaut single-origin coffee wayfarers, church-key asymmetrical 90\'s trust fund hashtag before they sold out thundercats photo booth. Godard sustainable roof party keffiyeh, Odd Future chillwave mlkshk kogi VHS leggings hoodie art party next level dreamcatcher yr. Blog american apparel aesthetic tattooed farm-to-table, stumptown viral whatever mixtape raw denim Williamsburg skateboard flexitarian actually tofu. Echo Park lomo disrupt PBR, jean shorts irony fingerstache blog kale chips. Street art iPhone PBR fingerstache Bushwick Cosby sweater.';
var_dump( $variable );
for( $i = 0; $i < 100000; $i++ )
{
$splFixedArray[ $i ] = $variable . rand();
}
echo number_format( memory_get_usage() );
Which results in this...
1,034,320
string(1209) "Truffaut single-origin coffee wayfarers, church-key asymmetrical 90's trust fund hashtag before they sold out thundercats photo booth. Godard sustainable roof party keffiyeh, Odd Future chillwave mlkshk kogi VHS leggings hoodie art party next level dreamcatcher yr. Blog american apparel aesthetic tattooed farm-to-table, stumptown viral whatever mixtape raw denim Williamsburg skateboard flexitarian actually tofu. Echo Park lomo disrupt PBR, jean shorts irony fingerstache blog kale chips. Street art iPhone PB"...
129,834,272
What I'm curious about is why function calls are resulting in stacked memory usage. Is it normal that memory would not be freed up after the iteration?

This behavior is expected.
In the first case, you are storing the same value multiple times, which can be implemented as a single instance of the value and a batch of references to it, with a copy-on-write semantic for cases when the value at a given array index is changed.
In the second case, you are storing many different values, which can't be handled the same way; memory must be allocated for the full contents of each value, which results in the difference you see in memory consumption between the two cases.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

How to create csv file using raw text data - php

Related

web scraping : how would you detect new items in a list?

Regex to isolate specific word in body of text until delimiter

Parsing feed that's inside namespace

How to parse a json in php

PHP, Memory and Iteration

Categories

Resources