Array
(
[sEcho] => 1
[iTotalRecords] => 7521
[iTotalDisplayRecords] => 1
[aaData] => Array
(
[0] => Array
(
[0] => Nordic Capital Buys SiC Processing
[1] => 2010-06-21/nordic-capital-buys-sic-processing
[2] => PEHub Media
[3] => Business
[4] => completed
[5] => Nordic Capital has acquired a 70% stake in SiC Processing AG, a German industrial recycling company, from Frog Capital. No sale price was disclosed. SiC Processing’s founding family retains a 25% holding, while former lead investor Zouk Ventures retains a 5% stake.
[6] => Admin, China, Frog Capital, Germany, Italy, Iyad Omari, Manufacturing, Norway, PEHub Media, Photovoltaic Wafer Manufacturing, Renewable Energy, Semiconductor, United States
)
)
)
echo json_encode($myArr);
{"sEcho":"1","iTotalRecords":7521,"iTotalDisplayRecords":"1","aaData":[["
Nordic Capital Buys SiC Processing</a></div>","
2010-06-21/nordic-capital-buys-sic-processing</div>","PEHub Media","Business","completed",null,"
Admin, China, Frog Capital, Germany, Italy, Iyad Omari, Manufacturing, Norway, PEHub Media, Photovoltaic Wafer Manufacturing, Renewable Energy, Semiconductor, United States]]}
Note the null in the middle of the string after completed
Why is this, what escape/manipulation do I need to perform in order to encode this?
I have tried, addslashes
From the manual:
Note that if you try to encode an
array containing non-utf values,
you'll get null values in the
resulting JSON string. You can
batch-encode all the elements of an
array with the array_map function:
$encodedArray = array_map(utf8_encode, $myArr);
echo json_encode($encodedArray);
Actually it doesn't return null, http://codepad.org/A34KdUf5.
Maybe your PHP version doesn't support json_encode().
Works for me on 5.2.13. Ensure you're using at least PHP 5.2.0 and that PHP wasn't compiled with --disable-json. You may also want to check that error reporting (and/or logging) is enabled.
The simpler way is $store_name = utf8_encode(name_of_varaible) but please make sure that your character set is ISO-8859-1.
Related
I've array in php with following values
Array
(
[0] => Clarithromycin 250mg/5ml oral susptake TWO 5ml spoonsful TWICE each day DISCARD REMAINING AFTER TEN DAYSSHAKE THE BOTTLE WELL BEFORE USING.SPACE THE DOSES EVENLY. KEEP TAKING UNTIL THE COURSE IS FINISHED, UNLESS YOU ARE TOLD TO STOP.
[1] => Lactulose 3.1-3.7g/5ml oral solntake ONE to FOUR 5ml spoonsful TWICE each day when required (PRN : to be taken when necessary)
[2] => Mirtazapine orodisp 30mg tabsOne To Be Taken At NightALLOW THE TABLETS TO DISSOLVE ON YOUR TONGUE, THEN SWALLOW WITH THE SALIVA. TAKE AFTER FOOD.WARNING: THIS MEDICINE MAY MAKE YOU FEEL SLEEPY. IF THIS HAPPENS, DO NOT DRIVE OR USE TOOLS OR MACHINES. DO NOT DRINK ALCOHOL.
[3] => Senna 7.5mg/5ml oral soln SFTwo 5ml Spoonfuls To Be Taken At NightSHAKE THE BOTTLE WELL BEFORE USING.THIS MEDICINE MAY COLOUR YOUR URINE. THIS IS HARMLESS.
[4] => SUDOCREM ANTISEPTIC HEALING CREAMas directedFOR EXTERNAL USE ONLY. (MD mean As directed)
[5] => CIRCADIN MR 2MG TABSOne To Be Taken At NightSWALLOW WHOLE. DO NOT CHEW OR CRUSH.TAKE WITH OR JUST AFTER FOOD, OR A MEAL.WARNING: THIS MEDICINE MAY MAKE YOU FEEL SLEEPY. IF THIS HAPPENS, DO NOT DRIVE OR USE TOOLS OR MACHINES. DO NOT DRINK ALCOHOL.
[6] => Memantine 10mg tabsOne To Be Taken Each Day
[7] => Omeprazole gr 10mg capsOne To Be Taken Each DaySWALLOW WHOLE.DO NOT CHEW OR CRUSH.
[8] => Senna 7.5mg tabsTwo To Be Taken At NightTHIS MEDICINE MAY COLOUR YOUR URINE. THIS IS HARMLESS.
)
I wanna separate medicine and description by specific words. e.g (take two,one to be taken,one to,to be taken etc.....)
array
(
[0] => Array
(
[medicine] => Clarithromycin 250mg/5ml oral susp
[description] => take TWO 5ml spoonsful TWICE each day DISCARD REMAINING AFTER TEN DAYSSHAKE THE BOTTLE WELL BEFORE USING.SPACE THE DOSES EVENLY. KEEP TAKING UNTIL THE COURSE IS FINISHED, UNLESS YOU ARE TOLD TO STOP.
)
[1] => Array
(
[medicine] => Lactulose 3.1-3.7g/5ml oral soln
[description] =>take ONE to FOUR 5ml spoonsful TWICE each day when required (PRN : to be taken when necessary)
)
[2] => Array
(
[medicine] => Mirtazapine orodisp 30mg tabs
[description] => One To Be Taken At NightALLOW THE TABLETS TO DISSOLVE ON YOUR TONGUE, THEN SWALLOW WITH THE SALIVA. TAKE AFTER FOOD.WARNING: THIS MEDICINE MAY MAKE YOU FEEL SLEEPY. IF THIS HAPPENS, DO NOT DRIVE OR USE TOOLS OR MACHINES. DO NOT DRINK ALCOHOL.
)
.
.
.
.
)
try this:
foreach ($array as $key => $value) {
$explode = explode('separate', $value);
$array[$key] = array(
'medicine' => $explode[0],
'description' => $explode[1]
);
}
just edit 'separate' by the word you want to seperate your text with
//Main array
//Create array of words that you have to separate from
$arrOfsplittedData = [];
$intCount = 0;
foreach($arrOfMedicineAndDesc as $medicineAndDesc){
$lowerCaseMedicineAndDesc = strtolower($medicineAndDesc);
$splitMedicineAndDesc = multiexplode(array("oral", "tabsone", "cream", "capsone", "tabstwo"), $lowerCaseMedicineAndDesc);
if($splitMedicineAndDesc){
$arrOfsplittedData[$intCount]["medicine"] = $splitMedicineAndDesc[0];
$arrOfsplittedData[$intCount]["description"] = $splitMedicineAndDesc[1];
}
$intCount++;
}
function multiexplode ($delimiters,$string) {
$ready = str_replace($delimiters, $delimiters[0], $string);
$launch = explode($delimiters[0], $ready);
return $launch;
}
echo "<pre>";
print_r($arrOfsplittedData);die;
You can be crafty with regex to do this:
<?php
$array = array(
"Clarithromycin 250mg/5ml oral susptake TWO 5ml spoonsful TWICE each day DISCARD REMAINING AFTER TEN DAYSSHAKE THE BOTTLE WELL BEFORE USING.SPACE THE DOSES EVENLY. KEEP TAKING UNTIL THE COURSE IS FINISHED, UNLESS YOU ARE TOLD TO STOP.",
"Lactulose 3.1-3.7g/5ml oral solntake ONE to FOUR 5ml spoonsful TWICE each day when required (PRN : to be taken when necessary)",
"Mirtazapine orodisp 30mg tabsOne To Be Taken At NightALLOW THE TABLETS TO DISSOLVE ON YOUR TONGUE, THEN SWALLOW WITH THE SALIVA. TAKE AFTER FOOD.WARNING: THIS MEDICINE MAY MAKE YOU FEEL SLEEPY. IF THIS HAPPENS, DO NOT DRIVE OR USE TOOLS OR MACHINES. DO NOT DRINK ALCOHOL.",
"Senna 7.5mg/5ml oral soln SFTwo 5ml Spoonfuls To Be Taken At NightSHAKE THE BOTTLE WELL BEFORE USING.THIS MEDICINE MAY COLOUR YOUR URINE. THIS IS HARMLESS.",
"SUDOCREM ANTISEPTIC HEALING CREAMas directedFOR EXTERNAL USE ONLY. (MD mean As directed)",
"CIRCADIN MR 2MG TABSOne To Be Taken At NightSWALLOW WHOLE. DO NOT CHEW OR CRUSH.TAKE WITH OR JUST AFTER FOOD, OR A MEAL.WARNING: THIS MEDICINE MAY MAKE YOU FEEL SLEEPY. IF THIS HAPPENS, DO NOT DRIVE OR USE TOOLS OR MACHINES. DO NOT DRINK ALCOHOL.",
"Memantine 10mg tabsOne To Be Taken Each Day",
"Omeprazole gr 10mg capsOne To Be Taken Each DaySWALLOW WHOLE.DO NOT CHEW OR CRUSH.",
"Senna 7.5mg tabsTwo To Be Taken At NightTHIS MEDICINE MAY COLOUR YOUR URINE. THIS IS HARMLESS."
);
$splitWords = [
"take \w+?\b",
"to be taken"
]; //Regex of what you want to split on
$regex = "/(".implode("|",$splitWords).")/";
$replaced = preg_replace($regex, "\n$1", $array); //Replace what you found with a newline + what you found
print_r(array_map(function ($v) {
$array = explode("\n",$v);
return [
"medicine" => $array[0],
"description" => isset($array[1])?$array[1]:null
]; //If you're array is bigger than 2 elements you may need to loop here.
},$replaced)); //Split sentences on the newlines.
Here's an example:
http://sandbox.onlinephpfunctions.com/code/b2e2ceef17152e9a888a7b68fab3acb2e9f10fe3
Ohare:Montrose:I-290 Circle:IL:IL
Ohare-Montrose-I_290-Circle-IL-IL
EB:Kennedy Expy:O'Hare:IL-43 (Harlem Ave):IL:IL
NB:I-894/US-45:Hale Interchange:Zoo Interchange:WI:IL
NB
I-894/US-45
Hale
Interchange
Zoo Interchange
WI
IL
WB:Indiana-East-West:Eastpoint:Middlebury:IN:25:IL
WB
Indiana-East-West
Eastpoint
Middlebury
IN
25
IL
Trying to extract words from two different sources that use different conventions.
Using regex for that, I cannot create one regex that deals with both options.
If I try to extract using : or - then the first one gets extracted as
Ohare, Montrose, I, 290 Circle, IL, IL
How can I get a regex to split on : or - but ignore I- or ignore 'IL-', 'US-', 'Indiana-East-West' and many other that I may find?
What I have so far but not working as I want
Regex
You can use this negative lookbehind regex:
(?:(?:IL?|US)-|Indiana-East-West)(*SKIP)(*F)|[:-]
RegEx Demo
Example Code:
$s = 'NB:I-894/US-45:Hale Interchange:Zoo Interchange:WI:IL';
print_r(preg_split('/(?:(?:IL?|US)-|Indiana-East-West)(*SKIP)(*F)|[:-]/' , $s));
Array
(
[0] => NB
[1] => I-894/US-45
[2] => Hale Interchange
[3] => Zoo Interchange
[4] => WI
[5] => IL
)
I wrote a regex to parse out a string in the form of:
Job Title (<numeric job number>) Location, State, Country
with this:
(?P<jobTitle>[a-zA-Z0-9,\:\/\s]+)[\s]+\((?P<jobCode>[0-9]+)\)[\s]+(?P<location>[a-zA-Z0-9,\s]+)
But I ran into a problem when a job came in this form instead:
Job Title (extra information) (<numeric job number>) Location, State, Country
So my question is, how can I take in everything before the numeric job number as the 'jobTitle', the numeric portion as the 'jobCode', and everything after that as the 'location'?
For example
Super Cool Job (12345) Cool Place, California, United States
jobTitle => Super Cool Job
jobCode => 12345
location => Cool Place, California, United States
Another Cool Job (Not in california) (54321) Paris, France
jobTitle => Another Cool Job (Not in california)
jobCode => 54321
location => Paris, France
You could be looking for something like:
(.*\S)\s+\((\d+)\)\s+(\S.*)
With this simple regex, your strings will be in Groups 1, 2 and 3
$jobs='Super Cool Job (12345) Cool Place, California, United States
Another Cool Job (Not in california) (54321) Paris, France';
$regex = '/^(?m)(.*?)\s+\((\d+)\)\s+(.*)$/';
if(preg_match_all($regex,$jobs,$matches, PREG_SET_ORDER)) {
echo "<pre>";
print_r($matches);
echo "</pre>";
}
OUTPUT:
Array
(
[0] => Array
(
[0] => Super Cool Job (12345) Cool Place, California, United States
[1] => Super Cool Job
[2] => 12345
[3] => Cool Place, California, United States
)
[1] => Array
(
[0] => Another Cool Job (Not in california) (54321) Paris, France
[1] => Another Cool Job (Not in california)
[2] => 54321
[3] => Paris, France
)
)
If you want to extract all the fields, you can use this:
^(?<title>\D+) \((?<id>\d+)\)(?: (?<desc>[^,]+),)? (?<city>[^,]+), (?<country>[^,]+)$
I'm using str_getcsv to parse tab separated values being returned from a nosql query however I'm running into a problem and the only solution I've found is illogical.
Here's some sample code to demonstrate (FYI, it seems the tabs aren't being preserved when showing here)...
$data = '0 16 Gruesome Public Executions In North Korea - 80 Killed http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata "North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou... 1384357511 http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw 0 The Young Turks 1 2013-11-13 12:53:31 9ab8f5607183ed258f4f98bb80f947b4 35afc4001e1a50fb463dac32de1d19e7';
$data = str_getcsv($data,"\t",NULL);
echo '<pre>'.print_r($data,TRUE).'</pre>';
Pay particular attention to the fact that one column (beginning with "North Korea...." actually starts with a double quote " but doesn't finish with one. This is why I supply NULL as the third parameter (enclosure) to override the defaut " enclosure value.
Here is the result:
Array
(
[0] => 0
[1] => 16
[2] => Gruesome Public Executions In North Korea - 80 Killed
[3] => http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata
[4] =>
[5] => North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou... 1384357511 http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw 0 The Young Turks 1 2013-11-13 12:53:31 9ab8f5607183ed258f4f98bb80f947b4 35afc4001e1a50fb463dac32de1d19e7
)
As you can see the quote is breaking the function. Logically I thought I would be able to use NULL or and empty string'' as the third parameter for str_getcsv (enclosure) but neither worked?!?!
The only thing I could use to get str_getcsv to work properly was a space char ' '. That doesn't make any sense to me becuase none of the columns have whitespace starting and/or ending them.
$data = '0 16 Gruesome Public Executions In North Korea - 80 Killed http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata "North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou... 1384357511 http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw 0 The Young Turks 1 2013-11-13 12:53:31 9ab8f5607183ed258f4f98bb80f947b4 35afc4001e1a50fb463dac32de1d19e7';
$data = str_getcsv($data,"\t",' ');
echo '<pre>'.print_r($data,TRUE).'</pre>';
Now the result is:
Array
(
[0] => 0
[1] => 16
[2] => Gruesome Public Executions In North Korea - 80 Killed
[3] => http://www.youtube.com/watch?v=Dtx30AQpcjw&feature=youtube_gdata
[4] =>
[5] => "North Korea staged gruesome public executions of 80 people this month, some for offenses as minor as watching South Korean entertainment videos or being fou...
[6] => 1384357511
[7] => http://gdata.youtube.com/feeds/api/videos/Dtx30AQpcjw
[8] => 0
[9] => The Young Turks
[10] =>
[11] =>
[12] =>
[13] =>
[14] => 1
[15] => 2013-11-13 12:53:31
[16] => 9ab8f5607183ed258f4f98bb80f947b4
[17] => 35afc4001e1a50fb463dac32de1d19e7
)
So my question is, why does it work with a space as the enclosure, but not NULL or and empty string? Also are there repercussions to this?
UPDATE 1: It seems this reduced the number of errors I was receiving in our logs but it didn't eliminate them, so I'm guessing that the I used as the enclosure has caused unintended side effects, albeit less troubling than the previous problem. But my question remains the same, why can't I use NULL, or an empty space as the enclosure, and secondly, is there a better way of dealing with / doing this?
Just to give a starting point ...
You might wanna consider working with the string itself, instead of using a function like str_getcsv in your case.
But be aware that there are at least some pitfalls, if you choose this route (might be your only option though):
Handling of escaped characters
Line breaks within the data (not meant as delimiters)
If you know that you don't have any other TABS in your string other than those ending the fields, and you don't have any linebreaks other than those delimiting a row, you might be fine with this:
$data = explode("\n", $the_whole_csv_string_block);
foreach ($data as $line)
{
$arr = explode("\t", $line);
// $arr[0] will have every first field of every row, $arr[1] the 2nd, ...
// Usually this is what I want when working with a csv file
// But if you rather want a multidimensional array, you can simply add
// $arr to a different array and after this loop you are good to go.
}
Otherwise this is just a starting point for you, to begin and tweak it to your individual situation, hope it helps.
Simply use chr(0) as enclosure and escape:
$data = str_getcsv($data, "\t", chr(0), chr(0));
I have the following PHP regex:
#<tr[\s\S]*?<a class="b1"[\s\S]*?<em[^>]*>([^<]*)[\s\S]*?stars_small_([0-9].[0-9])#
Which I am using on this site:
Gamespy
I get back this data:
[1] => Array
(
[0] => AC/DC Live: Rock Band Track Pack
[1] => Ace Combat 6: Fires of Liberation
[2] => All-Pro Football 2K8
[3] => Alone in the Dark
[4] => Armored Core 4
[5] => Army of Two
[6] => Army of Two: The 40th Day
)
[2] => Array
(
[0] => 3.5
[1] => 2.5
[2] => 3.5
[3] => 3.5
[4] => 2.5
[5] => 3.5
[6] => 3.5
)
This is what I am looking for, however I don't seem to be getting back all of the data. I should get the following Titles with scores. But for some reason I am only getting some of them.
AC/DC Live: Rock Band Track Pack
Ace Combat 6: Fires of Liberation
Afro Samurai
Alan Wake
Aliens vs. Predator
All-Pro Football 2K8
Alone in the Dark
Amped 3
Armored Core 4
Army of Two
Army of Two: The 40th Day
Assassin's Creed
Assassin's Creed II
Assassin's Creed: Brotherhood
Avatar: The Game
I have tested my regex here:
http://www.solmetra.com/scripts/regex/index.php
Using this HTML:
http://justpaste.it/20u5
Any help explaining why I am only getting back some of the results would be greatly appreciated. Thanks
Change the sub-pattern stars_small_([0-9].[0-9]) to stars_small_([0-9](?:\.[0-9])?) as some of the urls only have one digit in the SRC attribute of the IMG tag.