grabbing an array value, but only the text - how? - php

I'm trying to grab the value of an array that I got from parsing an xml file (using PHP's simpleXML) so that I can throw it into a database table. The problem I'm having is that one of the array values has a div and "a" tags in it after a sentence or two (which is what I really want). I'm not sure how to grab only the text. The array value looks like this:
[0] => The central purpose and philosophy of this podcast series<div class="feedflare">
So I'm assuming that maybe I could do some kind of function that grabs the value up to the point of the "<" and stop there and throw this new variable into the database. I'm kind of a n00b with PHP so I don't even know where to start doing that. Any help is greatly appreciated.

sounds like strip_tags() is what you_re looking for. just do:
$text = strip_tags($my_array[0]);

Related

Accessing XML attributes data

I have two lines of XML data that are attributes but also contain data inside then and they are repeating fields. They are being stored in a SimpleXML variable.
<inputField Type="Name">John Doe</inputField>
<inputField Type="DateOfHire">Tomorrow</inputField>
(Clearly this isnt real data but the syntax is actually in my data and I'm just using string data in them)
Everything that I've seen says to access the data like this, ,which I have tried and it worked perfectly. But my data is dynamic so the data isn't always going to be in the same place, so it doesn't fit my needs.
$xmlFile->inputField[0];
$xmlFile->inputField[1];
This works fine until one of the lines is missing, and I can have anywhere from 0 to 5 lines. So what I was wondering was is there any way that I can access the data by attribute name? So potentially like this.
$xmlFile->inputField['Name'];
or
$xmlFile->inputField->Name;
I use these as examples strictly to illustrate what I'm trying to do, I am aware that neither of the above lines of code are syntactically correct.
Just a note this information is being generated externally so I cannot change the format.
If anyone needs clarification feel free to let me know and would be happy to elaborate.
Maybe like this?
echo $xmlFile->inputField->attributest()->Name;
And what you're using? DOMDocument or simplexml?
You don't say, but I assume you're using SimpleXMLElement?
If you want to access every item, just iterate:
foreach ($xmlFile->inputField as $inputField) { ... }
If you want to access an attribute use array notation:
$inputField['Type']
If you want to access only one specific element, use xpath:
$xmlFile->xpath('inputField[#Type="Name"]');
Perhaps you should read through the basic examples of usage in the SimpleXMLElement documentation?
For example you can a grab a data:
$xmlFile = simplexml_load_file($file);
foreach($xmlFile->inputField as $res) {
echo $res["Name"];
}

Create value with specific parts of a text file

Ok, I am working on a flatfile shoutbox, and I am trying to achieve a way to get the username from the flatfile and making it a variable so I can use it to make a call to the database to check if the user is admin so they can delete/ban users directly from the shoutbox.
This is an example line in the flatfile
<div><i><div class='date'>12/08/2012 18:56 pm </div></i> <div class='groupAdmin'><b>Admin</b></div><b>kira423:</b> hiya :D</div>
So I wanna take the username which is kira423 in this case and create a variable such as $shoutname and make it equal kira423
I have tried a google search and looked around on here, but was unable to find an answer, so I am hoping that I can get some insight on how to do this with a question of my own here.
Thanks,
Kira
You should use preg_match for those tasks like this:
preg_match_all('|<div class=\'date\'>(?P<date>.*?) .*<a.*>(?P<user>.*)</a>|i', $data, $matches);
var_dump($matches);
Interating through all array elements:
foreach ($matches['user'] as $key => $user) {
var_dump($user);
}
I think you should just parse each line in the flatfile as HTML (there are simple HTML tags used), just like described in PHP Parse HTML code (or type "php parse HTML" in google). Then you may access the username (kira123) from an array or whatever.
PS HTML is not the best way you can store messages to display. Even CSV seems to be better - it'd be "kira123;date;some text" - it's easier to read and to access each part. When displaying, use the standar decorator pattern.

mySql retrieving data between square brackets

I have strings of data in a field named content, one record may look something like:
loads of text ... [attr1] some text [attr2] more text [attr3] more text etc...
What I'm looking to do is get all the text within the square brackets; so that I can put it into a PHP array. Is this even possible with mySql?
I've seen the following post: Looking to extract data between parentheses in a string via MYSQL, but they are looking to only extract one value from between their parentheses, I have an unknown number of them. After reading that post I've though of doing something like the following;
SELECT substr(content,instr(content,"["), instr(content,"]")) as attrList from myTable
Which would grab me the following:
[attr1] some text [attr2] some more text [attr3]
and I can use PHP to strip the rest of the text out and then explode the string into an array, but is there a better way to do this just using mySql where I can retrieve something like:
[attr1][attr2][attr3]
I was thinking perhaps regex, but I see that just returns a true of false which doesn't help me a lot.
After even more research, I'm not sure it's possible in mySql, and I might need the results in string or array form depending on where I'm using them in my app.
So I've created a new method to return the list after I've got the data from the database (with a little help from this post: PHP: Capturing text between square brackets):
public function attrList($array=false)
{
preg_match_all("/\[.*?\]/",$this->content,$matches);
$params = str_replace(array('[',']'),'',$matches[0]);
return ($array===false) ? implode(', ',$params) : $params;
}

PHP: How can I access this XML entity when its name contains a reserved word?

I'm trying to parse this feed: http://musicbrainz.org/ws/1/artist/c0b2500e-0cef-4130-869d-732b23ed9df5?type=xml&inc=url-rels
I want to grab the URLs inside the 'relation-list' tag.
I've tried fetching the URL with PHP using simplexml_load_file(), but I can't access it using $feed->artist->relation-list as PHP interprets "list" as the list() function.
I have a feeling I'm going about this wrong (not much XML experience), and even if I was able to get hold of the elements I want, I don't know how to extract their attributes (I just want the type and target fields).
Can anyone gently nudge me in the right direction?
Thanks.
Matt
Have a look at the examples on the php.net page, they actually tell you how to solve this:
// $feed->artist->relation-list
$feed->artist->{'relation-list'}
To get an attribute of a node, just use the attribute name as array index on the node:
foreach( $feed->artist->{'relation-list'}->relation as $relation ) {
$target = (string)$relation['target'];
$type = (string)$relation['type'];
// Do something with it
}
(Untested)

Assistance with building an inverted-index

It's part of an information retrieval thing I'm doing for school. The plan is to create a hashmap of words using the the first two letters of the word as a key and any words with the two letters saved as a string value. So,
hashmap["ba"] = "bad barley base"
Once I'm done tokenizing a line I take that hashmap, serialize it, and append it to the text file named after the key.
The idea is that if I take my data and spread it over hundreds of files I'll lessen the time it takes to fulfill a search by lessening the density of each file. The problem I am running into is when I'm making 100+ files in each run it happens to choke on creating a few files for whatever reason and so those entries are empty. Is there any way to make this more efficient? Is it worth continuing this, or should I abandon it?
I'd like to mention I'm using PHP. The two languages I know relatively intimately are PHP and Java. I chose PHP because the front end will be very simple to do and I will be able to add features like autocompletion/suggested search without a problem. I also see no benefit in using Java. Any help is appreciated, thanks.
I would use a single file to get and put the serialized string. I would also use json as the serialization.
Put the data
$string = "bad barley base";
$data = explode(" ",$string);
$hashmap["ba"] = $data;
$jsonContent = json_encode($hashmap);
file_put_contents("a-z.txt",$jsonContent);
Get the data
$jsonContent = file_get_contents("a-z.txt");
$hashmap = json_decode($jsonContent);
foreach($hashmap as $firstTwoCharacters => $value) {
if ($firstTwoCharacters == 'ba') {
$wordCount = count($value);
}
}
You didn't explain the problem you are trying to solve. I'm guessing you are trying to make a full text search engine, but you don't have document ids in your hashmap so I'm not sure how you are using the hashmap to find matching documents.
Assuming you want a full text search engine, I would look into using a trie for the data structure. You should be able to fit everything in it without it growing too large. Nodes that match a word you want to index would contain the ids of the documents containing that word.

Categories