Create value with specific parts of a text file - php

Ok, I am working on a flatfile shoutbox, and I am trying to achieve a way to get the username from the flatfile and making it a variable so I can use it to make a call to the database to check if the user is admin so they can delete/ban users directly from the shoutbox.
This is an example line in the flatfile
<div><i><div class='date'>12/08/2012 18:56 pm </div></i> <div class='groupAdmin'><b>Admin</b></div><b>kira423:</b> hiya :D</div>
So I wanna take the username which is kira423 in this case and create a variable such as $shoutname and make it equal kira423
I have tried a google search and looked around on here, but was unable to find an answer, so I am hoping that I can get some insight on how to do this with a question of my own here.
Thanks,
Kira

You should use preg_match for those tasks like this:
preg_match_all('|<div class=\'date\'>(?P<date>.*?) .*<a.*>(?P<user>.*)</a>|i', $data, $matches);
var_dump($matches);
Interating through all array elements:
foreach ($matches['user'] as $key => $user) {
var_dump($user);
}

I think you should just parse each line in the flatfile as HTML (there are simple HTML tags used), just like described in PHP Parse HTML code (or type "php parse HTML" in google). Then you may access the username (kira123) from an array or whatever.
PS HTML is not the best way you can store messages to display. Even CSV seems to be better - it'd be "kira123;date;some text" - it's easier to read and to access each part. When displaying, use the standar decorator pattern.

Related

extract info from jpeg with PHP

I want to extract variable lengths of information from a jpeg-file using PHP, but it is not exif-data.
If I open the jpeg with a simple text editor, I can see that the wanted informations are at the end of the file and seperated by \00.
Like this:
\00DATA\00DATA00DATA\00DATA\000\00DATA
Now if I use PHP's file_get_contents() to load the file into a string, the dividers \00 are gone and other symbols show up.
Like so:
ÿëžDATADATADATADATADATA ÿÙ
Could somebody please eplain:
Why do the \00 dividers vanish?
How to get the informations using PHP?
EDIT
The question is solved, but for those seeking a smarter solution, here is the file I try to obtain the DATA parts from: https://www.dropbox.com/s/5cwnlh2kadvi6f7/test-img.jpg?dl=0 (yes I know its corrupted)
Use instead $data = exif_read_data("PATH/some.jpg") it will give you all headers data about image, you can check its manual here - http://php.net/manual/en/function.exif-read-data.php
I came up with a solution on my own. May not be pretty, but works for me.
Using urlencode(file_get_contents()) I was able to retrieve the \00 parts as %00.
So now it reads like this:
%00DATA%00DATA%00DATA%00DATA%000%00DATA
I can split the string at the %00 parts.
I am going to accept this answer, once SO lets me do so and nobody comes up with a better solution.

Searching for a link in a website and displaying it PHP

hello im a newbie in php i am trying make a search function using php but only inside the website without any database
basically if i want to search a string namely "Health" it would display the lines
The Joys of Health
Healthy Diets
This snippet is the only thing i could find if properly coded would output the "lines" i want
$myPage = array("directory.php","pages.php");
$lines = file($myPage[n]);
echo $lines[n];
i havent tried it yet if it would work but before i do i want to ask if there is any better way to do this?
if my files have too many lines wont it stress out the server?
The file() function will return an array. You should use file_get_contents() instead, as it returns a string.
Then, use regular expressions to find specific text within a link.
Your goal is fine but the method you're thinking about is not. the file() function read a file, line by line, and inserts it into an array. This assumes the HTML is well-structured in a human-readable fashion, which is not always the case. However, if you're the one providing the HTML and you make sure the structure is perfectly defined, ok... here you have the example you provided us with but complete (take into account it's the 'wrong' way of solving your problem, but if you want to follow that pattern, it's ok):
function pagesearch($pages, $string) {
if (!empty($pages) && !empty($string)) {
$tags = [];
foreach ($pages as $page) {
if ($lines = file($page)) {
foreach ($lines as $line) {
if (!empty($line)) {
if (mb_strpos($line, $string)) {
$tags[$page][] = $line;
}
}
}
}
}
return $tags;
}
}
This will return you an array with all the pages you referenced with all occurrences of the word you look for, separated by page. As I said, it's not the way you want to solve this, but it's a way.
Hope that helps
Because you do not want to use any database and because the term database is very broad and includes the file-system you want to do a search in some database without having a database.
That makes no sense. In your case one database at least is the file-system. If you can accept the fact that you want to search a database (here your html files) but you do not want to use a database to store anything related to the search (e.g. some index or cached results), then what you suggest is basically how it is working: A real-time, text-based, line-by-line file-search.
Sure it is very rudimentary but as your constraint is "no database", you have already found the only possible way. And yes it will stress your server when used because real-time search is expensive.
Otherwise normally Lucene/Solr is used for the job but that is a database and a server even.

Accessing XML attributes data

I have two lines of XML data that are attributes but also contain data inside then and they are repeating fields. They are being stored in a SimpleXML variable.
<inputField Type="Name">John Doe</inputField>
<inputField Type="DateOfHire">Tomorrow</inputField>
(Clearly this isnt real data but the syntax is actually in my data and I'm just using string data in them)
Everything that I've seen says to access the data like this, ,which I have tried and it worked perfectly. But my data is dynamic so the data isn't always going to be in the same place, so it doesn't fit my needs.
$xmlFile->inputField[0];
$xmlFile->inputField[1];
This works fine until one of the lines is missing, and I can have anywhere from 0 to 5 lines. So what I was wondering was is there any way that I can access the data by attribute name? So potentially like this.
$xmlFile->inputField['Name'];
or
$xmlFile->inputField->Name;
I use these as examples strictly to illustrate what I'm trying to do, I am aware that neither of the above lines of code are syntactically correct.
Just a note this information is being generated externally so I cannot change the format.
If anyone needs clarification feel free to let me know and would be happy to elaborate.
Maybe like this?
echo $xmlFile->inputField->attributest()->Name;
And what you're using? DOMDocument or simplexml?
You don't say, but I assume you're using SimpleXMLElement?
If you want to access every item, just iterate:
foreach ($xmlFile->inputField as $inputField) { ... }
If you want to access an attribute use array notation:
$inputField['Type']
If you want to access only one specific element, use xpath:
$xmlFile->xpath('inputField[#Type="Name"]');
Perhaps you should read through the basic examples of usage in the SimpleXMLElement documentation?
For example you can a grab a data:
$xmlFile = simplexml_load_file($file);
foreach($xmlFile->inputField as $res) {
echo $res["Name"];
}

grabbing an array value, but only the text - how?

I'm trying to grab the value of an array that I got from parsing an xml file (using PHP's simpleXML) so that I can throw it into a database table. The problem I'm having is that one of the array values has a div and "a" tags in it after a sentence or two (which is what I really want). I'm not sure how to grab only the text. The array value looks like this:
[0] => The central purpose and philosophy of this podcast series<div class="feedflare">
So I'm assuming that maybe I could do some kind of function that grabs the value up to the point of the "<" and stop there and throw this new variable into the database. I'm kind of a n00b with PHP so I don't even know where to start doing that. Any help is greatly appreciated.
sounds like strip_tags() is what you_re looking for. just do:
$text = strip_tags($my_array[0]);

Assistance with building an inverted-index

It's part of an information retrieval thing I'm doing for school. The plan is to create a hashmap of words using the the first two letters of the word as a key and any words with the two letters saved as a string value. So,
hashmap["ba"] = "bad barley base"
Once I'm done tokenizing a line I take that hashmap, serialize it, and append it to the text file named after the key.
The idea is that if I take my data and spread it over hundreds of files I'll lessen the time it takes to fulfill a search by lessening the density of each file. The problem I am running into is when I'm making 100+ files in each run it happens to choke on creating a few files for whatever reason and so those entries are empty. Is there any way to make this more efficient? Is it worth continuing this, or should I abandon it?
I'd like to mention I'm using PHP. The two languages I know relatively intimately are PHP and Java. I chose PHP because the front end will be very simple to do and I will be able to add features like autocompletion/suggested search without a problem. I also see no benefit in using Java. Any help is appreciated, thanks.
I would use a single file to get and put the serialized string. I would also use json as the serialization.
Put the data
$string = "bad barley base";
$data = explode(" ",$string);
$hashmap["ba"] = $data;
$jsonContent = json_encode($hashmap);
file_put_contents("a-z.txt",$jsonContent);
Get the data
$jsonContent = file_get_contents("a-z.txt");
$hashmap = json_decode($jsonContent);
foreach($hashmap as $firstTwoCharacters => $value) {
if ($firstTwoCharacters == 'ba') {
$wordCount = count($value);
}
}
You didn't explain the problem you are trying to solve. I'm guessing you are trying to make a full text search engine, but you don't have document ids in your hashmap so I'm not sure how you are using the hashmap to find matching documents.
Assuming you want a full text search engine, I would look into using a trie for the data structure. You should be able to fit everything in it without it growing too large. Nodes that match a word you want to index would contain the ids of the documents containing that word.

Categories