|
-
October 16th, 2008, 04:35 PM
#1
[RESOLVED] Need help on data extract script
I need help extracting info from a HTML table.
The table have 5 columns and many rows. I want to extract the table information into an array so I can save in database.
This is the HTML code i'm dealing with:
Code:
<tr>
<td bgcolor=#FFFFFF><b>
Data1 </b></td>
<td bgcolor=#FFFFFF>Data2</td>
<td bgcolor=#FFFFFF colspan="2">Data3</td>
<td bgcolor=#FFFFFF>Data4</td>
<td bgcolor=#FFFFFF>Data5</td>
</tr>
<tr>
Data1 </b></td>
<td bgcolor=#FFFFFF>Data2</td>
<td bgcolor=#FFFFFF colspan="2">Data3</td>
<td bgcolor=#FFFFFF>Data4</td>
<td bgcolor=#FFFFFF>Data5</td>
</tr>
PHP should be able to handle this easy with preg_match_all() but i'm unable to make a regular expression for this one. Please help!
Thanks for attention!
Last edited by bubu; October 16th, 2008 at 04:38 PM.
All consequences are eternal in some way.
-
October 16th, 2008, 06:07 PM
#2
Re: Need help on data extract script
This type of thing gets *really* ugly as you go along as slight changes will often break the code / extract.
Basically, you'll need to match across each set of <tr>...</tr>'s (using a multi-line match) and then iterate over the internal contents as needed.
How structured is the page? The version shown below doesn't make it particularly easy to parse.
-
October 16th, 2008, 07:12 PM
#3
Re: Need help on data extract script
Thank you soo much for your reply! I've been trying it all day. All the craziest regular expressions from some books, google, php manual and from my head have been tried with no sucess. =(
I'm trying to get a list of items from this pages:
http://www.ittf.com/ittf_equipment/R...Company=ANDRO&
There's a table with items, each row in the html table would be a row in database table. So i need to get the data in the TDs. But I can't even get the TRs!
I've beeen trying something like:
Code:
$html = file_get_contents('http://www.ittf.com/ittf_equipment/Racket_Coverings1.asp?s_Company=ANDRO');
// code to clean unused HTML to leave only table TRs:
$start = strpos($html, '<tr', strpos($html, 'Rubber ID Stamp'));
$stop = strpos($html, '<td colspan="6" bgcolor="#CCCCCC">', $start);
if (!$stop)
{
$stop = strpos($html, '</table>', $start);
if (!$stop)
{
echo 'Error while parsing HTML.';
exit;
}
else
{
$stop = $stop - 10;
}
}
else
{
$stop = $stop - 20;
}
$html = substr($html, $start, $stop - $start);
$html = str_replace("\r\n", "", $html);
// finished cleaning HTML. now $html has only important data
// OUR REGULAR EXPRESSION DOESN't WORK (insert desesperate screams here)
$pattern = '|<tr>(.*)</tr>|i';
if (!preg_match_all($pattern, $html, $matches))
{
echo '<br>Error: No item found.';
exit;
}
print_r($matches);
=(
Last edited by bubu; October 16th, 2008 at 07:24 PM.
All consequences are eternal in some way.
-
October 16th, 2008, 07:22 PM
#4
Re: Need help on data extract script
Since you know that it is always 5 across, that makes it very simple. The following script should get you more than started.
PHP Code:
$contents = str_replace("\n", '', str_replace("\r", '', $contents));
preg_match_all('/<td\b[^>]*>(.*?)<\/td>/i', $contents, $matches);
$row = 1;
$column = 1;
foreach ($matches[1] as $match) {
$match = trim(strip_tags($match));
echo $row . '-' . $column . ': ' . $match . '<br />';
$column++;
if ($column == 6) {
$column = 1;
$row++;
}
}
If the post was helpful...Rate it! Remember to use [code] or [php] tags.
-
October 16th, 2008, 07:37 PM
#5
Re: Need help on data extract script
PeejAvery! You probably saved me hours of head bashing! Thank you!
I've changed some little things so I got the hole table in one array like I wanted:
Code:
$row = 1;
$column = 1;
$items = array();
foreach ($matches[1] as $match)
{
$items[$row][$column] = trim($match);
$column++;
if ($column == 6) {
$column = 1;
$row++;
}
}
print_r($items);
Thank you very much PeejAvery! Thank you also mmetzger for your attention.
All consequences are eternal in some way.
-
October 16th, 2008, 08:01 PM
#6
Re: Need help on data extract script
You're most welcome. Glad I could save you so much stress.
If the post was helpful...Rate it! Remember to use [code] or [php] tags.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|