erxuan
September 4th, 2002, 07:28 PM
I was asked to write a program to construct the tag tree given a HTML file. eg, the tree has <HTML>as its root, which has two children <HEAD> and <BODY>.
How to construct the tree efficiently? Is there any useful resources I can use to write this program?
Since I need to finish it in a very limited time. Thanks.:confused:
poccil
September 4th, 2002, 07:56 PM
Try this page to get started:
http://codeguru.earthweb.com/data-misc/C-XML.html
HTML is just a form of XML.
Bob Davis
September 4th, 2002, 07:56 PM
Sounds like you may want to code your own data structure to hold the tree data. If you want a general tree, I know that there's a commonly used way of representing it in terms of a binary tree, if you want to wrap a binary tree class that you have handy. I did this once in Java, but the code was extremely sloppy, as I based it on a not very well-written example from a book I have. Anyway, you can represent it with a binary tree by using the "first child, next sibling" representation. For any node, its first child is found on the left link, and its next sibling on the same level is found on its right link. If I'm not very clear here someone else may be able to clarify it. I don't know of any general tree classes available on the Web, but once you get the structure written, it should be fairly straightforward to write an algorithm to build the tree.