CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 3 of 3
  1. #1
    Join Date
    Sep 1999
    Location
    Upstate NY
    Posts
    6

    Need advice on an application i need to write

    First off, let me say that i am very new to java. I took an intro to java class at college, but that mainly covered creating applets and painting pictures to web pages using java. I have done NO java development since the intro class(18+ months), i have been doing all C++. Now, the company i work for needs an application written and they are talking about doing it in java. The application will simply go to a web page and validate all of the links throughout the entire web page. The problem is we are writting the application specifically for a web page we created. the web page is extremly large and complicated. when you go to the main page, you first log in with username and password, and then a web page is created on the fly after the domino server retrieves information about you from a notes database. Every web page, except the main page, is created on the fly with data from the notes database. Does anyone have any advice they could give me for this? I am not working alone, but as far as skills go i am VERY far behind everyone else. Is there specific information i will need to interface with notes/domino? will a good book on network programming with java cover it? are there any online resources that could help me get up to speed for this project? I know this is asking a lot, and i appreciate any/all advice or suggestions anyone has, even if its run out of the building screaming )


  2. #2
    Join Date
    May 1999
    Location
    Pune, MH, India.
    Posts
    453

    Re: Need advice on an application i need to write

    Well, I don't know any book on this issue. But when u say all the web-pages are generated on the fly by the web-server, there should some kind of a CGI program which might be creating the pages. May be perl, ISAPI or even using ASP.

    But its not different that retrieving a static web-page from the server. Because CGI does all the work for u and returns the 'html' code for the page.

    So if u r retrieving a webpage as...


    URL urlPage = new URL( "http://microsoft.com/default.htm");




    The method for retrieving dynamic pages from CGI is also the same e.g.


    URL urlPage = new URL( "http://site_address/pageGenerator.pl");




    Both the things will return 'html' code. U don't have to access database, its done by the CGI.

    - UnicMan
    http://members.tripod.com/unicman

  3. #3
    Join Date
    Sep 1999
    Location
    Dubai, UAE
    Posts
    38

    Re: Need advice on an application i need to write

    Hi, check out the JDC Tech tips issue - Dt. Sept23. This is useful provided u r using / intending to use Java 2 / a version of Swing that supports EditorKits. For ur conv. am enclosing the related excerpt. This will just give u a set of links, u'll hv to maintain a list, go thru each, decide whether uu want to check that or not, if yes do the check, and continue... 90% some recursive function...

    ---------------------------------------------------------------------------------Begin excerpt from the Tech Tips.
    There are many applications that fetch an HTML page from the Web
    and then extract the links from the page. For example, a
    link-checker application fetches a page, extracts the links, and
    then checks the links to see of they refer to actual pages.

    The HTML 3.2 support in the Java(tm) 2 platform makes it fairly easy
    to find and parse links. This tip demonstrates how to use that
    support.

    The first step is to create an editor kit. The purpose of an editor
    kit is to parse data in some format, such as HTML or RTF, and store
    the information in a data structure that fully represents the data.
    This data structure, called a Document, allows you to examine and
    modify the data in a convenient way.

    Let's look at an example. In the following example program, we're
    going to examine the HTML data in a Document object. The program
    looks for A (anchor) tags and extracts the HREF attribute information
    from these tags.


    import java.io.*;
    import java.net.*;
    import javax.swing.text.*;
    import javax.swing.text.html.*;

    class GetLinks {
    public static void main(String[] args) {
    EditorKit kit = new HTMLEditorKit();
    Document doc = kit.createDefaultDocument();

    // The Document class does not yet handle charset's properly.
    doc.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
    try {
    // Create a reader on the HTML content.
    Reader rd = getReader(args[0]);

    // Parse the HTML.
    kit.read(rd, doc, 0);

    // Iterate through the elements of the HTML document.
    ElementIterator it = new ElementIterator(doc);
    javax.swing.text.Element elem;
    while ((elem = it.next()) != null) {
    SimpleAttributeSet s = (SimpleAttributeSet)
    elem.getAttributes().getAttribute(HTML.Tag.A);
    if (s != null) {
    System.out.println(s.getAttribute(HTML.Attribute.HREF));
    }
    }
    } catch (Exception e) {
    e.printStackTrace();
    }
    System.exit(1);
    }

    // Returns a reader on the HTML data. If 'uri' begins
    // with "http:", it's treated as a URL; otherwise,
    // it's assumed to be a local filename.
    static Reader getReader(String uri) throws IOException {
    if (uri.startsWith("http:")) {
    // Retrieve from Internet.
    URLConnection conn = new URL(uri).openConnection();
    return new InputStreamReader(conn.getInputStream());
    } else {
    // Retrieve from file.
    return new FileReader(uri);
    }
    }
    }





    This program takes one parameter from the command line. If the
    parameter starts with "http:", the program treats the parameter as
    a URL and fetches the HTML from that URL. Otherwise, the parameter
    is treated as a filename and the HTML is fetched from that file.

    For example,

    $ java GetLinks http://java.sun.com

    retrieves the HTML from the main page at java.sun.com.

    The editor kit is an HTMLEditorKit object that contains an HTML
    parser. It creates a Document object that can represent HTML. And
    it's the editor kit's read() method that parses the HTML and stores
    the information in the Document.

    Once the HTML data is saved in the Document object, we're ready to
    look for links. This is done by creating an iterator (using
    ElementIterator) that iterates over all the visible text pieces
    (called elements) in the HTML. For each text piece, we check to see
    if it has been formatted for linking, in other words, whether the
    text is formatted with the A (anchor) tag. We do this by calling
    getAttributes().getAttribute(HTML.Tag.A). If the text piece has been
    formatted with the A tag, the method call returns the set of
    attributes of the A tag used to format that text piece. Otherwise
    the method call simply returns null.

    Note: The name getAttributes() is a little confusing because it has
    nothing to do with HTML attributes; the "attributes" in this case
    are all the HTML tags (such as an A tag) that were used to format
    that text piece.

    Now we have the set of attributes of the A tag used to format
    a piece of text; it's stored in a SimpleAttributeSet object. So we
    just need to get the value of the HREF attribute and we're done.
    We can do this by calling getAttribute(HTML.Attribute.HREF) on the
    A tag's attribute set.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured