Project 10: Continuing the web browser

By now you have or soon will have finished the part of the code that processes the text, puts it into the two panes, and reacts to the scroll bar. The next and final (big) step is to get the hyperlinks working.

For review and clarification, here is our goal: When a page is loaded, the drop-down box on upper right part of the window will contain a list of all the parts of the text that are hyperlinked. When the user selects one from the drop-down box, then the page that it links to is loaded into the browser, in place of the current page.

This project description will walk you through the required steps.

Grabbing the links and the linked text

Let's take a look at how a hyperlink looks in a sample webpage.

<p><b>Bears</b> are <a href="/wiki/Mammal" title="Mammal">mammals</a> of the

We're interested in two things: the href (/wiki/Mammal) and the text that is linked (mammals). We're going to ignore the title. (Arguably the title rather than the linked text is what we should extract, but I would contend that the user is more likely to notice the exact text as it appears on the page.)

So your first task is to modify your code that strips out the tags so that it handles anchor/href tags differently. Ultimately we will need to record the links and the texts, but first, test that you can extract them correctly by printing the links and texts to the screen while your program is processing the web page.

You may find that once you have the basic link-extraction working that you get some extra links that we would rather ignore---for example, links to anchor points within the current page. I suggest including an ad hoc test to weed out those links.

Storing the links

Once you are able to extract the links and linked texts, find some way to store them so that they can be used later---specifically, renderPage() will use them to modify the JComboBox. My implementation uses two ArrayLists; alternately a HashMap could be used. Implement this, and test that it works by moving the screen-dump of the texts and links to renderPage() instead of the method where you strip out tags.

Working with JComboBoxes

Study the API entry for JComboBox and my use of it in the WikipediaSearch example. In brief, it contains a set of Objects (Strings, for our purposes) that it displays in the drop down menu, and it maintains a "currently selected item"---not only can the user select an item, but it is also possible to set a certain item to be "selected" with code. There are two things we want to do with the JComboBox:

Notice that most of the first point can be achieved using the methods addItem() and removeItemAt() or removeItem(). (There also is a method removeAll(), but I couldn't get that to work correctly... I suspect there is a bug in the version of the library we're using.)

The second point will involve putting an ActionListener on the JComboBox which will react when the user selects something. You can then find out what was selected using getSelectedItem() on the JComboBox.

One thing to be careful about: Whenever you modify the JComboBox in code---such as when removing or adding an item---you will trigger an actionPerformed() on the ActionListener. You need to find a way to distinguish between a user-initiated action and one done in code. I suggest a having an initial blank item in the menu, setting the "selected item" to null while changing the menu, or some combination of those.

Test to see that the right item is selected by printing the selection to the terminal---say, in the method clickLink(). Do this before trying to render the right page (which is going to take more work...)

Dealing with relative URIs

Now the final, hard part. If the hyperlinks in the webpage were fully-formed absolute URLs (such as http://en.wikipedia.org/wiki/Mammal), then loading the next page wouldn't be so bad. However, most webpages contain relative links (/wiki/Mammal).

To load the correct page, you need to resolve the relative link to an absolute link. And here's the hint I'll give: look at the method resolve() in class URI.

Then, load a new page, and make sure everything still works---for example, the links in the drop-down menu get reset. And then, you've written a working web browser!

DUE: Wednesday, Oct 27, at 5:00


Thomas VanDrunen
Last modified: Fri Oct 22 12:11:07 CDT 2010