Run Geobot!


This utility will scan a site domain HTML page by HTML page for links to geospatial data looking at their file extensions. In addition to putting the geospatial data links into a database it also outputs out to an HTML page. So at the end of a run you have a text file of HTML links indicating where the robot has been and an HTML page indicating what the robot has picked up along the way. The robot also takes the <title> </title> off of each traversed HTML page and uses that as a description for the links on that page. The following file formats are suported.

On some sites the geospatial datasets are linked to HTML pages that are generated by CGI scripts. The HTTP utility that I use can not access these pages and therefore can not parse them.

Note that the robot traverses HTML pages by taking links off of pages that it has previously visited. If the initial page has no HTML links on it then obviously the robot will not work past that page. You can check this by using Ctrl-U on your browser and look at the <a href=" "> </a> fields. Its best to start the robot off on pages that have lots of links that correspond to the site domain you entered.

NOTE, PAGES WITH HTML FRAMES USUALLY DON'T WORK BECAUSE THERE AREN'T ANY LINKS ON SUCH PAGES FOR THE ROBOT TO GRAB. DO A CTRL-U AND YOU'LL SEE WHAT I MEAN.


Information used by the robot and database:

URL address of the HTML page where geospatial data is linked:
(http://domain/directories/filname.html)

Optional domain of FTP site that are linked to the above HTML page(s):
(ftp://domain)


Information to delete all references
of your site from the database:

Full name or password (make sure its
something you can remember):

E-mail address:



Go back to the Search Page.

Email: anp@geo.ed.ac.uk