July 16, 2020

HTML and Hyperlinking


Tim Berners-Lee understood the need for electronic document delivery at CERN. He had witnessed the problems encountered by physicists attempting to find information relevant to their work ("the technical details of past projects are lost forever"). Duplication of projects, the inability to access data due to differing terminals and software, and differing document formats (SGML, Unix, CERNDOC etc) were some of the most obvious problems.

Already familiar with SGML, he produced a simple subset for formatting documents and named it Hypertext Markup Language, HTML. In 1990 he and Robert Cailliau wrote software for viewing these documents. This browser was named the WorldWideWeb (one word). Later renamed as Nexus to differentiate it from the "information space", which was now named the World Wide Web.

Berners-Lee promoted the concept of "universal readership":

  • One program developed to access data
  • Can be read by anyone, anywhere and on any machine

The sequential access of information was considered to be inadequate; linked information was the way forward.

  • Concept of linking documents already understood by others (Bush, Englebart, Nelson). TBL's idea was to hyperlink CERN's electronic documents.
  • DTD was not written for his first version HTML 1 (TBL wanted to keep it simple; there would be no need for knowledge of SGML)
  • HTML tags do not convey semantics of content, just document structure
  • Simple tag set e.g. <h1></h1> for main heading; <p></p> for a paragraph. (go here for HTML basics)
  • A DTD was written for HTML 2. Didn't last long before it was "hijacked"
  • Mosaic browser written. First attempt at customising HTML. Included an image tag to display images in Web documents. Tim is not pleased.
  • Browser Wars! Netscape (Mosaic) versus MS Internet Explorer. Each new version offered additional features, hence moving further away from TBL's HTML.
  • A return to the same old problems; namely proprietary software and a lack of common formats.
  • HTML becomes "bloated" and fails to provide document structure by concentrating on document appearance.

Berners-Lee and others (W3C) attempt to stop the rot by returning to standards (go here for more on the history of the Web).

Enter XML!

