This link has been bookmarked by 9 people . It was first bookmarked on 28 Oct 2006, by Mike Wesch.
-
09 Sep 14
Greg LloydTed Nelson 2 Oct 1997: I want to discuss what I consider one of the worst mistakes of the current software world, embedded markup; which is, regrettably, the heart of such current standards as SGML and HTML. (There are many other embedded markup systems; an interesting one is RTF. But I will concentrate on the SGML-HTML theology because of its claims and fervor.)
-
Embedded Markup
I want to discuss what I consider one of the worst mistakes of the current software world, embedded markup; which is, regrettably, the heart of such current standards as SGML and HTML. (There are many other embedded markup systems; an interesting one is RTF. But I will concentrate on the SGML-HTML theology because of its claims and fervor.) -
There is no one reason this approach is wrong; I believe it is wrong in almost every respect. But I must be honest and acknowledge my objection as a serious paradigm conflict, or (if you will) religious conflict. In paradigm conflict and religious conflict, there can be no hope of doctrinal victory; the best we can seek is for both sides to understand each other fully and cordially.
-
SGML's advocates expect, or wish to enforce, a universal linear representation of hierarchial structure.
-
I believe that if this is a factual claim of appropriateness, it is a delusion; if it is an enforcement, it is an intolerable imposition which drastically curtails the representation of non-hierarchical media structure.
-
-
Network electronic publishing offers a unique special-case solution to the copyright problem that has not been generally recognized. I call it transpublishing. Let me explain.
-
In paper publishing, there are two copyright realms: a fortified zone of copyrighted material, defended by its owners and requiring prior negotiation by publishers for quotation and re-use; and an unfortified zone, the open sea of public domain, where anything may be quoted freely--but whose materials tend to be outdated and less desirable for re-use.
-
Transpublishing makes possible a new realm between these two, where everything may be treated as boilerplate (as with public-domain material), but where publishers relinquish none of their rights and receive revenue exactly proportional to use.
-
Two different parties have legitimate concerns. Original rightsholders are concerned for their territory of copyrighted material, as defined by law, so that they may maintain and benefit from their hard-won assets. But the public (everybody else, as well as rightsholders in their time off) would like to re-use and republish these materials in different ways.
-
What if a system could exist which would satisfy all parties--copyright holders and those who would like to quote and republish? What if materials could be quoted without restriction, or size limit, by anyone, without red tape or negotiation--but all publishers would continue to furnish the downloaded copies, and would be exactly rewarded, being paid for each copy?
-
Transpublishing versus Embedded Markup
-
Embedded markup drastically interferes with transclusive re-use. For one thing, any arbitrary section of an HTML document may not have correct tags (since the tags overlap and extend over potentially long attribute fields). This means HTML-based transclusion cannot be handled by a simple tag, but probably requires some sort of proxy server.
-
This is done all the time in scholarly writing and serious journalism, with phrases like "emphasis mine." It needs to be possible in transpublishing to change emphasis and other attributes by nullifying the original markup. Of course, re-emphasizing through markup is an editorial modification, subject to judgment calls and issues of academic etiquette. But the inquiring reader can always follow the bridge of transclusion to see the original as formatted by the author.
-
Alternative method 1: parallel markup
The best alternative is parallel markup. I believe that sequential formatted objects are best represented by a format in which the text and the markup are treated as separate parallel members, presumably (but not necessarily) in different files.[7] -
Alternative method 2: tag override
Where it is inconvenient to break out the tags into a parallel stream--i.e., where they're already stuck or published in the original--we may fall back on the method of tag override. By this I mean simply treating the original tags as if they are not there; ignoring them while counting through the contents and furnishing instead a parallel tag stream, as in parallel markup. We do not dislodge the original markup, but simply ignore it. -
Exactly Representing Thought
and Change -
My principal long-term concern is the exact representation of human thought, especially that thought put into words and writing. But the sequentiality of words and old-fashioned writing have until now compromised that representation, requiring authors to force sequence on their material, and curtail its interconnections. Designing editorial systems for exact and deep representation is therefore my objective.
-
To find the support functions really needed for creative organization by authors and editors, we must understand the exact representation and presentation of human thought, and be able to track the continuities of structure and change.
-
This means we must find a stable means of representing structure very different from the sequential and hierarchial--a representation of structure which recognizes the most anarchic and overlapping relations; and the location of identical and corresponding materials in different versions; which recognizes and maintains constancies of structure and data across successive versions, even as addresses of these materials become unpredictably fragmented by editing.
-
Thus deep version management--knowing locations of shared materials to the byte level--is a vital problem to solve in the design of editing systems. And the same location management is necessary on a much broader scale to support transpublishing.
-
-
Three Layers
-
I believe we should find a very general representational system, a reference model which breaks apart in parallel what is represented by SGML and HTML. This would make the creation of deep editing and version management methods much easier. By handling contents, structure, and special effects separately in such a reference model, the parts can be better understood and worked on, and far more general structures can be represented.
-
I would propose a three-layer model:[8]
-
- A content layer to facilitate editing, content linking, and transclusion management.
- A content layer to facilitate editing, content linking, and transclusion management.
-
- A structure layer, declarable separately. Users should be able to specify entities, connections and co-presence logic, defined independently of appearance or size or contents; as well as overlay correspondence, links, transclusions, and "hoses" for movable content.
- A structure layer, declarable separately. Users should be able to specify entities, connections and co-presence logic, defined independently of appearance or size or contents; as well as overlay correspondence, links, transclusions, and "hoses" for movable content.
-
Finally, a special-effects-and-primping layer should allow the declaration of ever-so-many fonts, format blocks, fanfares, and whizbangs, and their assignment to what's in the content and structure layers.
-
-
Theodor Holm Nelson, designer and generalist, has been a software designer and theorist since 1960 and a software consultant since 1967. His principal design work includes Project Xanadu and xanalogical systems, the transcopyright system, and the theory of virtuality design. His industry positions include Harcourt Brace & World publishers, Creative Computing Magazine, Datapoint Corporation, and Autodesk, Inc.; his university positions include Vassar College, University of Illinois, Swarthmore College, Strathclyde University, and Keio University.
-
Mr. Nelson has written several books, the most recent being The Future of Information (1997), as well as numerous articles, lectures, and presentations. He is best known for discovering the hypertext concept and for coining various words which have become popular, such as "hypertext," "hypermedia," "cybercrud," "softcopy," "electronic visualization," "dildonics," "technoid," "docuverse," and "transclusion."
-
[2] Samuel Latt Epstein of Sensemedia, Inc. has pointed out (personal communication to the author) that he learned graphics programming on the Intecolor ISC-8001, ca. 1976, a machine that had a parallel data structure for its screen. 8K of memory was devoted to the text, 8K was devoted to the corresponding bytes of attribute memory. This "made it a snap" to program the screen, he says. The two parallel banks of memory could be manipulated independently, changing the colors without touching the text and vice versa, which greatly simplified (he says) programming both the text and the various graphical effects of those days.
-
-
17 May 14
-
23 May 07
-
The SGML approach is a delivery format, not a working format. Editing is outside the paradigm, happens "elsewhere."
-
Objection 2: Transpublishing a Potential Conflict
-
though they must be edited in parallel.
-
treating the original tags as if they are not there
-
This means we must find a stable means of representing structure very different from the sequential and hierarchial--a representation of structure which recognizes the most anarchic and overlapping relations; and the location of identical and corresponding materials in different versions; which recognizes and maintains constancies of structure and data across successive versions, even as addresses of these materials become unpredictably fragmented by editing.
-
three-layer model
-
-
28 Oct 06
-
Embedded Markup Considered Harmful
-
-
13 Oct 06
-
20 Mar 06
-
18 Dec 04
-
24 Aug 04
Would you like to comment?
Join Diigo for a free account, or sign in if you are already a member.