Tématické mapy

 

Obsah:

 

1. Úvod…………………………………………………………………………………………… 3

1.1 Rozširené linkovanie a preplnenie informáciami..………………………………………… 3

1.2 Tématické mapy budú propagovať novú generáciu vyhľadávacích engine-ov…………  3

2. Výťah z normy ISO/IEC 13250……………………………………………………………… 3

2.1 Introduction………………………………………………………………………………….. 3

2.2 Scope………………………………………………………………………………………….. 4

2.3 Definitions……………………………………………………………………………………. 5

2.4 Topic Maps architecture…………………………………………………………………….. 8

3. Tématické mapy a XML. …………………………………………………………………………. 10

3.1 Ako sú Topikové mapy významné pre XML? …………………………………………… 10

3.2 XTM – XML Tématické Mapy……………………………………………………………. 10

3.3 Ciele dizajnu XTM (XML Topic Map)…………………………………………………… 10

3.3 Introduction to XTM Syntax………………………………………………………………. 11

4.    TMQL (Topic Map Query Language)....………………………………………………… 11

5.    Záver....…………………………………………………………………………………….. 12

6.    Zhrnutie použitej literatúry.....…………………………………………………………… 12

 

 

1.Úvod.

 

            Tématická mapa je dokument použivaný na zlepšenie získavania informacií a navigácie. Sú vytvárané a využívajú sa na pomoc ľuďom najsť informácie rýchlo a ľahko. Môžu byť formatované ako široká škála vyhľadavacích pomôcok, slovník, súbor rôznych on-line vyhľadavacích pomôcok, alebo ako indexy.

 

Očakáva sa že sa dramatický zvýši výkonnosť a spoľahlivosť vyhľadávacích enginov na web-e vďaka verejnému zdieľaniu tématických máp potenciálne veľkými komunitami použivaťeľov.

 

Model topikových máp bol daný organizáciou ISO (ISO 13250), ktorá bola rozvinutá ako aplikácia HyTime-u. "HyTime" (ISO 10744) tvorí sadu štandardnej nadstavby SMGL. Standard Generalized Markup Language (SMGL, ISO 8879). Ale organizacia W3C (World Wide Web Consortium) odporúča použitie XML (eXtensible Markup Language).

1.1 Rozširené linkovanie a preplnenie informáciami.

 

                V kontexte prichádzajúceho ”rozšireného linkovania” (”extended linking”), kde linka môže obsahovať odkazy na rôzny počet zdrojov (aj iných ”extendet links”), jednoducho založený webovy informačný objekt sa môže stať rekurzivnym odkazom pre veľké množstvo liniek. Tento jav može spôsobiť klasické zahltenie informáciami, v ktorom pri veľkom množstve dostupných informácií nemôžeme nájsť to čo potrebujeme. V ”rozširenom linkovaní, problém zahltenia je o čosi redukovaný rozlišovaním typov a vzťahov. Avšak veľa vecí ma rovnaký typ a vzťah, takže je tu stále problém zahltenia informáciami. V ”rozšírenom linkovaní” (ako je popísané v HyTime štandarde a v Xlink načrtu spoločnosti W3C), je tu štandardný predpis pre rekurzívne ciele, ktoré majú  ten istý vzťah na zavedenie rekurzívneho odkazu.

Poz.: Jednoduchým linkovaním je možné odkazať sa len na jeden zdroj/linku.

1.2 Tématické mapy budú propagovať novú generáciu vyhľadávacích engine-ov.

            Aj bez rozšíreného linkovania existuje webový vyhľadávací engine, ktorý poskytuje suboptimalny prístup k dátam obsiahnutým na webe. Množstvo dôležitých informácií je nevyužitých z dôvodu vyhľadávacích engin-ov, ktoré niesu tématicky orientované. Namiesto toho, vyhľadávací engine typicky používa algoritmy na porovnávanie znakov a varianty porovnávania znakov a fráz, pozorujú správanie vyhľadávania ľudí a vyvodzujú z toho predpoklad na požadované dáta. Tématické mapy poskytujú prekvapivo silný nástroj pre použivaťeľov webu kombinovaním hľadania a koncentrácie sily množstva indexových stratégií a vyhľadávacích engin-eov.

 

2. Výťah z normy ISO/IEC 13250

 

                Norma tématických máp, základné pojmy a definície tychto pojmov.

2.1 Introduction

 

This International Standard provides a standardized notation for interchangeably representing information about the structure of information resources used to define topics, and the relationships between topics. A set of one or more interrelated documents that employs the notation defined by  this International Standard is called a topic map. In general, the structural information conveyed by topic maps includes:

— groupings of addressable information objects around topics (‘occurrences’), and

— relationships between topics (‘associations’).

A topic map defines a multidimensional topic space — a space in which the locations are topics,

and in which the distances between topics are measurable in terms of the number of intervening

topics which must be visited in order to get from one topic to another, and the kinds of

relationships that define the path from one topic to another, if any, through the intervening topics,

if any. In addition, information objects can have properties, as well as values for those properties,

 assigned to them externally. These properties are called facet types. Several topic maps can provide topical structure information about the same information resources. The Topic Maps architecture is designed to facilitate merging topic maps without requiring the merged topic maps to be copied or modified. Because of their extrinsic character, topic maps can  be thought of as overlays on, or extensions to, sets of information objects. The base notation of Topic Maps is SGML; an interchangeable topic map always consists of at least one SGML document, and it may include and/or refer to other kinds information resources. A set of information resources that comprise a complete interchangeable topic map can be specified using the ‘bounded object set (BOS)’ facility defined by the HyTime architecture in ISO/IEC 10744:1997. As the Extensible Markup Language (XML), a World Wide Web Consortium recommendation, is a subset of SGML, as explained in Annex K of SGML (1997), also known as WebSGML, XML can be also used as a base notation for Topic Maps. The topic map notation is defined as an SGML Architecture, and this International Standard takes the form of an architecture definition document expressed in conformance with Normative Annex A.3 of ISO/IEC 10744:1997, the SGML Architectural Form Definition Requirements (AFDR). The formal definition of the topic map notation is expressed as a meta-DTD.

2.2 Scope

 

Topic maps enable multiple, concurrent views of sets of information objects. The structural nature of these views is unconstrained; they may reflect an object oriented approach, or they may be relational, hierarchical, ordered, unordered, or any combination of the foregoing. Moreover, an unlimited number of topic maps may be overlaid on a given set of information resources. Topic maps can be used:

— To qualify the content and/or data contained in information objects as topics to enable

navigational tools such as indexes, cross-references, citation systems, or glossaries.

— To link topics together in such a way as to enable navigation between them. This capability can be used for virtual document assembly, and for creating thesaurus-like interfaces to corpora,

knowledge bases, etc.

— To filter an information set to create views adapted to specific users or purposes. For example,

such filtering can aid in the management of multilingual documents, management of access

modes depending on security criteria, delivery of partial views depending on user profiles and/

or knowledge domains, etc.

— To structure unstructured information objects, or to facilitate the creation of topic-oriented user interfaces that provide the effect of merging unstructured information bases with structured

 ones. The overlay mechanism of topic maps can be considered as a kind of external markup

mechanism, in the sense that an arbitrary structure is imposed on the information without

altering its original form. This International Standard does not require or disallow the use of any scheme for addressing information objects. Except for the requirement that topic map documents themselves be expressed using SGML (or WebSGML) and HyTime, using the syntax described herein, neither does it require or disallow the use of any notation used to express information.

2.3 Definitions

2.3.1 added themes

Topics added to the sets of themes comprising the scopes within which topics have their topic

characteristics.

2.3.2 association

a) A specific relationship among specific topics that is asserted by an association link element.

b) An association link element.

2.3.3 association link

A hyperlink element conforming to the association link architectural form defined by this

International Standard.

2.3.4 association role

One of the roles that topics play in a given topic association.

2.3.5 association type

a) A subject which is a class of topic associations.

b) One of the classes of topic associations of which a particular association link is an instance. The association types of which a given association link is an instance can be specified by its optional types attribute.

2.3.6 base name

 a) A subelement (basename) of a topname subelement of a topic link.

 b) A name characteristic of a topic that is specified in the content of a basename element.

2.3.7 bounded object set (BOS)

A set of one or more documents and other information objects, all of which are known to the

processing application and which are processed collectively. See ISO/IEC 10744:1997 for details.

2.3.8 display name

 a) A subelement (dispname) of a topname subelement of a topic link, containing the identifying

information intended to be displayed by the application to represent the subject of the topic link.

 b) A name characteristic of a topic that is specified in the content of a dispname element.

2.3.9 facet

a) The subset of information objects that share an externally-applied property.

b) The values given to a particular property externally applied to a set of information objects.

2.3.10 facet link

A hyperlink that applies values for a given property (as well as the property itself) to one or more

information objects.

2.3.11 facet type

A property applied by one or more facet links to one or more objects.

2.3.12 facet value

A member of the set of all values of a particular facet type.

2.3.13 hub document

 The HyTime document used to define the set of information resources (the bounded object set

(BOS)) that comprise a HyTime hyperdocument. Applications may regard the HyTime document

used as the entry point for a browsing session within a hyperdocument as the hub document. See

ISO/IEC 10744:1997 for details. By definition, a topic map is a HyTime hyperdocument, and any

topic map document can be regarded as a hub document.

2.3.14 occurrence role

The sense in which some set of occurrences is relevant to a topic. In the Topic Maps architecture,

occurrence roles are specified as anchor roles (as defined in the HyTime architecture) of topic links.

2.3.15 public subject descriptor

A subject descriptor (see the definition of ‘subject descriptor’) which is used (or, especially, which  is designed to be used) as a common referent of the identity attributes of many topic links in many topic maps. The subject described by the subject descriptor is thus easily recognized as the common binding point of all the topic links that reference it, so that they will be merged.

2.3.16 scope

The extent of the validity of a topic characteristic assignment (see the definition of ‘topic

characteristic assignment’): the context in which a name or an occurrence is assigned to a given

topic, and the context in which topics are related through associations. This International Standard does not require that scopes be specified explicitly. If the scope of a topic characteristic assignment  is not explicitly specified via one or more scope attributes, the scope within which the topic characteristic applies to the topic includes all the topics in the entire topic map; this special scope is called ‘the unconstrained scope’. If a scope is specified, the specification consists of a set of topics, which, in the context of their role as members of such a set, are called ‘themes’. Each theme contributes to the extent of the scope that the themes collectively define; a given scope is the union of the subjects of the set of themes used to specify that scope.

2.3.17 sort name

a) A subelement (sortname) of a topname subelement of a topic link, containing a string that is an alternative representation of a topic name that is intended to be used for alphabetic or other

ordering.

b) A name characteristic of a topic that is specified in the content of a sortname element. 2.3.18 subject In the most generic sense, a ‘subject’ is any thing whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever.

2.3.19 subject descriptor

Information which is intended to provide a positive, unambiguous indication of the identity of a

 subject, and which is the referent of an identity attribute of a topic link.

2.3.20 theme

A member of the set of topics comprising a scope within which a topic characteristic assignment is valid. See also the definitions of ‘scope’ and ‘topic’.

a) An aggregate of topic characteristics, including zero or more names, occurrences, and roles

played in associations with other topics, whose organizing principle is a single subject.

b) A topic link element.

2.3.22 topic association

a) A specific relationship among specific topics that is asserted by an association link element.

b) An association link element.

2.3.23 topic characteristic

Any defining characteristic of a topic. There are three kinds of topic characteristics:

a) names,

b) occurrences, and

c) roles played in relationships (‘associations’) with other topics.

For example, a name of a topic is a ‘name characteristic’ of that topic.

2.3.24 topic characteristic assignment

a) The mechanism whereby a topic characteristic becomes a characteristic of a topic. For example, topname subelements of topic link elements are used to assign names to topics as topic

characteristics, so, in topic map documents, they perform the function of assigning topic name

characteristics.

b) The fact that a particular topic characteristic is a characteristic of a particular topic.

2.3.25 topic link

A hyperlink element conforming to the topic link architectural form.

2.3.26 topic map

a) A set of information resources regarded by a topic map application as a bounded object set

whose hub document is a topic map document conforming to the SGML architecture defined by

this International Standard.

b) Any topic map document conforming to the SGML architecture defined by this International

 Standard, or the document element (topicmap) of such a document.

 c) The document element type (topicmap) of the topic map document architecture.

2.3.27 topic name

a) A string of characters specified as a name of a topic; a name characteristic of a topic.

 b) A topic name (topname) element, as defined by this International Standard.

 c) Either a base name (basename), display name (dispname) or name to be used as sort key

 (sortname) element, as defined by this International Standard, and/or the information that such

an element contains.

d) A combination of the foregoing definitions.

2.3.28 topic occurrence

Information that is specified as relevant to a given subject.

2.3.29 topic type

a) A subject which is a class of topics.

b) One of the classes of topics of which a particular topic link is an instance. The topic types of

 which a given topic link is an instance can be specified via its optional types attribute.

2.3.30 unconstrained scope

 The scope comprised of all of the topics in a topic map. When no applicable scope attributes are

explicitly specified as governing a topic characteristic assignment, the scope within which the topic characteristic assignment is made is the unconstrained scope.

2.4 Topic Maps architecture

 

This clause defines the syntax of topic maps. The Topic Maps syntax makes use of the base,

location address, and hyperlinking modules of the HyTime architecture. When interchanged, topic maps are HyTime bounded object sets (BOSs). The hub document of such a BOS must contain a Topic Maps architectural support declaration. Only one of the hyperlink syntaxes defined by HyTime is used in the topic map syntax: variable link (varlink). The HyTime architecture provides a comprehensive set of addressing mechanisms and a standard syntax for using them. In addition, it provides means whereby any addressing syntax can be declared and used. The topic map architecture preserves these features of HyTime. Thus, the Topic Maps architecture allows topic map authors to use any addressing scheme, including proprietary addressing mechanisms driven by expressions in any notations, provided each such notation is formally declared as a notation in the manner prescribed by the SGML and HyTime International

Standards.

2.4.1 Topic Map Architectural Form

 The topic map (topicmap) element form is used as the document element of all documents that

conform to the Topic Maps architecture defined by this International Standard. The effect of specifying the added themes (addthems) attribute is to add the themes that it references to the scopes of all of the topic characteristic assignments made throughout the document of which the element is the root element. The topicmap element type is derived from the document element type of the HyTime architecture (HyDoc). All of the remaining attributes (maxbos, boslevel and grovplan) are inherited from HyDoc. The optional maxbos and boslevel attributes are used in hub documents in specifying the  members of the HyTime bounded object set rooted at the document. The optional grovplan attribute is used in HyTime addressing. (See ISO/IEC 10744:1997.)

 

<!entity %

       TMCFC         -- Topic map context-free content --

       "topic|assoc|facet|bosspec|addthms|TMBrid"

>

<!element

       TMBrid        -- Topic map bridge element --

       - O

       ANY

>

<!element

       topicmap     -- Topic map document element --

                     -- Clause: 5.1 --

       - O

       (%TMCFC;)*

>

<!attlist

  topicmap

   HyTime            -- HyTime architectural form name --

       NAME

       HyDoc         -- HyTime document element. (This

                     attribute definition is redundant; it

                     appears here as an aid to

                     understanding.) --

   addthems          -- Added themes --

                      -- Themes to add to all scopes that govern

                      the assignments of topic names,

                     occurrences, and roles played in

                     associations in this topic map

                     document. --

       CDATA         -- Reference --

                     -- Reftype: topic+ --

       #IMPLIED     -- Default: No themes added via this

                     attribute. --

   -- bos --         -- HyTime bounded object set --

                     -- HyTime Clause: 6.5.1 --

   maxbos            -- Maximum bounded object set level --

                     -- Bounding level of HyTime bounded object

                     set when document is a hub or

                     subhub. --

       NUMBER        -- Constraint: Depth of nested entities to

                     include in BOS (0=no limit, 1=hub only)

                     --

       0

   boslevel          -- Bounded object set level --

                     -- Default BOS level used by data entities

                     declared in hub document. --

       NUMBER        -- Constraint: Depth of nested entities to

                     include in BOS (0=no limit, 1=this

                     entity only) --

       #IMPLIED     -- Default: No HyTime BOS --

   -- bosspcat --       -- BOS exception specification attributes

                     --

                     -- HyTime Clause: 6.5.3 --

   bosspec           -- Bounded object set exception

                     specification --

                     -- Adjustments to be made to the bounded

                     object set. --

       IDREFS        -- Reference --

                     -- Reftype: bosspec+ --

                     -- Constraint: Must be internal reference

                     --

       #IMPLIED     -- Default: No BOS exception specification

                     --

   -- dgrvplan --       -- HyTime document grove plan --

                     -- HyTime Clause: 7.1.4.1 --

   grovplan          -- Grove plan --

                     -- Grove plan for HyTime extended SGML

                     document grove --

       CDATA         -- Reference --

                     -- Reftype: grovplan --

       #IMPLIED     -- Default: HyTime default grove plan --

>

 

 

3. Tématické mapy a XML.

3.1 Ako sú Topikové mapy významné pre XML?

 

            HTML bol prvý jazyk zvolený pre Web. Umožnuje prezentáciu jednocestného hypertextoveho odkazu, ale XML splňa požiadavky, ktoré HTML nikdy nemohlo dosiahnuť. Na rozdiel od HTML, XML poskytuje spôsob značenia pre asociáciu podstatnej semantiky a podstatnej časti informácii. Použitím primeraného algoritmu a trochu dôvtipu, tvorca mapy može značne zvýšit produktivitu pôsobením podobne ako pri sémantickom značení. Môžu viac pohotovo zahrnovať ďalšie zdroje do základných dát ktorých téma poskytuje základ vyhľadávacích pomôcok, a možu ľahšie identifikovať témy, typy tém, vzťahy tém, vzťahy medzi typmi tém, události, typy události, mená a priestor ktorý má byť pridaný do týchto tém.

3.2 XTM – XML Tématické Mapy

XML Tématické Mapy (XTM) sú prudukt TopicMaps.Org Authoring Group (AG), sformovaná v 2000 nezávislym konzorciom TopicMaps.Org.

TopicMap.Org bol založený na vývoj a realizovanie vylepšenia vyhľadávania a manažovaťeľnosti informácii na Svetovej sieti (World Wide Web).

3.3 Ciele dizajnu XTM (XML Topic Map)

XTM má byť bez komplikácii použiteľná v Internete.

XTM má podporovať  širokú rozmanitosť aplikácii.

XTM má byť kompatibilná z XML, Xlink a ISO 13250.

Má byť jednoduché písať programy pre spracovanie XTM dokumentov.

XTM dokumenty majú byť ľuďom zrozumitelné, formálne a stručné.

 

Táto špecifikácia spolu z XML 1.0 (syntax), Xlink 1.0 (linking syntax),  XML Base (URI resolution), a IETF URI špecifikácia, poskytuje všetky informacie potrebné pre pochopenie XTM 1.0 a vytvorenie dokumentu tématickej mapy.

3.3 Introduction to XTM Syntax

The syntax for serializing and interchanging topic map documents conforming to this specification is defined by the XML document type definition provided in Annex D: XTM 1.0 (http://www.topicmaps.org/xtm/1.0/) Document Type Declaration. This section provides documentation for all the element types defined in that DTD.

 

The following is a complete list of XTM element types in the order in which they are documented:

 

<topicRef>: Reference to a Topic element

<subjectIndicatorRef>: Reference to a Subject Indicator

<scope>: Reference to Topic(s) that comprise the Scope

<instanceOf>: Points to a Topic representing a class

<topicMap>: Topic Map document element

<topic>: Topic element

<subjectIdentity>: Subject reified by Topic

<baseName>: Base Name of a Topic

<baseNameString>: Base Name String container

<variant>: Alternate forms of Base Name

<variantName>: Container for Variant Name

<parameters>: Processing context for Variant

<association>: Topic Association

<member>: Member in Topic Association

<roleSpec>: Points to a Topic serving as an Association Role

<occurrence>: Resources regarded as an Occurrence

<resourceRef>: Reference to a Resource

<resourceData>: Container for Resource data

<mergeMap>: Merge with another Topic Map

 

4. TMQL (Topic Map Query Language)

               

”Topic Map Query Language” je neštandardizovaná gramatika poskytujúca široké možnosti pristupovania k modelom tématických máp. Spoločnosť Empolis vytvorila produkt k42, v ktorom použili TMQL. K42 je prvá implementácia TMQL, ktorá spĺňa požiadavky uživateľa identifikovať súbor komponentov tématickej mapy podľa uživaťeľom špecifikovaného dotazu.

 

TMQL je efektívna cesta na vytvorenie štandardnej gramaiky pre prístum k objektom tématických máp v akejkoľvek aplikácii tématickej mápy.

 

Spoločnosť Empolis je v tomto výskume v popredí. Svojím produktom k42 sa snažia o štandardizovanie TMQL.

 

5. Záver

 

Pri rozširujúcom sa trende informácii na svetovej informačnej sieti (Internete) majú tématické mapy velký výzmam. Tento spôsob manažovatelnosti vedomostí a informácii sa da využiť v skutočne širokej oblasti. Spoločnosti ktoré sa zaoberajú rozvojom tématických máp sa snažia o špecifikovanie jazyka, v ktorom by sa dali jednoducho tvoriť tieto tématické mapy a to pre čo najväčšiu oblasť použivaťeľov. Čiže ľudí, ktorý nepotrebujú mať špecifické znalosti z Javy alebo iných objektovo orientovaných jazykov. Celé toto snaženie má slúžiť širokej verejnosti a má uľahčiť vyhľadávanie informácii, a tým pádom aj odľahčeniu siete kedže ku koncovým použivateľom nebudú prúdiť nadbytočné dáta.

 

6. Zhrnutie použitej literatúry.

 

[1] norma ISO/IEC 13250 – špecifikácia tématických máp pre SMGL.

 

[2] http://www.topicmaps.org/xtm/1.0/ - TopicMaps.Org je nezávislé konzorcium, ktoré sa venuje rozvoju aplikovatelnosti tématických máp použitím XML špecifikácii. Na tejto stránke je špecifikácia XML verzia 1.0 Tématických máp – XTM.

 

[3] http://www.w3.org/TR/xml-infoset/ - Táto stránka obsahuje definíciu abstraktného dátového modelu XML Information Set (Infoset). Jej podstatou je poskytovať sadu definícii pre použitie v iných špecifikáciach, ktoré referujú na informácie v XML 1.0 (Second Edition) dokumente.

 

[4] http://www.topicmap.com/ - Názov tejto stránky hovorí sám za seba. Sú tu novinky z tejto oblasti a odkazy na súvisiace témy.

 

[5] http://www.ontopia.net/topicmaps/materials/tao.html - Článok Steve Pepper-a. Popisuje tu problém zahltenia informačnej siete a možnosť riešenia tohto problému použitím tématických máp. Popisuje zakladné črty dokumentu tématickej mapy.

Ontopia je Nórska spoločnosť, ktorá sa zaoberá riešením a výskumom manažovatelnosti vedomostí a informácii.

 

[6] http://k42.empolis.co.uk/home.html - Empolis je spoločnosť so zameraním ako Ontopia, ich sídlo je v Nemecku. Vytvorili program k42 na základe TMQL (Topic Map Query Language) a snažia sa o štandardizovanie tohto jazyka.

 

[7] http://xml.coverpages.org/xml.html - Na tejto stránke sú informácie o XML (eXtensible Markup Language).