xrfragment/doc/RFC_XR_Fragments.md

60 KiB

%%% Title = "XR Fragments" area = "Internet" workgroup = "Internet Engineering Task Force"

[seriesInfo] name = "XR-Fragments" value = "draft-XRFRAGMENTS-leonvankammen-00" stream = "IETF" status = "informational"

date = 2023-04-12T00:00:00Z

author initials="L.R." surname="van Kammen" fullname="L.R. van Kammen"

%%%

.# Abstract

This draft is a specification for 4D URLs & hypermediatic navigation, which links together space, time & text together, for hypermedia browsers with- or without a network-connection.
The specification promotes spatial addressibility, sharing, navigation, query-ing and databinding objects for (XR) Browsers.
XR Fragments allows us to better use existing metadata inside 3D scene(files), by connecting it to proven technologies like URI Fragments.

Almost every idea in this document is demonstrated at https://xrfragment.org

{mainmatter}

Introduction

How can we add more control to existing text & 3D scenes, without introducing new dataformats?
Historically, there's many attempts to create the ultimate markuplanguage or 3D fileformat.
The lowest common denominator is: designers describing/tagging/naming things using plain text.
XR Fragments exploits the fact that all 3D models already contain such metadata:

XR Fragments allows controlling of metadata in 3D scene(files) using URLs

Or more detailed:

  1. addressibility and hypermediatic navigation of 3D scenes/objects: URI Fragments + src/href spatial metadata
  2. Interlinking (text)objects by collapsing space into a Word Graph (XRWG) to show visible links
  3. unlocking spatial potential of the (originally 2D) hashtag (which jumps to a chapter) for navigating XR documents

NOTE: The chapters in this document are ordered from highlevel to lowlevel (technical) as much as possible

Core principle

XR Fragments allows controlling of metadata in 3D scene(files) using URLs

XR Fragments tries to seek to connect the world of text (semantical web / RDF), and the world of pixels.
Instead of combining them (in a game-editor e.g.), XR Fragments integrates all, by collecting metadata into an XRWG and control it via URL:

principle XR 4D URL HTML 2D URL
the XRWG wordgraph (collapses 3D scene to tags) Ctrl-F (find)
the hashbus hashtags alter camera/scene/object-projections hashtags alter document positions
src metadata renders content and offers sourceportation renders content
href metadata teleports to other XR document jumps to other HTML document
href metadata triggers predefined view Media fragments
href metadata triggers camera/scene/object/projections n/a
href metadata draws visible connection(s) for XRWG 'tag' n/a
href metadata queries certain (in)visible objects n/a

XR Fragments does not look at XR (or the web) thru the lens of HTML.
But approaches things from a higherlevel feedbackloop/hypermedia browser-perspective:

 +──────────────────────────────────────────────────────────────────────────────────────────────+
 │                                                                                              │
 │   the soul of any URL:       ://macro        /meso            ?micro      #nano              │
 │                                                                                              │
 │                2D URL:       ://library.com  /document        ?search     #chapter           │
 │                                                                                              │
 │                4D URL:       ://park.com     /4Dscene.fbx ──> ?misc  ──>  #view ───> hashbus │
 │                                                │                          #query      │      │
 │                                                │                          #tag        │      │
 │                                                │                          #material   │      │
 │                                                │                          #animation  │      │
 │                                                │                          #texture    │      │
 │                                                │                          #variable   │      │
 │                                                │                                      │      │
 │                                               XRWG <─────────────────────<────────────+      │
 │                                                │                                      │      │
 │                                                └─ objects  ──────────────>────────────+      │
 │                                                                                              │
 │                                                                                              │
 +──────────────────────────────────────────────────────────────────────────────────────────────+

Traditional webbrowsers can become 4D document-ready by:

  • hypermediatic loading 3D assets (gltf/fbx e.g.) natively (with or without using HTML).
  • allowing assets to publish hashtags to themselves (the scene) using the hashbus (like hashtags controlling the scrollbar).
  • collapsing the 3D scene to an wordgraph (for essential navigation purposes) controllable thru a hash(tag)bus

XR Fragments itself are hypermediatic and HTML-agnostic, though pseudo-XR Fragment browsers can be implemented on top of HTML/Javascript.

Conventions and Definitions

See appendix below in case certain terms are not clear.

XR Fragment URI Grammar

reserved    = gen-delims / sub-delims
gen-delims  = "#" / "&"
sub-delims  = "," / "="

Example: ://foo.com/my3d.gltf#pos=1,0,0&prio=-5&t=0,100

Demo Explanation
pos=1,2,3 vector/coordinate argument e.g.
pos=1,2,3&rot=0,90,0&q=foo combinators

this is already implemented in all browsers

List of URI Fragments

fragment type example info
#pos vector3 #pos=0.5,0,0 positions camera (or XR floor) to xyz-coord 0.5,0,0,
#rot vector3 #rot=0,90,0 rotates camera to xyz-coord 0.5,0,0
#t timevector #t=2,2000,1 play animation-loop range between frame 2 and 2000 at (normal) speed 1
#q vector3 #q=-sky -tag:hide queries scene-graph (and removes object with name cube or tag: hide)

List of metadata for 3D nodes

key type example (JSON) function existing compatibility
href string "href": "b.gltf" XR teleport custom property in 3D fileformats
src string "src": "#cube" XR embed / teleport custom property in 3D fileformats
tag string "tag": "cubes geo" tag object (for query-use / XRWG highlighting) custom property in 3D fileformats

Supported popular compatible 3D fileformats: .gltf, .obj, .fbx, .usdz, .json (THREE.js), .dae and so on.

vector datatypes

| type | syntax | example | info | |------ | vector2 | x,y | 2,3.0 | 2-dimensional vector | | vector3 | x,y,z | 2,3.0,4 | 3-dimensional vector | | timevector | speed | 1 | 1D timeline: play | | | | 0 | 1D timeline: stop | | | x,speed | 1,2 | 1D timeline: play at offset 1 at (normal) speed 2 | | | | 0,0 | 1D timeline: stop (stopoffset-startoffset == 0) | | | | 0,1 | 1D timeline: unpause with (normal) speed 1 | | | | 1..100,1 | 1D timeline: play (loop) between offset 1 and 100 at normal speed (1) | | | x,y,xspeed,yspeed | 0,0.5,0,0 | 2D timeline: stop uv-coordinate at 0,0.5 | | | | 0,0.5,0.2,0 | 2D timeline: play uv-coordinate at offset 0,0.5 and scroll x (=u) 0.2 within each second | | | | 0,0..0.5,0.2,0 | 2D timeline: play uv-coordinate between offset 0,0 and 0,0.5 (loop) and scroll x (=u) 0.2 within each second | | | x,y,z,xspeed,yspeed,zspeed | 0,0.5,1,0.2,0,2 | XD timeline: play uv-coordinate at 0,0.5 and scroll x (=u) 0.2 within each second and pass 1 and 2 as custom data to shader uniforms za and zb |

NOTE: XR Fragments are optional but also file- and protocol-agnostic, which means that programmatic 3D scene(nodes) can also use the mechanism/metadata.

Dynamic XR Fragments (+databindings)

These are automatic fragment-to-metadata mappings, which only trigger if the 3D scene metadata matches a specific identifier (aliasname e.g.)

fragment type example info
#<aliasname> string #cubes evaluate predefined views (#cubes: #foo&bar e.g.)
#<tag_or_objectname> string #person focus object(s) with tag: person or name person by looking up XRWG
#<cameraname> string #cam01 set camera as active camera
#<objectname>=<material> string=string #car=metallic set material of car to material with name metallic
string=string #product=metallic set material of objects tagged with product to material with name metallic
#<objectname>=<mediafrag> string=media frag #foo=0,1 play media src using media fragment URI
#<objectname>=<timevector> string=timevector #sky=0,0.5,0.1,0 sets 1D/2D/3D time(line) vectors (uv-position e.g.) to 0,0.5 (and autoscroll x with max 0.1 every second)
#music=1,2 play media of object (src: podcast.mp3 e.g.) from beginning (1) at double speed (2)

Spatial Referencing 3D

XR Fragments assume the following objectname-to-URIFragment mapping:


  my.io/scene.fbx
  +─────────────────────────────+
  │ sky                         │  src: http://my.io/scene.fbx#sky          (includes building,mainobject,floor)
  │ +─────────────────────────+ │ 
  │ │ building                │ │  src: http://my.io/scene.fbx#building     (includes mainobject,floor)
  │ │ +─────────────────────+ │ │
  │ │ │ mainobject          │ │ │  src: http://my.io/scene.fbx#mainobject   (includes floor)
  │ │ │ +─────────────────+ │ │ │
  │ │ │ │ floor           │ │ │ │  src: http://my.io/scene.fbx#floor        (just floor object)
  │ │ │ │                 │ │ │ │
  │ │ │ +─────────────────+ │ │ │
  │ │ +─────────────────────+ │ │
  │ +─────────────────────────+ │
  +─────────────────────────────+

Every 3D fileformat supports named 3D object, and this name allows URLs (fragments) to reference them (and their children objects).

Clever nested design of 3D scenes allow great ways for re-using content, and/or previewing scenes.
For example, to render a portal with a preview-version of the scene, create an 3D object with:

  • href: https://scene.fbx
  • src: https://otherworld.gltf#mainobject

It also allows sourceportation, which basically means the enduser can teleport to the original XR Document of an src embedded object, and see a visible connection to the particular embedded object. Basically an embedded link becoming an outbound link by activating it.

Navigating 3D

fragment type functionality
#pos=0,0,0 vector3 (re)position camera
#t=0,100 vector3 set playback speed, and (re)position looprange of scene-animation or src-mediacontent
#rot=0,90,0 vector3 rotate camera

» example implementation
» discussion

  1. the Y-coordinate of pos identifies the floorposition. This means that desktop-projections usually need to add 1.5m (average person height) on top (which is done automatically by VR/AR headsets).
  2. set the position of the camera accordingly to the vector3 values of #pos
  3. rot sets the rotation of the camera (only for non-VR/AR headsets)
  4. t sets the playbackspeed and animation-range of the current scene animation(s) or src-mediacontent (video/audioframes e.g., use t=0,7,7 to 'STOP' at frame 7 e.g.)
  5. in case an href does not mention any pos-coordinate, pos=0,0,0 will be assumed

Here's an ascii representation of a 3D scene-graph which contains 3D objects and their metadata:

  +────────────────────────────────────────────────────────+ 
  │                                                        │
  │  index.gltf                                            │
  │    │                                                   │
  │    ├── ◻ buttonA                                       │
  │    │      └ href: #pos=1,0,1&t=100,200                 │
  │    │                                                   │
  │    └── ◻ buttonB                                       │
  │           └ href: other.fbx                            │   <── file─agnostic (can be .gltf .obj etc)
  │                                                        │
  +────────────────────────────────────────────────────────+

An XR Fragment-compatible browser viewing this scene, allows the end-user to interact with the buttonA and buttonB.
In case of buttonA the end-user will be teleported to another location and time in the current loaded scene, but buttonB will replace the current scene with a new one, like other.fbx, and assume pos=0,0,0.

Top-level URL processing

Example URL: ://foo/world.gltf#cube&pos=0,0,0

The URL-processing-flow for hypermedia browsers goes like this:

  1. IF a #cube matches a custom property-key (of an object) in the 3D file/scene (#cube: #......) THEN execute that predefined_view.
  2. IF scene operators (pos) and/or animation operator (t) are present in the URL then (re)position the camera and/or animation-range accordingly.
  3. IF no camera-position has been set in step 1 or 2 update the top-level URL with #pos=0,0,0 (example)
  4. IF a #cube matches the name (of an object) in the 3D file/scene then draw a line from the enduser('s heart) to that object (to highlight it).
  5. IF a #cube matches anything else in the XR Word Graph (XRWG) draw wires to them (text or related objects).

Embedding XR content (src-instancing)

src is the 3D version of the iframe.
It instances content (in objects) in the current scene/asset.

fragment type example value
src string (uri, hashtag/query) #cube
#sometag
#q=-ball_inside_cube<br>#q=-/sky -rain<br>#q=-.language .english<br>#q=price:>2 price:<5<br>https://linux.org/penguin.png<br>https://linux.world/distrowatch.gltf#t=1,100<br>linuxapp://conference/nixworkshop/apply.gltf#q=flyer<br>androidapp://page1?tutorial#pos=0,0,1&t1,100<br>foo.mp3#0,0,0`

Here's an ascii representation of a 3D scene-graph with 3D objects which embeds remote & local 3D objects with/out using queries:

  +────────────────────────────────────────────────────────+  +─────────────────────────+ 
  │                                                        │  │                         │
  │  index.gltf                                            │  │ ocean.com/aquarium.fbx  │
  │    │                                                   │  │   │                     │
  │    ├── ◻ canvas                                        │  │   └── ◻ fishbowl        │
  │    │      └ src: painting.png                          │  │         ├─ ◻ bass       │
  │    │                                                   │  │         └─ ◻ tuna       │
  │    ├── ◻ aquariumcube                                  │  │                         │       
  │    │      └ src: ://rescue.com/fish.gltf#bass%20tuna   │  +─────────────────────────+
  │    │                                                   │    
  │    ├── ◻ bedroom                                       │   
  │    │      └ src: #canvas                               │
  │    │                                                   │   
  │    └── ◻ livingroom                                    │      
  │           └ src: #canvas                               │
  │                                                        │
  +────────────────────────────────────────────────────────+

An XR Fragment-compatible browser viewing this scene, lazy-loads and projects painting.png onto the (plane) object called canvas (which is copy-instanced in the bed and livingroom).
Also, after lazy-loading ocean.com/aquarium.gltf, only the queried objects bass and tuna will be instanced inside aquariumcube.
Resizing will be happen accordingly to its placeholder object aquariumcube, see chapter Scaling.

Instead of cherrypicking objects with #bass&tuna thru src, queries can be used to import the whole scene (and filter out certain objects). See next chapter below.

Specification:

  1. local/remote content is instanced by the src (query) value (and attaches it to the placeholder mesh containing the src property)
  2. local src values (URL starting with #, like #cube&foo) means only the mentioned objectnames will be copied to the instanced scene (from the current scene) while preserving their names (to support recursive selectors). (example code)
  3. local src values indicating a query (#q=), means that all included objects (from the current scene) will be copied to the instanced scene (before applying the query) while preserving their names (to support recursive selectors). (example code)
  4. the instanced scene (from a src value) should be scaled accordingly to its placeholder object or scaled relatively based on the scale-property (of a geometry-less placeholder, an 'empty'-object in blender e.g.). For more info see Chapter Scaling.
  5. external src values should be served with appropriate mimetype (so the XR Fragment-compatible browser will now how to render it). The bare minimum supported mimetypes are:
  6. src values should make its placeholder object invisible, and only flush its children when the resolved content can succesfully be retrieved (see broken links)
  7. external src values should respect the fallback link mechanism (see broken links
  8. when the placeholder object is a 2D plane, but the mimetype is 3D, then render the spatial content on that plane via a stencil buffer.
  9. src-values are non-recursive: when linking to an external object (src: foo.fbx#bar), then src-metadata on object bar should be ignored.
  10. clicking on external src-values always allow sourceportation: teleporting to the origin URI to which the object belongs.
  11. when only one object was cherrypicked (#cube e.g.), set its position to 0,0,0
  12. equirectangular detection: when the width of an image is twice the height (aspect 2:1), an equirectangular projection is assumed.
  13. when the enduser clicks an href with #t=1,0,0 (play) will be applied to all src mediacontent with a timeline (mp4/mp3 e.g.)
  • model/gltf+json
  • image/png
  • image/jpg
  • text/plain;charset=utf-8;bib=^@

» example implementation
» example 3D asset
» discussion

Navigating content (internal/outbound href portals)

navigation, portals & mutations

fragment type example value
href string (uri or predefined view) #pos=1,1,0
#pos=1,1,0&rot=90,0,0
://somefile.gltf#pos=1,1,0
  1. clicking an outbound ''external''- or ''file URI'' fully replaces the current scene and assumes pos=0,0,0&rot=0,0,0 by default (unless specified)

  2. relocation/reorientation should happen locally for local URI's (#pos=....)

  3. navigation should not happen ''immediately'' when user is more than 2 meter away from the portal/object containing the href (to prevent accidental navigation e.g.)

  4. URL navigation should always be reflected in the client (in case of javascript: see [here for an example navigator).

  5. In XR mode, the navigator back/forward-buttons should be always visible (using a wearable e.g., see [here for an example wearable)

  6. in case of navigating to a new [[pos)ition, ''first'' navigate to the ''current position'' so that the ''back-button'' of the ''browser-history'' always refers to the previous position (see [here)

  7. portal-rendering: a 2:1 ratio texture-material indicates an equirectangular projection

» example implementation
» example 3D asset
» discussion

UX spec

End-users should always have read/write access to:

  1. the current (toplevel) URL (an URLbar etc)
  2. URL-history (a back/forward button e.g.)
  3. Clicking/Touching an href navigates (and updates the URL) to another scene/file (and coordinate e.g. in case the URL contains XR Fragments).

Scaling instanced content

Sometimes embedded properties (like src) instance new objects.
But what about their scale?
How does the scale of the object (with the embedded properties) impact the scale of the referenced content?

Rule of thumb: visible placeholder objects act as a '3D canvas' for the referenced scene (a plane acts like a 2D canvas for images e, a cube as a 3D canvas e.g.).

  1. IF an embedded property (src e.g.) is set on an non-empty placeholder object (geometry of >2 vertices):
  • calculate the bounding box of the ''placeholder'' object (maxsize=1.4 e.g.)
  • hide the ''placeholder'' object (material e.g.)
  • instance the src scene as a child of the existing object
  • calculate the bounding box of the instanced scene, and scale it accordingly (to 1.4 e.g.)

REASON: non-empty placeholder object can act as a protective bounding-box (for remote content of which might grow over time e.g.)

  1. ELSE multiply the scale-vector of the instanced scene with the scale-vector (a common property of a 3D node) of the placeholder object.

TODO: needs intermediate visuals to make things more obvious

XR Fragment: pos

XR Fragment: rot

XR Fragment: t

controls the animation(s) of the scene (or src resource which contains a timeline)

| fragment | type | functionality | | #t=1,1,100 | vector3 (default:#t=1,0,0) | speed,framestart,framestop |

  • playposition is reset to framestart, when framestart or framestop is greater than 0 |
Example Value Explanation
1,1,100 play loop between frame 1 and 100
1,1,0 play once from frame 1 (oneshot)
1,0,0 play (previously set looprange if any)
0,0,0 pause
1,1,1 play and auto-loop between begin and end of duration
-1,0,0 reverse playback speed
2.3,0,0 set (forward) playback speed to 2.3 (no restart)
-2.3,0,0 set (reverse) playback speed to -2.3 ( no restart)
-2.3,100,0 set (reverse) playback speed to -2.3 restarting from frame 100

» example implementation
» discussion

XR audio/video integration

To play global audio/video items:

  1. add a src: foo.mp3 or src: bar.mp4 metadata to a 3D object (cube e.g.)
  2. to disable auto-play and global timeline (t) control: hardcode a t XR Fragment: (src: bar.mp3#t=0,0,0 e.g.)
  3. to play it, add href: #cube somewhere else
  4. when the enduser clicks the href, #t=1,0,0 (play) will be applied to the src value
  5. to play a single animation, add href: #animationname=1,0,0 somewhere else

NOTE: hardcoded framestart/framestop uses sampleRate/fps of embedded audio/video, otherwise the global fps applies. For more info see #t.

XR Fragment queries

Include, exclude, hide/shows objects using space-separated strings:

example outcome
#q=-sky show everything except object named sky
#q=-tag:language tag:english hide everything with tag language, but show all tag english objects
#q=price:>2 price:<5 of all objects with property price, show only objects with value between 2 and 5

It's simple but powerful syntax which allows filtering the scene using searchengine prompt-style feeling:

  1. queries are a way to traverse a scene, and filter objects based on their tag- or property-values.
  2. words like german match tag-metadata of 3D objects like "tag":"german"
  3. words like german match (XR Text) objects with (Bib(s)TeX) tags like #KarlHeinz@german or @german{KarlHeinz, ... e.g.

including/excluding

operator info
- removes/hides object(s)
: indicates an object-embedded custom property key/value
> < compare float or int number
/ reference to root-scene.
Useful in case of (preventing) showing/hiding objects in nested scenes (instanced by src) (*)

* = #q=-/cube hides object cube only in the root-scene (not nested cube objects)
#q=-cube hides both object cube in the root-scene AND nested skybox objects |

» example implementation » example 3D asset » discussion

Query Parser

Here's how to write a query parser:

  1. create an associative array/object to store query-arguments as objects
  2. detect object id's & properties foo:1 and foo (reference regex: /^.*:[><=!]?/ )
  3. detect excluders like -foo,-foo:1,-.foo,-/foo (reference regex: /^-/ )
  4. detect root selectors like /foo (reference regex: /^[-]?\// )
  5. detect number values like foo:1 (reference regex: /^[0-9\.]+$/ )
  6. for every query token split string on :
  7. create an empty array rules
  8. then strip key-operator: convert "-foo" into "foo"
  9. add operator and value to rule-array
  10. therefore we we set id to true or false (false=excluder -)
  11. and we set root to true or false (true=/ root selector is present)
  12. we convert key '/foo' into 'foo'
  13. finally we add the key/value to the store like store.foo = {id:false,root:true} e.g.

An example query-parser (which compiles to many languages) can be found here

Visible links

When predefined views, XRWG fragments and ID fragments (#cube or #mytag e.g.) are triggered by the enduser (via toplevel URL or clicking href):

  1. draw a wire from the enduser (preferabbly a bit below the camera, heartposition) to object(s) matching that ID (objectname)
  2. draw a wire from the enduser (preferabbly a bit below the camera, heartposition) to object(s) matching that tag value
  3. draw a wire from the enduser (preferabbly a bit below the camera, heartposition) to object(s) containing that in their src or href value

The obvious approach for this, is to consult the XRWG (example), which basically has all these things already collected/organized for you during scene-load.

UX

  1. do not update the wires when the enduser moves, leave them as is
  2. offer a control near the back/forward button which allows the user to (turn off) control the correlation-intensity of the XRWG

Text in XR (tagging,linking to spatial objects)

How does XR Fragments interlink text with objects?

The XR Fragments does this by collapsing space into a Word Graph (the XRWG example), augmented by Bib(s)Tex.

Instead of just throwing together all kinds media types into one experience (games), what about their tagged/semantical relationships?
Perhaps the following question is related: why is HTML adopted less in games outside the browser? Through the lens of constructive lazy game-developers, ideally metadata must come with text, but not obfuscate the text, or spawning another request to fetch it.
XR Fragments does this by detecting Bib(s)Tex, without introducing a new language or fileformat

Why Bib(s)Tex? Because its seems to be the lowest common denominator for an human-curated XRWG (extendable by speech/scanner/writing/typing e.g, see further motivation here)

Hence:

  1. XR Fragments promotes (de)serializing a scene to the XRWG (example)
  2. XR Fragments primes the XRWG, by collecting words from the tag and name-property of 3D objects.
  3. XR Fragments primes the XRWG, by collecting words from optional metadata at the end of content of text (see default mimetype & Data URI)
  4. Bib's and BibTex are first tag citizens for priming the XRWG with words (from XR text)
  5. Like Bibs, XR Fragments generalizes the BibTex author/title-semantics (author{title}) into this points to that (this{that})
  6. The XRWG should be recalculated when textvalues (in src) change
  7. HTML/RDF/JSON is still great, but is beyond the XRWG-scope (they fit better in the application-layer)
  8. Applications don't have to be able to access the XRWG programmatically, as they can easily generate one themselves by traversing the scene-nodes.
  9. The XR Fragment focuses on fast and easy-to-generate end-user controllable word graphs (instead of complex implementations that try to defeat word ambiguity)
  10. Tags are the scope for now (supporting https://github.com/WICG/scroll-to-text-fragment will be considered)

Example:

  http://y.io/z.fbx                                                           | Derived XRWG (expressed as BibTex)
  ----------------------------------------------------------------------------+--------------------------------------
                                                                              | @house{castle,
  +-[src: data:.....]----------------------+   +-[3D mesh]-+                  |   url = {https://y.io/z.fbx#castle}
  | Chapter one                            |   |    / \    |                  | }
  |                                        |   |   /   \   |                  | @baroque{castle,
  | John built houses in baroque style.    |   |  /     \  |                  |   url = {https://y.io/z.fbx#castle}
  |                                        |   |  |_____|  |                  | }
  | #john@baroque                          |   +-----│-----+                  | @baroque{john}
  |                                        |         │                        |
  |                                        |         ├─ name: castle          | 
  |                                        |         └─ tag: house baroque    | 
  +----------------------------------------+                                  |
                                               [3D mesh ]                     |
                                               |    O   ├─ name: john         |                           
                                               |   /|\  |                     |
                                               |   / \  |                     |
                                               +--------+                     |

the #john@baroque-bib associates both text John and objectname john, with tag baroque

Another example:

  http://y.io/z.fbx                                                           | Derived XRWG (expressed as BibTex)
  ----------------------------------------------------------------------------+--------------------------------------
                                                                              | 
  +-[src: data:.....]----------------------+   +-[3D mesh]-+                  | @house{castle,
  | Chapter one                            |   |    / \    |                  |   url = {https://y.io/z.fbx#castle}
  |                                        |   |   /   \   |                  | }
  | John built houses in baroque style.    |   |  /     \  |                  | @baroque{castle,
  |                                        |   |  |_____|  |                  |   url = {https://y.io/z.fbx#castle}
  | #john@baroque                          |   +-----│-----+                  | }
  | @baroque{john}                         |         │                        | @baroque{john}
  |                                        |         ├─ name: castle          | 
  |                                        |         └─ tag: house baroque    | 
  +----------------------------------------+                                  | @house{baroque}
                                               [3D mesh ]                     | @todo{baroque}
  +-[remotestorage.io / localstorage]------+   |    O   + name: john          | 
  | #baroque@todo@house                    |   |   /|\  |                     | 
  | ...                                    |   |   / \  |                     | 
  +----------------------------------------+   +--------+                     | 

both #john@baroque-bib and BibTex @baroque{john} result in the same XRWG, however on top of that 2 tages (house and todo) are now associated with text/objectname/tag 'baroque'.

As seen above, the XRWG can expand bibs (and the whole scene) to BibTeX.
This allows hasslefree authoring and copy-paste of associations for and by humans, but also makes these URLs possible:

URL example Result
https://my.com/foo.gltf#baroque draws lines between mesh john, 3D mesh castle, text John built(..)
https://my.com/foo.gltf#john draws lines between mesh john, and the text John built (..)
https://my.com/foo.gltf#house draws lines between mesh castle, and other objects with tag house or todo

hashtagbibs potentially allow the enduser to annotate text/objects by speaking/typing/scanning associations, which the XR Browser saves to remotestorage (or localStorage per toplevel URL). As well as, referencing BibTags per URI later on: https://y.io/z.fbx#@baroque@todo e.g.

The XRWG allows XR Browsers to show/hide relationships in realtime at various levels:

  • wordmatch inside src text
  • wordmatch inside href text
  • wordmatch object-names
  • wordmatch object-tagnames

Spatial wires can be rendered between words/objects etc.
Some pointers for good UX (but not necessary to be XR Fragment compatible):

  1. The XR Browser needs to adjust tag-scope based on the endusers needs/focus (infinite tagging only makes sense when environment is scaled down significantly)
  2. The XR Browser should always allow the human to view/edit the metadata, by clicking 'toggle metadata' on the 'back' (contextmenu e.g.) of any XR text, anywhere anytime.
  3. respect multi-line BiBTeX metadata in text because of the core principle
  4. Default font (unless specified otherwise) is a modern monospace font, for maximized tabular expressiveness (see the core principle).
  5. anti-pattern: hardcoupling an XR Browser with a mandatory markup/scripting-language which departs from onubtrusive plain text (HTML/VRML/Javascript) (see the core principle)
  6. anti-pattern: limiting human introspection, by abandoning plain text as first tag citizen.

The simplicity of appending metadata (and leveling the metadata-playfield between humans and machines) is also demonstrated by visual-meta in greater detail.

Fictional chat:

<John> Hey what about this: https://my.com/station.gltf#pos=0,0,1&rot=90,2,0&t=500,1000
<Sarah> I'm checking it right now 
<Sarah> I don't see everything..where's our text from yesterday?
<John> Ah wait, that's tagged with tag 'draft' (and hidden)..hold on, try this:
<John> https://my.com/station.gltf#.draft&pos=0,0,1&rot=90,2,0&t=500,1000
<Sarah> how about we link the draft to the upcoming YELLO-event?
<John> ok I'm adding #draft@YELLO 
<Sarah> Yesterday I also came up with other usefull assocations between other texts in the scene:
#event#YELLO
#2025@YELLO
<John> thanks, added.
<Sarah> Btw. I stumbled upon this spatial book which references station.gltf in some chapters:
<Sarah> https://thecommunity.org/forum/foo/mytrainstory.txt
<John> interesting, I'm importing mytrainstory.txt into station.gltf 
<John> ah yes, chapter three points to trainterminal_2A in the scene, cool

Default Data URI mimetype

The src-values work as expected (respecting mime-types), however:

The XR Fragment specification bumps the traditional default browser-mimetype

text/plain;charset=US-ASCII

to a hashtagbib(tex)-friendly one:

text/plain;charset=utf-8;bib=^@

This indicates that:

  • utf-8 is supported by default
  • lines beginning with @ will not be rendered verbatim by default (read more)
  • the XRWG should expand bibs to BibTex occurring in text (#contactjohn@todo@important e.g.)

By doing so, the XR Browser (applications-layer) can interpret microformats (visual-meta to connect text further with its environment ( setup links between textual/spatial objects automatically e.g.).

for more info on this mimetype see bibs

Advantages:

  • auto-expanding of hashtagbibs associations
  • out-of-the-box (de)multiplex human text and metadata in one go (see the core principle)
  • no network-overhead for metadata (see the core principle)
  • ensuring high FPS: HTML/RDF historically is too 'requesty'/'parsy' for game studios
  • rich send/receive/copy-paste everywhere by default, metadata being retained (see the core principle)
  • netto result: less webservices, therefore less servers, and overall better FPS in XR

This significantly expands expressiveness and portability of human tagged text, by postponing machine-concerns to the end of the human text in contrast to literal interweaving of content and markupsymbols (or extra network requests, webservices e.g.).

For all other purposes, regular mimetypes can be used (but are not required by the spec).

URL and Data URI

  +--------------------------------------------------------------+  +------------------------+
  |                                                              |  | author.com/article.txt |
  |  index.gltf                                                  |  +------------------------+
  |    │                                                         |  |                        |
  |    ├── ◻ article_canvas                                      |  | Hello friends.         |
  |    │    └ src: ://author.com/article.txt                     |  |                        |
  |    │                                                         |  | @book{greatgatsby      |
  |    └── ◻ note_canvas                                         |  |   ...                  |
  |           └ src:`data:welcome human\n@book{sunday...}`       |  | }                      | 
  |                                                              |  +------------------------+
  |                                                              |
  +--------------------------------------------------------------+

The enduser will only see welcome human and Hello friends rendered verbatim (see mimetype). The beauty is that text in Data URI automatically promotes rich copy-paste (retaining metadata). In both cases, the text gets rendered immediately (onto a plane geometry, hence the name '_canvas'). The XR Fragment-compatible browser can let the enduser access visual-meta(data)-fields after interacting with the object (contextmenu e.g.).

additional tagging using bibs: to tag spatial object note_canvas with 'todo', the enduser can type or speak #note_canvas@todo

XR Text example parser

To prime the XRWG with text from plain text src-values, here's an example XR Text (de)multiplexer in javascript (which supports inline bibs & bibtex):

xrtext = {

  expandBibs: (text) => { 
    let bibs   = { regex: /(#[a-zA-Z0-9_+@\-]+(#)?)/g, tags: {}}
    text.replace( bibs.regex , (m,k,v) => {
       tok   = m.substr(1).split("@")
       match = tok.shift()
       if( tok.length ) tok.map( (t) => bibs.tags[t] = `@${t}{${match},\n}` )
       else if( match.substr(-1) == '#' ) 
          bibs.tags[match] = `@{${match.replace(/#/,'')}}`
       else bibs.tags[match] = `@${match}{${match},\n}`
    })
    return text.replace( bibs.regex, '') + Object.values(bibs.tags).join('\n')
  },
    
  decode: (str) => {
    // bibtex:     ↓@   ↓<tag|tag{phrase,|{ruler}>  ↓property  ↓end
    let pat    = [ /@/, /^\S+[,{}]/,                /},/,      /}/ ]
    let tags   = [], text='', i=0, prop=''
    let lines  = xrtext.expandBibs(str).replace(/\r?\n/g,'\n').split(/\n/)
    for( let i = 0; i < lines.length && !String(lines[i]).match( /^@/ ); i++ ) 
        text += lines[i]+'\n'

    bibtex = lines.join('\n').substr( text.length )
    bibtex.split( pat[0] ).map( (t) => {
        try{
           let v = {}
           if( !(t = t.trim())         ) return
           if( tag = t.match( pat[1] ) ) tag = tag[0]
           if( tag.match( /^{.*}$/ )   ) return tags.push({ruler:tag})
           if( tag.match( /}$/ )       ) return tags.push({k: tag.replace(/}$/,''), v: {}})
           t = t.substr( tag.length )
           t.split( pat[2] )
           .map( kv => {
             if( !(kv = kv.trim()) || kv == "}" ) return
             v[ kv.match(/\s?(\S+)\s?=/)[1] ] = kv.substr( kv.indexOf("{")+1 )
           })
           tags.push( { k:tag, v } )
        }catch(e){ console.error(e) }
    })
    return {text, tags}
  },

  encode: (text,tags) => {
    let str = text+"\n"
    for( let i in tags ){
      let item = tags[i]
      if( item.ruler ){
          str += `@${item.ruler}\n`
          continue;
      }
      str += `@${item.k}\n`
      for( let j in item.v ) str += `  ${j} = {${item.v[j]}}\n`
      str += `}\n`
    }
    return str
  }
}

The above functions (de)multiplexe text/metadata, expands bibs, (de)serialize bibtex and vice versa

above can be used as a startingpoint for LLVM's to translate/steelman to a more formal form/language.

str = `
hello world
here are some hashtagbibs followed by bibtex:

#world
#hello@greeting
#another-section#

@{some-section}
@flap{
  asdf = {23423}
}`

var {tags,text} = xrtext.decode(str)          // demultiplex text & bibtex
tags.find( (t) => t.k == 'flap{' ).v.asdf = 1 // edit tag
tags.push({ k:'bar{', v:{abc:123} })          // add tag
console.log( xrtext.encode(text,tags) )       // multiplex text & bibtex back together 

This expands to the following (hidden by default) BibTex appendix:

hello world
here are some hashtagbibs followed by bibtex:

@{some-section}
@flap{
  asdf = {1}
}
@world{world,
}
@greeting{hello,
}
@{another-section}
@bar{
  abc = {123}
}

when an XR browser updates the human text, a quick scan for nonmatching tags (@book{nonmatchingbook e.g.) should be performed and prompt the enduser for deleting them.

Transclusion (broken link) resolution

In spirit of Ted Nelson's 'transclusion resolution', there's a soft-mechanism to harden links & minimize broken links in various ways:

  1. defining a different transport protocol (https vs ipfs or DAT) in src or href values can make a difference
  2. mirroring files on another protocol using (HTTP) errorcode tags in src or href properties
  3. in case of src: nesting a copy of the embedded object in the placeholder object (embeddedObject) will not be replaced when the request fails

due to the popularity, maturity and extensiveness of HTTP codes for client/server communication, non-HTTP protocols easily map to HTTP codes (ipfs ERR_NOT_FOUND maps to 404 e.g.)

For example:

  +────────────────────────────────────────────────────────+ 
  │                                                        │
  │  index.gltf                                            │
  │    │                                                   │
  │    │ #: #q=-offlinetext                                │
  │    │                                                   │
  │    ├── ◻ buttonA                                       │
  │    │      └ href:     http://foo.io/campagne.fbx       │
  │    │      └ href@404: ipfs://foo.io/campagne.fbx       │
  │    │      └ href@400: #q=clienterrortext               │
  │    │      └ ◻ offlinetext                              │
  │    │                                                   │
  │    └── ◻ embeddedObject                          <--------- the meshdata inside embeddedObject will (not)
  │           └ src: https://foo.io/bar.gltf               │    be flushed when the request (does not) succeed.
  │           └ src@404: http://foo.io/bar.gltf            │    So worstcase the 3D data (of the time of publishing index.gltf)
  │           └ src@400: https://archive.org/l2kj43.gltf   │    will be displayed.
  │                                                        │
  +────────────────────────────────────────────────────────+

Topic-based index-less Webrings

As hashtags in URLs map to the XWRG, href-values can be used to promote topic-based index-less webrings.
Consider 3D scenes linking to eachother using these href values:

  • href: schoolA.edu/projects.gltf#math
  • href: schoolB.edu/projects.gltf#math
  • href: university.edu/projects.gltf#math

These links would all show visible links to math-tagged objects in the scene.
To filter out non-related objects one could take it a step further using queries:

  • href: schoolA.edu/projects.gltf#math&q=-topics math
  • href: schoolB.edu/projects.gltf#math&q=-courses math
  • href: university.edu/projects.gltf#math&q=-theme math

This would hide all object tagged with topic, courses or theme (including math) so that later only objects tagged with math will be visible

This makes spatial content multi-purpose, without the need to separate content into separate files, or show/hide things using a complex logiclayer like javascript.

Security Considerations

Since XR Text contains metadata too, the user should be able to set up tagging-rules, so the copy-paste feature can :

  • filter out sensitive data when copy/pasting (XR text with tag:secret e.g.)

FAQ

Q: Why is everything HTTP GET-based, what about POST/PUT/DELETE HATEOS
A: Because it's out of scope: XR Fragment specifies a read-only way to surf XR documents. These things belong in the application layer (for example, an XR Hypermedia browser can decide to support POST/PUT/DELETE requests for embedded HTML thru src values)


Q: Why isn't there support for scripting, while we have things like WASM A: This is out of scope as it unhyperifies hypermedia, and this is up to XR hypermedia browser-extensions.
Historically scripting/Javascript seems to been able to turn webpages from hypermedia documents into its opposite (hyperscripted nonhypermedia documents).
In order to prevent this backward-movement (hypermedia tends to liberate people from finnicky scripting) XR Fragments should never unhyperify itself by hardcoupling to a particular markup or scripting language. XR Macro's are an example of something which is probably smarter and safer for hypermedia browsers to implement, instead of going full-in with a turing-complete scripting language (and suffer the security consequences later).
XR Fragments supports filtering objects in a scene only, because in the history of the javascript-powered web, showing/hiding document-entities seems to be one of the most popular basic usecases.
Doing advanced scripting & networkrequests under the hood are obviously interesting endavours, but this is something which should not be hardcoupled with hypermedia.
This belongs to browser extensions.
Non-HTML Hypermedia browsers should make browser extensions the right place, to 'extend' experiences, in contrast to code/javascript inside hypermedia documents (this turned out as a hypermedia antipattern).

IANA Considerations

This document has no IANA actions.

Acknowledgments

Appendix: Definitions

definition explanation
human a sentient being who thinks fuzzy, absorbs, and shares thought (by plain text, not markuplanguage)
scene a (local/remote) 3D scene or 3D file (index.gltf e.g.)
3D object an object inside a scene characterized by vertex-, face- and customproperty data.
metadata custom properties of text, 3D Scene or Object(nodes), relevant to machines and a human minority (academics/developers)
XR fragment URI Fragment with spatial hints like #pos=0,0,0&t=1,100 e.g.
the XRWG wordgraph (collapses 3D scene to tags)
the hashbus hashtags map to camera/scene-projections
spacetime hashtags positions camera, triggers scene-preset/time
teleportation repositioning the enduser to a different position (or 3D scene/file)
sourceportation teleporting the enduser to the original XR Document of an src embedded object.
placeholder object a 3D object which with src-metadata (which will be replaced by the src-data.)
src (HTML-piggybacked) metadata of a 3D object which instances content
href (HTML-piggybacked) metadata of a 3D object which links to content
query an URI Fragment-operator which queries object(s) from a scene like #q=cube
visual-meta visual-meta data appended to text/books/papers which is indirectly visible/editable in XR.
requestless metadata metadata which never spawns new requests (unlike RDF/HTML, which can cause framerate-dropping, hence not used a lot in games)
FPS frames per second in spatial experiences (games,VR,AR e.g.), should be as high as possible
introspective inward sensemaking ("I feel this belongs to that")
extrospective outward sensemaking ("I'm fairly sure John is a person who lives in oklahoma")
ascii representation of an 3D object/mesh
(un)obtrusive obtrusive: wrapping human text/thought in XML/HTML/JSON obfuscates human text into a salad of machine-symbols and words
BibTeX simple tagging/citing/referencing standard for plaintext
BibTag a BibTeX tag
(hashtag)bibs an easy to speak/type/scan tagging SDL (see here which expands to BibTex/JSON/XML