xrfragment/doc/RFC_XR_Fragments.md

34 KiB

%%% Title = "XR Fragments" area = "Internet" workgroup = "Internet Engineering Task Force"

[seriesInfo] name = "XR-Fragments" value = "draft-XRFRAGMENTS-leonvankammen-00" stream = "IETF" status = "informational"

date = 2023-04-12T00:00:00Z

author initials="L.R." surname="van Kammen" fullname="L.R. van Kammen"

%%%

.# Abstract

This draft offers a specification for 4D URLs & navigation, to link 3D scenes and text together with- or without a network-connection.
The specification promotes spatial addressibility, sharing, navigation, query-ing and tagging interactive (text)objects across for (XR) Browsers.
XR Fragments allows us to enrich existing dataformats, by recursive use of existing proven technologies like URI Fragments and BibTags notation.

Almost every idea in this document is demonstrated at https://xrfragment.org

{mainmatter}

Introduction

How can we add more features to existing text & 3D scenes, without introducing new dataformats?
Historically, there's many attempts to create the ultimate markuplanguage or 3D fileformat.
Their lowest common denominator is: (co)authoring using plain text.
XR Fragments allows us to enrich/connect existing dataformats, by recursive use of existing technologies:

  1. addressibility and navigation of 3D scenes/objects: URI Fragments + src/href spatial metadata
  2. hasslefree tagging across text and spatial objects using bibs / BibTags appendices (see visual-meta e.g.)

NOTE: The chapters in this document are ordered from highlevel to lowlevel (technical) as much as possible

Core principle

XR Fragments strives to serve (nontechnical/fuzzy) humans first, and machine(implementations) later, by ensuring hasslefree text-vs-thought feedback loops.
This also means that the repair-ability of machine-matters should be human friendly too (not too complex).

"When a car breaks down, the ones without turbosupercharger are easier to fix"

Let's always focus on average humans: our fuzzy symbolical mind must be served first, before serving a greater categorized typesafe RDF hive mind).

Humans first, machines (AI) later.

Thererfore, XR Fragments does not look at XR (or the web) thru the lens of HTML.
XR Fragments itself is HTML-agnostic, though pseudo-XR Fragment browsers can be implemented on top of HTML/Javascript.

Conventions and Definitions

See appendix below in case certain terms are not clear.

List of URI Fragments

fragment type example info
#pos vector3 #pos=0.5,0,0 positions camera to xyz-coord 0.5,0,0
#rot vector3 #rot=0,90,0 rotates camera to xyz-coord 0.5,0,0
#t vector2 #t=500,1000 sets animation-loop range between frame 500 and 1000
#...... string #.cubes #cube object(s) of interest (fragment to object name or class mapping)

xyz coordinates are similar to ones found in SVG Media Fragments

List of metadata for 3D nodes

key type example (JSON) info
name string "name": "cube" available in all 3D fileformats & scenes
class string "class": "cubes" available through custom property in 3D fileformats
href string "href": "b.gltf" available through custom property in 3D fileformats
src string "src": "#q=cube" available through custom property in 3D fileformats

Popular compatible 3D fileformats: .gltf, .obj, .fbx, .usdz, .json (THREE.js), .dae and so on.

NOTE: XR Fragments are file-agnostic, which means that the metadata exist in programmatic 3D scene(nodes) too.

Navigating 3D

Here's an ascii representation of a 3D scene-graph which contains 3D objects and their metadata:

  +--------------------------------------------------------+ 
  |                                                        |
  |  index.gltf                                            |
  |    │                                                   |
  |    ├── ◻ buttonA                                       |
  |    │      └ href: #pos=1,0,1&t=100,200                 |
  |    │                                                   |
  |    └── ◻ buttonB                                       |
  |           └ href: other.fbx                            |   <-- file-agnostic (can be .gltf .obj etc)
  |                                                        |
  +--------------------------------------------------------+

An XR Fragment-compatible browser viewing this scene, allows the end-user to interact with the buttonA and buttonB.
In case of buttonA the end-user will be teleported to another location and time in the current loaded scene, but buttonB will replace the current scene with a new one, like other.fbx.

Embedding 3D content

Here's an ascii representation of a 3D scene-graph with 3D objects which embeds remote & local 3D objects (without) using queries:

  +--------------------------------------------------------+  +-------------------------+ 
  |                                                        |  |                         |
  |  index.gltf                                            |  | ocean.com/aquarium.fbx  |
  |    │                                                   |  |   │                     |
  |    ├── ◻ canvas                                        |  |   └── ◻ fishbowl        |
  |    │      └ src: painting.png                          |  |         ├─ ◻ bass       |
  |    │                                                   |  |         └─ ◻ tuna       |
  |    ├── ◻ aquariumcube                                  |  |                         |       
  |    │      └ src: ://rescue.com/fish.gltf#q=bass%20tuna |  +-------------------------+
  |    │                                                   |    
  |    ├── ◻ bedroom                                       |   
  |    │      └ src: #q=canvas                             |
  |    │                                                   |   
  |    └── ◻ livingroom                                    |      
  |           └ src: #q=canvas                             |
  |                                                        |
  +--------------------------------------------------------+

An XR Fragment-compatible browser viewing this scene, lazy-loads and projects painting.png onto the (plane) object called canvas (which is copy-instanced in the bed and livingroom).
Also, after lazy-loading ocean.com/aquarium.gltf, only the queried objects bass and tuna will be instanced inside aquariumcube.
Resizing will be happen accordingly to its placeholder object aquariumcube, see chapter Scaling.

XR Fragment queries

Include, exclude, hide/shows objects using space-separated strings:

  • #q=cube
  • #q=cube -ball_inside_cube
  • #q=* -sky
  • #q=-.language .english
  • #q=cube&rot=0,90,0
  • #q=price:>2 price:<5

It's simple but powerful syntax which allows css-like class/id-selectors with a searchengine prompt-style feeling:

  1. queries are showing/hiding objects only when defined as src value (prevents sharing of scene-tampered URL's).
  2. queries are highlighting objects when defined in the top-Level (browser) URL (bar).
  3. search words like cube and foo in #q=cube foo are matched against 3D object names or custom metadata-key(values)
  4. search words like cube and foo in #q=cube foo are matched against tags (BibTeX) inside plaintext src values like @cube{redcube, ... e.g.
  5. # equals #q=*
  6. words starting with . like .german match class-metadata of 3D objects like "class":"german"
  7. words starting with . like .german match class-metadata of (BibTeX) tags in XR Text objects like @german{KarlHeinz, ... e.g.

For example: #q=.foo is a shorthand for #q=class:foo, which will select objects with custom property class:foo. Just a simple #q=cube will simply select an object named cube.

including/excluding

operator info
* select all objects (only useful in src custom property)
- removes/hides object(s)
: indicates an object-embedded custom property key/value
. alias for "class" :".foo" equals class:foo
> < compare float or int number
/ reference to root-scene.
Useful in case of (preventing) showing/hiding objects in nested scenes (instanced by src) (*)

* = #q=-/cube hides object cube only in the root-scene (not nested cube objects)
#q=-cube hides both object cube in the root-scene AND nested skybox objects |

» example implementation » example 3D asset » discussion

Query Parser

Here's how to write a query parser:

  1. create an associative array/object to store query-arguments as objects
  2. detect object id's & properties foo:1 and foo (reference regex: /^.*:[><=!]?/ )
  3. detect excluders like -foo,-foo:1,-.foo,-/foo (reference regex: /^-/ )
  4. detect root selectors like /foo (reference regex: /^[-]?\// )
  5. detect class selectors like .foo (reference regex: /^[-]?class$/ )
  6. detect number values like foo:1 (reference regex: /^[0-9\.]+$/ )
  7. expand aliases like .foo into class:foo
  8. for every query token split string on :
  9. create an empty array rules
  10. then strip key-operator: convert "-foo" into "foo"
  11. add operator and value to rule-array
  12. therefore we we set id to true or false (false=excluder -)
  13. and we set root to true or false (true=/ root selector is present)
  14. we convert key '/foo' into 'foo'
  15. finally we add the key/value to the store like store.foo = {id:false,root:true} e.g.

An example query-parser (which compiles to many languages) can be found here

XR Fragment URI Grammar

reserved    = gen-delims / sub-delims
gen-delims  = "#" / "&"
sub-delims  = "," / "="

Example: ://foo.com/my3d.gltf#pos=1,0,0&prio=-5&t=0,100

Demo Explanation
pos=1,2,3 vector/coordinate argument e.g.
pos=1,2,3&rot=0,90,0&q=.foo combinators

Text in XR (tagging,linking to spatial objects)

We still think and speak in simple text, not in HTML or RDF.
The most advanced human will probably not shout <h1>FIRE!</h1> in case of emergency.
Given the new dawn of (non-keyboard) XR interfaces, keeping text as is (not obscuring with markup) is preferred.
Ideally metadata must come with text, but not obfuscate the text, or in another file.

This way:

  1. XR Fragments allows hasslefree XR text tagging, using BibTeX metadata at the end of content (like visual-meta).
  2. XR Fragments allows hasslefree textual tagging, spatial tagging, and supra tagging, by mapping 3D/text object (class)names using BibTeX 'tags'
  3. Bibs/BibTeX-appendices is first-choice requestless metadata-layer for XR text, HTML/RDF/JSON is great (but fits better in the application-layer)
  4. Default font (unless specified otherwise) is a modern monospace font, for maximized tabular expressiveness (see the core principle).
  5. anti-pattern: hardcoupling a mandatory obtrusive markuplanguage or framework with an XR browsers (HTML/VRML/Javascript) (see the core principle)
  6. anti-pattern: limiting human introspection, by immediately funneling human thought into typesafe, precise, pre-categorized metadata like RDF (see the core principle)

This allows recursive connections between text itself, as well as 3D objects and vice versa, using BibTags :

  +---------------------------------------------+         +------------------+
  | My Notes                                    |         |        / \       |
  |                                             |         |       /   \      |
  | The houses here are built in baroque style. |         |      /house\     |
  |                                             |         |      |_____|     |
  |                                             |         +---------|--------+ 
  | @house{houses,                              >----'house'--------|    class/name match?
  |   url  = {#.house}                          >----'houses'-------`    class/name match? 
  | }                                           |
  +---------------------------------------------+

The enduser can add connections by speaking/typing/scanning hashtagbibs which the XR Browser can expand to (hidden) BibTags.

This allows instant realtime tagging of objects at various scopes:

scope matching algo
textual text containing 'houses' is now automatically tagged with 'house' (incl. plaintext src child nodes)
spatial spatial object(s) with "class":"house" (because of {#.house}) are now automatically tagged with 'house' (incl. child nodes)
supra text- or spatial-object(s) (non-descendant nodes) elsewhere, named 'house', are automatically tagged with 'house' (current node to root node)
omni text- or spatial-object(s) (non-descendant nodes) elsewhere, containing class/name 'house', are automatically tagged with 'house' (too node to all nodes)
infinite text- or spatial-object(s) (non-descendant nodes) elsewhere, containing class/name 'house' or 'houses', are automatically tagged with 'house' (too node to all nodes)

This empowers the enduser spatial expressiveness (see the core principle): spatial wires can be rendered, words can be highlighted, spatial objects can be highlighted/moved/scaled, links can be manipulated by the user.
The simplicity of appending BibTeX 'tags' (humans first, machines later) is also demonstrated by visual-meta in greater detail.

  1. The XR Browser needs to adjust tag-scope based on the endusers needs/focus (infinite tagging only makes sense when environment is scaled down significantly)
  2. The XR Browser should always allow the human to view/edit the metadata, by clicking 'toggle metadata' on the 'back' (contextmenu e.g.) of any XR text, anywhere anytime.

NOTE: infinite matches both 'house' and 'houses' in text, as well as spatial objects with "class":"house" or name "house". This multiplexing of id/category is deliberate because of the core principle.

Default Data URI mimetype

The src-values work as expected (respecting mime-types), however:

The XR Fragment specification bumps the traditional default browser-mimetype

text/plain;charset=US-ASCII

to a hashtagbib(tex)-friendly one:

text/plain;charset=utf-8;bib=^@

This indicates that:

  • utf-8 is supported by default
  • hashtagbibs are expanded to bibtags
  • lines matching regex ^@ will automatically get filtered out, in order to:
  • links between textual/spatial objects can automatically be detected
  • bibtag appendices (visual-meta can be interpreted e.g.

for more info on this mimetype see bibs

Advantages:

  • out-of-the-box (de)multiplex human text and metadata in one go (see the core principle)
  • no network-overhead for metadata (see the core principle)
  • ensuring high FPS: HTML/RDF historically is too 'requesty'/'parsy' for game studios
  • rich send/receive/copy-paste everywhere by default, metadata being retained (see the core principle)
  • netto result: less webservices, therefore less servers, and overall better FPS in XR

This significantly expands expressiveness and portability of human tagged text, by postponing machine-concerns to the end of the human text in contrast to literal interweaving of content and markupsymbols (or extra network requests, webservices e.g.).

For all other purposes, regular mimetypes can be used (but are not required by the spec).

URL and Data URI

  +--------------------------------------------------------------+  +------------------------+
  |                                                              |  | author.com/article.txt |
  |  index.gltf                                                  |  +------------------------+
  |    │                                                         |  |                        |
  |    ├── ◻ article_canvas                                      |  | Hello friends.         |
  |    │    └ src: ://author.com/article.txt                     |  |                        |
  |    │                                                         |  | @friend{friends        |
  |    └── ◻ note_canvas                                         |  |   ...                  |
  |           └ src:`data:welcome human\n@...`                   |  | }                      | 
  |                                                              |  +------------------------+
  |                                                              |
  +--------------------------------------------------------------+

The enduser will only see welcome human and Hello friends rendered spatially. The beauty is that text (AND visual-meta) in Data URI promotes rich copy-paste. In both cases, the text gets rendered immediately (onto a plane geometry, hence the name '_canvas'). The XR Fragment-compatible browser can let the enduser access visual-meta(data)-fields after interacting with the object (contextmenu e.g.).

additional tagging using bibs: to tag spatial object note_canvas with 'todo', the enduser can type or speak @note_canvas@todo

The mapping between 3D objects and text (src-data) is simple (the :

Example:

  +------------------------------------------------+ 
  |                                                | 
  |  index.gltf                                    | 
  |    │                                           | 
  |    └── ◻ rentalhouse                           | 
  |           └ class: house              <----------------- matches -------+
  |           └ ◻ note                             |                        |
  |                 └ src:`data: todo: call owner  |       hashtagbib       |
  |                              #owner@house@todo | ----> expands to     @house{owner,
  |                                                |          bibtex:     }
  |                        `                       |                      @contact{
  +------------------------------------------------+                      }

Bi-directional mapping between 3D object names and/or classnames and text using bibs,BibTags & XR Fragments, allows for rich interlinking between text and 3D objects:

  1. When the user surfs to https://.../index.gltf#rentalhouse the XR Fragments-parser points the enduser to the rentalhouse object, and can show contextual info about it.
  2. When (partial) remote content is embedded thru XR Fragment queries (see XR Fragment queries), indirectly related metadata can be embedded along.

Bibs & BibTeX: lowest common denominator for linking data

"When a car breaks down, the ones without turbosupercharger are easier to fix"

Unlike XML or JSON, BibTex is typeless, unnested, and uncomplicated, hence a great advantage for introspection.
It's a missing sensemaking precursor to extrospective RDF.
BibTeX-appendices are already used in the digital AND physical world (academic books, visual-meta), perhaps due to its terseness & simplicity.
In that sense, it's one step up from the .ini fileformat (which has never leaked into the physical world like BibTex):

  1. frictionless copy/pasting (by humans) of (unobtrusive) content AND metadata
  2. an introspective 'sketchpad' for metadata, which can (optionally) mature into RDF later
characteristic UTF8 Plain Text (with BibTeX) RDF
perspective introspective extrospective
structure fuzzy (sensemaking) precise
space/scope local world
everything is text (string) yes no
voice/paper-friendly bibs no
leaves (dictated) text intact yes no
markup language just an appendix ~4 different
polyglot format no yes
easy to copy/paste content+metadata yes up to application
easy to write/repair for layman yes depends
easy to (de)serialize yes (fits on A4 paper) depends
infrastructure selfcontained (plain text) (semi)networked
freeform tagging/annotation yes, terse yes, verbose
can be appended to text-content yes up to application
copy-paste text preserves metadata yes up to application
emoji yes depends on encoding
predicates free semi pre-determined
implementation/network overhead no depends
used in (physical) books/PDF yes (visual-meta) no
terse non-verb predicates yes no
nested structures no (but: BibTex rulers) yes

To keep XR Fragments a lightweight spec, BibTeX is used for rudimentary text/spatial tagging (not JSON, RDF or a scripting language because they're harder to write/speak/repair.).

Applications are also free to attach any JSON(LD / RDF) to spatial objects using custom properties (but is not interpreted by this spec).

XR Text example parser

  1. The XR Fragments spec does not aim to harden the BiBTeX format
  2. respect multi-line BibTex values because of the core principle
  3. Expand hashtag(bibs) and rulers (like ${visual-meta-start}) according to the hashtagbibs spec
  4. BibTeX snippets should always start in the beginning of a line (regex: ^@), hence mimetype text/plain;charset=utf-8;bib=^@

Here's an XR Text (de)multiplexer in javascript, which ticks all the above boxes:

xrtext = {

  expandBibs: (text) => { 
    let bibs   = { regex: /(#[a-zA-Z0-9_+@\-]+(#)?)/g, tags: {}}
    text.replace( bibs.regex , (m,k,v) => {
       tok   = m.substr(1).split("@")
       match = tok.shift()
       if( tok.length ) tok.map( (t) => bibs.tags[t] = `@${t}{${match},\n}` )
       else if( match.substr(-1) == '#' ) 
          bibs.tags[match] = `@{${match.replace(/#/,'')}}`
       else bibs.tags[match] = `@${match}{${match},\n}`
    })
    return text.replace( bibs.regex, '') + Object.values(bibs.tags).join('\n')
  },
    
  decode: (str) => {
    // bibtex:     ↓@   ↓<tag|tag{phrase,|{ruler}>  ↓property  ↓end
    let pat    = [ /@/, /^\S+[,{}]/,                /},/,      /}/ ]
    let tags   = [], text='', i=0, prop=''
    let lines  = xrtext.expandBibs(str).replace(/\r?\n/g,'\n').split(/\n/)
    for( let i = 0; i < lines.length && !String(lines[i]).match( /^@/ ); i++ ) 
        text += lines[i]+'\n'

    bibtex = lines.join('\n').substr( text.length )
    bibtex.split( pat[0] ).map( (t) => {
        try{
           let v = {}
           if( !(t = t.trim())         ) return
           if( tag = t.match( pat[1] ) ) tag = tag[0]
           if( tag.match( /^{.*}$/ )   ) return tags.push({ruler:tag})
           t = t.substr( tag.length )
           t.split( pat[2] )
           .map( kv => {
             if( !(kv = kv.trim()) || kv == "}" ) return
             v[ kv.match(/\s?(\S+)\s?=/)[1] ] = kv.substr( kv.indexOf("{")+1 )
           })
           tags.push( { k:tag, v } )
        }catch(e){ console.error(e) }
    })
    return {text, tags}
  },

  encode: (text,tags) => {
    let str = text+"\n"
    for( let i in tags ){
      let item = tags[i]
      if( item.ruler ){
          str += `@${item.ruler}\n`
          continue;
      }
      str += `@${item.k}\n`
      for( let j in item.v ) str += `  ${j} = {${item.v[j]}}\n`
      str += `}\n`
    }
    return str
  }
}

The above functions (de)multiplexe text/metadata, expands bibs, (de)serialize bibtex (and all fits more or less on one A4 paper)

above can be used as a startingpoint for LLVM's to translate/steelman to a more formal form/language.

str = `
hello world
here are some hashtagbibs followed by bibtex:

#world
#hello@greeting
#another-section#

@{some-section}
@flap{
  asdf = {23423}
}`

var {tags,text} = xrtext.decode(str)          // demultiplex text & bibtex
tags.find( (t) => t.k == 'flap{' ).v.asdf = 1 // edit tag
tags.push({ k:'bar{', v:{abc:123} })          // add tag
console.log( xrtext.encode(text,tags) )       // multiplex text & bibtex back together 

This expands to the following (hidden by default) BibTex appendix:

hello world
here are some hashtagbibs followed by bibtex:

@{some-section}
@flap{
  asdf = {1}
}
@world{world,
}
@greeting{hello,
}
@{another-section}
@bar{
  abc = {123}
}

when an XR browser updates the human text, a quick scan for nonmatching tags (@book{nonmatchingbook e.g.) should be performed and prompt the enduser for deleting them.

HYPER copy/paste

The previous example, offers something exciting compared to simple copy/paste of 3D objects or text. XR Text according to the XR Fragment spec, allows HYPER-copy/paste: time, space and text interlinked. Therefore, the enduser in an XR Fragment-compatible browser can copy/paste/share data in these ways:

  1. time/space: 3D object (current animation-loop)
  2. text: TeXt object (including BibTeX/visual-meta if any)
  3. interlinked: Collected objects by visual-meta tag

Security Considerations

Since XR Text contains metadata too, the user should be able to set up tagging-rules, so the copy-paste feature can :

  • filter out sensitive data when copy/pasting (XR text with class:secret e.g.)

IANA Considerations

This document has no IANA actions.

Acknowledgments

Appendix: Definitions

definition explanation
human a sentient being who thinks fuzzy, absorbs, and shares thought (by plain text, not markuplanguage)
scene a (local/remote) 3D scene or 3D file (index.gltf e.g.)
3D object an object inside a scene characterized by vertex-, face- and customproperty data.
metadata custom properties of text, 3D Scene or Object(nodes), relevant to machines and a human minority (academics/developers)
XR fragment URI Fragment with spatial hints like #pos=0,0,0&t=1,100 e.g.
src (HTML-piggybacked) metadata of a 3D object which instances content
href (HTML-piggybacked) metadata of a 3D object which links to content
query an URI Fragment-operator which queries object(s) from a scene like #q=cube
visual-meta visual-meta data appended to text/books/papers which is indirectly visible/editable in XR.
requestless metadata metadata which never spawns new requests (unlike RDF/HTML, which can cause framerate-dropping, hence not used a lot in games)
FPS frames per second in spatial experiences (games,VR,AR e.g.), should be as high as possible
introspective inward sensemaking ("I feel this belongs to that")
extrospective outward sensemaking ("I'm fairly sure John is a person who lives in oklahoma")
ascii representation of an 3D object/mesh
(un)obtrusive obtrusive: wrapping human text/thought in XML/HTML/JSON obfuscates human text into a salad of machine-symbols and words
BibTeX simple tagging/citing/referencing standard for plaintext
BibTag a BibTeX tag
(hashtag)bibs an easy to speak/type/scan tagging SDL (see here