<!-- for annotated version see: https://raw.githubusercontent.com/ietf-tools/rfcxml-templates-and-schemas/main/draft-rfcxml-general-template-annotated-00.xml -->
This draft is a specification for interactive URI-controllable 3D files, enabling [hypermediatic](https://github.com/coderofsalvation/hypermediatic) navigation, to enable a spatial web for hypermedia browsers with- or without a network-connection.<br>
The specification uses [W3C Media Fragments](https://www.w3.org/TR/media-frags/) and [URI Templates (RFC6570)](https://www.rfc-editor.org/rfc/rfc6570) to promote spatial addressibility, sharing, navigation, filtering and databinding objects for (XR) Browsers.<br>
XR Fragments allows us to better use existing metadata inside 3D scene(files), by connecting it to proven technologies like [URI Fragments](https://en.wikipedia.org/wiki/URI_fragment).<br>
XR Fragments views spatial webs thru the lens of 3D scene URI's, rather than thru code(frameworks) or protocol-specific browsers (webbrowser e.g.).
1. addressibility and [hypermediatic](https://github.com/coderofsalvation/hypermediatic) navigation of 3D scenes/objects: [URI Fragments](https://en.wikipedia.org/wiki/URI_fragment) using src/href spatial metadata
Instead of forcing authors to combine 3D/2D objects programmatically (publishing thru a game-editor e.g.), XR Fragments **integrates all** which allows a universal viewing experience.<br>
Fact: our typical browser URL's are just **a possible implementation** of URI's (for untapped humancentric potential of URI's [see interpeer.io](https://interpeer.io))
> XR Fragments does not look at XR (or the web) thru the lens of HTML or URLs.<br>But approaches things from a higherlevel feedbackloop/hypermedia browser-perspective.
> ?-linked and #-linked navigation are JUST one possible way to implement XR Fragments: the essential goal is to allow a Hypermediatic FeedbackLoop (HFL) between external and internal 4D navigation.
> Being able to use the same URI Fragment DSL for navigation (`href: #foo`) as well as interactions (`href: xrf://#bar`) greatly simplifies implementation, increases HFL, and reduces need for scripting languages.
This opens up the following benefits for traditional & future webbrowsers:
* [hypermediatic](https://github.com/coderofsalvation/hypermediatic) loading/clicking 3D assets (gltf/fbx e.g.) natively (with or without using HTML).
* allowing 3D assets/nodes to publish XR Fragments to themselves/eachother using the `xrf://` hashbus
XR Fragments itself are [hypermediatic](https://github.com/coderofsalvation/hypermediatic) and HTML-agnostic, though pseudo-XR Fragment browsers **can** be implemented on top of HTML/Javascript.
| | (this does not update topLevel URI) | (this is non-standard, non-hypermediatic) |
> An important aspect of HFL is that URI Fragments can be triggered without updating the top-level URI (default href-behaviour) thru their own 'bus' (`xrf://#.....`). This decoupling between navigation and interaction prevents non-standard things like (`href`:`javascript:dosomething()`).
Pseudo (non-native) browser-implementations (supporting XR Fragments using HTML+JS e.g.) can use the `?` search-operator to address outbound content.<br>
In other words, the URL updates to: `https://me.com?https://me.com/other.glb` when navigating to `https://me.com/other.glb` from inside a `https://me.com` WebXR experience e.g.<br>
That way, if the link gets shared, the XR Fragments implementation at `https://me.com` can load the latter (and still indicates which XR Fragments entrypoint-experience/client was used).
> Every 3D fileformat supports named 3D object, and this name allows URLs (fragments) to reference them (and their children objects).
Clever nested design of 3D scenes allow great ways for re-using content, and/or previewing scenes.<br>
For example, to render a portal with a preview-version of the scene, create an 3D object with:
* href: `https://scene.fbx`
* src: `https://otherworld.gltf#mainobject`
> It also allows **sourceportation**, which basically means the enduser can teleport to the original XR Document of an `src` embedded object, and see a visible connection to the particular embedded object. Basically an embedded link becoming an outbound link by activating it.
| [Media Fragments](https://www.w3.org/TR/media-frags/) | [media fragment](#media%20fragments%20and%20datatypes) | `#t=0,2&loop` | play (and loop) 3D animation from 0 seconds till 2 seconds|
| **PRESET** | `#<preset>` | string | `#cubes` | evaluates preset (`#foo&bar`) defined in 3D Object metadata (`#cubes: #foo&bar` e.g.) while URL-browserbar reflects `#cubes`. Only works when metadata-key starts with `#` |
| **FOCUS** | `#<tag_or_objectname>` | string | `#person` | (and show) object(s) with `tag: person` or name `person` (XRWG lookup) |
| **FILTERS** | `#[!][-]<tag_or_objectname>[*]` | string | `#person` (`#-person`) | will reset (`!`), show/focus or hide (`-`) focus object(s) with `tag: person` or name `person` by looking up XRWG (`*`=including children) |
| **CAMERASWITCH** | `#<cameraname>` | string | `#cam01` | sets camera with name `cam01` as active camera |
| **MATERIALUPDATE** | `#<tag_or_objectname>[*]=<materialname>` | string=string | `#car=metallic`| sets material of car to material with name `metallic` (`*`=including children)|
| **VARIABLE UPDATE** | `#<variable>=<metadata-key>` | string=string | `#foo=bar` | sets [URI Template](https://www.rfc-editor.org/rfc/rfc6570) variable `foo` to the value `#t=0` from **existing** object metadata (`bar`:`#t=0` e.g.), This allows for reactive [URI Template](https://www.rfc-editor.org/rfc/rfc6570) defined in object metadata elsewhere (`src`:`://m.com/cat.mp4#{foo}` e.g., to play media using [media fragment URI](https://www.w3.org/TR/media-frags/#valid-uri)). NOTE: metadata-key should not start with `#` |
> NOTE: below the word 'play' applies to 3D animations embedded in the 3D scene(file) **but also** media defined in `src`-metadata like audio/video-files (mp3/mp4 e.g.)
> \* = this is extending the [W3C media fragments](https://www.w3.org/TR/media-frags/#mf-advanced) with (missing) playback/viewport-control. Normally `#t=0,2` implies setting start/stop-values AND starting playback, whereas `#s=0&loop` allows pausing a video, speeding up/slowing down media, as well as enabling/disabling looping.
> The rationale for `uv` is that the `xywh` Media Fragment deals with rectangular media, which does not translate well to 3D models (which use triangular polygons, not rectangular) positioned by uv-coordinates. This also explains the absense of a `scale` or `rotate` primitive, which is challenged by this, as well as multiple origins (mesh- or texture).
> NOTE: URI Template variables are immutable and respect scope: in other words, the end-user cannot modify `blue` by entering an URL like `#blue=.....` in the browser URL, and `blue` is not accessible by the plane/media-object (however `{play}` would work).
1. the Y-coordinate of `pos` identifies the floorposition. This means that desktop-projections usually need to add 1.5m (average person height) on top (which is done automatically by VR/AR headsets).
2. set the position of the camera accordingly to the vector3 values of `#pos`
3.`rot` sets the rotation of the camera (only for non-VR/AR headsets)
4. mediafragment `t` in the top-URL sets the playbackspeed and animation-range of the global scene animation
5. before scene load: the scene is cleared
6. after scene load: in case the scene (rootnode) contains an `#` default view with a fragment value: execute non-positional fragments via the hashbus (no top-level URL change)
7. after scene load: in case the scene (rootnode) contains an `#` default view with a fragment value: execute positional fragment via the hashbus + update top-level URL
8. in case of no default `#` view on the scene (rootnode), default player(rig) position `0,0,0` is assumed.
9. in case a `href` does not mention any `pos`-coordinate, the current position will be assumed
An XR Fragment-compatible browser viewing this scene, allows the end-user to interact with the `buttonA` and `buttonB`.<br>
In case of `buttonA` the end-user will be teleported to another location and time in the **current loaded scene**, but `buttonB` will **replace the current scene** with a new one, like `other.fbx`, and assume `pos=0,0,0`.
1. IF a `#cube` matches a custom property-key (of an object) in the 3D file/scene (`#cube`: `#......`) <b>THEN</b> execute that predefined_view.
2. IF scene operators (`pos`) and/or animation operator (`t`) are present in the URL then (re)position the camera and/or animation-range accordingly.
3. IF no camera-position has been set in <b>step 1 or 2</b> update the top-level URL with `#pos=0,0,0` ([example](https://github.com/coderofsalvation/xrfragment/blob/main/src/3rd/js/three/navigator.js#L31]]))
4. IF a `#cube` matches the name (of an object) in the 3D file/scene then draw a line from the enduser('s heart) to that object (to highlight it).
5. IF a `#cube` matches anything else in the XR Word Graph (XRWG) draw wires to them (text or related objects).
It instances content (in objects) in the current scene/asset, and follows similar logic like the previous chapter, except that it does not modify the camera.
An XR Fragment-compatible browser viewing this scene, lazy-loads and projects `painting.png` onto the (plane) object called `canvas` (which is copy-instanced in the bed and livingroom).<br>
Also, after lazy-loading `ocean.com/aquarium.gltf`, only the queried objects `fishbowl` (and `bass` and `tuna`) will be instanced inside `aquariumcube`.<br>
> Instead of cherrypicking a rootobject `#fishbowl` with `src`, additional filters can be used to include/exclude certain objects. See next chapter on filtering below.
2.<b>local</b>`src` values (`#...` e.g.) starting with a non-negating filter (`#cube` e.g.) will (deep)reparent that object (with name `cube`) as the new root of the scene at position 0,0,0
3.<b>local</b>`src` values should respect (negative) filters (`#-foo&price=>3`)
4. the instanced scene (from a `src` value) should be <b>scaled accordingly</b> to its placeholder object or <b>scaled relatively</b> based on the scale-property (of a geometry-less placeholder, an 'empty'-object in blender e.g.). For more info see Chapter Scaling.
5.<b>external</b>`src` values should be served with appropriate mimetype (so the XR Fragment-compatible browser will now how to render it). The bare minimum supported mimetypes are:
6.`src` values should make its placeholder object invisible, and only flush its children when the resolved content can succesfully be retrieved (see [broken links](#links))
7.<b>external</b>`src` values should respect the fallback link mechanism (see [broken links](#broken-links)
8. when the placeholder object is a 2D plane, but the mimetype is 3D, then render the spatial content on that plane via a stencil buffer.
9. src-values are non-recursive: when linking to an external object (`src: foo.fbx#bar`), then `src`-metadata on object `bar` should be ignored.
12. when the enduser clicks an href with `#t=1,0,0` (play) will be applied to all src mediacontent with a timeline (mp4/mp3 e.g.)
13. a non-euclidian portal can be rendered for flat 3D objects (using stencil buffer e.g.) in case ofspatial `src`-values (an object `#world3` or URL `world3.fbx` e.g.).
3. navigation should not happen ''immediately'' when user is more than 5 meter away from the portal/object containing the href (to prevent accidental navigation e.g.)
4. URL navigation should always be reflected in the client URL-bar (in case of javascript: see [[here](https://github.com/coderofsalvation/xrfragment/blob/dev/src/3rd/js/three/navigator.js) for an example navigator), and only update the URL-bar after the scene (default fragment `#`) has been loaded.
5. In immersive XR mode, the navigator back/forward-buttons should be always visible (using a wearable e.g., see [[here](https://github.com/coderofsalvation/xrfragment/blob/dev/example/aframe/sandbox/index.html#L26-L29) for an example wearable)
6. make sure that the ''back-button'' of the ''browser-history'' always refers to the previous position (see [[here](https://github.com/coderofsalvation/xrfragment/blob/main/src/3rd/js/three/xrf/href.js#L97))
7. ignore previous rule in special cases, like clicking an `href` using camera-portal collision (the back-button could cause a teleport-loop if the previous position is too close)
8. href-events should bubble upward the node-tree (from children to ancestors, so that ancestors can also contain an href), however only 1 href can be executed at the same time.
9. the end-user navigator back/forward buttons should repeat a back/forward action until a `pos=...` primitive is found (the stateless xrf:// href-values should not be pushed to the url-history)
How does the scale of the object (with the embedded properties) impact the scale of the referenced content?<br>
> Rule of thumb: visible placeholder objects act as a '3D canvas' for the referenced scene (a plane acts like a 2D canvas for images e, a cube as a 3D canvas e.g.).
1.<b>IF</b> an embedded property (`src` e.g.) is set on an non-empty placeholder object (geometry of >2 vertices):
* calculate the <b>bounding box</b> of the ''placeholder'' object (maxsize=1.4 e.g.)
* hide the ''placeholder'' object (material e.g.)
* instance the `src` scene as a child of the existing object
* calculate the <b>bounding box</b> of the instanced scene, and scale it accordingly (to 1.4 e.g.)
> REASON: non-empty placeholder object can act as a protective bounding-box (for remote content of which might grow over time e.g.)
1. to enable enduser-triggered play, use a [[URI Template]] XR Fragment: (`src: bar.mp3#{player}` and `play: t=0&loop` and `href: xrf://#player=play` e.g.)
1. when the enduser clicks the `href`, `#t=0&loop` (play) will be applied to the `src` value
* see [an (outdated) example video here](https://coderofsalvation.github.io/xrfragment.media/queries.mp4) which used a dedicated `q=` variable (now deprecated and usable directly)
> NOTE 1: after an external embedded object has been instanced (`src: https://y.com/bar.fbx#room` e.g.), filters do not affect them anymore (reason: local tag/name collisions can be mitigated easily, but not in case of remote content).
> NOTE 2: depending on the used 3D framework, toggling objects (in)visible should happen by enabling/disableing writing to the colorbuffer (to allow children being still visible while their parents are invisible).
> An example filter-parser (which compiles to many languages) can be [found here](https://github.com/coderofsalvation/xrfragment/blob/main/src/xrfragment/Filter.hx)
The obvious approach for this, is to consult the XRWG ([example](https://github.com/coderofsalvation/xrfragment/blob/feat/macros/src/3rd/js/XRWG.js)), which basically has all these things already collected/organized for you during scene-load.
> The XR Fragments does this by collapsing space into a **Word Graph** (the **XRWG** [example](https://github.com/coderofsalvation/xrfragment/blob/feat/macros/src/3rd/js/XRWG.js)), augmented by Bib(s)Tex.
1. XR Fragments promotes (de)serializing a scene to a (lowercase) XRWG ([example](https://github.com/coderofsalvation/xrfragment/blob/feat/macros/src/3rd/js/XRWG.js))
4. XR Fragments primes the XRWG, by collecting tags/id's from linked hypermedia (URI fragments for HTML e.g.)
5. The XRWG should be recalculated when textvalues (in `src`) change
6. HTML/RDF/JSON is still great, but is beyond the XRWG-scope (they fit better in the application-layer, or as embedded src content)
7. Applications don't have to be able to access the XRWG programmatically, as they can easily generate one themselves by traversing the scene-nodes.
8. The XR Fragment focuses on fast and easy-to-generate end-user controllable word graphs (instead of complex implementations that try to defeat word ambiguity)
9. Instead of exact lowercase word-matching, levensteihn-distance-based matching is preferred
9. The XR Browser needs to adjust tag-scope based on the endusers needs/focus (infinite tagging only makes sense when environment is scaled down significantly)
10. The XR Browser should always allow the human to view/edit the metadata, by clicking 'toggle metadata' on the 'back' (contextmenu e.g.) of any XR text, anywhere anytime.
13. Default font (unless specified otherwise) is a modern monospace font, for maximized tabular expressiveness (see [the core principle](#core-principle)).
14. anti-pattern: hardcoupling an XR Browser with a mandatory **markup/scripting-language** which departs from onubtrusive plain text (HTML/VRML/Javascript) (see [the core principle](#core-principle))
* words beginning with `#` (hashtags) will prime the XRWG by adding the hashtag to the XRWG, linking to the current sentence/paragraph/alltext (depending on '.') to the XRWG
> This significantly expands expressiveness and portability of human tagged text, by **postponing machine-concerns to the end of the human text** in contrast to literal interweaving of content and markupsymbols (or extra network requests, webservices e.g.).
> additional tagging using [bibs](https://github.com/coderofsalvation/hashtagbibs): to tag spatial object `note_canvas` with 'todo', the enduser can type or speak `#note_canvas@todo`
Environment mapping is crucial for creating realistic reflections and lighting effects on 3D objects.
To apply environment mapping efficiently in a 3D scene, traverse the scene graph and assign each object's environment map based on the nearest ancestor's texture map. This ensures that objects inherit the correct environment mapping from their closest parent with a texture, enhancing the visual consistency and realism.
> due to the popularity, maturity and extensiveness of HTTP codes for client/server communication, non-HTTP protocols easily map to HTTP codes (ipfs ERR_NOT_FOUND maps to 404 e.g.)
> This would hide all object tagged with `topic`, `courses` or `theme` (including math) so that later only objects tagged with `math` will be visible
This makes spatial content multi-purpose, without the need to separate content into separate files, or show/hide things using a complex logiclayer like javascript.
**ARIA** (`aria-description`) is the most important to support, as it promotes accessibility and allows scene transcripts. Please start `aria-description` with a verb to aid transcripts.
> Example: object 'tryceratops' with `aria-description: is a huge dinosaurus standing on a #mountain` generates transcript `#tryceratops is a huge dinosaurus standing on a #mountain`, where the hashtags are clickable XR Fragments (activating the visible-links in the XR browser).
4. the textlog contains `aria-descriptions`, and its narration (Screenreader e.g.) can be skipped (via 2-button navigation)
5. The `back` command should navigate back to the previous URL (alias for browser-backbutton)
6. The `forward` command should navigate back to the next URL (alias for browser-nextbutton)
7. A destination is a 3D node containing an `href` with a `pos=` XR fragment
8. The `go` command should list all possible destinations
9. The `go left` command should move the camera around 0.3 meters to the left
10. The `go right` command should move the camera around 0.3 meters to the right
11. The `go forward` command should move the camera 0.3 meters forward (direction of current rotation).
12. The `rotate left` command should rotate the camera 0.3 to the left
13. The `rotate left` command should rotate the camera 0.3 to the right
14. The (dynamic) `go abc` command should navigate to `#pos=scene2` in case there's a 3D node with name `abc` and `href` value `#pos=scene2`
15. The `look` command should give an (contextual) 3D-to-text transcript, by scanning the `aria-description` values of the current `pos=` value (including its children)
16. The `do` command should list all possible `href` values which don't contain an `pos=` XR Fragment
17. The (dynamic) `do abc` command should navigate/execute `https://.../...` in case a 3D node exist with name `abc` and `href` value `https://.../...`
## Two-button navigation
For specific user-profiles, gyroscope/mouse/keyboard/audio/visuals will not be available.<br>
Therefore a 2-button navigation-interface is the bare minimum interface:
1. objects with href metadata can be cycled via a key (tab on a keyboard)
2. objects with href metadata can be activated via a key (enter on a keyboard)
3. the TTS reads the href-value (and/or aria-description if available)
If a glTF implementation does not support a particular extension, the (XRF) extras field can serve as a fallback. This way, metadata provided in extras can still be useful for applications that don't handle certain extensions.
> **Example 1** In case of the OMI_LINK glTF extension (`href: https://nlnet.nl`) and an XR Fragment (`href: #pos=otherroom` or `href: otherplanet.glb`), it is clear that `https://nlnet.nl` should open in a browsertab, whereas the XR Fragment links should teleport the user. If the OMI_LINK contains an XR Fragment (`#pos=` e.g.) a teleport should be performed only (and other [overlapping] metadata should be ignored).
> **Example 2** If an Extensions uses XR Fragments in URI's (`href: #pos=otherroom` or `href: xrf://-walls` in OMI_LINK e.g.), then perform them according to XR Fragment spec (teleport user). But only once: ignore further overlapping metadata for that usecase.
Vendor-specific metadata in a 3D scenefiles, are similar to vendor-specific [CSS-prefixes](https://en.wikipedia.org/wiki/CSS#Vendor_prefixes) (`-moz-opacity: 0.2` e.g.).
This allows popular 3D engines/frameworks, to initialize specific features when loading a scene/object, in a progressive enhanced way.
Vendor Prefixes allows embedding 3D engines/framework-specific features a 3D file via metadata:
> Why? Because not all XR interactions can/should be solved/standardized by embedding XR Fragments into any 3D file.
The lowest common denominator between 3D engines is the 'entity'-part of their entity-component-system (ECS). The 'component'-part can be progressively enhanced via vendor prefixes.
String-templatevalues are evaluated as per [URI Templates (RFC6570)](https://www.rfc-editor.org/rfc/rfc6570) Level 1.
> This 'separating of mechanism from policy' (unix rule) does **somewhat** break portability of an XR experience, but still prevents (E-waste of) handcoded virtual worlds. It allows for (XR experience) metadata to survive in future 3D engines and scene-fileformats.
The only dynamic parts are [W3C Media Fragments](https://www.w3.org/TR/media-frags/) and [URI Templates (RFC6570)](https://www.rfc-editor.org/rfc/rfc6570).<br>
The use of URI Templates is limited to pre-defined variables and Level0 fragments-expansion only, which makes it quite safe.<br>
**Q:** Why is everything HTTP GET-based, what about POST/PUT/DELETE HATEOS<br>
**A:** Because it's out of scope: XR Fragment specifies a read-only way to surf XR documents. These things belong in the application layer (for example, an XR Hypermedia browser can decide to support POST/PUT/DELETE requests for embedded HTML thru `src` values)
**Q:** Why isn't there support for scripting, URI Template Fragments are so limited compared to WASM & javascript
**A:** This is out of scope as it unhyperifies hypermedia, and this is up to XR hypermedia browser-extensions.<br> Historically scripting/Javascript seems to been able to turn webpages from hypermedia documents into its opposite (hyperscripted nonhypermedia documents).<br>In order to prevent this backward-movement (hypermedia tends to liberate people from finnicky scripting) XR Fragment uses [W3C Media Fragments](https://www.w3.org/TR/media-frags/) and [URI Templates (RFC6570)](https://www.rfc-editor.org/rfc/rfc6570), to prevent unhyperifying itself by hardcoupling to a particular markup or scripting language. <br>
XR Fragments supports filtering objects in a scene only, because in the history of the javascript-powered web, showing/hiding document-entities seems to be one of the most popular basic usecases.<br>
Doing advanced scripting & networkrequests under the hood are obviously interesting endavours, but this is something which should not be hardcoupled with XR Fragments or hypermedia.<br>This perhaps belongs more to browser extensions.<br>
Non-HTML Hypermedia browsers should make browser extensions the right place, to 'extend' experiences, in contrast to code/javascript inside hypermedia documents (this turned out as a hypermedia antipattern).
|URI | some resource at something somewhere via someprotocol (`http://me.com/foo.glb#foo` or `e76f8efec8efce98e6f` [see interpeer.io](https://interpeer.io))|
|URL | something somewhere via someprotocol (`http://me.com/foo.glb`) |
|visual-meta | [visual-meta](https://visual.meta.info) data appended to text/books/papers which is indirectly visible/editable in XR. |
|requestless metadata | metadata which never spawns new requests (unlike RDF/HTML, which can cause framerate-dropping, hence not used a lot in games) |
|FPS | frames per second in spatial experiences (games,VR,AR e.g.), should be as high as possible |
|introspective | inward sensemaking ("I feel this belongs to that") |
|extrospective | outward sensemaking ("I'm fairly sure John is a person who lives in oklahoma") |
|`◻` | ascii representation of an 3D object/mesh |
|(un)obtrusive | obtrusive: wrapping human text/thought in XML/HTML/JSON obfuscates human text into a salad of machine-symbols and words |
|BibTeX | simple tagging/citing/referencing standard for plaintext |
|BibTag | a BibTeX tag |
|(hashtag)bibs | an easy to speak/type/scan tagging SDL ([see here](https://github.com/coderofsalvation/hashtagbibs) which expands to BibTex/JSON/XML |