Skip to main content

Resources and subjects, JSON Home and JSON-LD

10 min read

This blog post discloses the evolution of ideas and is not just the presentation of the final result. The latest draft of JSON Home published on 24 November is not covered. Note: I’m just on the way to understanding REST.

As HATEOAS is a constraint in REST and not an option [1] designers and implementors of RESTful APIs should have a look at JSON Home [2]. JSON Home - in it’s primary design goal - is a resource centric standard draft.

In a current project we have established a RESTful API for trans system data replication and - consequently - implementing JSON Home as a lookup mechanism for resources was on our roadmap.

So far, so good. However, Topic Mappers like me and the Semantic Web people in general are always asking for the subject(s) indicated / represented / identified by a web resource’s representation (see the difference between resource and representation in RFC 7231). I will use represented in the following to keep things easy.
A generic definition of subject can be found in the Topic Maps Data Model:

"A subject can be anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever. In particular, it is anything about which the creator of a topic map chooses to discourse." [3]

The basic idea I would like to sketch out in the following is how to use JSON Home to enable clients to get an idea about the subjects represented by the resources’ representations in question.

In my opinion, this would add substantial benefit to JSON Home as "follow your nose" [2] could be improved or even replaced by an I know why I follow-approach.

JSON-LD is a serialization of RDF. RDF enables authors to identify subjects (by using resources and representations as intermediates!) and make statements about them using triples constituted by subject-predicate-object. However, by subject I still refer to the definition given above.
Having RDF at hand a connection to the semantic web stack is established.

The simple and obvious task is how to connect JSON Home and JSON-LD?

The latest draft of JSON Home provides two hints sections to add further information:
    1.    Resource Hints

    2.    Representation Hints


In order to be able to disclose information about the represented subjects we could add a third section:
    3.    Subjects Hints


The internet media type identifier tells clients which model is used to disclose subjects semantics. Example:
{
   "...": "...",
   "subjects-hints": {
      "application/ld+json": {
          "...": "..."
      }
   }
}

A more sophisticated approach could provide more information about the chosen representation. Example:
{
   "...": "...",
   "subject-hints": {
      "models": [{
          "media-type": "application/ld+json",
          "docs": "http://json-ld.org",
          "body": {
              "...": "..."
          }
      }]
   }
}

The key body holds the actual statements / data.

Both approaches allow the integration of n models.

But wait, let’s examine what has happened so far: I have mixed two mediatypes, i.e. application/json-home encloses application/ld+json. The resource’s representation in question is in a media type dilemma. Declaring both media types is no option as RFC 7231 simply does not allow assignments of more than one media type in the Content-Type header. Good bye, RESTfulness.
Last but not least, the bridge property - subjects-hints - introduces vocabulary which isn’t fixed in any global registry and / or in the standard (yet).

In a quick thought I wondered if JSON namespaces would help (and stumbled upon https://www.mnot.net/blog/2011/10/12/thinking_about_namespaces_in_json). But this doesn’t solve the media type dilemma. An aspect that I miss in the discussion in Mnot’s blog post (or I missed the point).

The new simple and obvious task is to separate JSON Home and JSON-LD and make the JSON-LD accessible via JSON Home.

Basically, what I really want to do is to establish a lookup mechanism for subjects while JSON Home is a lookup mechanism for resources. Consequently, the JSON-LD should reside at the base URI - as a "Home JSON-LD" - side by side with the JSON Home document. I.e. I want the resource identified by the base URI to represent (at least) these two desired states [4].

This is easily done by content negotiation which realizes the separation:

GET / HTTP/1.1

Host: example.org

Accept: application/json-home

GET / HTTP/1.1

Host: example.org

Accept: application/ld+json

The second part of the task is to make the JSON-LD accessible via JSON Home as the lookup of resources is the starting point for clients.

The initial approach sketched out above introduces new vocabulary in JSON Home: subject-hints
The idea was to attach pieces of JSON-LD to the particular resources which - in turn - requires JSON-LD to specify JSON pointer [5] as its fragment identifier syntax, though.

Extending the standard this way feels cumbersome somehow. Looking still deeper at the conceptual basis, the attempt is to
    •    Separate the subjects space from the resources space,

    •    As a meta layer,

    •    And make the subjects space accessible from the resources space.


I.e. the separation is still more conceptual than technical. As the subjects represented in the JSON-LD are about the domain space, the resources published in JSON Home will also occur in the JSON-LD. But with a specific purpose related to the subject in question.
E.g. in Topic Maps there is a precise classification / distinction of purposes:
"subject identifier
locator that refers to a subject indicator

subject indicator
information resource that is referred to from a topic map in an attempt to unambiguously identify the subject represented by a topic to a human being

subject locator
locator that refers to the information resource that is the subject of a topic" [6]

Given the URI http://example.org/people/John-Doe this is just a resource in a JSON Home document enriched by technically motivated hints. However, from the subjects perspective, a certain real Mr. John Doe may be a subject of interest and http://example.org/people/John-Doe may act as a subject identifier referring to a subject indicator. This is all well defined in Topic Maps which serves as a reference implementation for subject identification here. The JSON-LD document then contains further statements about the subject Mr. John Doe using RDF triples which constitute the subjects space. This is semantically motivated information.

Let’s talk about an example.
A very popular vocabulary widespread on the web and therefore - for short - hypothetically attributed with a certain quality of common understanding is schema.org. Please refer to http://schema.org for further information.
We find some inspiring examples in the schemas section, e.g. for the type “Person”:
Jane Doe
Photo of Jane Joe
Professor
20341 Whitworth Institute
405 Whitworth
Seattle WA 98052
(425) 123-4567
jane-doe@illinois.edu
Jane's home page:
janedoe.com
Graduate students:
Alice Jones
Bob Smith [7]

Let’s imagine we want to build a RESTful API to access public data about the staff and students of a university where Prof. Jane Doe is a guest lecturer. The basic design of the API consists of indexes (of the university people, i.e. staff / employees and students) represented as lists of hyperlinks to the people’s homepages, and items which are the people’s homepages.
Since REST is not about nice URIs [8], parsing the URIs and / or the URI patterns in our API’s JSON Home for meaningful strings fails; we have chosen arbitrary generated URIs. E.g. the URI http://example.org/indexes/foobarbaz identifies the index of the employees.

I guess you are getting the idea. We would establish a JSON-LD representation which discloses information about the involved subjects, the Whitworth Institute and its employees:

{
   "@context": "http://schema.org",
   "@type": "CollegeOrUniversity",
   "name": "Whitworth Institute",
   "url": "http://example.org/about",
   "address": {
      "@type": "PostalAddress",
      "addressLocality": "Seattle",
      "addressRegion": "WA",
      "postalCode": "98052",
      "streetAddress": "20341 Whitworth Institute 405 N. Whitworth"
   },
   "employees": {
      "@type": "Person",
      "url": "http://example.org/indexes/foobarbaz"
   }
}

The property url is specified as "URL of the item" [9]. Unfortunately, this is just vague semantics.

There is no guarantee that all resources from JSON Home reappear in the "Home JSON-LD". It is the author’s choice to determine the resources coverage as well as the richness and number of statements about the subjects. However, "Home JSON-LD" would impose on authors to set the url property for each subject which acts as a wormhole between JSON Home and "Home JSON-LD".

This is the subjects space. Based on shared semantics (in / via the established properties), clients can pick resources of interest from "Home JSON-LD" and retrieve access information from JSON Home afterwards (or vice versa). Little effort is required to map the string property url to the corresponding resource object.

I know that this is a naive approach; the connection between "Home JSON-LD" and JSON Home is not a sound one. URIs could change, and authors might forget to update the JSON-LD. Using the link relation types in JSON Home which allow custom relations like “tag:me@example.com,2016:widgets” [2] would be a far more stable mechanism.

Finally, I like the idea of a "Home JSON-LD" which is retrieved via the Base URI using content negotiation. Using a special resource in JSON Home requires a corresponding special link relation which discloses the specific "Home JSON-LD" semantics. This relation has to be registered and established (or the other way round) first. However, we should use the whole bandwidth of existing standards first before extending them. This is always a useful design goal.

Musing about “Home JSON-LD” is another story.

References

[1] https://www.infoq.com/articles/roy-fielding-on-versioning
[2] https://mnot.github.io/I-D/json-home/
[3] http://www.isotopicmaps.org/sam/sam-model/#d0e746
[4] https://tools.ietf.org/html/rfc7231#section-3
[5] https://tools.ietf.org/html/rfc6901
[6] http://www.isotopicmaps.org/sam/sam-model/#terms-and-definitions
[7] https://schema.org/Person, Example 8
[8] https://speakerdeck.com/stilkov/rest-not-an-intro-1?slide=14
[9] https://schema.org/CollegeOrUniversity

The Coherent Business to SCS Model

3 min read

At first glance the self-contained system (SCS) approach [1] is a native software architecture topic. However, there is a tight correlation to the organisational resp. the business part of the story. Here's the hook: "Each SCS is owned by one team." [2]. Further: "The manageable domain specific scope enables the development, operation and maintenance of an SCS by a single team." [3]

What's the consequence? Let's change perspective and have a view from the business side. A plausible correlation between an SCS architecture and the business layer is constituted by a mapping from the core business processes to their enabling (or maybe just facilitating) software systems. The mapping is as follows:

  1. Identify a value creating business process P.
  2. Divide P into logical steps P1 - Px whereby "logical" means that each step represents disjoint, well-defined business logic.
  3. A step is casted to a business domain or - for short - domain.
  4. Finally, each domain is mapped to a self-contained system. Shared business objects such as customer or order are exchanged via RESTful HTTP or leightweight messaging as defined in the SCS approach.

The Coherent Business to SCS Model

The mapping of the business part to its technical counterpart is coherent. Therefore, this is called the "Coherent Business to SCS Model" (CBSM).

A product organization has product managers who manage their domain products powered by SCSs. There's a good chance that they do it the agile way having teams who develop, operate and maintain their SCSs.

On the business layer the domains are tied together by the company vision which enables the product managers deriving their product vision.

The core value of the CBSM is that it enables the domains to develop with maximum speed due to minimum dependencies on system level. To be precise, the only dependency is the API. If a domain does not change its API there will be (theoretically) no limit of development within this domain. This even allows replacement of the underlying SCS once it is at the end of its lifecycle. Equivalent leightweight communication on the business layer makes this a success story.

A congruent approach has been implemented at GALERIA Kaufhof (a member of Hudson's Bay Company) [4]. It further shows an evolution of the model: A domain may be powered by more than one SCS for technical reasons.

Well, I think this kind of interaction of business and technology is not really a new idea. Have a look at The Open Group's definition of a service in SOA which embraces some of the core ideas [5]. My argument is that the Coherent Business to SCS Model is a more leightweight approach (buzzword!). I just wonder if SCS is a consequence or the driver?

[1] http://scs-architecture.org
[2] http://scs-architecture.org/#one-team
[3] https://speakerdeck.com/rstrangh/self-contained-systems-1
[4] https://galeria-kaufhof.github.io/general/2015/12/15/architektur-und-organisation-im-galeria-de-produktmanagement/ (in german)
[5] https://en.wikipedia.org/wiki/Service-oriented_architecture

"Let Technology Inspire You" Series

1 min read

Last monday we started our "Let Technology Inspire You" series at DI UNTERNEHMER. Offering a forum to data people and enabling vital discussions about "data" is part of our transformation towards a (digital) data company. First guest speaker was Tim Strehle explaining "How the Semantic Web can change Digital Asset Management" (slides in German).

My inspirations are as follows:

  1. "Using HTML as the Media Type for your API" including schema.org RDFa markup
  2. The power of Mediatypes to describe a system's domain in an SCS environment

Thoughts on #1:

  • Is it efficient to have have a hybrid resource representation for both humans and machines? Why not use content negotiation to provide a human and a machine readable representation? If a resource is data driven this separation should not result in much effort.?

Beside these inspirations it was interesting to see how the idea of Self-Contained Systems is spreading. (Tilkov, the Messiah.)

If you are interested in sharing your thoughts on data with us pls. drop me a line.