JSON-LD WG & CG meeting

Minutes for 2026-01-22

Present

Niklas Lindström, Anatoly Scherbakov, Aaron Coburn, Benjamin Young, Ruben Taelman, Wesley Smith, Pierre-Antoine Champin, Ted Thibodeau Jr., Ivan Herman, Evan Prodromou, Piotr Sowinski, Jason Desrosiers, Roger French, Victor Lu, Erika Barcelos, David I. Lehn

Chair(s)

Benjamin Young

Scribe(s)

Wesley Smith, Benjamin Young

Agenda

https://www.w3.org/events/meetings/1a0cb2ad-eb25-46af-9c3c-65783a7eb5d2/20260121T120000/#agenda

Topics

Piotr Sowinski and the RDF Streaming WG work
Ruben Taelman and Comunica
Roger French : Verifying and Analyzing Scientific Data
Evan Prodromou: Activity Streams & ActivityPub
Community Meet & Greet

Wesley Smith is scribing.

Benjamin Young: Thank you everyone for being here. We are doing a couple of soft kickoff calls over the next couple weeks

... inviting folks to come talk about what they are doing with JSON-LD.

... As we look towards publishing YAML-LD, CBOR-LD, JSON-LD 1.2, we want to make sure everyone is aware this work is going on.

Ted Thibodeau Jr.: Previous meeting: https://www.w3.org/2026/01/14-json-ld-minutes.html

Ted Thibodeau Jr.: Next meeting: https://www.w3.org/2026/01/28-json-ld-minutes.html

... in February the meetings will be stricter WG calls, but right now they are more open.

Topic: Piotr Sowinski and the RDF Streaming WG work ✪

Piotr Sowinski: I am the CTO and cofounder of Neverblink, a startup working with RDF. I'd like to show what we are up to in the community group.

... The reason we are doing this is because we have noticed that there are many things related to RDF that are "stream-y" in nature.

... The problem is that different vocabularies addressing this are not interoperable or compatible.

... We define streams where we carry RDF messages from producers to consumers - we intentionally keep this abstract to be portable and extensible.

... Here is an example of how we would serialize this in the Turtle format. We are working on non-breaking extensions of W3C RDF formats.

... As for JSON-LD, we want to have a format for JSON-LD streams.

... the WG looked into this previously, our plan is to use that work as a starting point.

... We want to produce a final draft of this specification and then hopefully form a working group where we could standardize this work.

... If anyone could review the specification and give feedback it would be appreciated.

... We have GitHub issue tracking you can use to provide constructive criticism.

Piotr Sowinski: Link to the RDF Messages spec draft: https://w3c-cg.github.io/rsp/spec/messages

Benjamin Young: The work you propose couldn't be added to the current WG charter, but we are trying to decide if we want to have community group calls concurrently with the WG calls.

... practically, we could look at YAML-LD as a deliverable format for that. It was originally a streamable format.

... Currently the YAML-LD spec doesn't explore this use case.

... CBOR-LD knows nothing of streams and is meant for smaller use cases.

Niklas Lindström: We (National Library of Sweden) use ndjson as a dump format (full dumps then kept in sync using https://emm-spec.org/1.0/ )

Anatoly Scherbakov: How do RDF messages relate to JSON RPC?

Benjamin Young: https://json-ld.github.io/yaml-ld/spec/#streams

Piotr Sowinski: I haven't looked into this, I will open an issue so we can see if we can make this compatible.

Anatoly Scherbakov: JSON RPC is very minimalist and I always thought linked data could add value in making it more semantic.

Ruben Taelman: https://www.w3.org/TR/json-ld11-streaming/

Ruben Taelman: I'm sure you have seen the json-ld 1.1 streaming work from the previous WG. It's essentially a profile of JSON-LD to make it easier to process in a streaming manner. The main difference I see is that this is not meant for messages, so your requirements for RDF messages don't match with the streaming JSON-LD note.

Roger French: We use JSON-LD for FAIRifying Scientific Data for acquistion, analysis, modeling

Roger French: And we have a Materials DAta Science Ontology (MDS-Onto, https://cwrusdle.bitbucket.io/ )

Piotr Sowinski: I think we should add a note to clarify this, you can have statement level streaming, triple by triple, or you can go message by message. Jelly distinguishes between these. I think it's a good idea to make this explicit.

Niklas Lindström: See also ND-delimited JSON-LD in RDF4J: https://github.com/eclipse-rdf4j/rdf4j/issues/2840

... Your work is complementary to what we do.

Benjamin Young: I would love to see some comparative stuff - they all say "streaming" or similar, and it would be helpful to know what the key distinctions are.

Topic: Ruben Taelman and Comunica ✪

Piotr Sowinski: Thank you for all the pointers to implementations and your use cases! This is very useful for our CG.

Ruben Taelman: Hello everyone - I am currently a postdoc at Ghent University where I lead a team of researchers to investigate algorithms over decentralized knowledge graphs.

... our engine sits between developers and decentralized data on the Web.

... We want to build query execution algorithms to abstract the complexities in decentralized data.

... We implement our research in the Comunica framework.

... You can run queries in the browser, and Comunica will figure out how to decompose the query into HTTP requests in the background.

... I am also involved with SPARQL and RDF work as well as the creation of specification tests.

... I was also involved in the JSON-LD 1.1 WG. I joined the group because I started working on a JSON-LD parser for TypeScript, because I needed one for Comunica.

... Comunica aims to support all the RDF formats, and the main requirement we have is to support the parsing of all RDF formats in a streaming manner.

... I initially implemented a streaming JSON-LD parser, but parsing JSON-LD in a streaming manner is a lot more complicated than Turtle or n-triples

... I worked to provide some recommendations to producers of JSON-LD documents to order certain keys in a certain way so that it's easier for streaming parsers to parse such a document.

... You can annotate a media type to improve parsing performance with key reordering.

... Also relevant is that the Comunica query engine is highly modular, and in order to support this architecture we worked on a dependency injection framework that is driven by JSON-LD documents.

... The Comunica framework allows you to define several different instantiations of Comunica

... Happy to answer any questions.

Anatoly Scherbakov: Thank you for the presentation. I looked briefly at Comunica. Do you always have to specify the exact datasets to query, or can Comunica guess based on the SPARQL query?

Roger French: I can present for Erika and I

Roger French: But i don't know how to put something in the speaker queue

Ruben Taelman: The default version of Comunica only implements federated SPARQL queries where you need to specify the datasets.

... We have a different build of Comunica that includes more experimental components, but that is not included in default Comunica because it is unstable.

... Here is an example query that does not require you to choose data sources - Comunica will try to figure out which URLs to start from based on the query.

... It will then recurse in links found in documents.

... Link traversal works reasonably well in environments that provide rigid structure on top of linked data, but over the open web it doesn't work that well.

Anatoly Scherbakov: As an efficiency upgrade, have you considered collaboratively built databases or descriptions of relations between databases.

Ruben Taelman: Authentication layers make it difficult to create a public or shared index over some protected datasets. If the data is only open, such an index would help a lot.

Anatoly Scherbakov: Such an index would bring the vision of the semantic web a little bit closer, and would be interested in learning about research in this area.

Benjamin Young is scribing.

Wesley Smith: I have a question about link traversal techniques

... if you had infinite processing power, would it be just as performant as an informed query?

Ruben Taelman: There are performance issues at this point that mean we haven't dug into "correctness" much yet

... it could be 10's of thousands of links

... and figuring out how to prioritize them, etc.

... is mostly where we're focused

... we're also exploring trustworthiness

... if you work under normal linked data principles, your trust is rather high

... and if you hit a malicious publisher, that may try to corrupt your data in some way

... there are techniques to discover which ones are trustworthy...or at least partially trustworthy

Wesley Smith: You pointed out that correctness is hard to define here

... and that folks could pollute the data in some way

Anatoly Scherbakov: The question of trust is very important, and perhaps people could express trust or distrust with the previously discussed index.

... I think it is a wide and interesting area of research.

Ruben Taelman: My main interest in this WG is to get JSON-LD ready for RDF 1.2.

Benjamin Young: 1.2 Is a point release, we are looking towards 1.3 for the future.

... potentially 1.3 would be called 2.0. First we need to finish the maintenance mode housekeeping.

Topic: Roger French : Verifying and Analyzing Scientific Data ✪

Roger French: We focus on verifying scientific data and analyzing models, down in the details of science but then connected back up to schema.org etc.

... We are using JSON-LD to verify data. It lets us keep metadata inside JSON-LD and find it easily. Then we build knowledge graphs out of this linked data with JSON-LD backed up with Turtle.

... We have a tool for converting between draw.io diagrams and Turtle formats.

... We also have tools to drop facts from data analysis into JSON-LD.

... This allows datasets to remain full without tying them to any particular database.

... Good to connect to the CS side since we are from the material science side.

... We have about 50 people in our research group and also do a lot of work with the DOE and national labs. There is a lot of interest there in linked data to make things agnostic to implementation details.

Anatoly Scherbakov: Scientific applications of linked data is very exciting. How often do researchers/non specialists have to write or draw linked data?

Roger French: Our model is that this should become the normal routine way, and we focus on developing data science tools for people in our fields to do these things.

... We have a lot of graduates that get hired into science labs because there is demand for this.

Anatoly Scherbakov: YAML-LD is a lightweight way to write linked data that could be useful in your tooling, and can integrate with AI agents for writing linked data.

... I would love it if you had feedback on the candidate specification.

Topic: Evan Prodromou: Activity Streams & ActivityPub ✪

Anatoly Scherbakov: I haven't been able to hide my motives from pchampin for too long :)

Evan Prodromou: I am one of the editors of ActivityStreams and one of the authors of activitypub and would love to talk about them.

... ActivityStreams is a JSON-LD vocabulary for representing activities and entities in social networks.

... Both actors, like people and groups, that perform activities, and content-like objects, such as articles or images.

... There are also events, places, and relationships in this vocabulary. It is a subject-verb-object structure.

... ActivityStream's core is our main structure that defines how activities, actors, and objects work together.

... We have another standard called Activity Vocabulary is where these types are defined out.

... There are a few different applications of ActivityStreams - it is used in some other linked data systems, but the primary use in 2026 is Activity Pub, which is a standard for hosting activities and ActivityStream objects as well as delivering them across the network.

... ActivityPub is a widely implemented standard for social network interactivity - Mastodon is the best known implementer.

... There are about 100 independent implementations.

Evan Prodromou: We're able to convey information about the object, etc.

... and apply that info in various places

Benjamin Young is scribing.

Evan Prodromou: We are at 6-7 task forces within that social CG.

... The documents themselves have been in existence for 8-9 years and have not been revised in that time.

... The implementation experience of those 100s of implementers has led to some understanding of clarity issues with ActivityPub/Streams.

... In order to resolve these and make development of these applications easier, we are working on a revision of these documents in a new W3C WG.

... The goal there is backwards compatible improvements of ActivityPub/Streams, especially w.r.t clarity and readability.

... Currently, we are not looking at supporting other linked data representations such as YAML-LD or CBOR-LD, but experimentation in that area is possible.

Benjamin Young: There are a lot of second to third passes on downstream specifications like these. So much of our own next chapter is helping groups understand extensibility and look towards stuff like property graphs that might be of help or simplify things. It's great that you will be here representing.

Topic: Community Meet & Greet ✪

Benjamin Young: Thank you everybody who has come today - we are looking for warm bodies to fill chairs and type in specifications.

... We need people who are willing to contribute PRs. There are a handful of people who are visiting who didn't say anything, and I'm keen to give them time since Evan will be back.

Jason Desrosiers: I am here representing JSON Schema, but Victor invited me to talk about how these two spaces interact.

Pierre-Antoine Champin: I would love to continue the discussions about JSON-Schema vs. SHACL for JSON-LD

Jason Desrosiers: https://www.ietf.org/archive/id/draft-polli-restapi-ld-keywords-08.html

Jason Desrosiers: https://teamdigitale.github.io/dati-semantic-schema-editor/latest/

Aaron Coburn: We've mainly got questions around media types

The W3C JSON-LD Community Group

Go Back