I'm a member of the JSON Schema Technical Steering Committee, and been making a living consulting with companies making use of JSON Schema at large. Think data domains in the fintech industry, big OpenAPI specs, API Governance programs, etc. The tooling to support all of these use cases was terrible (non-compliant, half-baked, lack of advanced features, etc), and I've been trying to fix that. Some highlights include:
- An open-source JSON Schema CLI (https://github.com/sourcemeta/jsonschema) with lots of features for managing large schema ontologies (like a schema test runner, linter, etc)
- Blaze (https://github.com/sourcemeta/blaze), a high-performance JSON Schema C++ compiler/validator, proven to be in average at least 10x faster than others while retaining a 100% compliance score. For API Gateways and some high-throughput financial use cases
Right now I'm trying to consolidate a lot of the things I built into a "JSON Schema Registry" self-hosted micro-service that you can just provision your schemas to (from a git repo) and it will do all of the heavy lifting for you, including rich API access to do a lot of schema related operations. Still in alpha (and largely undocumented!), but working hard to transition some of the custom projects I did for various orgs to use this micro-service long term.
As a schema and open-source nerd, I'm working on my dream job :)
That's really neat. I've been doing some JSON schema work on our products recently, especially trying to take a schema and generate info compliant with the schema for e.g. testing accounts with clean data on non-production environments, etc. I feel like the area's underdeveloped.
Yeah, exactly. This is a great example. In theory schemas open up all of those use cases in an elegant manner, yet the tooling often sucks. Would love to connect and at least have your use case on my radar!
I think this is the key. It is cheaper and more convenient than ever to deploy and manage data critical services yourself, in a self hosted manner that is protected by whatever jurisdiction you are in. What matters is not who builds it, but who has access to the data, and ideally, that's only you!
I've been to Bangalore a few times and man, the traffic is indeed the worst one I have ever seen. It could take like an hour to even get an Uber driver to accept your ride, and another hour to move around between relatively short areas of the city. Plus there seems to be a complete lack of semaphores. Fun city :)
+1. So many cool desktop app ideas showing up in HN every now and then, yet most of them Electron-based web stuff that just feels horrible. At least Qt would be a lot more appreciated. So much missed potential
It's clearly a matter of taste. I, on the other hand, can't stand Qt or apps that try to look "native" in general. At least on Windows, it feels like going back to the previous century. The modern look, known from websites, is the only one that works for me, hence I use Electron. This way I also have full control over the UI and it looks the same on every OS.
> I, on the other hand, can't stand Qt or apps that try to look "native" in general.
100%. That's why I said "at least" and it's the feeling I have with Electron too. Electron apps (nor Qt ones) do not really feel native, and in that case, better to either go full native (so it doesn't feel like an imperfect approximation) or just deliver a web app that you can use on a browser?
The in-between ends up in a gray area that never feels quite right. But I agree it is in part a matter of style and expectations.
Though I also agree the Win32 look is terrible and outdated. GTK and Cocoa on Linux and macOS are really great and good looking native technologies. I've seen more and more projects target GTK on Windows instead of Win32 for this reason.
I feel comfortable with web technologies, and a large part of this project was using Monaco Editor as a powerful diff viewer, so I'll stay with my current stack. I'd be happy to make a pure web app, but I don't think it's possible with something like a Git client, since it needs to call system commands. Maybe there are some hacks available for local apps, but why would that be better than just using Electron. Yeah, it takes some space, but what is this compared to, for example, any modern computer game.
Can you explain why? I feel like more people are familiar with web technology and from my own subjective experience it's much simpler than qt for example
It depends from which point of view you are talking about.
A lot of people are familiar with web technologies, therefore using something like Electron is way easier for them. That makes a lot of sense.
However, from an end user point of view, Electron (while potentially easier to developer with for a large pool of developers) doesn't feel native. You can tell you are running a web app inside that doesn't obey the OS conventions, the standardised OS shortcuts, looks different than the rest, etc. It's like it doesn't quite match and all the muscle memory you have for working with other native apps (mainly for keyword-heavy users like myself) just doesn't work, making it a frustrating experience. Plus many (not all!) Electron apps are super heavy weight and feel slow when you contrast them with other truly native apps.
Overall, I think you will see a lot of people that don't really mind Electron, but many do. I think it largely comes down to whether you want to develop a desktop app faster yourself, or deliver a desktop app that would satisfy almost every user out there (which might be harder to build).
And BTW, this is coming from somebody that worked a LOT with Electron, as the original author of Etcher (https://github.com/balena-io/etcher), plus I led the Desktop Team at Postman (https://www.postman.com, which is arguable one of the worst Electron apps out there, mostly due to really bad early architecture/business decisions) for a while. I tried everything but I gave up on it. It can never even be a good enough approximation to a native experience that many users expect.
In any case, great job with GitQuill. It does look pretty cool and I wish it was a native app!
Developing apps with Qt and QML is super easy these days. I wrote a post about my experience[1]. QML is such a joy to program the UI in, and then you can use a compiled language–C++/Rust/etc for amazing performance gains. Also, most Qt Quick components (those exposed via QML) are written in C++, so using these native components means that even working with these abstractions is very performant.
To the article's point, many/most JavaScript projects are not optimised and better performance can be achieved with just JavaScript, and yes, JavaScript engines are becoming faster. However, no matter how much faster JavaScript can get, you can still always get faster with other system languages.
I work on high-performance stuff as a C++ engineer, currently working on an ultra fast JSON Schema validator. We are benchmarking against AJV (https://ajv.js.org), a project with a TON of VERY crazy optimisations to squeeze out as much performance as possible (REALLY well optimised compared to other JavaScript libraries), and we still get numbers like 200x faster than it with C++.
Not building an app myself right now, but I worked a lot on the desktop app space, leading development at various companies and been thinking of writing a book on the topic (I'm an O'Reilly author in another space). I tend to blog on macOS stuff every now and then, like https://www.jviotti.com/2022/02/21/emitting-signposts-to-ins....
Quick questions for everybody here:
- How do you develop your apps right now, mainly cross-platform ones? QT? Would you enjoy a C++ cross-platform framework that binds directly (and well) to Cocoa?
- Are you using Electron? If so, would you appreciate premium modules for Electron that bring a lot more "native" Cocoa functionality instead of reinventing the wheel in JavaScript?
What would you wish a book on modern desktop app development would cover?
Great stuff! I'm starting to work a lot on the satellite space (https://www.sourcemeta.com), building a binary serialization format around JSON called JSON BinPack (https://jsonbinpack.sourcemeta.com) that is extremely space-efficient to pack more documents in the same Iridium uplink/downlink operation (up to 74% more compact than Protocol Buffers. See reproducible benchmark here: https://arxiv.org/abs/2211.12799).
It is still a heavy work in progress, but if anybody here is suffering from expensive Iridium bills, I would love to connect and discuss to make sure JSON BinPack is built the right way!
Hey Juan, this looks promising - using a pre-shared schema would allow you to reduce payload sizes in the same way using pre-shared dictionaries makes compression more efficient (such as with zstd or brotli). Having worked on exactly this space in the past (sending and receiving messages from a microcontroller that eventually go through an Iridium SBD modem), some feedback:
After spending a few minutes poking around your landing page, I'm not sure how JSON BinPack works. I see a JSON Schema, a JSON document and then a payload, but I'd be interested in the low level details of how I get from a schema to a payload. When I'm on a microcontroller, I'm going to care quite a bit about the code that has to run to make that, and also the code that's receiving it. Is this something I could hand-jam on the micro, and then decode with a nice schema on the receiving side? Understanding the path from code and data to serialized payload would be important to me.
One thing that was nice about using ProtocolBuffers for this was that I had libraries at hand to encode/decode messages in several different languages (which were used) - what is the roadmap or current language support?
I can understand how ProtocolBuffers handles schema evolution, but I'm still not sure how JSON Schema evolution would work with JSON BinPack. An example would move mountains.
Finally, if I were digging into this and found it had switched from Apache to AGPL, and required a commercial license, it would be a hard sell vs all the permissively licensed alternatives. At the end of the day, Iridium SBD messages are like 270 bytes and even hand rolling a binary format is pretty manageable. I think I could swing budget for support, consulting, maintenance or some other service. But if I were evaluating this back when I needed a serialization format, and ran into AGPL, I would bounce almost immediately.
Thanks for the very valuable and honest feedback. All really good points, mainly how lacking the website is in terms of documentation at the moment. Making notes and plan to address all of it!
I'm working on JSON BinPack (https://jsonbinpack.sourcemeta.com), a binary serialization format for JSON (think of it as a Protobuf alternative) with a strong focus on space-efficiency for reducing costs when transferring structured data over 5G and satellite transceivers for robotics, IoT, automotive, etc.
If you work at any of those industries and pay a lot for data transfer, please reach out! I'm trying to talk to as many people as possible to make sure JSON BinPack fits their use case well (I'm trying to build a business around it).
It was originally designed during my (award-winning!) research at the University of Oxford (https://www.jviotti.com/dissertation.pdf), and it was proven to be more space-efficient than any tested alternative in every single tested case (https://arxiv.org/abs/2211.12799), beating Protocol Buffers for up to ~75%.
While designing it was already difficult, implementing a C++ production-ready version has proven to be very tricky, leading me to branch off to various other pre-requisite projects, like an ultra-fast JSON Schema compiler for nano-second schema validation (https://github.com/sourcemeta/jsontoolkit) (for which I'm publishing a paper soon).
Don't know if you'll see this, but I really like your work. I read your dissertation at least twice, it's well written and informative.
In my day job I have multiple large json files, used in an internal in-house application, into which I serialize a database, and working with them is much faster than using the database (for some workloads). Although the main reason to do so is just for using git for source control.
In the future I plan on migrating all the code to working off the json files, but that won't happen for some time yet.
I was looking for a fast json library and that's why I discovered your paper and your project, although that was some months after I have started working on that project and it was too late to change.
Currently I use the (very slow) json.net library for most serialization/deserialization, it's fast enough for the specific places where I use them, but for some specific workload that has to be fast I use RapidJson (the code in that part is already c++), but only its streaming parser, which blazingly fast, at the cost of some development. I tried using simdjson but had trouble compiling it at the time.
By the way, in your dissertation you don't mention RapidJson, nor json.net, nor simdjson. Is there a reason why you didn't compare them?
Hey olvyo, thanks for your nice comments! We should definitely connect!
> By the way, in your dissertation you don't mention RapidJson, nor json.net, nor simdjson. Is there a reason why you didn't compare them?
My research was focused on space-efficiency more than parsing performance, so I didn't talk about that problem. However the long-term goal is that by using JSON BinPack's binary encoding, you should have a better time parsing it, as you don't have to deal with the JSON grammar.
The json storage we currently use is kind of "secondary" to the database, and so there currently we don't verify it. The relational database we use does have a schema, of course.
Not verifying the json schema is something that is sorely missing, since the files get corrupted from time to time and this is discovered only during deserialization. Unfortunately this is something I don't have enough time to add, as we're swamped with other work (our app is huge and we are a very small time). We just tell our internal users to re-serialize the database to fix these corruptions, which is unfortunate and costly, but the best we can do at the moment.
I wrote more about that aspect of my internal app here.
Like I wrote, when in a future version we'll drop the database completely, and work only off those json files, I'll also introduce schema validation. That won't happen for some time though.
Thanks for your answer, I understand now why you didn't measure the performance of simdjson etc. into account.
I have thought in the past about using BinPack for my json documents, but: I want them to remain as human readable as possible, since the reason to move to json from a DB, was to make database diffs into readable diffs, and BinPack isn't readable.
I also want users to be able to use existing tools on the json files (e.g. the jq tool), but existing tools don't understand BinPack (yet?).
I think BinPack would shine in an RPC/IPC setting. Just recently there was this big discussion here about systemd replacing D-BUS with a json-based IPC and a huge discussion around the waste of using plain json to do that.
Currently experimenting with programmatic generation of json schemas via https://github.com/sinclairzx81/typebox. Trying to maximize reuse of schema components.
Was wondering if JSON BinPack is a good serialization format to sign json documents?
I'm a member of the JSON Schema Technical Steering Committee, and been making a living consulting with companies making use of JSON Schema at large. Think data domains in the fintech industry, big OpenAPI specs, API Governance programs, etc. The tooling to support all of these use cases was terrible (non-compliant, half-baked, lack of advanced features, etc), and I've been trying to fix that. Some highlights include:
- An open-source JSON Schema CLI (https://github.com/sourcemeta/jsonschema) with lots of features for managing large schema ontologies (like a schema test runner, linter, etc)
- Blaze (https://github.com/sourcemeta/blaze), a high-performance JSON Schema C++ compiler/validator, proven to be in average at least 10x faster than others while retaining a 100% compliance score. For API Gateways and some high-throughput financial use cases
- Learn JSON Schema (https://www.learnjsonschema.com/2020-12/), becoming the de-facto documentation site for JSON Schema. >15k visits a month
Right now I'm trying to consolidate a lot of the things I built into a "JSON Schema Registry" self-hosted micro-service that you can just provision your schemas to (from a git repo) and it will do all of the heavy lifting for you, including rich API access to do a lot of schema related operations. Still in alpha (and largely undocumented!), but working hard to transition some of the custom projects I did for various orgs to use this micro-service long term.
As a schema and open-source nerd, I'm working on my dream job :)