Best way to handle json query

4/10/2023

It’s important to mention though, that in typical IoT scenarios like the one we mentioned, events and messages are typically sent to an event store like Azure IoT Hub or Kafka, and then processed by a relatively small number of event processors that will execute data loading, so concurrency can be less of an issue. Having multiple writers can potentially create some concurrency issues that varies depending on the combination of indexing and rows already present in the table: this article is describing in details impact on concurrency and performance for various options.įor Columnstore indexes, for example, each thread loads data exclusively into each rowset by taking a X lock on the rowset, allowing parallel data load with concurrent data load sessions. This topic has been extensively covered in another article comparing and contrasting all batching options. Clustered Columnstore ( trickle, or individual inserts we wrapped in a BEGIN/COMMIT )įor #3, it’s important to mention the importance of batching multiple individual insert operations and wrap them into an explicit transaction to mitigate both latency between application and database layer, and latency introduced by individual transaction log writes typical of explicit transactions.In our tests, in order to eliminate all variability related to application implementation and potential latency in database interactions, we dynamically generate various row batches to insert (1, 1000, 10000, 100000, 3M rows) on the server-side, with some random attribute values generated while inserting into our target table through an INSERT…SELECT statement like this:įor the schematized table we’re taking a similar approach to generate our synthetic rowset, but we’re using the OPENJSON function instead to automatically shred all JSON attributes into table columns while again inserting all rows with an INSERT…SELECT statement:įor each table, we then measure loading these row batches with different indexing strategies: The second approach we wanted to test is instead “shredding” all attributes of the JSON document into specific columns in a relational table:įirst scenario we want to test is how efficiently we can load one or many documents generated by an application into these two table structures.

We then identified two options for table structures where persist these JSON messages and execute various tests against.įirst one is storing plain JSON fragments in a nvarchar(max) column, which is an approach that some customers that don’t want to deal with a relational model tend to adopt: To reproduce a realistic scenario, we started from a JSON fragment representing a real telemetry message from a machine in a factory floor containing ~60 attributes, for an average 1.6KB size:

You can persist these fragments can be in a variety of data stores, from blob or file shares, to relational and non-relational databases, and there’s a long standing debate in the industry on what’s the database technology that fits “better” for this task.Īzure SQL Database offers several options for parsing, transforming and querying JSON data, and this article doesn’t pretend to provide a definitive answer to that debate, but rather to explore these options for common scenarios like data loading and retrieving, and benchmarking results to provide a clear indication of how Azure SQL Database will perform manipulating JSON data.

Storing and retrieving data from JSON fragments is a common need in many application scenarios, like IoT solutions or microservice-based architectures.

0 Comments

Best way to handle json query

Leave a Reply.

Author

Archives

Categories