SQL queries JSON modeling

In this chapter, you will understand the principles of data modeling through SQL query statements that we have introduced in this release of Gluesync. All of the operations involved in that feature are performed on the fly without involving local caches or any form of persistence inside Gluesync, this makes the overall process faster, more secure and more consistent.

Data modeling in Gluesync makes use of what we call virtual entities, if you’re not familiar with it we suggest having a look at this page, it will take less than 1 min to read.

The query virtual entity

As per the snipped here following, you can make use of the SQL queries JSON modeling declaring inside the entity object the reserved key called query. This query object takes as a parameter a minified SQL query statement string and uses this against the data source: this means that the Gluesync engine is compatible with the SQL query engine version or language that your relational databases can support. You don’t have to learn a new SQL query language nor limit you on specific platform functionalities.

Following a SQL query used for example against an MS SQL relational database:

-- orders list + customer info + order status + addresses for orders only in status == opened
select
    oh.id,
    oh.order_number,
    oh.order_date,
    c.name + ' ' + c.surname as customer,
    c.phone,
    os.status,
    d.first_name,
    ad.street + ' ' + cast(ad.street_number as varchar(max)) + ' ' + ad.city + ' ' + ad.postal_code as address,
    oh.notes as order_notes
from orders_headers oh
         inner join customers c on c.id = oh.customer_id
         inner join orders_status os on os.id = oh.status
         inner join addresses ad on ad.id = oh.address_id
         inner join drivers d on oh.driver_id = d.id
where oh.status = 1;

will result in a virtual entity map that looks like this:

"sourceEntities": {
    ...
    "orders": {
      "query" : "select oh.id, oh.order_number, oh.order_date, c.name + ' ' + c.surname as customer, c.phone, os.status, ad.street + ' ' + cast(ad.street_number as varchar(max)) + ' ' + ad.city + ' ' + ad.postal_code as address, oh.notes as order_notes from orders_headers oh inner join customers c on c.id = oh.customer_id inner join orders_status os on os.id = oh.status inner join addresses ad on ad.id = oh.address_id where oh.status = 1",
      "type": "opensea",
      "scope": "myschema"
    }
  },

as a JSON document results in your NoSQL database, you’ll then have:

{
  "id": 100,
  "order_number": "SO-71828397",
  "order_date": "2021-07-29T23:01:55Z",
  "customer": "Curtis Streets",
  "phone": "+966 443 229 2199",
  "status": "Opened",
  "address": "Rowland 107 Komsomolsk-on-Amur 681008",
  "order_notes": "Networked fault-tolerant solution",
  "type": "orders"
}

Key fields involved in Advanced data modeling

  • query: the key that tells the engine how to treat the following object, specifying that key enables SQL query data modeling over the object;

  • type: every modeled document stored on the target database will have a key called type with the given custom value;

  • scope (optional): will be used as source/target scope in NoSQL databases supporting this kind of data tenancy, like Couchbase for example.

type and scope keys have been recently introduced with version 1.5, to learn more please take a look at our version changelog by following this link: Changes in SQL query data modeling.

Supported SQL commands

SQL query statements offer DBAs and software developers a wide variety of possibilities when it comes to querying and representing data sourced from tables, columns and rows.

Since this initial release of the SQL queries JSON modeling feature, we focused on providing a robust and flexible way to represent SQL queries output into the JSON format that you expect to serve as content to your users who are consuming the APIs you’ve attached to your NoSQL database. Giving this flexibility means that there could be corner cases where the query statement you are using in Gluesync could make the replication process slower than expected or might not support all the aggregation or business logic you are used to applying while querying your data in a relational database.

Here we have collected all the current supported query statements that are under the suite of integration tests. We’d love to hear your feedback for any other not-yet-mentioned SQL command statement or function you manage to test, this will help us to improve the product for the next releases.

SQL statement Gluesync compatibility

INNER JOIN aggregations

OUTER JOIN aggregations

LEFT JOIN aggregations

WHERE clauses

string concat like foo + ' ' + bar

cast(foo as xxx(yyy)) operators

math operators like * + - /

UNION clauses

👉🏻 Support for UNION is achieved through merging results from different entities, see the following chapter about UNIONS how-to to learn how making use of them.

Subqueries are not yet supported in this version of Gluesync, we plan to add the support in the upcoming releases.

Unions how-to

With a UNION you can aggregate results from different queries that share sharing same column format, while this is usually achieved in common SQL queries by applying a UNION clause between each of your query statements in Gluesync we adopted a different approach to let you achieve the same goal providing a seamless approach for results incoming from each entity types.

To ask Gluesync to UNION results from different entities (no limitations are imposed, you can UNION as many entities as you’d like) you just need to declare two or more entities that are then sharing the same destination type and scope, like those in the example here following:

  "job_a": {
    "query": "SELECT j_a.MyNo as name,
    j_a.Active as active,
    j_a.Description as description,
    j_a.Notes as notes, 123 as dbid
    FROM dbo_a.Job j_a",
    "scope": "dbo-scope",
    "type": "job-type"
  },
  "job_b": {
    "query": "SELECT j_b.MyNo as name,
    j_b.Active as active,
    j_b.Description as description,
    j_b.Notes as notes, 123 as dbid
    FROM dbo_b.Job j_b",
    "scope": "dbo-scope",
    "type": "job-type",
  },
  ...

Incoming results will be then merged together to form the UNION you were looking to achieve, bringing all the results together inside the destination scope (if supported by your target datastore) and anyways sharing the same type key as well as the same collection.

Compatibiliy matrix

RDBMS Gluesync compatibility

DB2 for i series (AS/400)

DB2 w/ GDC (triggers)

⏱ launching soon

MariaDB w/ GDC (triggers)

Microsoft SQL Server w/ Change Tracking

Microsoft SQL Server w/ CDC

MySQL w/ GDC (triggers)

Oracle w/ Xstream

Oracle w/ GDC (triggers)

PostgreSQL w/ GDC (triggers)

PostgreSQL w/ CDC

Sybase w/ GDC (triggers)

⏱ launching soon

As per our product roadmap relational databases are being added on monthly basis so expect to have SQL queries JSON modelling support on more databases soon after the initial launch.

Nested JSON objects

SQL queries JSON modelling enables you to output your query statement into a flat JSON file that represent the actual result of a query output in a 2 dimensional rows & columns representation.

If you’re looking for nesting JSON objects in different level of deep you can opt for the Gluesync’s feature called Advanced data modelling. In order to learn more about it please visit the following link: Advanced data modelling from RDBMS.