Advanced data modeling from RDBMS

In this chapter you will understand the principles of the newly introduced feature available starting from version 1.4 specifically designed for a wide range of data modeling/data transformation applications: we named this feature Advanced Data Modeling. All of the operations involved in that feature are performed on the fly without involving local caches or any form of persistence inside Gluesync, this makes the overall process faster, more secure and consistent.

Data modeling in Gluesync makes use of what we call virtual entities, if you’re not familiar with it we suggest having a look at this page, it will take less than 1 minute to read.

Advanced data modeling

When we designed the new data modeling feature we took into consideration a few main aspects, like:

  • freedom to define relationships between tables;

  • Choose if data should be represented as nested in different level of deep or just flat;

  • compatibility across multiple sources: not only RDBMS but also NoSQL databases;

  • easiness to hand-write but also future-proof when it will be managed via an API / GUI (coming soon);

  • fast to compute;

  • no new language to learn: SQL-like statements easy to recall;

  • easy to be extended in future releases;

Taking these as main principles we came up with a meta-description language that is now the core foundation of that functionality.

See it in action here in this video:

The meta description language

The description language helps you define what will be the content of your virtual entity that will then become a JSON object.

Here following is the list of available commands and their description:

Key Description Accepted values

level

Level of deep of parenthood for JSON objects

1 or 2

name

Defines the name of the entity

string

tables

Array of tables that are involved in a JOIN-like statement, each table is described using these elements here on the left

Key Description Accepted values

table

Name of the table involved

string

fields

Object representing the list of columns that will selected and used for the output. The mapping works exactly as per the basic config of Gluesync: values are the resulting target’s names used in your JSON

{
  "id": "id",
  "order_number": "order_number",
  "notes": "notes"
}

alias

(optional) Value used by the query engine to distinguish between repeated usages of the same table in the same context

string

where

(optional) Same per standard SQL language, specifies the clause applied to the filter

string, ex. status = 1

fromTable

Represent the master table name that is involved in your query

boolean (true or false), defaults to false

join (if foreign keys are in place)

Recalls standard SQL-like JOIN statement, specifies the table that has to be joined together in this context. It automatically manages the traverse of relationships between tables' foreign keys (if any).

{
  "with": "orders_headers"
}

join (when NO foreign key is involved)

Recalls standard SQL-like JOIN statement, specifies the table that has to be joined together in this context. It makes use of "on" clauses to force the engine to aggregate data between involved tables in specified statements. This can be used when your table has NO foreign keys declared. Note: don’t forget to use defined aliases.

{
  "with": "orders_headers",
  "on": [
    "oh.id = c.id",
    ...
  ]
}

keys

Map of keys/values representing the keys and their aliases for the involved table

{
  "id_customer": "customerId"
}

Bring everything together

Following a virtual entity defined using the above keys:

"order": {
  // You need to define which feature you intend to enable when parsing the following object. Here we are declaring that our virtual entity will be treated as a "dataModeling" entity
  "dataModeling": [
      {
        "level": 1,
        "name": "orders",
        "tables":
        [
          {
            "table": "orders_headers",
            "keys":
            {
              "id": "id"
            },
            "fields":
            {
              "id": "id",
              "order_number": "order_number",
              "notes": "notes"
            },
            "alias": "oh",
            "where": "status = 1",
            "fromTable": true
          },
          {
            "table": "customers",
            "keys":
            {
              "id": "id"
            },
            "fields":
            {
              "name": "name",
              "surname": "surname",
              "phone": "phone"
            },
            "alias": "c",
            "join":
            {
              "with": "orders_headers",
              // Use only if foreign keys are not defined on table schemas
              "on": [
                "oh.id = c.id"
              ]
            }
          }
        ]
      },
      {
        "level": 1,
        "name": "address",
        "tables":
        [
          {
            "table": "addresses",
            "keys":
            {
              "id": "id"
            },
            "fields":
            {
              "street": "street",
              "street_number": "street_number",
              "city": "city",
              "postal_code": "postal_code"
            },
            "alias": "ad",
            "join":
            {
              "with": "orders_headers"
            }
          }
        ]
      },
      {
        "level": 2,
        "name": "order_rows",
        "tables":
        [
          {
            "table": "orders_rows",
            "keys":
            {
              "id": "row_id"
            },
            "fields":
            {
              "id": "row_id",
              "order_id": "order_id",
              "quantity": "quantity",
              "notes": "row_notes"
            },
            "alias": "o",
            "join":
            {
              "with": "orders_headers"
            }
          },
          {
            "table": "articles",
            "keys":
            {
              "id": "id"
            },
            "fields":
            {
              "article_name": "article_name",
              "description": "description"
            },
            "alias": "a",
            "join":
            {
              "with": "orders_rows"
            }
          }
        ]
      }
    ]
}

as a JSON document results in your NoSQL database, you’ll then have:

{
  "id": 1,
  "order_number": "SO-55052681",
  "notes": "this is a test",
  "name": "Bernard",
  "surname": "Worham",
  "phone": "+46 758 348 3574",
  "street": "Rigney",
  "street_number": 53,
  "city": "Dubai",
  "postal_code": "33170",
  "order_rows": [
    {
      "row_id": 38,
      "order_id": 1,
      "quantity": 898,
      "row_notes": "Total foreground function",
      "article_name": "Beans - Navy, Dry",
      "description": "Re-contextualized full-range service-desk"
    },
    {
      "row_id": 98,
      "order_id": 1,
      "quantity": 172,
      "row_notes": "Enterprise-wide attitude-oriented complexity",
      "article_name": "Muffin Mix - Blueberry",
      "description": "Right-sized high-level paradigm"
    },
    {
      "row_id": 230,
      "order_id": 1,
      "quantity": 810,
      "row_notes": "Horizontal non-volatile core",
      "article_name": "Papayas",
      "description": "Stand-alone responsive standardization"
    },
    {
      "row_id": 393,
      "order_id": 1,
      "quantity": 742,
      "row_notes": "Optimized executive hub",
      "article_name": "Gooseberry",
      "description": "Devolved bifurcated moratorium"
    },
    ...
  ],
  "type": "order",
  "scope": "myscope"
}

Key fields involved in Advanced data modeling

  • dataModeling: the key that tells the engine how to treat the following object, specifying that key enables Advanced data modeling over the object;

  • type: every modeled document stored on the target database will have a key called type with the given custom value;

  • scope (optional): will be used as source/target scope in NoSQL databases supporting this kind of data tenancy, like Couchbase for example.

type and scope keys have been recently introduced with version 1.5, to learn more please take a look at our version changelog by following this link: Changes in Advanced data modeling.

Compatibility matrix

RDBMS Gluesync compatibility

DB2 for i series (AS/400)

DB2 w/ GDC (triggers)

⏱ launching soon

MariaDB w/ GDC (triggers)

-

Microsoft SQL Server w/ Change Tracking

Microsoft SQL Server w/ CDC

MySQL w/ GDC (triggers)

-

Oracle w/ Xstream

⏱ launching soon

Oracle w/ GDC (triggers)

PostgreSQL w/ GDC (triggers)

-

PostgreSQL w/ CDC

Sybase w/ GDC (triggers)

-

As per our product roadmap relational databases are being added every month so expect to have Advanced data modeling support on more databases soon after the initial launch.