GlueSync in Logistic: real-time data for Business Intelligence

Magazzino Logistica GlueSync

Here we are, as promised, guiding you through some real-world GlueSync use cases.

The first that I want to introduce you to is a use of GlueSync in the project of a client that operates in the logistic sector covering, at a global scale, the supply chain for food goods. 

First of all, an introduction on the food business

Goods came from different locations and forms, some via sea, others by airway and the rest as you can imagine, overland. All the goods then, travel to reach dispatching centers and warehouses located, in this case, all over the US. Items are then dispatched and loaded in trucks that will soon reach the specific customer that has ordered it, in the proper quantities and much more important in time while preserving conservation temperatures.

Along all that journey the client wants to get as much data as possible to better monitor all the process and supply chain related infos: orders, dispatching, picking, transfers, quantities, locations, trucks and fridges IoT data (position, goods temperature). 

With all of that data, they are able to elaborate dispatching plans and time windows to better adopt delivery strategies to preserve the goods and reach the customers in time.

Overview of the current infrastructure

The old architecture

The business operates under an ERP built on top of Oracle Database that manages pretty much all of their processes. Each location, basically each client warehouse and logistic center, has its own installation of the ERP and its own Oracle Database located internally inside an on-prem datacenter. Each location runs independently and nightly there’s a batch import procedure that aligns the central systems located in the main headquarters in which all the business intelligence is processed.

Business intelligence at scale

Intelligence has a crucial importance here, in a business that moves tons of data across many locations, imagine having to manage the “Amazon.com” business: you are going to need a really powerful business intelligence to take the proper decisions. And that’s what they have: they built a BI using the powerful tools offered by Tableau. This allows the business to make decisions based on the data that is processed and displayed within the Tableau interfaces via queries executed on top of the HQ Oracle Database cluster.

The problem

Having data just available the day after is becoming a problem: business is growing, customers are demanding more in terms of lead time and response time, competitors large or small are offering near real-time feedback as soon as an order is placed, and food delivery startups are popping up everywhere offering services that couldn’t even be imagined 10 years ago. How to shift the business paradigm to a faster one? 

Moving to Couchbase

Customer have looked into the NoSQL world as a possible solution to evolve the current data infrastructure to a future-proof one and found Couchbase as a good partner: it offers an analytics technology that can perform queries isolated from the rest of the operations performed inside the database, it has mobile support thanks to the SyncGateway so in that way they can provide better mobile apps to the workforce being able to finally dismiss the highly-costing, battery draining and old-fashioned Windows PDAs used in every logistic center. 

Just that? Couchbase supports Tableau via ODBC drivers, offered by CDATA (a good example can be found here). Therefore, this change does not require rewriting all BI interfaces, and adopting this technology has a good ROI for the CEO because it will soon lead to a reduction in the cost of infrastructure resources currently allocated to the huge Oracle Database cluster and to maintain the custom batch procedures that keep the systems aligned each night.

Say Goodbye to my nightly batch jobs friends!

Definitely happy to say goodbye to them, they served the client’s business  for decades but it is now time to dismiss them. They cost money to maintain over time and IT people were scared to touch them whenever a new field or table needed to be added as a requirement from the business.

A change can lead to an unpredictable data loss or malfunction that may not be known until the next day or if someone raises their hand because of an expired customer order that was lost in some mysterious way… and not talking about network resiliency and data consistency, how hard is it to achieve this within a batch job script? Better to just drop everything and start over, I guess.

Move away from legacy with GlueSync

GlueSync fits well in that scenario thanks to its ability to directly connect to multiple Oracle data sources and offload data from its tables pumping it inside Couchbase. 

The topology adopted is the one represented in the diagram below.

The newest architecture after the introduction of Couchbase and GlueSync

In the HQ datacenter we have deployed a GlueSync node per each Oracle data source (5 in total) near the Couchbase cluster using the same Kubernetes infrastructure so, in that way, we’re able to orchestrate also the connectors node. 

Each connector node connects to the respective Oracle Database instance installed on the remote datacenter located in the client’s warehouses over a site-to-site VPN tunnel over a WAN fiber connection.

What if the network goes down

Some may ask what will happen when the VPN tunnel or the WAN goes down: no worries at all. The connector is designed to be resilient, from the network point of view, and to avoid this kind of failure, it stores the latest checkpoint of the last transaction consumed and properly synced, in both ways.

Achieving the near real-time goal

Adopting an approach that reacts to events instead of scheduling imports has brought us to reach the goal of having data available on the other side as soon as the change has been captured. GlueSync uses native Oracle database functionality to get notified about changes made inside the tables, per-transaction and at row level: it is called Xstream API and is part of the Oracle database suite.

Transferring data from Oracle to Couchbase

Now that you know how it works under to hood, I’ll describe one data model of this use case (of course every sensible information has been omitted for privacy sake)

An example of the table “Orders” converted via GlueSync to a Couchbase-friendly JSON document

Let’s build on top

A step into the future has been made now that the data comes from the different Oracle database locations and collected in near real-time inside the central Couchbase cluster located in the HQ datacenter, in that way Tableau can be consulted along the day confident of having freshly data available in which the business can count on.

With data available in Couchbase many scenarios can now be pursued: we previously spoke about the app for the workforce, right? So wouldn’t it be interesting bringing data to mobile devices and syncing it back again to the ERP? Nothing more easier than that. 

But that’s the story for another use case,

See you on the next article

We only know a fifth of what lives in the depths.Everything else is left to explore.

coderspace is coming.