Friday, November 04, 2016

Separation of Query Server and Data Server

In MySQL we have separated the Query Server and the Data Server
functionality. The Query Server is what takes care of handling the
SQL queries and maps those to lower layers call into the Data Server.
The API to the Data Server in MySQL is the storage engine API.

Almost all DBMS have a similar separation between Data Server and Query
Server.

When I performed my research work, that later led to development of
NDB Cluster, that is now MySQL Cluster, an important part of the
research was to handle the separation of the Data Server and
the Query Server.

As part of my research we looked deeply into the next generation mobile
networks and their use of network databases. From these studies it was
obvious that the traffic part of the applications almost always made very
simple queries, mostly key lookups and in some cases slightly more complex
queries were used.

At the same time there is also management applications in the telecom
network, these applications will often use more complex queries and will
almost always use some sort of standard access method such as SQL.

Normally the traffic queries have strict requirements on response times
whereas the management applications have much less strict requirements
on response times.

So it was clear that there was a need for both a fast path to the data
as well as standard APIs used for more complex queries and for data
provisioning.

From this it was clear that it was desirable to have a clearly defined
Data Server API in a telecom DBMS to handle most of the traffic queries.

Now the next question that came up was the placement of the API. There were
a number of alternatives. One method was to have an API such that the
application can execute direct function calls into the Data Server. A
number of databases uses this approach. The reason for avoiding this
approach is the high availability requirement. If the application gets
a wrong pointer and writes some data out of place, then it can change
the data inside the DBMS. Another problem with this approach is that it
becomes very difficult to manage in a shared nothing architecture since
the application will have to be colocated with its data for the benefits
to be of any value.

So choosing a network-based protocol as the Data Server API was a choice
to ensure the highest level of reliability of the DBMS and its applications.
We have a clear place where we can check the correctness of the API calls
and only through this API can the application change the data in the Data
Server.

Another reason that made it natural to choose a network protocol as API was
that the development of technologies for low-latency and high bandwidth had
started already in the 1990s. The first versions of NDB Cluster actually had
an SCI transporter as its main transporter which ensured that communication
between computers could happen in microseconds. TCP/IP sockets have since
replaced it since also SCI and Infiniband now have support for TCP/IP sockets
that is more or less as fast as direct use of SCI and Infiniband.

One more reason for using a network protocol as API is that it enables us to
build very scalable Data Servers.

To make it easier to program applications we developed the C++ NDB API that
is used to access to the NDB Data Server protocol.

The marriage between MySQL and NDB Cluster was a natural one since NDB Cluster
had mainly focused on the Data Server parts and by connecting to the MySQL
Server we had a natural implementation of the Query Server functionality.

The requirements on extremely fast failover times in telecom DBMSs made it
necessary to implement NDB Cluster as a shared nothing DBMS. So effectively
the Data Server API has support for storing relational tables in a shared
nothing architecture. The methods available in the NDB Data Server API is
methods for key-value access for read and write, scan access using full
table scan and ordered index scans. In 7.2 we also added some functionality
to pushdown join execution into the Data Server API.

To decrease the amount of interaction we also added an interpreter to the
NDB Data Server, this can be used for simple pushdown of filters, it can
be used to perform simple update operations (such as increment a value)
and it can handle LIKE filters.

So what have been the benefits and disadvantages of these architecturial
choices been over the years.

One advantage is that MySQL Cluster can be used for many different things.

One important use case is what it was designed for, there are many good
examples of applications written against any of the direct NDB APIs to
serve telecom applications, financial applications and web applications
while still being able to access the same data through an SQL interface
in the MySQL Server. These applications takes advantages of the
performance advantage that make it possible to scale applications to
as much as hundreds of millions of operations per second.

Another category in poular use with MySQL Cluster is to implement an
LDAP server as the Query Server on top of the NDB Data Server. This would
have been very difficult using a Query Server API since the Query Server
adds a very significant overhead to simple requests.

The latest example of use cases is to use the Data Server to implement
a scalable file system. This has been implemented by HopsFS in replacing
the Name Server in Hadoop HDFS with a set of Name Servers that use a
set of NDB Data Servers to store the actual metadata. Most people that
hear that such an architecture is built on something with MySQL in the
name will immediately think of the overhead in using SQL interface to
implement a file system. But obviously it isn't the SQL interface which
is used, it is implemented directly on top of the Java implementation
of the NDB API, ClusterJ. Personally I also have a hobby project that
I play around every now and then for the last 10 years that will
implement a Linux filesystem on top of NDB Cluster using a new NDB API,
a new NDB management server and using the FUSE API to implement the
file system in userspace.

The final category obviously is the use of MySQL Cluster with the
SQL interface. There are many applications that use this to get to the
high availability of MySQL Cluster but still using the SQL interface.

The disadvantages is obviously that DBMSs that have a more direct
API between the Query Server and the Data Server will get benefits
in that they don't have to go over a network API to access its data.
With NDB Cluster you pay this extra cost to get higher availability,
more flexible access to your data and higher scalability.

We have worked very hard since the inception of NDB Cluster into MySQL
to ensure that the performance of SQL queries is as close as possible
to the colocated Query Server and Data Server APIs. We've gotten very
close and we are working on getting even closer.

At the same time by separating the Data Server and the Query Server we
have made it possible to work on parallelising some queries in an easy
manner. This makes the gap much smaller and for many complex queries
NDB will even outperform local storage engines.

No comments: