Using SQL Server PolyBase for External Data Access
By Tom Nonmacher
SQL Server PolyBase is a significant feature that was introduced in SQL Server 2016 and considerably enhanced in SQL Server 2019. It allows SQL Server to run queries on external data in Hadoop or Azure blob storage. It also allows the use of data from Oracle, Teradata, and IBM DB2 in a SQL Server database by creating an external table. This feature makes PolyBase a powerful tool for data virtualization.
PolyBase uses the T-SQL language to create an abstraction layer over the external data, allowing users to interact with it as though it were in a local SQL Server database. The external data can be queried, and the results combined with data in the SQL Server instance. To illustrate this, here's a simple T-SQL query using PolyBase:
SELECT * FROM EXTERNAL_TABLE_NAME
WHERE Column_Name = 'Your Value'
To use PolyBase with a MySQL database, you need to create an external data source. MySQL 8.0 is a popular open-source database system that can be used with SQL Server PolyBase. The external data source would look something like this:
CREATE EXTERNAL DATA SOURCE MySQLDataSource
WITH (
LOCATION = 'mysql://MySQLServer:3306',
CREDENTIAL = MySQLCredential,
PUSHDOWN = ON
);
For IBM DB2, the process is similar. IBM DB2 11.5 is the latest version of IBM's database system and works well with SQL Server PolyBase. Just like in the MySQL example, you would create an external data source that points to your DB2 database.
CREATE EXTERNAL DATA SOURCE Db2DataSource
WITH (
LOCATION = 'db2://Db2Server:50000',
CREDENTIAL = Db2Credential,
PUSHDOWN = ON
);
Azure SQL and Azure Synapse are cloud-based data platforms that can also be integrated with SQL Server PolyBase. Azure SQL provides the latest features of SQL Server while Azure Synapse is an analytics service that integrates on-premises SQL Server with cloud-based data services. The connection string for Azure SQL would look like this:
CREATE EXTERNAL DATA SOURCE AzureSQLDataSource
WITH (
LOCATION = 'sqlazure://AzureSQLServer.database.windows.net:1433',
CREDENTIAL = AzureSQLCredential,
PUSHDOWN = ON
);
SQL Server PolyBase provides a unified and straightforward way to manage and access external data, making it a must-have tool for database administrators and developers alike. By using PolyBase, you can harness the power of SQL Server 2019 to integrate data from various sources like MySQL 8.0, DB2 11.5, Azure SQL, and Azure Synapse into your data solutions.