Getting Started with Trino Query Engine | by Angelica Lo Duca – Towards Data Science
Sign up
Sign in
Sign up
Sign in
Member-only story
Angelica Lo Duca
Follow
Towards Data Science
--
2
Share
Trino is a distributed open source SQL query engine for Big Data Analytics. It can run distributed and parallel queries thus it is incredibly fast. Trino can run both on on-premise and cloud environments, such as Google, Azure, and Amazon.
In this tutorial, I describe how to install Trino locally, connect it to a MySQL database (provided by XAMPP) and connect a simple Python client to it. The official Trino documentation can be found at this link.
Typically Trino is composed of a cluster of machines, with one coordinator and many workers. All the workers connect to the coordinator, which provides the access point for the clients.
Before installing Trino, I should make sure to run a 64-bit machine. Then I can proceed with the installation of Python and Java:
Once I have installed the previous requirements, I can download the Trino Search Engine and unpack it. Before using Trino, I must configure it.
Within the unpacked directory, I create another directory, called
etc
, which will contain all the configuration files. There are three main configuration files:The
etc
folder should also contain another folder, called catalog
, which contains the list of all data sources (i.e. connectors). The list of all available connectors is available at this link.A minimal configuration for the server is the following:
--
--
2
Towards Data Science
Researcher | +50k monthly views | I write on Data Science, Python, Tutorials, and, occasionally, Web Applications | Book Author of Comet for Data Science
Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams
source