Editor's note: This is a continuation of Evgeniy Shadrin's introduction to Tarantool. The first part can be seen here, and you may need to refer back to it in order to understand the concepts presented in this part (although the main code example is duplicated further down here).
Tarantool can be run in two ways:
As an interpreter; just run
tarantool
and execute commands line by line. This comes in handy if you’re unfamiliar with some command and simply want to execute it to see what it does.Via the format
init.lua
(the name is arbitrary, you may pick whatever you like), sotarantool init.lua
. This launches a startup script containing a series of commands.
Let’s study the startup script I provided in Part 1 in more detail.
It all starts with configuring the database via box.cfg
— here, box
is a module that contains a configurable cfg
table. This module’s responsibility is working directly with the database. You can run Tarantool, execute some procedures or functions, and print some messages, but you won’t be able to run the database without configuring box.cfg
. In my example, I specified two important parameters: a logging level of 5 (DEBUG) and slab_alloc_arena
of 1 GB — this is the amount of RAM allocated for my data (as mentioned earlier, you may need to adjust this depending on your system).
The box module contains a lot of other useful things, such as:
box.info
: A library that displays general information about Tarantool.box.slab
: An important table for monitoring memory capacity.box.stat
: A statistical library that shows the number of insert, select, and other operations you performed.
If you type box.cfg
in the Tarantool interpreter after specifying all of the necessary parameters, you’ll get an object with all of the available parameters described — not only those that I specified explicitly but also the default ones.
On the slide above, you can see the two parameters I specified — slab_alloc_arena
(RAM capacity of 1 GB) and log_level
(5, or DEBUG) — along with some other important parameters like snapshot_count
, which defines how many snapshots Tarantool should store. In this case, the six latest snapshots are saved. By the way, snapshot periods are regulated by a parameter called — you guessed it! — snapshot_period
. It defaults to 3,600 seconds, which means that Tarantool will take snapshots hourly. Setting the appropriate level is up to you: you can configure Tarantool to take snapshots every minute or even every second, but this will severely affect its overall performance. As for snap_dir
and wal_dir
, these parameters determine where to store your snapshots and transaction logs, respectively.
The slide above illustrates the use of the box.info
module. Here, you can get general information about Tarantool. If it’s being run as a daemon, you can obtain its PID, version, uptime, and current status.
Now that the configuration is over, we can turn to creating entities, or data itself inside Tarantool.
The slide above displays the image from the official documentation that details Tarantool’s data model. All data is stored in spaces, each having entities called tuples (analogous to records in a relational database), as well as primary and secondary indexes.
So basically, I need a space to store all of my user data.
As you may have noticed, I’m creating a space inside of an if
statement, and I’m doing so on purpose. Suppose your Tarantool instance was stopped for some reason. If you have snapshots and xlogs saved and you relaunch Tarantool, it will first take the latest snapshot and perform the operations contained in the latest xlog, thus restoring its state. If that’s the case, Tarantool won’t let you create a users
space (although you probably don’t need it anyway), so you’ll often see such if
statements, which allow you to avoid unnecessary errors. So if you don’t have a users
space, it will get created, along with an index. In my example, it’s a primary tree index, which is just a single number.
Further down in the script, I need to add new user records. This can be done with a regular insert operation where a key-value pair is passed, but in this case, it’s achieved more easily with auto_increment
: when a new user visits the page, they’re automatically assigned a key that’s equal to the current number of database records plus one (editor’s note: sequences have replaced auto_increment
as the preferred way to accomplish this). And if I need to know how many records I have in my database, I can use the built-in len()
function. As you can see, the Tarantool syntax is quite simple and clear.
As I mentioned earlier, Tarantool isn’t just a database but also a full-blown Lua application server. What the developers intended is for you to write your own modules and packages in Lua to implement any missing logic that you need. You won’t need to reinvent one large wheel but rather a few small ones if they are really necessary or if other solutions don’t have what you’re looking for.
Tarantool also supports LuaRocks, a package manager that works with its own repository and makes installing packages a breeze — it’s done with just one command. You can find details about the various packages in the GitHub repositories. The packages that are used most often are http
and queue
.
Let’s talk a bit more in detail about packages now. The first thing to remember about them is that they need to be loaded.
A package is just another Lua script containing some logic. By loading a package, you can use the methods, data, and variables defined in it. On the slide above, I’m loading two packages (console
and log
) via Lua’s require mechanism.
I’m launching the console on localhost and having it listen on port 33013. Using the log
package, I can write to a log. The console in this context is an admin console or a remote control console that allows the monitoring of Tarantool’s state. It’s not that tricky to do: if you have your console running, you can use standard Unix utilities or something like telnet
or rlwrap
. telnet
is used for connecting to and listening to a port, while rlwrap
comes in handy when entering commands and saving command history.
You can connect to a Tarantool instance that’s currently running and get some information from box.info
or box.stat
.
The package that I use most often is http
. It’s an HTTP server with limited functionality, but it has many useful mechanisms. On the slide above, I’m loading the package, creating a server and a route, and then launching that server. After that, the handler
function is returning a server response with text, and I’m assigning a cookie to a user (name = ‘tarantool_id’
) and setting the value to ID (value = id
). I’m also specifying an expiration date for cookies to get deleted; in my example, cookies are stored for one year.
http
’s main mechanisms allow you to implement basic logic because the package provides both a full-fledged server and a client. http
works with cookies and supports Lua as an embedded language, i.e. it can be used for variables inside of a template. This means that you can write little Lua procedures right inside of your HTML.
#!/usr/bin/tarantool
-- Tarantool init script
local log = require(‘log’)
local console = require(‘console’)
local server = require(‘http.server’)
local HOST = ‘localhost’
local PORT = 8008
box.cfg {
log_level = 5,
slab_alloc_arena = 1,
}
console.listen(‘127.0.0.1:33013’)
if not box.space.users then
s = box.schema.space.create(‘users’)
s:create_index(‘primary’,
{type = ‘tree’, parts = {1, ‘NUM’}})
end
I went over the basics of my example script, so it should make more sense to you now. To make sure you have it down, let’s briefly review it once again. What we have is an executable Lua script with a comment on top. First, I’m loading packages via require
. Then, I’m declaring two variables, HOST
and PORT
. After that, I’m configuring the Tarantool database via box.cfg
, where I’m specifying two parameters: log_level
(logging level) and slab_alloc_arena
(RAM capacity).
I’m creating an admin console that I’ll be using it further down in the script. Then, if I don’t have a users space, I’m creating it with box.schema.space.create
and I’m also setting an index on it.
function handler(self)
local id = self:cookie(‘tarantool_id’)
local ip = self.peer.host
local data = ‘’
log.info(‘Users id = %s’, id)
if not id then
data = ‘Welcome to Tarantool server!’
box.space.users:auto_increment({ip})
id = box.space.users:len()
return self:render({ text = data}):
setcookie({ name = ‘tarantool_id’, value = id, expires = ‘+1y’ })
else
local count = box.space.users:len()
data = ‘Your id is ‘ .. id .. ‘. We have ‘ .. count .. ‘ users’
return self:render({ text = data })
end
end
httpd = server.new(HOST, PORT)
httpd:route({ path = ‘/’ }, handler)
httpd:start()
In the handler
function, I’m receiving cookies from visitors to my page. I’m looking up the IP address of visitors and writing it to my log. If their ID is not in tarantool_id
, I’m adding their IP address to my database with auto_increment
, looking up their ID, and returning a welcome message data; the cookie value gets set to the visitor’s ID (value = id
). I’m counting how many records I have in my database and showing the visitor the number of unique page views. At the bottom of my script, after the function declaration, I’m running the server and working with it.
It’s a relatively simple example, but given all the modules and Lua’s extensibility, it can iteratively be improved upon until it’s fit to be used in real-life projects.
Tarantool has lots of different packages. For example, there’s one for working with JSON, there’s a package called fiber
(more details later), a YAML package, and a cryptographic library called digest
(which contains basic encryption mechanisms). Tarantool also has a package for non-blocking sockets, so you can work over the network and implement various protocols. In addition, there’s a package that allows you to work with MessagePack and a library called fio
(file input/output) for handling files. One particularly interesting mechanism is net.box
, which enables Tarantool to communicate using a binary protocol — for example, with another Tarantool instance. It’s very fast and convenient.
Fibers are lightweight threads based on the green thread model. The main difference between fibers and regular threads is that fibers are created and work inside Tarantool, so it takes very little time to create them and they also have a fairly low switch time. Fibers can come in handy if you’re implementing an asynchronous model or if you need to launch a daemon that performs some side task in parallel with the main one.
There are some basic principles to keep in mind when working with fibers: a fiber is created with fiber.create
; it can be put into wait mode with fiber.sleep
; and a fiber_object
can always be canceled if you want to stop using it.
fiber.time
is a handy library that can get you a value (as a Lua number) from an event loop that counts time.
A very popular library built with the fiber library is expirationd
, which, based on some predefined criteria (usually it is time), deletes records from your database. For example, you could use it to remove everything older than one month.
I could go on and on about Tarantool, but I don’t know all there is to know about it. You can always check the official documentation at tarantool.io.
Tarantool supports most Unix-like systems and at Sberbank Digital Ventures, we constantly keep an eye out for new packages since we have Red Hat Enterprise Linux installed on our machines. The developers also maintain the official Tarantool package shipped with Debian.
One thing I like a lot about Tarantool is that you can contact the Tarantool dev team. I had some questions, so I simply found some members via Skype or Telegram and pinged them. And Konstantin Osipov, the principal Tarantool developer, gave a short talk on queues at this conference. Developers, especially those new to the field, find it very important to be able to ask questions and learn firsthand the best approaches to particular problems. Of course, you need to be prepared for the fact that the open-source community can be quite peculiar. Perhaps this image will tell you more than I’d ever be able to:
At the same time, interacting with community members can be an exciting experience that helps you grow and makes your projects a little better.
I’d like to wrap up this piece by sharing a few takeaways with you.
Each NoSQL solution has its own application. It’s often very difficult to say which database is better or worse, or more or less performant. They are just different and usually were created for solving different problems.
And development tools are extremely important: if chosen well, they allow you to speed up and simplify the development process and avoid lots of unnecessary problems. But you shouldn’t forget about what is most important: your ideas and end goal. After all, every developer’s objective is to solve the problem at hand, bring ideas to life and make the world a slightly better place.
Well, I hope I’ve managed to persuade you that Tarantool isn’t that complicated and that you can start using it easily. Should you have any questions, you can contact the Tarantool team directly here.