Field report on upgrading a rocket v0.4 application to v0.5
The application in question is my first foray into the rust language and a very simple, CRUD-style web application. It consists of:
- a page with list of entries in a database, sorted by fixed criteria, for now
- a single JSON API endpoint to retrieve that same list, with optional filters
- a form to add an entry to the list
- a nearly static about page
- two tasks started via cron jobs, one running every few minutes and the other once a day
The rocket framework appealed to me because it offered a similar semantic to flask for python. It comes with several options for templating and state (aka database) integrations. I had picked tera templates, as they are near identical to jinja2 templates that are used all over python application (i.e. in pelican, ansible, flask, ...) and diesel for accessing an sqlite database. The rocket framework was already at version 0.4 when I started out with the project in 2020. It used a synchronous version of hyper, a low-level HTTP library to provide a multi-threaded HTTP server. It lets you compile the entire application into a single statically linked binary, excellent to run in a small "FROM scratch" container image.
In June 2021, the rocket 0.5 release candidate got published, and over the quieter days end of 2021 / beginning of 2022, I dug into upgrading the app. Let's start with the best bits about this process and the end result:
- I didn't need to do any changes to the templates, the CSS, the database structures or queries. The rocket routing mechanism worked the same way as before.
- rocket 0.5 is now compiling on a stable rust compiler, where 0.4 required the use of the nightly compiler (meaning it used some unstable language features). It got rewritten to use asynchronous functions, which allows it to use the latest hyper library which leverages that mechanism on top of a multi-threaded runtime provided by tokio, an implementation of the green threads concept in Rust.
For this particular application, the average latency was reduced from 67ms to 60ms (as monitored by a zabbix proxy running an a host attached to the same subnet) and the daily cronjob duration got reduced from 52s, using multi-threading, to 22s, using async and the tokio runtime (the earliest, fully linear, implementation of that task took 20 minutes).
While I am happy with these clear improvements, they did come at some cost:
- In the end I spent about 7 workdays on this upgrade, spread over 2 weekends and several evenings - git tells me that this consisted of 6188 additions and 4231 deletions, including all the license changes.
- The switch to async/await in both rocket and hyper required that many formerly synchronous functions had to be made async as well, as they now needed to call async functions themselves. Rust would offer you the block_on construct to call async functions from sync ones, but that conflicts with the tokio runtime and causes it to fail. The same is true for spawning threads, which I had done extensively in parts of the code that run multiple HTTP(S) requests in parallel. Using tokio to let the async calls be handled automatically did pay off and is indeed faster then my manual optimizations, but the rewrite into async required pretty big code changes.
- The newer hyper version lost some features I had been relying on, like shorter connection timeouts (I replaced these with a timeout wrapper on the returned future and an ugly workaround to return an error) and IDN domain names (I now use the URL parser from servo, developed for Firefox).
- The unit tests also needed to be made async aware and tokio provides a dedicated macro for wrapping tests. Unfortunately I had to give up on one unit test that I just couldn't get to work under a tokio runtime, while the error couldn't be reproduced on the running service, either manually or scripted. It seems the tokio runtime acts differently when in a unit test, where it keeps telling me it can't launch a runtime from within a runtime. The stack traces seem to go remain outside any of my code, so doesn't even seem to enter any part of the unit test itself.
- Due to all these new and upgraded libraries the statically linked binary ended up growing (after stripping and upx-ing) from 2.16 MiB to 2.83 MiB. Sure, compared to a full PHP- or Python-stack it is still at least an order of magnitude smaller, but it's also an increase of 1/3.
I would still recommend rocket to anyone familiar with flask. Creating dynamic web pages or APIs is very straight forward to do. The issues I encountered with the switch to async would not apply to new projects anyway. For existing rocket v0.4 applications, the upgrade is worthwhile if you can use the extra performance or aren't doing weird multi-threaded things in the background, like me. None of the core rocket features broke, only the database abstraction required modest changes. My struggle was mostly with the use of hyper as a client.
For the database side, diesel isn't quite as elegant as ORMs in other languages. It makes it simple to retrieve (lists of) elements from a table and do simple joins. For more complex things you can fall back to writing manual SQL queries and mapping the result onto your structs for convenience (you can mark properties that aren't tied to specific table columns, so you can populate them from calculations on joined tables or such). The database migration mechanism also requires the use of plain SQL.
elrido
Als Antwort auf elrido • •As a brief follow up, I did eventually manage to fix that last unit test. It was a pretty large end-to-end affair, where I used both rocket's Client, to send form data, and also triggered the cron jobs that use their own rocket instances, but only to extract the database connection. This seems to have been the cause of the tokio within tokio runtime issue. I simply split this into two tests, one that used the form and the other to test all the cron job functions.
The binary acts as a multi-call one, when started with the regular rocket environment parameters it will launch the web service. But if the CRON environment variable is detected it will only execute the requested cron mode and exit. This is not a normal use case for a rocket application, so I can only really blame myself for this quirk.
One thing I did also find out is that my test code did have side effects, because most tests use a shared database (sqlite), so I had to switch to running all tests single-threaded (which is slower, takes up to 10s) as otherwise I got random failures when some tests deleted oth
... mehr anzeigenAs a brief follow up, I did eventually manage to fix that last unit test. It was a pretty large end-to-end affair, where I used both rocket's Client, to send form data, and also triggered the cron jobs that use their own rocket instances, but only to extract the database connection. This seems to have been the cause of the tokio within tokio runtime issue. I simply split this into two tests, one that used the form and the other to test all the cron job functions.
The binary acts as a multi-call one, when started with the regular rocket environment parameters it will launch the web service. But if the CRON environment variable is detected it will only execute the requested cron mode and exit. This is not a normal use case for a rocket application, so I can only really blame myself for this quirk.
One thing I did also find out is that my test code did have side effects, because most tests use a shared database (sqlite), so I had to switch to running all tests single-threaded (which is slower, takes up to 10s) as otherwise I got random failures when some tests deleted others data or couldn't add new instances when these were already present. This could obviously be fixed by either providing each test with a mock database or just be more careful designing the tests such that their order doesn't matter (i.e. using unique keys to add to the database for each test.
Boring details start around this point in the commit.