Skip to main content

Posts

How to store last n-days of historical data in a partitioned table - second attempt

In the previous post , I've tried to reproduce the thinking process of how I redesigned the way historical data is stored in the relational database. The goal was to speed up daily maintenance tasks that age data. Dropping the oldest part of data would now take only milliseconds to execute. That was a huge success since with the previous solution the ageing task was running for minutes causing table-level locks and leading to drastic underperformance of the whole system. Why that was not the best idea? It's true that maintenance improvement was a success. It wasn't long before I realized introducing an additional column with rolling integer values by which the table had been partitioned maybe is a good idea to ease up moving the sliding window of data to another day but now writing ad-hoc queries in a way they would take advantage of partitioning is not straightforward hard. One would need to add an additional condition in each query on partition index column. Furtherm
Recent posts

How (not) to store last n-days of historical data in a partitioned table

Photo by  Jan Antonin Kolar  on  Unsplash Some time ago I was asked to enhance the way historical data is kept and maintained in an MS SQL Server DB. I was told that it is required to keep 35 full days of data, every day after midnight the data from the oldest day should be removed. I was asked for help because the data ageing was taking too much time and was causing locks on the problematic table, which made the whole system to underperform drastically during that time. At that time, the data was being stored in a single SQL table. The volume was around 160 million records. Data ageing was being done by a simple DELETE FROM statement with a condition on the indexed date column. The solution seemed pretty obvious to me - I could partition the table by days and then,  during the daily maintenance,  simply truncate the partition containing the oldest data . Since it's a meta-data operation, it should take only milliseconds. Because the whole operation had to be done witho

System.exit() not working?

Until recently, I was living in a world where calling a System.exit() method was not very elegant but 100% effective, bulletproof and ultimate way to shut down a multi-threaded Java application. Over the years, in many Java projects, I've seen a quite similar approach to shutting down the application. First, we try to shut down gracefully by finishing or interrupting all the threads, which is by the way not a trivial task in a large application, then, after a timeout is reached, we give up and call  System.exit()  method - we want to be sure that the JVM is stopped. The exact same approach was applied in the application I was recently refactoring. I've been doing a massive redesign, touched about three hundred classes, changed many of them drastically. A day or two after one of the huge merges a bug was discovered - shutting down the application takes ages. I've checked the logs, confronting it with the source code, the whole flow was pretty straightforward and I was