(This is a multi-part series about the proper care & feeding of a Vertica cluster. There’s simply too much content for one post, so I’ll be splitting this up into a three-part series. Check back in a few days for the next installment.)
Vertica is a very unique database platform, with a unique set of challenges. Often times this means that a proper care & feeding routine for a Vertica cluster is either incomplete or missing. To get the best possible performance and stability out of your Vertica cluster, there are a few periodic maintenance tasks that are critical to perform.
In this post, I’ll detail a few of the top items I recommend to clients that they include in their periodic maintenance routines. Some of these items should be performed as often as nightly or weekly, and others can be performed quarterly. I suggest adjusting the intervals as your organization requires.
Backup and Restore
First on the checklist is perhaps the most obvious, but it’s also something that’s commonly overlooked or trivialized. If you’re not already regularly backing up your database, then stop reading this post and come back when you are. It’s vitally important to regularly backup your database, because it’s not a matter of if something unpredictable will happen, it’s a matter of when.
I strongly recommend that every maintenance checklist include periodic checks that backups are working properly, and that every organization test their disaster recovery plan at least twice a year, including restoring a backup. If you’ve never tested restoring your backups, then your backup isn’t a backup; it’s a wish.
Data in many Vertica clusters is derived from a separate transactional database, a system of record. Because of this, many organizations opt not to back up their Vertica clusters, since they could technically regenerate their Vertica database by re-importing data from their source database. While true, you still must ensure that you’re regularly backing up your catalog and Vertica configuration files. You must also account for how long it will take to regenerate this data, and you should regularly test this process. In some cases it may be required to go ahead and back up the Vertica database in order to meet SLAs, since restoration times will likely be much quicker.
Check Disk Space
It’s important for your data volumes to always have at least 40% disk space free. Vertica may need this free space for important operations, such as rebuilding projections or rebalancing after resizing your cluster. You should have alerts set up to warn you when you approach 45% free or less so that you can clean up unused projections/tables, or consider adding more nodes. You can easily check for data volumes with
SELECT * FROM disk_storage WHERE storage_usage ILIKE '%data%' AND disk_space_free_mb/(disk_space_free_mb+disk_space_used_mb)
It’s also a good idea to check your catalog volumes to make sure your vertica.log file is rotating properly, and that your CopyErrorLogs directory is cleaned up.
Quick Tip: If your vertica.log file is junked up with messages about incomplete startup packets, and you use a load balancer, it’s likely a harmless message about your load balancer checking to make sure a node is still reachable. You can disable these messages by running:
SELECT SET_CONFIG_PARAMETER('WarnOnIncompleteStartupPacket', 0);
Clean Up Unused Projections
One quick win for performance is to drop any unused projections that may be lying around. Unused projections simply slow down data loads, since Vertica must write an additional copy of the data for each projection that exists for a table. If a projection isn’t being used for any queries, then it’s simply slowing you down when you’re loading data and needlessly consuming disk space. To find unused projections, you can query the system table
v_monitor.projections_usageor run the Workload Analyzer (which we’ll detail in a future installment of this series).
Statistics are important for Vertica’s optimizer to make the right choices when selecting a query plan. Statistics are collected per column per projection, and may be missing if they’ve never been generated or if you’ve loaded a large amount of data. You can quickly see which projections are missing statistics by looking at the has_statistics column of the v_catalog.projections table:
SELECT has_statistics, COUNT(*) FROM v_catalog.projections GROUP BY 1;
You can generate statistics for a given table by running the
SELECT ANALYZE_STATISTICS('[table]')statement, or run
SELECT ANALYZE_STATISTICS('')to generate statistics for all projections. If you’re loading a table for the first time or the data in the table changes significantly (number of rows changes by 50% or min/max values of a column change by 50%), generate the statistics at the end of your ETL job. Then, periodically (nightly, if possible) generate all statistics to make sure none are missed.