Redshift STL Views for Logging
Introduction
If you work with Amazon Redshift, it’s important to monitor query execution and system activity. This will help you improve performance, troubleshoot issues, and ensure smooth operations. This will help you optimize performance, troubleshoot issues, and ensure smooth operations. Fortunately, Redshift provides a powerful set of System Tables and Views (STL) that allow you to access detailed information about queries, connections, and various system metrics.
In this article, we will dive into the basics of Redshift STL views, focusing on stl_query, stl_connection_log, and other essential views for logging and monitoring.
What are Redshift STL Views?
Redshift System Tables and Views (STL) are a collection of built-in views that store information about the internal operations of the Redshift cluster. These views provide valuable insights into query execution, system performance, resource utilization, and more. By querying the STL views, you can gain a deep understanding of how your queries are running, identify bottlenecks, and make informed decisions to optimize your Redshift workloads. Redshift generates STL system views from log files.
Exploring stl_query View
One of the most commonly used STL views is stl_query. This view contains a record of every query executed on the Redshift cluster, along with detailed information about each query. Let’s take a closer look at how to use stl_query for logging and analysis.
Querying stl_query
To access the information stored in stl_query, you can run a simple SELECT statement. Here’s an example:
SELECT query, starttime, endtime, elapsed, aborted FROM stl_query ORDER BY starttime DESC LIMIT 10;
This query retrieves the last 10 queries executed on the Redshift cluster, along with their start time, end time, elapsed time, and whether they were aborted. The result will give you a quick overview of recent query activity.
Analyzing Query Performance
By digging deeper into the stl_query view, you can gain valuable insights into query performance. For example, you can identify long-running queries by filtering on the elapsed column:
SELECT query, elapsed, substring(querytxt, 1, 50) as query_snippet FROM stl_query WHERE elapsed > 60000 -- Query duration in milliseconds ORDER BY elapsed DESC;
This query retrieves queries that took longer than 60 seconds to execute, along with a snippet of the query text. By analyzing these slow queries, you can identify performance bottlenecks and optimize your SQL statements accordingly.
Monitoring Connections with stl_connection_log view
Another important STL view for logging is stl_connection_log. This view records information about client connections to the Redshift cluster, including connection start and end times, user names, and client IP addresses. Let’s explore how to use stl_connection_log for monitoring connections.
Querying stl_connection_log
To view the connection log, you can run a simple SELECT query on stl_connection_log:
SELECT username, starttime, endtime, duration, remotehost, remoteport FROM stl_connection_log ORDER BY starttime DESC LIMIT 10;
This query retrieves the last 10 connections to the Redshift cluster, including the username, start time, end time, duration, remote host, and remote port. By monitoring the connection log, you can track user activity and identify any suspicious or unauthorized connections.
Analyzing Connection Patterns
You can also use stl_connection_log to analyze connection patterns and identify potential issues. For example, you can count the number of connections per user:
SELECT username, COUNT(*) as connection_count FROM stl_connection_log GROUP BY username ORDER BY connection_count DESC;
This query gives you an overview of the number of connections made by each user. If you notice an unusually high number of connections from a particular user, it could indicate a problem or an inefficient application design that needs attention.
Other Essential STL Views for Logging
In addition to stl_query and stl_connection_log, there are several other STL views that provide valuable information for logging and monitoring. Let’s briefly explore a few of them:
stl_error
The stl_error view records information about errors that occur during query execution. By querying this view, you can identify and troubleshoot issues in your queries. Here’s an example:
SELECT username, query, substring(error, 1, 50) as error_snippet, starttime FROM stl_error ORDER BY starttime DESC LIMIT 10;
This query retrieves the last 10 errors, including the username, query ID, a snippet of the error message, and the start time. By analyzing the errors, you can identify and fix issues in your SQL statements.
stl_wlm_query
The stl_wlm_query view provides information about workload management (WLM) query queues and slot usage. By monitoring this view, you can optimize your WLM configuration and ensure efficient resource allocation. Here’s an example:
SELECT queue_start_time, total_queue_time, total_exec_time, slot_count FROM stl_wlm_query ORDER BY queue_start_time DESC LIMIT 10;
This query retrieves the last 10 queries processed by WLM, including the queue start time, total queue time, total execution time, and the number of slots used. By analyzing this information, you can identify queries that are waiting in the queue for too long or consuming excessive slots.
stl_load_commits
The stl_load_commits view records information about COPY commands used to load data into Redshift tables. By querying this view, you can monitor the progress and performance of your data loads. Here’s an example:
SELECT query, table_name, lines_scanned, num_files, start_time, end_time, datediff(seconds, start_time, end_time) as duration FROM stl_load_commits ORDER BY start_time DESC LIMIT 10;
This query retrieves the last 10 COPY commands, including the query ID, target table name, number of lines scanned, number of files loaded, start time, end time, and duration. By monitoring the load commits, you can track the efficiency of your data loading processes.
System Views for Monitoring
In addition to the STL views, Redshift provides a set of system views (STV) that offer real-time information about the current state of the cluster. These views can be particularly useful for monitoring and troubleshooting. Here are a few notable system views:
- stv_recents: Provides information about the most recently run queries, including query ID, query text, and execution time.
- stv_inflight: Shows currently executing queries, along with their query ID, user name, and execution time.
- stv_wlm_service_class_config: Displays the configuration of WLM service classes, including concurrency and memory limits.
- stv_blocklist: Provides information about disk space utilization and block metadata.
By querying these system views, you can gain real-time insights into the current state of your Redshift cluster and quickly identify any issues or performance bottlenecks.
Conclusion
Redshift STL views offer a wealth of information for logging, monitoring, and optimizing your cluster’s performance. By leveraging views like stl_query, stl_connection_log, and others, you can gain deep insights into query execution, system activity, and resource utilization.
Remember to regularly monitor and analyze the information provided by these views to proactively identify and address any issues, optimize query performance, and ensure the smooth operation of your Redshift cluster. With the power of STL views at your fingertips, you can take your Redshift logging and monitoring to the next level.
If you’re looking for a comprehensive solution to secure your Redshift cluster and ensure compliance with data protection regulations, consider exploring the user-friendly and flexible tools offered by DataSunrise. Our team would be happy to provide you with an online demo and showcase how our database security, audit, and compliance features can benefit your organization.