Understanding Data Ingress and Egress
Data is constantly flowing in and out of systems in today’s interconnected digital world. Two key concepts related to this data flow are data ingress and data egress. In this article, we’ll dive into what data ingress and egress mean, look at some examples, discuss how they relate to Kubernetes network policies, and examine data security considerations. By the end, you’ll have a solid grasp of these important data concepts.
What is Data Ingress?
Data ingress refers to data entering a system or network from an external source. When a database, application, or other IT system ingests data, it constitutes data ingress. Some common examples of data ingress include:
- Querying a database to read data
- Receiving user input through a web form
- Importing a data file into an application
- Sensor devices sending telemetry to a central system
- Clients sending requests to an API endpoint
So when you query a database to retrieve information, is that considered data ingress or egress? Querying a database to read data is actually an example of data ingress. The data is entering your system from the external database. SQL statements like SELECT ingress data in relational databases.
What is Data Egress?
In contrast to ingress, data egress is data leaving a system or network to an external destination. Whenever an IT system sends, transmits, or exports data out, it constitutes data egress. Here are some typical examples of data egress:
- Writing or updating data in a database
- Exporting a file from an application
- Sending an email with an attachment
- Uploading data to a remote server or cloud storage
- Responding to client requests from an API
DML statements, such as INSERT, UPDATE, and DELETE manipulate data in relational databases because they modify data. So while querying a database is ingress, writing to a database is egress.
Data Ingress and Egress in Kubernetes
In Kubernetes, a popular container orchestration platform, the concepts of data ingress and egress come into play with network policies. Kubernetes network policies allow you to restrict and control network traffic flow between pods and with external systems.
Network policy ingress rules specify which sources can send inbound traffic to selected pods on specific ports. Egress rules determine the destinations that selected pods can send outbound traffic to on specific ports.
For example, you could create an ingress rule allowing traffic from a front-end web pod to an API pod on port 80. And you could add an egress rule allowing the API pod to connect to a database pod on port 5432. This way, Kubernetes network policies help tightly control data ingress and egress between pods and external systems.
Securing Data in Transit
Whenever data is moving in or out of a system, it’s crucial to keep that data secure. Data is often most vulnerable when it’s in transit. Some key measures to securing data ingress and egress include:
- Encrypting data in transit using protocols like SSL/TLS
- Authenticating and authorizing clients accessing APIs and databases
- Validating and sanitizing all user inputs to prevent injection attacks
- Monitoring networks for unusual data ingress/egress patterns that could indicate a breach
- Tightly restricting egress destinations and ports with firewalls and Kubernetes policies
For example, when exposing a REST API, use HTTPS with SSL/TLS certificates to encrypt all traffic between clients and the API server. Require API keys, OAuth tokens, or other authentication credentials. Validate request parameters and JSON payloads to guard against attacks like SQL injection and XML external entities (XXE).
For databases, use encrypted connections, strong authentication, and SQL prepared statements. Audit and monitor database activity logs for suspicious queries. And employ network policies or access control lists to limit database access to only authorized application servers.
Data Ingress and Egress Examples
Let’s walk through a concrete example to tie these concepts together. Imagine a simple web application that allows users to create accounts, login, and save their favorite colors. Here’s the data flow:
- User creates an account by submitting a form (data ingress via HTTP POST).
- Application inserts account details into a MySQL database (data egress to database).
- User logs in by submitting credentials (data ingress via HTTP POST).
- Application queries database to validate credentials (data ingress from database).
- User saves their favorite color (data ingress via HTTP POST).
- Application updates favorite color in database (data egress to database).
To secure this data flow, the application should:
- Serve all HTTP traffic over SSL/TLS
- Validate and sanitize user inputs like username, password, color
- Use SQL prepared statements for querying and updating the database
- Configure the database to use encrypted connections and strong credentials
- Set up Kubernetes (if needed) network policies to restrict database access to just the application pods
As an example of a data leak via egress, imagine an application bug where user-provided data is passed unsanitized into an SQL query. An attacker could inject malicious SQL to exfiltrate sensitive data from the database out to a remote host. Proper input validation and parameterized queries prevent this.
Summary and Conclusion
In summary, data ingress is data entering a system while data egress is data leaving a system. Securing data ingress and egress is critical for protecting sensitive information. Key measures include encrypting data in transit, validating user inputs, using strong authentication and authorization, monitoring for breaches, and tightly controlling egress.
Kubernetes network policies help control ingress and egress between pods and external systems via ingress/egress rules. Following security best practices for handling data in transit can significantly reduce the risk of data leakage and breaches.
For additional guidance on securing your databases and ensuring compliance with data regulations, consider the user-friendly and flexible tools from DataSunrise. Our solutions provide robust security, data discovery, OCR, and compliance capabilities. Contact the DataSunrise team today to schedule an online demo and see how we can help you stay on top of data security.