Application log analytics pipeline
Centralized Logging with OpenSearch supports log analysis for application logs, such as Nginx/Apache HTTP Server logs or custom application logs.
Note
Centralized Logging with OpenSearch supports cross-account log ingestion. If you want to ingest logs from the same account, the resources in the Sources group will be in the same account as your Centralized Logging with OpenSearch account. Otherwise, they will be in the member account.
Logs from Amazon EC2 / Amazon EKS
Centralized Logging with OpenSearch supports collecting logs from Amazon EC2 instances or Amazon EKS clusters. The workflow supports two scenarios.
Scenario 1: Using OpenSearch Engine
Application log pipeline architecture for EC2/EKS
The log pipeline runs the following workflow:
-
Fluent Bit works as the underlying log agent to collect logs from application servers and send them to an optional Log Buffer, or ingest into OpenSearch domain directly.
-
The Log Buffer triggers the Lambda (Log Processor) to run.
-
The log processor reads and processes the log records and ingests the logs into the OpenSearch domain.
-
Logs that fail to be processed are exported to an Amazon S3 bucket (Backup Bucket).
Scenario 2: Using Light Engine
Application log pipeline architecture for EC2/EKS
The log pipeline runs the following workflow:
- Fluent Bit works as the underlying log agent to collect logs from application servers and send them to an optional Log Buffer.
- The Log Buffer triggers the Lambda to copy objects from log bucket to staging bucket.
- Log Processor, AWS Step Functions, processes raw log files stored in the staging bucket in batches, converts them to Apache Parquet, and automatically partitions all incoming data by criteria including time and region.
Logs from Amazon S3
Centralized Logging with OpenSearch supports collecting logs from Amazon S3 buckets. The workflow supports three scenarios.
Scenario 1: Using OpenSearch Engine (On-going)
Application log pipeline architecture for S3
The log pipeline runs the following workflow:
-
User uploads logs to an Amazon S3 bucket (Log Bucket).
-
An event notification is sent to Amazon SQS using S3 Event Notifications when a new log file is created.
-
Amazon SQS initiates AWS Lambda.
-
AWS Lambda copies objects from the log bucket to the staging bucket.
-
The log processor reads and processes the log records and ingests the logs into the OpenSearch domain.
-
Logs that fail to be processed are exported to an Amazon S3 bucket (Backup Bucket).
Scenario 2: Using OpenSearch Engine (One-time)
Application log pipeline architecture for S3
The log pipeline runs the following workflow:
-
User uploads logs to an Amazon S3 bucket (Log Bucket).
-
Amazon ECS Task iterates logs in the log bucket
-
Amazon ECS Task send the log location to a Amazon SQS queue
-
Amazon SQS initiates AWS Lambda.
-
AWS Lambda copies objects from the log bucket to the staging bucket.
-
The log processor reads and processes the log records and ingests the logs into the OpenSearch domain.
-
Logs that fail to be processed are exported to an Amazon S3 bucket (Backup Bucket).
Scenario 3: Using Light Engine (On-going)
Application log pipeline architecture for S3
The log pipeline runs the following workflow:
- User uploads logs to an Amazon S3 bucket (Log Bucket).
- An event notification is sent to Amazon SQS using S3 Event Notifications when a new log file is created.
- Amazon SQS initiates AWS Lambda.
- AWS Lambda copies objects from the log bucket to the staging bucket.
- (5. 6. 7.)The Log Processor, AWS Step Functions, processes raw log files stored in the staging bucket in batches. It converts them into Apache Parquet format and automatically partitions all incoming data based on criteria including time and region.
Logs from Syslog Client
Important
- Make sure your Syslog generator/sender's subnet is connected to Centralized Logging with OpenSearch' two private subnets. You need to use VPC Peering Connection or Transit Gateway to connect these VPCs.
- The NLB together with the ECS containers in the architecture diagram will be provisioned only when you create a Syslog ingestion and be automated deleted when there is no Syslog ingestion.
Scenario 1: Using OpenSearch Engine
Application log pipeline architecture for Syslog
The log pipeline runs the following workflow:
- Syslog client (like Rsyslog) send logs to a Network Load Balancer (NLB) in Centralized Logging with OpenSearch's private subnets, and NLB routes to the ECS containers running Syslog servers.
- Fluent Bit works as the underlying log agent in the ECS Service to parse logs, and send them to an optional Log Buffer, or ingest into OpenSearch domain directly.
- The Log Buffer triggers the Lambda (Log Processor) to run.
- The log processor reads and processes the log records and ingests the logs into the OpenSearch domain.
- Logs that fail to be processed are exported to an Amazon S3 bucket (Backup Bucket).
Scenario 2: Using Light Engine
Application log pipeline architecture for Syslog
The log pipeline runs the following workflow:
- Syslog client (like Rsyslog) send logs to a Network Load Balancer (NLB) in Centralized Logging with OpenSearch's private subnets, and NLB routes to the ECS containers running Syslog servers.
- Fluent Bit works as the underlying log agent in the ECS Service to parse logs, and send them to an Log Buffer.
- An event notification is sent to Amazon SQS using S3 Event Notifications when a new log file is created.
- Amazon SQS initiates AWS Lambda.
- AWS Lambda copies objects from the log bucket to the staging bucket.
- (6. 7. 8.)The Log Processor, AWS Step Functions, processes raw log files stored in the staging bucket in batches. It converts them into Apache Parquet format and automatically partitions all incoming data based on criteria including time and region.