11. Monitor / Log Analytics
Azure 運用の中心は Azure Monitor。ログ集約は Log Analytics Workspace、APM は Application Insights、アラート発火は Action Group。これらを組み合わせます。
登場人物
| 用語 | 役割 | AWS 対応 |
|---|---|---|
| Log Analytics Workspace | ログ・メトリクスの保存先 | CloudWatch Logs |
| Diagnostic Setting | 「あるリソースのログを LAW に送る」設定 | 個別設定 |
| Application Insights | アプリ APM(リクエスト・例外・依存追跡) | X-Ray + CloudWatch Application Insights |
| Metric Alert | メトリクスのしきい値アラート | CloudWatch Alarm |
| Action Group | 通知先の定義(メール・Webhook・Logic App) | SNS Topic |
Log Analytics Workspace
resource "azurerm_log_analytics_workspace" "main" {
name = "law-myapp-prd"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
sku = "PerGB2018" # 従量制
retention_in_days = 30 # 30-730 日
daily_quota_gb = 10 # 1 日あたり 10 GB で頭打ち(暴走防止)
tags = local.common_tags
}
retention と料金
Log Analytics は 取り込み量 + 保管期間 で課金。retention 30 日超は別料金。意図しない大量ログ流入で月数万円~の事故が起きがち。
daily_quota_gb でガードを。
Diagnostic Settings
各 Azure リソースの「アクティビティログ」「リソースログ」を LAW に流す設定。これがないと何も貯まらない。
# Key Vault のアクセスログを LAW に
resource "azurerm_monitor_diagnostic_setting" "kv" {
name = "kv-to-law"
target_resource_id = azurerm_key_vault.main.id
log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
enabled_log {
category = "AuditEvent"
}
enabled_log {
category = "AzurePolicyEvaluationDetails"
}
metric {
category = "AllMetrics"
enabled = true
}
}
# Storage Account(Blob)のログ
resource "azurerm_monitor_diagnostic_setting" "storage_blob" {
name = "blob-to-law"
target_resource_id = "${azurerm_storage_account.data.id}/blobServices/default/"
log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
enabled_log {
category_group = "audit"
}
}
Application Insights
resource "azurerm_application_insights" "main" {
name = "appi-myapp-prd"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
application_type = "web"
workspace_id = azurerm_log_analytics_workspace.main.id # workspace-based(推奨)
retention_in_days = 90
tags = local.common_tags
}
# Container App や Function App に接続文字列を渡す
resource "azurerm_container_app" "api" {
# ...
template {
container {
env {
name = "APPLICATIONINSIGHTS_CONNECTION_STRING"
value = azurerm_application_insights.main.connection_string
}
}
}
}
アラートと Action Group
# 通知グループ
resource "azurerm_monitor_action_group" "ops" {
name = "ag-ops"
resource_group_name = azurerm_resource_group.main.name
short_name = "ops"
email_receiver {
name = "oncall"
email_address = "oncall@example.com"
}
webhook_receiver {
name = "slack"
service_uri = var.slack_webhook_url
}
}
# メトリクスアラート(Container App のリクエスト失敗率)
resource "azurerm_monitor_metric_alert" "api_5xx" {
name = "api-5xx-high"
resource_group_name = azurerm_resource_group.main.name
scopes = [azurerm_container_app.api.id]
description = "5xx error rate > 5%"
severity = 2
criteria {
metric_namespace = "Microsoft.App/containerApps"
metric_name = "Requests"
aggregation = "Total"
operator = "GreaterThan"
threshold = 100
dimension {
name = "statusCodeCategory"
operator = "Include"
values = ["5xx"]
}
}
window_size = "PT5M"
frequency = "PT1M"
action {
action_group_id = azurerm_monitor_action_group.ops.id
}
}
# ログクエリベースのアラート(KQL)
resource "azurerm_monitor_scheduled_query_rules_alert_v2" "errors" {
name = "log-errors-spike"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
evaluation_frequency = "PT5M"
window_duration = "PT15M"
scopes = [azurerm_log_analytics_workspace.main.id]
severity = 2
criteria {
query = "traces | where severityLevel == 3 | summarize count() by bin(timestamp, 5m)"
time_aggregation_method = "Total"
threshold = 50
operator = "GreaterThan"
failing_periods {
minimum_failing_periods_to_trigger_alert = 1
number_of_evaluation_periods = 1
}
}
action {
action_groups = [azurerm_monitor_action_group.ops.id]
}
}
最低限のセット
① Log Analytics Workspace(retention 30 日、daily_quota_gb で頭打ち)/② 重要リソースに Diagnostic Setting/③ App Insights を 1 つ/④ ops 用の Action Group + 5xx と Failed Request の Alert 2-3 個。これだけで実用最小ラインに到達。