11. Cloud Logging / Monitoring
GCP の運用基盤。Cloud Logging でログ集約、Cloud Monitoring でメトリクスとアラート、Cloud Trace で分散トレース。すべて Project デフォルトで有効、サービスごとの設定が中心です。
運用 4 兄弟
| サービス | 役割 | AWS 対応 |
|---|---|---|
| Cloud Logging | ログ集約・検索(_Default バケットに自動保存) | CloudWatch Logs |
| Cloud Monitoring | メトリクス・ダッシュボード・アラート | CloudWatch Metrics / Alarms |
| Cloud Trace | 分散トレース | X-Ray |
| Error Reporting | 例外集約 | — |
GCP の特徴は 「サービスを使うと自動でログ・メトリクスが Cloud Logging / Monitoring に集まる」。AWS のように個別に「設定しないとログが取れない」状況がほとんどない。
Log Sink(ログのエクスポート)
Cloud Logging の _Default バケットに集まったログを、別バケット/BigQuery/Pub/Sub に転送する仕組み。
# 専用 Log Bucket(保持 90 日)
resource "google_logging_project_bucket_config" "audit" {
project = "myapp-prd"
location = "asia-northeast1"
retention_days = 365
bucket_id = "audit-logs"
}
# 監査ログだけを audit バケットへ
resource "google_logging_project_sink" "audit" {
name = "audit-to-bucket"
destination = "logging.googleapis.com/projects/myapp-prd/locations/asia-northeast1/buckets/${google_logging_project_bucket_config.audit.bucket_id}"
filter = "logName:\"cloudaudit.googleapis.com\""
unique_writer_identity = true
}
# その sink にロール付与
resource "google_project_iam_member" "audit_sink_writer" {
project = "myapp-prd"
role = "roles/logging.bucketWriter"
member = google_logging_project_sink.audit.writer_identity
}
# 長期保管用に Cloud Storage に書き出し
resource "google_logging_project_sink" "archive" {
name = "archive-to-gcs"
destination = "storage.googleapis.com/${google_storage_bucket.logs.name}"
filter = "severity >= WARNING"
unique_writer_identity = true
}
ログ保持期間とコスト
Cloud Logging の _Default バケットはデフォルト 30 日保持。期間を延ばすと取り込み量 × 保管月 で課金。
resource "google_logging_project_bucket_config" "default" {
project = "myapp-prd"
location = "global"
retention_days = 30 # 必要に応じて 30-3650 日
bucket_id = "_Default"
}
_Required は変更不可
Cloud Audit Logs を保管する
_Required バケットは 400 日固定で変更不可。Project 単位で常にコストが発生します(小規模なら無料枠内)。
アラート (AlertPolicy)
resource "google_monitoring_alert_policy" "api_5xx" {
display_name = "API 5xx error rate"
combiner = "OR"
conditions {
display_name = "5xx > 5% for 5m"
condition_threshold {
filter = "metric.type=\"run.googleapis.com/request_count\" resource.type=\"cloud_run_revision\" metric.label.response_code_class=\"5xx\""
duration = "300s"
comparison = "COMPARISON_GT"
threshold_value = 10
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_RATE"
}
}
}
notification_channels = [google_monitoring_notification_channel.email.id]
alert_strategy {
auto_close = "1800s"
}
documentation {
content = "API の 5xx エラー率が高い状態です。ログを確認してください。"
mime_type = "text/markdown"
}
}
# ログクエリベースのアラート
resource "google_logging_metric" "errors" {
name = "app-errors"
filter = "resource.type=\"cloud_run_revision\" severity=\"ERROR\""
metric_descriptor {
metric_kind = "DELTA"
value_type = "INT64"
}
}
resource "google_monitoring_alert_policy" "error_spike" {
display_name = "Error log spike"
combiner = "OR"
conditions {
display_name = "errors > 50 in 5m"
condition_threshold {
filter = "metric.type=\"logging.googleapis.com/user/${google_logging_metric.errors.name}\""
duration = "300s"
comparison = "COMPARISON_GT"
threshold_value = 50
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_RATE"
}
}
}
notification_channels = [google_monitoring_notification_channel.email.id]
}
通知チャネル
resource "google_monitoring_notification_channel" "email" {
display_name = "Ops Email"
type = "email"
labels = {
email_address = "ops@example.com"
}
}
resource "google_monitoring_notification_channel" "slack" {
display_name = "Ops Slack"
type = "slack"
labels = {
channel_name = "#alerts"
}
sensitive_labels {
auth_token = var.slack_token
}
}
resource "google_monitoring_notification_channel" "pagerduty" {
display_name = "Oncall PagerDuty"
type = "pagerduty"
sensitive_labels {
service_key = var.pd_integration_key
}
}
Uptime Check
外部から URL に定期 HTTP リクエストして死活監視。AWS Route 53 Health Check 相当。
resource "google_monitoring_uptime_check_config" "api" {
display_name = "api-uptime"
timeout = "10s"
period = "60s"
http_check {
path = "/health"
port = "443"
use_ssl = true
validate_ssl = true
accepted_response_status_codes {
status_class = "STATUS_CLASS_2XX"
}
}
monitored_resource {
type = "uptime_url"
labels = {
project_id = "myapp-prd"
host = "api.myapp.com"
}
}
selected_regions = ["ASIA_PACIFIC", "USA"]
}
resource "google_monitoring_alert_policy" "api_uptime" {
display_name = "API down"
combiner = "OR"
conditions {
display_name = "Uptime check failing"
condition_threshold {
filter = "metric.type=\"monitoring.googleapis.com/uptime_check/check_passed\" resource.type=\"uptime_url\" metric.label.check_id=\"${google_monitoring_uptime_check_config.api.uptime_check_id}\""
duration = "60s"
comparison = "COMPARISON_LT"
threshold_value = 1
aggregations {
alignment_period = "60s"
per_series_aligner = "ALIGN_NEXT_OLDER"
cross_series_reducer = "REDUCE_COUNT_FALSE"
group_by_fields = ["resource.label.host"]
}
}
}
notification_channels = [google_monitoring_notification_channel.pagerduty.id]
}
最低限のセット
① _Default バケットを 30-90 日 retention/② Cloud Run / Cloud SQL 等の主要メトリクスにアラート 2-3 個/③ Email + Slack 通知/④ 公開 API に Uptime Check。これで基本運用に必要な可視化が整います。