1. K8S Scheduler란?
2. 실습환경 구성을 위한 Terraform 설치
3. Fargate란?
- Amazon EKS Blueprints for Terraform
4. Auto-mode 란?
- sample-aws-eks-auto-mode
K8S Scheduler
Scheduler(스케줄러)는 쿠버네티스 클러스터 내에서 생성된 파드를 적절한 노드에 배치하는 컴포넌트
역할:
파드(Pod)를 적절한 노드(Node)에 배치해 클러스터의 리소스(CPU, 메모리 등)를 효율적으로 사용하도록 합니다.
동작 방식:
파드를 배치할 노드를 선택하기 위해 여러 단계를 거쳐 결정합니다.
파드의 요구사항과 노드의 상태, 제약 조건을 고려합니다.
Scheduler의 동작은 크게 2단계로 구분
① Filtering (필터링)
파드가 실행될 수 있는 후보 노드를 선별합니다.
파드의 요청 리소스(Requests), 볼륨 타입, 레이블/어피니티 등을 고려하여 노드를 필터링합니다.
주요 필터링 조건 예시:
노드에 충분한 리소스가 있는가? (CPU, 메모리 등)
노드가 파드의 affinity(친화성)/anti-affinity(반친화성)를 만족하는가?
taints(테인트)와 tolerations(톨러레이션)을 만족하는가?
노드가 필요한 볼륨 유형을 지원하는가?
② Scoring (점수화)
필터링을 통과한 후보 노드에 대해 각 노드별로 우선순위 점수를 매깁니다.
가장 높은 점수를 가진 노드에 파드를 배치합니다.
주요 점수화 기준 예시:
리소스의 균형(Balanced resource usage)
노드의 현재 로드 상태
노드 affinity 및 anti-affinity 조건 만족도
노드의 이미지 로컬리티 (이미지 미리 존재 여부)
개념 | 설명 | |
Affinity | 파드가 특정 노드를 선호하게 하여 파드를 함께 배치하거나 특정 노드에 배치 | 동일 애플리케이션 파드를 함께 배치 |
Anti-Affinity | 파드가 특정 파드나 노드를 회피하여 배치되게 함 | HA를 위해 파드를 서로 다른 노드에 분산 배치 |
Taints (테인트) | 노드가 특정 파드를 회피하도록 설정한 조건 | GPU 노드는 GPU 사용 파드만 사용하도록 설정 |
Tolerations (톨러레이션) | 파드가 특정 테인트가 설정된 노드에 배치될 수 있도록 허용 | GPU를 요구하는 파드가 GPU 노드에 설정된 테인트를 허용 |
Scheduler가 고려하는 리소스와 요소
CPU 및 메모리 리소스 요청량(Requests)
디스크 볼륨 타입 지원 여부
NodeSelector, 노드 레이블(Label)
노드의 현재 리소스 사용량과 상태
파드 Affinity, Anti-Affinity 조건
Taints & Tolerations 조건
실습환경 구성을 위한 Terraform 설치
- Terraform은 IaC 도구
윈도우 환경
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list sudo apt update && sudo apt install terraform # 테라폼 버전 정보 확인 terraform version ![]() # 서브커맨드 help 지원 terraform console -help terraform init -help terraform init - 초기화 terraform plan - 생성되는 자원 확인 terraform apply - 생성 terraform state list - 자원정보 terraform destroy - 삭제 terraform fmt - 들여쓰기 확인 |
Fargate
- EKS(컨트롤 플레인) + Fargate(데이터 플레인)의 완전한 서버리스화(=AWS 관리형)
서비스 구성도
특징
Node 관리 | 불필요 (AWS에서 관리) |
자동 스케일링 | 가능 (파드 기반) |
컨테이너 격리 | 파드별 독립된 환경 |
볼륨 타입 제한 | EFS, EmptyDir 가능 |
비용 청구 방식 | 사용된 CPU/메모리 기준 |
최대 자원 | CPU:16 vCPU, Mem:120 GiB |
DaemonSet 지원 | 미지원 |
Fargate 요금은 실행된 컨테이너의 vCPU와 메모리 자원을 기준으로 과금
사용한 리소스(CPU, 메모리) 및 실행 시간따라 과금
vCPU | $0.04956 |
메모리 | $0.00544/GB |
계층기술/구성요소 | 관리 | |
컨테이너 런타임 | 컨테이너(Docker/containerd) | AWS |
가상화(Virtualization) | Firecracker MicroVM | AWS |
하이퍼바이저(Hypervisor) | AWS Nitro 시스템 | AWS |
물리 서버 | AWS 전용 하드웨어 (EC2 Nitro 기반) | AWS |
Amazon EKS Blueprints for Terraform
Amazon EKS Blueprints for Terraform은 Amazon Elastic Kubernetes Service(EKS)의 설정과 운영을 간소화하고 자동화하기 위해 설계된 오픈소스 프레임워크입니다. 사용자가 손쉽게 표준화된 모범 사례를 따라 Kubernetes 클러스터를 배포하고 관리할 수 있도록 Terraform 모듈과 설정 예시를 제공
- 표준화된 Kubernetes 환경
모범 사례와 검증된 패턴을 미리 구성한 Terraform 모듈을 제공하여 일관된 환경 구성이 가능 - 모듈식 아키텍처
네트워크, 보안, 로깅, 모니터링, CI/CD, GitOps 등 필수 구성요소를 독립적인 모듈 형태로 제공하여 필요에 따라 자유롭게 결합 - 확장성 및 사용자 정의
사전 구성된 애드온(add-on) 외에도 사용자가 필요에 따라 추가적인 오픈소스 도구(예: ArgoCD, Prometheus, Grafana, Fluent Bit)를 손쉽게 추가하거나 수정하여 사용 - 보안 및 거버넌스 강화
AWS IAM 역할과 정책을 명확히 설정하여 클러스터 액세스 제어와 RBAC(Role-Based Access Control)을 효과적으로 관리 - GitOps 및 CI/CD 통합 지원
ArgoCD와 같은 GitOps 도구 및 CI/CD 파이프라인 도구와의 통합을 통해 코드 기반의 지속적인 배포 및 관리를 구현
Amazon EKS Blueprints for Terraform 으로 EKS, Fargate Profile 배포
# 코드 가져오기 git clone https://github.com/aws-ia/terraform-aws-eks-blueprints tree terraform-aws-eks-blueprints/patterns cd terraform-aws-eks-blueprints/patterns/fargate-serverless # init 초기화 terraform init tree .terraform cat .terraform/modules/modules.json | jq tree .terraform/providers/registry.terraform.io/hashicorp -L 2 # plan terraform plan # 배포 : EKS, Add-ons, fargate profile - 13분 소요 terraform apply -auto-approve # 배포 완료 후 확인 terraform state list ![]() # EKS 자격증명 $(terraform output -raw configure_kubectl) # aws eks --region ap-northeast-2 update-kubeconfig --name fargate-serverless cat ~/.kube/config # kubectl context 변경 kubectl ctx kubectl config rename-context "arn:aws:eks:ap-northeast-2:$(aws sts get-caller-identity --query 'Account' --output text):cluster/fargate-serverless" "fargate-lab" # k8s 노드, 파드 정보 확인 kubectl ns default kubectl cluster-info kubectl get node kubectl get pod -A # 상세 정보 확인 terraform show ![]() |
실습 기본정보
# k8s api service 확인 : ENDPOINTS 의 IP는 EKS Owned-ENI 2개 root@DESKTOP-1BA59FT:~/terraform-aws-eks-blueprints/patterns/fargate-serverless# kubectl get svc,ep NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 17m NAME ENDPOINTS AGE endpoints/kubernetes 10.0.18.202:443,10.0.35.56:443 17m # node 확인 : 노드(Micro VM) 4대 ( v1.30.8-eks-2d5f260 ) kubectl get csr kubectl get node -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME fargate-ip-10-0-24-240.us-west-2.compute.internal Ready <none> 8m47s v1.30.8-eks-2d5f260 10.0.24.240 <none> Amazon Linux 2 5.10.234-225.910.amzn2.x86_64 containerd://1.7.25 fargate-ip-10-0-34-95.us-west-2.compute.internal Ready <none> 8m52s v1.30.8-eks-2d5f260 10.0.34.95 <none> Amazon Linux 2 5.10.234-225.910.amzn2.x86_64 containerd://1.7.25 fargate-ip-10-0-37-91.us-west-2.compute.internal Ready <none> 9m9s v1.30.8-eks-2d5f260 10.0.37.91 <none> Amazon Linux 2 5.10.234-225.910.amzn2.x86_64 containerd://1.7.25 fargate-ip-10-0-4-32.us-west-2.compute.internal Ready <none> 9m13s v1.30.8-eks-2d5f260 10.0.4.32 <none> Amazon Linux 2 5.10.234-225.910.amzn2.x86_64 containerd://1.7.25 kubectl describe node | grep eks.amazonaws.com/compute-type Labels: eks.amazonaws.com/compute-type=fargate Taints: eks.amazonaws.com/compute-type=fargate:NoSchedule ... # 파드 확인 : 파드의 IP와 노드의 IP가 같다! kubectl get pdb -n kube-system NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE aws-load-balancer-controller N/A 1 1 10m coredns N/A 1 1 16m kubectl get pod -A -owide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES app-2048 app-2048-6c45d649c8-2c2x4 0/1 Pending 0 14m <none> <none> <none> <none> app-2048 app-2048-6c45d649c8-84p4s 0/1 Pending 0 14m <none> <none> <none> <none> app-2048 app-2048-6c45d649c8-cvgw9 0/1 Pending 0 14m <none> <none> <none> <none> kube-system aws-load-balancer-controller-7cc6cd8ddd-29lc5 1/1 Running 0 10m 10.0.24.240 fargate-ip-10-0-24-240.us-west-2.compute.internal <none> <none> kube-system aws-load-balancer-controller-7cc6cd8ddd-r57wj 1/1 Running 0 10m 10.0.34.95 fargate-ip-10-0-34-95.us-west-2.compute.internal <none> <none> kube-system coredns-69fd949db7-pjp9h 1/1 Running 0 10m 10.0.37.91 fargate-ip-10-0-37-91.us-west-2.compute.internal <none> <none> kube-system coredns-69fd949db7-szr79 1/1 Running 0 10m 10.0.4.32 fargate-ip-10-0-4-32.us-west-2.compute.internal <none> <none> # aws-load-balancer-webhook-service , eks-extension-metrics-api? kubectl get svc,ep -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/aws-load-balancer-webhook-service ClusterIP 172.20.72.191 <none> 443/TCP 34m service/eks-extension-metrics-api ClusterIP 172.20.173.28 <none> 443/TCP 42m # eks-extension-metrics-api? kubectl get apiservices.apiregistration.k8s.io | grep eks v1.metrics.eks.amazonaws.com kube-system/eks-extension-metrics-api True 53m kubectl get --raw "/apis/metrics.eks.amazonaws.com" | jq kubectl get --raw "/apis/metrics.eks.amazonaws.com/v1" | jq # configmap 확인 kubectl get cm -n kube-system ... # aws-auth 보다 우선해서 IAM access entry 가 있음을 참고. # 기본 관리노드 보다 system:node-proxier 그룹이 추가되어 있음. # fargate profile 이 2개인데, 그 profile 갯수만큼 있음. kubectl get cm -n kube-system aws-auth -o yaml ... data: mapRoles: | - groups: - system:bootstrappers - system:nodes - system:node-proxier rolearn: arn:aws:iam::824766816795:role/kube-system-20250320115108272100000010 username: system:node:{{SessionName}} - groups: - system:bootstrappers - system:nodes - system:node-proxier rolearn: arn:aws:iam::824766816795:role/app_wildcard-2025032011510827100000000f username: system:node:{{SessionName}} kind: ConfigMap ... # kubectl rbac-tool lookup system:node-proxier SUBJECT | SUBJECT TYPE | SCOPE | NAMESPACE | ROLE | BINDING ----------------------+--------------+-------------+-----------+---------------------+------------------------- system:node-proxier | Group | ClusterRole | | system:node-proxier | eks:kube-proxy-fargate kubectl rolesum -k Group system:node-proxier ... Policies: • [CRB] */eks:kube-proxy-fargate ⟶ [CR] */system:node-proxier Resource Name Exclude Verbs G L W C U P D DC endpoints [*] [-] [-] ✖ ✔ ✔ ✖ ✖ ✖ ✖ ✖ endpointslices.discovery.k8s.io [*] [-] [-] ✖ ✔ ✔ ✖ ✖ ✖ ✖ ✖ events.[,events.k8s.io] [*] [-] [-] ✖ ✖ ✖ ✔ ✔ ✔ ✖ ✖ nodes [*] [-] [-] ✔ ✔ ✔ ✖ ✖ ✖ ✖ ✖ services [*] [-] [-] ✖ ✔ ✔ ✖ ✖ ✖ ✖ ✖ # kubectl get cm -n kube-system amazon-vpc-cni -o yaml apiVersion: v1 data: branch-eni-cooldown: "60" minimum-ip-target: "3" warm-ip-target: "1" warm-prefix-target: "0" ... # coredns 설정 내용 kubectl get cm -n kube-system coredns -o yaml # 인증서 작성되어 있음 : client-ca-file , requestheader-client-ca-file kubectl get cm -n kube-system extension-apiserver-authentication -o yaml # kubectl get cm -n kube-system kube-proxy -o yaml kubectl get cm -n kube-system kube-proxy-config -o yaml apiVersion: v1 data: config: |- apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 10 contentType: application/vnd.kubernetes.protobuf kubeconfig: /var/lib/kube-proxy/kubeconfig qps: 5 clusterCIDR: "" configSyncPeriod: 15m0s conntrack: maxPerCore: 32768 min: 131072 tcpCloseWaitTimeout: 1h0m0s tcpEstablishedTimeout: 24h0m0s enableProfiling: false healthzBindAddress: 0.0.0.0:10256 hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: 14 minSyncPeriod: 0s syncPeriod: 30s ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "" syncPeriod: 30s kind: KubeProxyConfiguration metricsBindAddress: 0.0.0.0:10249 mode: "iptables" nodePortAddresses: null oomScoreAdj: -998 portRange: "" |
coredns 파드 상세 정보 확인 : schedulerName: fargate-scheduler
# coredns 파드 상세 정보 확인 kubectl get pod -n kube-system -l k8s-app=kube-dns -o yaml ... spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux - key: kubernetes.io/arch operator: In values: - amd64 - arm64 podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - podAffinityTerm: labelSelector: matchExpressions: - key: k8s-app operator: In values: - kube-dns topologyKey: kubernetes.io/hostname weight: 100 ... resources: limits: cpu: 250m memory: 256M requests: cpu: 250m memory: 256M ... securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - ALL readOnlyRootFilesystem: true ... dnsPolicy: Default enableServiceLinks: true nodeName: fargate-ip-10-10-34-186.ap-northeast-2.compute.internal preemptionPolicy: PreemptLowerPriority priority: 2000001000 priorityClassName: system-node-critical restartPolicy: Always schedulerName: fargate-scheduler securityContext: {} serviceAccount: coredns serviceAccountName: coredns terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/control-plane - key: CriticalAddonsOnly operator: Exists - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 topologySpreadConstraints: - labelSelector: matchLabels: k8s-app: kube-dns maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: ScheduleAnyway ... qosClass: Guaranteed |
EC2 확인 (미존재)
fargate에 kube-ops-view 설치
helm 배포 helm repo add geek-cookbook https://geek-cookbook.github.io/charts/ helm install kube-ops-view geek-cookbook/kube-ops-view --version 1.2.2 --set env.TZ="Asia/Seoul" --namespace kube-system # 포트 포워딩 kubectl port-forward deployment/kube-ops-view -n kube-system 8080:8080 & # 접속 주소 확인 : 각각 1배, 1.5배, 3배 크기 echo -e "KUBE-OPS-VIEW URL = http://localhost:8080" echo -e "KUBE-OPS-VIEW URL = http://localhost:8080/#scale=1.5" echo -e "KUBE-OPS-VIEW URL = http://localhost:8080/#scale=3" ![]() |
# node 확인 : 노드(Micro VM) kubectl get csr kubectl get node -owide kubectl describe node | grep eks.amazonaws.com/compute-type # kube-ops-view 디플로이먼트/파드 상세 정보 확인 kubectl get pod -n kube-system kubectl get pod -n kube-system -o jsonpath='{.items[0].metadata.annotations.CapacityProvisioned}' kubectl get pod -n kube-system -l app.kubernetes.io/instance=kube-ops-view -o jsonpath='{.items[0].metadata.annotations.CapacityProvisioned}' 0.25vCPU 0.5GB # 디플로이먼트 상세 정보 kubectl get deploy -n kube-system kube-ops-view -o yaml ... template: ... spec: automountServiceAccountToken: true containers: - env: - name: TZ value: Asia/Seoul image: hjacobs/kube-ops-view:20.4.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 8080 timeoutSeconds: 1 name: kube-ops-view ports: - containerPort: 8080 name: http protocol: TCP readinessProbe: failureThreshold: 3 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 8080 timeoutSeconds: 1 resources: {} securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 startupProbe: failureThreshold: 30 periodSeconds: 5 successThreshold: 1 tcpSocket: port: 8080 timeoutSeconds: 1 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst enableServiceLinks: true restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: kube-ops-view serviceAccountName: kube-ops-view terminationGracePeriodSeconds: 30 ... # 파드 상세 정보 : admission control 이 동작했음을 알 수 있음 kubectl get pod -n kube-system -l app.kubernetes.io/instance=kube-ops-view -o yaml ... ![]() # kubectl describe pod -n kube-system -l app.kubernetes.io/instance=kube-ops-view | grep Events: -A10 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal LoggingEnabled 3m22s fargate-scheduler Successfully enabled logging for pod Normal Scheduled 2m40s fargate-scheduler Successfully assigned kube-system/kube-ops-view-796947d6dc-75qwb to fargate-ip-10-0-35-75.us-west-2.compute.internal Normal Pulling 2m39s kubelet Pulling image "hjacobs/kube-ops-view:20.4.0" Normal Pulled 2m31s kubelet Successfully pulled image "hjacobs/kube-ops-view:20.4.0" in 8.103s (8.103s including waiting). Image size: 81086356 bytes. Normal Created 2m31s kubelet Created container kube-ops-view Normal Started 2m31s kubelet Started container kube-ops-viewl |
fargate 에 netshoot 디플로이먼트(파드)
vCPU valueMemory value
.25 vCPU | 0.5 GB, 1 GB, 2 GB |
.5 vCPU | 1 GB, 2 GB, 3 GB, 4 GB |
1 vCPU | 2 GB, 3 GB, 4 GB, 5 GB, 6 GB, 7 GB, 8 GB |
2 vCPU | Between 4 GB and 16 GB in 1-GB increments |
4 vCPU | Between 8 GB and 30 GB in 1-GB increments |
8 vCPU | Between 16 GB and 60 GB in 4-GB increments |
16 vCPU | Between 32 GB and 120 GB in 8-GB increments |
# 네임스페이스 생성 kubectl create ns study-aews # 테스트용 파드 netshoot 디플로이먼트 생성 : 0.5vCPU 1GB 할당되어, 아래 Limit 값은 의미가 없음. 배포 시 대략 시간 측정해보자! cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: netshoot namespace: study-aews spec: replicas: 1 selector: matchLabels: app: netshoot template: metadata: labels: app: netshoot spec: containers: - name: netshoot image: nicolaka/netshoot command: ["tail"] args: ["-f", "/dev/null"] resources: requests: cpu: 500m memory: 500Mi limits: cpu: 2 memory: 2Gi terminationGracePeriodSeconds: 0 EOF kubectl get events -w --sort-by '.lastTimestamp' # 확인 : 메모리 할당 측정은 어떻게 되었는지? kubectl get pod -n study-aews -o wide kubectl get pod -n study-aews -o jsonpath='{.items[0].metadata.annotations.CapacityProvisioned}' 0.5vCPU 1GB # 디플로이먼트 상세 정보 kubectl get deploy -n study-aews netshoot -o yaml ... template: ... spec: ... schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 0 ... # 파드 상세 정보 : admission control 이 동작했음을 알 수 있음 kubectl get pod -n study-aews -l app=netshoot -o yaml ... metadata: annotations: CapacityProvisioned: 0.5vCPU 1GB Logging: LoggingEnabled ... preemptionPolicy: PreemptLowerPriority priority: 2000001000 priorityClassName: system-node-critical restartPolicy: Always schedulerName: fargate-scheduler ... qosClass: Burstable # kubectl describe pod -n study-aews -l app=netshoot | grep Events: -A10 # kubectl get mutatingwebhookconfigurations.admissionregistration.k8s.io kubectl describe mutatingwebhookconfigurations 0500-amazon-eks-fargate-mutation.amazonaws.com kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io # 파드 내부에 zsh 접속 후 확인 kubectl exec -it deploy/netshoot -n study-aews -- zsh ----------------------------------------------------- ip -c a cat /etc/resolv.conf curl ipinfo.io/ip # 출력되는 IP는 어떤것? , 어떤 경로를 통해서 인터넷이 되는 걸까? ping -c 1 <다른 파드 IP ex. coredns pod ip> lsblk df -hT / cat /etc/fstab exit ![]() ----------------------------------------------------- |
kubectl apply -f - <<EOF apiVersion: v1 kind: Pod metadata: name: root-shell namespace: study-aews spec: containers: - command: - /bin/cat image: alpine:3 name: root-shell securityContext: privileged: true tty: true stdin: true volumeMounts: - mountPath: /host name: hostroot hostNetwork: true hostPID: true hostIPC: true tolerations: - effect: NoSchedule operator: Exists - effect: NoExecute operator: Exists volumes: - hostPath: path: / name: hostroot EOF # kubectl get pod -n study-aews root-shell kubectl describe pod -n study-aews root-shell | grep Events: -A 10 ![]() # 출력 메시지 # Pod not supported on Fargate: fields not supported: # HostNetwork, HostPID, HostIPC, volumes not supported: # hostroot is of an unsupported volume Type, invalid SecurityContext fields: Privileged # 삭제 kubectl delete pod -n study-aews root-shell # (참고) fargate가 아닌 권한이 충분한 곳에서 실행 시 : 아래 처럼 호스트 네임스페이스로 진입 가능! kubectl -n kube-system exec -it root-shell -- chroot /host /bin/bash root@myk8s-control-plane:/# id uid=0(root) gid=0(root) groups=0(root),1(daemon),2(bin),3(sys),4(adm),6(disk),10(uucp),11,20(dialout),26(tape),27(sudo) |
AWS ALB(Ingress)
# 게임 디플로이먼트와 Service, Ingress 배포 cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: namespace: study-aews name: deployment-2048 spec: selector: matchLabels: app.kubernetes.io/name: app-2048 replicas: 2 template: metadata: labels: app.kubernetes.io/name: app-2048 spec: containers: - image: public.ecr.aws/l6m2t8p7/docker-2048:latest imagePullPolicy: Always name: app-2048 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: namespace: study-aews name: service-2048 spec: ports: - port: 80 targetPort: 80 protocol: TCP type: ClusterIP selector: app.kubernetes.io/name: app-2048 --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: namespace: study-aews name: ingress-2048 annotations: alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/target-type: ip spec: ingressClassName: alb rules: - http: paths: - path: / pathType: Prefix backend: service: name: service-2048 port: number: 80 EOF # 모니터링 watch -d kubectl get pod,ingress,svc,ep,endpointslices -n study-aews # 생성 확인 kubectl get-all -n study-aews kubectl get ingress,svc,ep,pod -n study-aews kubectl get targetgroupbindings -n study-aews # Ingress 확인 kubectl describe ingress -n study-aews ingress-2048 kubectl get ingress -n study-aews ingress-2048 -o jsonpath="{.status.loadBalancer.ingress[*].hostname}{'\n'}" # 게임 접속 : ALB 주소로 웹 접속 kubectl get ingress -n study-aews ingress-2048 -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' | awk '{ print "Game URL = http://"$1 }' ![]() # 파드 IP 확인 kubectl get pod -n study-aews -owide # 파드 증가 kubectl scale deployment -n study-aews deployment-2048 --replicas 4 # 게임 실습 리소스 삭제 kubectl delete ingress ingress-2048 -n study-aews kubectl delete svc service-2048 -n study-aews && kubectl delete deploy deployment-2048 -n study-aews |
fargate job
# cat <<EOF | kubectl apply -f - apiVersion: batch/v1 kind: Job metadata: name: busybox1 namespace: study-aews spec: template: spec: containers: - name: busybox image: busybox command: ["/bin/sh", "-c", "sleep 10"] restartPolicy: Never ttlSecondsAfterFinished: 60 # <-- TTL controller --- apiVersion: batch/v1 kind: Job metadata: name: busybox2 namespace: study-aews spec: template: spec: containers: - name: busybox image: busybox command: ["/bin/sh", "-c", "sleep 10"] restartPolicy: Never EOF # kubectl get job,pod -n study-aews kubectl get job -n study-aews -w kubectl get pod -n study-aews -w kubectl get job,pod -n study-aews # kubectl delete job busybox2 -n study-aews # kubectl create ns study-aews kubectl get job,pod -n study-aews |
fargate logging
Fargate의 Amazon EKS는 Fluent Bit 기반의 내장 로그 라우터를 제공합니다. 즉, Fluent Bit 컨테이너를 사이드카로 명시적으로 실행하지 않고 Amazon에서 실행합니다. 로그 라우터를 구성하기만 하면 됩니다.
- 구성은 다음 기준을 충족해야 하는 전용 ConfigMap을 통해 이루어집니다.
- 이름 : aws-logging
- aws-observability라는 전용 네임스페이스에서 생성됨
- 5300자를 초과할 수 없습니다.
- ConfigMap을 생성하면 Fargate의 Amazon EKS가 자동으로 이를 감지하고 로그 라우터를 구성합니다. Fargate는 AWS에서 관리하는 Fluent Bit의 업스트림 호환 배포판인 Fluent Bit용 AWS 버전을 사용합니다. 자세한 내용은 GitHub의 Fluent Bit용 AWS를 참조하세요 - Docs
- 로그 라우터를 사용하면 AWS의 다양한 서비스를 로그 분석 및 저장에 사용할 수 있습니다. Fargate에서 Amazon CloudWatch, Amazon OpenSearch 서비스로 로그를 직접 스트리밍할 수 있습니다. 또한 Amazon Data Firehose를 통해 Amazon S3, Amazon Kinesis 데이터 스트림 및 파트너 도구와 같은 대상으로 로그를 스트리밍할 수도 있습니다.
- Fargate 포드를 배포할 기존 Kubernetes 네임스페이스를 지정하는 기존 Fargate 프로필입니다.
GitHub - aws/aws-for-fluent-bit: The source of the amazon/aws-for-fluent-bit container image
The source of the amazon/aws-for-fluent-bit container image - aws/aws-for-fluent-bit
github.com
cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: sample-app namespace: study-aews spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - image: nginx:latest name: nginx ports: - containerPort: 80 name: http resources: requests: cpu: 500m memory: 500Mi limits: cpu: 2 memory: 2Gi --- apiVersion: v1 kind: Service metadata: name: sample-app namespace: study-aews spec: selector: app: nginx ports: - port: 80 targetPort: 80 protocol: TCP type: ClusterIP EOF # 확인 kubectl get pod -n study-aews -l app=nginx kubectl describe pod -n study-aews -l app=nginx # 반복 접속 kubectl exec -it deploy/netshoot -n study-aews -- curl sample-app | grep title while true; do kubectl exec -it deploy/netshoot -n study-aews -- curl sample-app | grep title; sleep 1; echo ; date; done; # 로그 확인 kubectl stern -n study-aews -l app=nginx |
# main.tf ... # Enable Fargate logging this may generate a large ammount of logs, disable it if not explicitly required enable_fargate_fluentbit = true fargate_fluentbit = { flb_log_cw = true } ... # aws-observability라는 이름의 전용 네임스페이스 확인 kubectl get ns --show-labels ![]() # Fluent Conf 데이터 값이 포함된 ConfigMap : 컨테이너 로그를 목적지로 배송 설정 ## Amazon EKS Fargate 로깅은 ConfigMap의 동적 구성을 지원하지 않습니다. ## ConfigMap에 대한 모든 변경 사항은 새 포드에만 적용됩니다. 기존 포드에는 변경 사항이 적용되지 않습니다. kubectl get cm -n aws-observability kubectl get cm -n aws-observability aws-logging -o yaml data: filters.conf: | [FILTER] Name parser Match * Key_name log Parser crio [FILTER] Name kubernetes Match kube.* Merge_Log On Keep_Log Off Buffer_Size 0 Kube_Meta_Cache_TTL 300s flb_log_cw: "true" # Ships Fluent Bit process logs to CloudWatch. output.conf: |+ [OUTPUT] Name cloudwatch Match kube.* region ap-northeast-2 log_group_name /fargate-serverless/fargate-fluentbit-logs2025031600585521800000000c log_stream_prefix fargate-logs- auto_create_group true [OUTPUT] Name cloudwatch_logs Match * region ap-northeast-2 log_group_name /fargate-serverless/fargate-fluentbit-logs2025031600585521800000000c log_stream_prefix fargate-logs-fluent-bit- auto_create_group true parsers.conf: | [PARSER] Name crio Format Regex Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$ Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%L%z Time_Keep On |
# 테라폼 삭제 : vpc 삭제가 잘 안될 경우 aws 콘솔에서 vpc 수동 삭제 -> vnic 등 남아 있을 경우 해당 vnic 강제 삭제 terraform destroy -auto-approve # VPC 삭제 확인 aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml # kubeconfig 삭제 rm -rf ~/.kube/config |
Auto-mode
항목 | 내용 |
정의 | EKS의 Managed Node Group 및 Cluster Autoscaler를 활용한 자동 관리 기능 |
기능 | - 노드 자동 프로비저닝 - 노드 자동 업그레이드 및 패치 적용 - 자동 확장 및 축소 (Auto Scaling) - 자동 Kubernetes 버전 업데이트 |
노드 관리 주체 | AWS가 노드 생성부터 종료까지 자동으로 관리 |
설정 방식 | - eksctl 명령어 또는 AWS Management Console을 통한 설정 가능 - YAML을 통한 상세 설정 가능 |
Cluster Autoscaler 연동 | - Kubernetes 리소스 (Pod)의 증가/감소에 따라 자동으로 노드 수를 조정 |
장점 | - 관리 편의성 증가 - 비용 최적화 (필요한 만큼만 노드 운영) - 자동 업데이트로 보안 강화 |
단점/유의사항 | - AWS에 의존도가 높아 세부적인 노드 컨트롤은 제한적임 - 세부 관리가 필요한 환경에서는 Self-managed 방식이 적합 |
권장 활용 사례 | - 운영 관리 효율성을 극대화할 때 - 급격한 워크로드 변화가 예상될 때 |
실습 (sample-aws-eks-auto-mode)
https://github.com/aws-samples/sample-aws-eks-auto-mode
GitHub - aws-samples/sample-aws-eks-auto-mode
Contribute to aws-samples/sample-aws-eks-auto-mode development by creating an account on GitHub.
github.com
# Get the code : 배포 코드에 addon 내용이 없음 git clone https://github.com/aws-samples/sample-aws-eks-auto-mode.git cd sample-aws-eks-auto-mode/terraform module "eks" { source = "terraform-aws-modules/eks/aws" version = "~> 20.24" cluster_name = var.name cluster_version = var.eks_cluster_version # Give the Terraform identity admin access to the cluster # which will allow it to deploy resources into the cluster enable_cluster_creator_admin_permissions = true cluster_endpoint_public_access = true vpc_id = module.vpc.vpc_id subnet_ids = module.vpc.private_subnets cluster_compute_config = { enabled = true node_pools = ["general-purpose"] } tags = local.tags # Initialize and apply Terraform : 9:19~ terraform init terraform plan terraform apply -auto-approve # Configure kubectl cat setup.tf ls -l ../nodepools $(terraform output -raw configure_kubectl) # kubectl context 변경 kubectl ctx aws eks update-kubeconfig --name automode-cluster --region us-west-2 kubectl ns default # 아래 IP의 ENI 찾아보자 kubectl get svc,ep NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 15m NAME ENDPOINTS AGE endpoints/kubernetes 10.0.2.0:443,10.0.35.204:443 15m # terraform state list ![]() terraform show terraform state show 'module.eks.aws_eks_cluster.this[0]' ... compute_config { enabled = true node_pools = [ "general-purpose", ] node_role_arn = "arn:aws:iam::911283464785:role/automode-cluster-eks-auto-20250316042752605600000003" } ... |
# kubectl get crd NAME CREATED AT cninodes.eks.amazonaws.com 2025-03-21T12:55:32Z cninodes.vpcresources.k8s.aws 2025-03-21T12:52:04Z ingressclassparams.eks.amazonaws.com 2025-03-21T12:55:33Z nodeclaims.karpenter.sh 2025-03-21T12:55:54Z nodeclasses.eks.amazonaws.com 2025-03-21T12:55:54Z nodediagnostics.eks.amazonaws.com 2025-03-21T12:55:54Z nodepools.karpenter.sh 2025-03-21T12:55:54Z policyendpoints.networking.k8s.aws 2025-03-21T12:52:04Z securitygrouppolicies.vpcresources.k8s.aws 2025-03-21T12:52:03Z targetgroupbindings.eks.amazonaws.com 2025-03-21T12:55:34Z kubectl api-resources | grep -i node nodes no v1 false Node cninodes cni,cnis eks.amazonaws.com/v1alpha1 false CNINode nodeclasses eks.amazonaws.com/v1 false NodeClass nodediagnostics eks.amazonaws.com/v1alpha1 false NodeDiagnostic nodeclaims karpenter.sh/v1 false NodeClaim nodepools karpenter.sh/v1 false NodePool runtimeclasses node.k8s.io/v1 false RuntimeClass csinodes storage.k8s.io/v1 false CSINode cninodes cnd vpcresources.k8s.aws/v1alpha1 false CNINode # 노드에 Access가 불가능하니, 분석 지원(CRD)제공 kubectl explain nodediagnostics GROUP: eks.amazonaws.com KIND: NodeDiagnostic VERSION: v1alpha1 DESCRIPTION: The name of the NodeDiagnostic resource is meant to match the name of the node which should perform the diagnostic tasks # kubectl get nodeclasses.eks.amazonaws.com NAME ROLE READY AGE default automode-cluster-eks-auto-20250314121820950800000003 True 29m kubectl get nodeclasses.eks.amazonaws.com -o yaml ... spec: ephemeralStorage: iops: 3000 size: 80Gi throughput: 125 networkPolicy: DefaultAllow networkPolicyEventLogs: Disabled role: automode-cluster-eks-auto-20250314121820950800000003 securityGroupSelectorTerms: - id: sg-05d210218e5817fa1 snatPolicy: Random # ??? subnetSelectorTerms: - id: subnet-0539269140458ced5 - id: subnet-055dc112cdd434066 - id: subnet-0865f60e4a6d8ad5c status: ... instanceProfile: eks-ap-northeast-2-automode-cluster-4905473370491687283 securityGroups: - id: sg-05d210218e5817fa1 name: eks-cluster-sg-automode-cluster-2065126657 subnets: - id: subnet-0539269140458ced5 zone: ap-northeast-2a zoneID: apne2-az1 - id: subnet-055dc112cdd434066 zone: ap-northeast-2b zoneID: apne2-az2 - id: subnet-0865f60e4a6d8ad5c zone: ap-northeast-2c zoneID: apne2-az3 # kubectl get nodepools NAME NODECLASS NODES READY AGE general-purpose default 0 True 13m kubectl get nodepools -o yaml ... spec: disruption: budgets: - nodes: 10% consolidateAfter: 30s consolidationPolicy: WhenEmptyOrUnderutilized template: metadata: {} spec: expireAfter: 336h # 14일 nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: default requirements: - key: karpenter.sh/capacity-type operator: In values: - on-demand - key: eks.amazonaws.com/instance-category operator: In values: - c - m - r - key: eks.amazonaws.com/instance-generation operator: Gt values: - "4" - key: kubernetes.io/arch operator: In values: - amd64 - key: kubernetes.io/os operator: In values: - linux terminationGracePeriod: 24h0m0s ... # kubectl get mutatingwebhookconfiguration NAME WEBHOOKS AGE eks-load-balancing-webhook 2 14m pod-identity-webhook 1 17m vpc-resource-mutating-webhook 1 17m kubectl get validatingwebhookconfiguration NAME WEBHOOKS AGE vpc-resource-validating-webhook 2 17m |
kube-ops-view 설치
# 모니터링 eks-node-viewer --node-sort=eks-node-viewer/node-cpu-usage=dsc --extra-labels eks-node-viewer/node-age watch -d kubectl get node,pod -A ![]() # helm 배포 helm repo add geek-cookbook https://geek-cookbook.github.io/charts/ helm install kube-ops-view geek-cookbook/kube-ops-view --version 1.2.2 --set env.TZ="Asia/Seoul" --namespace kube-system kubectl get events -w --sort-by '.lastTimestamp' # 출력 이벤트 로그 분석해보자 # 확인 kubectl get nodeclaims NAME TYPE CAPACITY ZONE NODE READY AGE general-purpose-9jqb9 c6a.large on-demand us-west-2b Unknown 17s # OS, KERNEL, CRI 확인 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME i-0aa251960a87d2c37 Ready <none> 18s v1.31.4-eks-0f56d01 10.0.23.147 <none> Bottlerocket (EKS Auto) 2025.3.14 (aws-k8s-1.31) 6.1.129 containerd://1.7.25+bottlerocket # CNI 노드 확인 kubectl get cninodes.eks.amazonaws.com NAME AGE i-0aa251960a87d2c37 4s #[신규 터미널] 포트 포워딩 kubectl port-forward deployment/kube-ops-view -n kube-system 8080:8080 & # 접속 주소 확인 : 각각 1배, 1.5배, 3배 크기 echo -e "KUBE-OPS-VIEW URL = http://localhost:8080" echo -e "KUBE-OPS-VIEW URL = http://localhost:8080/#scale=1.5" echo -e "KUBE-OPS-VIEW URL = http://localhost:8080/#scale=3" open "http://127.0.0.1:8080/#scale=1.5" # macOS |
karpenter 동작 확인
# Step 1: Review existing compute resources (optional) kubectl get nodepools NAME NODECLASS NODES READY AGE general-purpose default 1 True 17m # Step 2: Deploy a sample application to the cluster # eks.amazonaws.com/compute-type: auto selector requires the workload be deployed on an Amazon EKS Auto Mode node. cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: inflate spec: replicas: 1 selector: matchLabels: app: inflate template: metadata: labels: app: inflate spec: terminationGracePeriodSeconds: 0 nodeSelector: eks.amazonaws.com/compute-type: auto securityContext: runAsUser: 1000 runAsGroup: 3000 fsGroup: 2000 containers: - name: inflate image: public.ecr.aws/eks-distro/kubernetes/pause:3.7 resources: requests: cpu: 1 securityContext: allowPrivilegeEscalation: false EOF # Step 3: Watch Kubernetes Events kubectl get events -w --sort-by '.lastTimestamp' 2m12s Normal Synced node/i-0aa251960a87d2c37 Node synced successfully 2m8s Normal DisruptionBlocked node/i-0aa251960a87d2c37 Node is nominated for a pending pod 2m8s Normal RegisteredNode node/i-0aa251960a87d2c37 Node i-0aa251960a87d2c37 event: Registered Node i-0aa251960a87d2c37 in Controller 88s Normal Unconsolidatable nodeclaim/general-purpose-9jqb9 Can't replace with a cheaper node 88s Normal Unconsolidatable node/i-0aa251960a87d2c37 Can't replace with a cheaper node 19s Normal Scheduled pod/inflate-b6b45f8d4-6w75w Successfully assigned default/inflate-b6b45f8d4-6w75w to i-0aa251960a87d2c37 19s Normal SuccessfulCreate replicaset/inflate-b6b45f8d4 Created pod: inflate-b6b45f8d4-6w75w 19s Normal ScalingReplicaSet deployment/inflate Scaled up replica set inflate-b6b45f8d4 to 1 18s Normal Pulling pod/inflate-b6b45f8d4-6w75w Pulling image "public.ecr.aws/eks-distro/kubernetes/pause:3.7" 17s Normal Pulled pod/inflate-b6b45f8d4-6w75w Successfully pulled image "public.ecr.aws/eks-distro/kubernetes/pause:3.7" in 1.211s (1.211s including waiting). Image size: 2002080 bytes. 17s Normal Created pod/inflate-b6b45f8d4-6w75w Created container inflate 17s Normal Started pod/inflate-b6b45f8d4-6w75w Started container inflate kubectl get nodes NAME STATUS ROLES AGE VERSION i-0aa251960a87d2c37 Ready <none> 3m15s v1.31.4-eks-0f56d01 |
# custom node pool 생성 : 고객 NodePool : Karpenter 와 키가 다르니 주의! ls ../nodepools cat ../nodepools/graviton-nodepool.yaml kubectl apply -f ../nodepools/graviton-nodepool.yaml --- kind: NodeClass metadata: name: graviton-nodeclass spec: role: automode-cluster-eks-auto-20250321124641255100000002 subnetSelectorTerms: - tags: karpenter.sh/discovery: "automode-demo" securityGroupSelectorTerms: - tags: kubernetes.io/cluster/automode-cluster: owned tags: karpenter.sh/discovery: "automode-demo" --- apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: graviton-nodepool spec: template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: graviton-nodeclass requirements: - key: "eks.amazonaws.com/instance-category" operator: In values: ["c", "m", "r"] - key: "eks.amazonaws.com/instance-cpu" operator: In values: ["4", "8", "16", "32"] - key: "kubernetes.io/arch" operator: In values: ["arm64"] taints: - key: "arm64" value: "true" effect: "NoSchedule" # Prevents non-ARM64 pods from scheduling limits: cpu: 1000 disruption: consolidationPolicy: WhenEmpty consolidateAfter: 30s # kubectl get NodeClass NAME ROLE READY AGE default automode-cluster-eks-auto-20250321124641255100000002 True 21m graviton-nodeclass automode-cluster-eks-auto-20250321124641255100000002 True 8s kubectl get NodePool NAME NODECLASS NODES READY AGE general-purpose default 0 True 64m graviton-nodepool graviton-nodeclass 0 True 3m32s # ls ../examples/graviton cat ../examples/graviton/game-2048.yaml ... resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "200m" memory: "256Mi" automountServiceAccountToken: false tolerations: - key: "arm64" value: "true" effect: "NoSchedule" nodeSelector: kubernetes.io/arch: arm64 ... kubectl apply -f ../examples/graviton/game-2048.yaml # c6g.xlarge : vCPU 4, 8 GiB RAM > 스팟 선택됨! kubectl get nodeclaims NAME TYPE CAPACITY ZONE NODE READY AGE graviton-nodepool-ngp42 c6g.xlarge spot ap-northeast-2b i-0b7ca5072ebf3c969 True 9m48s kubectl get nodeclaims -o yaml ... spec: expireAfter: 336h ... kubectl get cninodes.eks.amazonaws.com kubectl get cninodes.eks.amazonaws.com -o yaml eks-node-viewer --resources cpu,memory kubectl get node -owide kubectl describe node ... Taints: arm64=true:NoSchedule ... Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Fri, 14 Mar 2025 22:53:54 +0900 Fri, 14 Mar 2025 22:37:35 +0900 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 14 Mar 2025 22:53:54 +0900 Fri, 14 Mar 2025 22:37:35 +0900 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Fri, 14 Mar 2025 22:53:54 +0900 Fri, 14 Mar 2025 22:37:35 +0900 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Fri, 14 Mar 2025 22:53:54 +0900 Fri, 14 Mar 2025 22:37:35 +0900 KubeletReady kubelet is posting ready status KernelReady True Fri, 14 Mar 2025 22:57:39 +0900 Fri, 14 Mar 2025 22:37:39 +0900 KernelIsReady Monitoring for the Kernel system is active ContainerRuntimeReady True Fri, 14 Mar 2025 22:57:39 +0900 Fri, 14 Mar 2025 22:37:39 +0900 ContainerRuntimeIsReady Monitoring for the ContainerRuntime system is active StorageReady True Fri, 14 Mar 2025 22:57:39 +0900 Fri, 14 Mar 2025 22:37:39 +0900 DiskIsReady Monitoring for the Disk system is active NetworkingReady True Fri, 14 Mar 2025 22:57:39 +0900 Fri, 14 Mar 2025 22:37:39 +0900 NetworkingIsReady Monitoring for the Networking system is active ... System Info: Machine ID: ec272ed9293b6501bd9f665eed7e1627 System UUID: ec272ed9-293b-6501-bd9f-665eed7e1627 Boot ID: 97c24ba6-d319-4686-abf8-bb62c4f22888 Kernel Version: 6.1.129 OS Image: Bottlerocket (EKS Auto) 2025.3.9 (aws-k8s-1.31) Operating System: linux Architecture: arm64 Container Runtime Version: containerd://1.7.25+bottlerocket Kubelet Version: v1.31.4-eks-0f56d01 Kube-Proxy Version: v1.31.4-eks-0f56d01 # kubectl get deploy,pod -n game-2048 -owide ![]() ![]() |
ingress 설정
# cat ../examples/graviton/2048-ingress.yaml ... apiVersion: eks.amazonaws.com/v1 kind: IngressClassParams metadata: namespace: game-2048 name: params spec: scheme: internet-facing --- apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: namespace: game-2048 labels: app.kubernetes.io/name: LoadBalancerController name: alb spec: controller: eks.amazonaws.com/alb parameters: apiGroup: eks.amazonaws.com kind: IngressClassParams name: params --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: namespace: game-2048 name: ingress-2048 spec: ingressClassName: alb rules: - http: paths: - path: / pathType: Prefix backend: service: name: service-2048 port: number: 80 kubectl apply -f ../examples/graviton/2048-ingress.yaml # kubectl get ingressclass,ingressclassparams,ingress,svc,ep -n game-2048 NAME CONTROLLER PARAMETERS AGE ingressclass.networking.k8s.io/alb eks.amazonaws.com/alb IngressClassParams.eks.amazonaws.com/params 105s NAME GROUP-NAME SCHEME IP-ADDRESS-TYPE AGE ingressclassparams.eks.amazonaws.com/params internet-facing 105s NAME CLASS HOSTS ADDRESS PORTS AGE ingress.networking.k8s.io/ingress-2048 alb * k8s-game2048-ingress2-db993ba6ac-782663732.ap-northeast-2.elb.amazonaws.com 80 105s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/service-2048 NodePort 172.20.194.11 <none> 80:30280/TCP 105s NAME ENDPOINTS AGE endpoints/service-2048 10.20.30.64:80 105s |
# Get security group IDs ALB_SG=$(aws elbv2 describe-load-balancers \ --query 'LoadBalancers[?contains(DNSName, `game2048`)].SecurityGroups[0]' \ --output text) EKS_SG=$(aws eks describe-cluster \ --name automode-cluster \ --query 'cluster.resourcesVpcConfig.clusterSecurityGroupId' \ --output text) echo $ALB_SG $EKS_SG # 해당 보안그룹을 관리콘솔에서 정책 설정 먼저 확인해보자 # Allow ALB to communicate with EKS cluster : 실습 환경 삭제 때, 미리 $EKS_SG에 추가된 규칙만 제거해둘것. aws ec2 authorize-security-group-ingress \ --group-id $EKS_SG \ --source-group $ALB_SG \ --protocol tcp \ --port 80 # 아래 웹 주소로 http 접속! kubectl get ingress ingress-2048 \ -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' \ -n game-2048 k8s-game2048-ingress2-db993ba6ac-782663732.ap-northeast-2.elb.amazonaws.com ![]() |
# 노드 인스턴스 ID 확인 kubectl get node NODEID=<각자 자신의 노드ID> NODEID=i-0b12733f9b75cd835 # 디버그 컨테이너를 실행합니다. 다음 명령어는 노드의 인스턴스 ID에 i-01234567890123456을 사용하며, # 대화형 사용을 위해 tty와 stdin을 할당하고 kubeconfig 파일의 sysadmin 프로필을 사용합니다. kubectl debug node/$NODEID -it --profile=sysadmin --image=public.ecr.aws/amazonlinux/amazonlinux:2023 ------------------------------------------------- bash-5.2# # 셸에서 이제 nsenter 명령을 제공하는 util-linux-core를 설치할 수 있습니다. # nsenter를 사용하여 호스트에서 PID 1의 마운트 네임스페이스(init)를 입력하고 journalctl 명령을 실행하여 큐블릿에서 로그를 스트리밍합니다: yum install -y util-linux-core htop nsenter -t 1 -m journalctl -f -u kubelet htop # 해당 노드(인스턴스) CPU,Memory 크기 확인 ![]() # 정보 확인 nsenter -t 1 -m ip addr nsenter -t 1 -m ps -ef nsenter -t 1 -m ls -l /proc nsenter -t 1 -m df -hT ![]() nsenter -t 1 -m ctr nsenter -t 1 -m ctr ns ls nsenter -t 1 -m ctr -n k8s.io containers ls CONTAINER IMAGE RUNTIME 09e8f837f54d66305f3994afeee44d800971d7a921c06720382948dbdd9c6fab localhost/kubernetes/pause:0.1.0 io.containerd.runc.v2 2e1af1a9ff996505c0de0ee6b55bd8b3fefdaf6579fd6af46e978ad6e2096bae public.ecr.aws/amazonlinux/amazonlinux:2023 io.containerd.runc.v2 ebd4b8805ed0338a0bc6a54625c51f3a4c202641f0d492409d974871db49c476 localhost/kubernetes/pause:0.1.0 io.containerd.runc.v2 f4924417248c24b7eaac2498a11c912b575fc03cef1b8e55cae66772bdb36af5 docker.io/hjacobs/kube-ops-view:20.4.0 io.containerd.runc.v2 ... # (참고) 보안을 위해 Amazon Linux 컨테이너 이미지는 기본적으로 많은 바이너리를 설치하지 않습니다. # yum whatproved 명령을 사용하여 특정 바이너리를 제공하기 위해 설치해야 하는 패키지를 식별할 수 있습니다. yum whatprovides ps ------------------------------------------------- |
kubectl apply -f - <<EOF apiVersion: v1 kind: Pod metadata: name: root-shell namespace: kube-system spec: containers: - command: - /bin/cat image: alpine:3 name: root-shell securityContext: privileged: true tty: true stdin: true volumeMounts: - mountPath: /host name: hostroot hostNetwork: true hostPID: true hostIPC: true tolerations: - effect: NoSchedule operator: Exists - effect: NoExecute operator: Exists volumes: - hostPath: path: / name: hostroot EOF # 파드 확인 : 파드와 노드의 IP가 같다 (hostNetwork: true) kubectl get pod -n kube-system root-shell NAME READY STATUS RESTARTS AGE root-shell 1/1 Running 0 19s kubectl get node,pod -A -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME node/i-0b12733f9b75cd835 Ready <none> 62m v1.31.4-eks-0f56d01 10.20.1.198 <none> Bottlerocket (EKS Auto) 2025.3.9 (aws-k8s-1.31) 6.1.129 containerd://1.7.25+bottlerocket NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system pod/kube-ops-view-657dbc6cd8-2lx4d 1/1 Running 0 62m 10.20.12.0 i-0b12733f9b75cd835 <none> <none> kube-system pod/root-shell 1/1 Running 0 5m29s 10.20.1.198 i-0b12733f9b75cd835 <none> <none> # 호스트패스 : 파드에 /host 경로에 rw로 마운트 확인 kubectl describe pod -n kube-system root-shell ... Mounts: /host from hostroot (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hj5c6 (ro) Conditions: Type Status PodReadyToStartContainers True Initialized True Ready True ContainersReady True PodScheduled True Volumes: hostroot: Type: HostPath (bare host directory volume) Path: / HostPathType: ... # 탈취 시도 kubectl -n kube-system exec -it root-shell -- chroot /host /bin/sh ![]() |