Businesses have recently developed great interest in Container-based architectures as it's more suitable for agile architectures.
The following chart from Grand View Research shows the growth of the application
container market size in the United States.
In this guide; you as Architect or Ops engineer will find your way through migrating to an AWS container service.
- Agility and product improvement over regular instances
- Cost Savings by improving resource utilization and flexible scaling up/down
- Elasticity of scaling up/down according to the business needs.
- Faster Innovation by hosting control plane to the cloud and releasing more resources for innovation
- Deploy globally in minutes via 77 AZs across 24 geographic Regions worldwide
- DevOps driver using managed container technology to support DevOps on the cloud platform which further improves efficiency and increase customer migration pace to containers
- Platform as a Service (PaaS) platform building: build your own containerized platform on the cloud and combine them with your DevOps operation for improved efficiency and flexibility, also reduced complexity by unifying DevOps and Production environments.
- Operations simplification using managed container services to relieve the management and operations burden also gain improved efficiency integrating with deeply integrated services
- Expectation of the same user experience as native Kubernetes with services like EKS which provide the convenience of hosted services plus the freedom of open source.
- Digital transformation which involves the development of digital technologies and support capabilities to create a vibrant digital business model.
- IoT/ML innovation Model training and deployment in the cloud
- Deep integration with AWS leverage the breadth and depth of AWS cloud technologies, including networking, security, and monitoring
- Security and compliance — AWS offers 210 security, compliance and governance-related services and key features, also isolation between containers and granular access permissions per container
- For Serverless computing for containers, use AWS Fargate
- If you want to manage your computing environment for containers, use Amazon EC2 (Elastic Cloud Compute)
- For deeply-integrated AWS container orchestration, use Amazon ECS (Elastic Container Service)
- For managed Kubernetes-in-the-cloud service with zero-refactoring migration, use Amazon EKS (Elastic Kubernetes Service)
- For fully-managed container registry (container images library), use Amazon ECR (Elastic Container Registry)
- For managing microservice architectures across multiple compute infrastructure services (EC2 - ECS - EKS - Fargate), use AWS App Mesh.
Before formulating the migration plan, Architects should evaluate the customer's preperation to the migration process to ensure that it will solve their problems, also to provide the basis for the customer to help make decisions throughout the plan. Following is the aspects of evaluation for the customer's preparation
- Business Target — Targeted benefits from the migration process. Involved rules include business managers, financial managers, budget owners, migration decision makers and stakeholders.
- People — Who from the customer's IT personnel will be involved in the migration process , and staffing demanding for the process according to Technical skills. Also training the tech staff for the new technology. Involved rules include HRs, staffing specialists and people managers.
- Governance — Evaluating involved teams in the process and people in charge of these teams. Also evaluating the final decision maker for the decision chain. It also involves evaluating the effectiveness of PM tools and communication. For the customer, provide a way for measuring project results (e.g. cost reduction ratio). Rules involved include CIOs, PMs and Enterprise Architects.
- Cloud Platform — Customer's familiarity with the AWS cloud platform and basic container services within it.
- Container Platform — Customer's familiarity with the AWS cloud container platform and their related skill set.
- Assessment Method — Consider the AWS APN Immersion Day tool.
- Proof of Concept (PoC) — Understand and assess customer's familiarity with the AWS platform and services through a simple PoC. After the demo, the customer needs to know the basics of EKS and the difference between it and other services/platforms, also the best method to understand EKS.
- Security Compliance — Evaluate the customer requirements on AWS infrastructure security, IAM rules and RBAC such as node IAM, cluster IAM and Pod execution. Also evaluate requirements for service accounts the customer needs and its minimum permissions.
- Monitoring — Metrics, tooling and methodology to build the monitoring system.
- Alarm — Customer's requirements for alarms, alarm indicators and impact of alarms on business.
- Analysis — Customer's analysis requirements in the container operations field, such as attacks and error root cause analysis.
- Release Management — Tools and process used by the customer for release management and the migration method or optimization plan for release through the evaluation.
- Disaster Tolerance — Customer's disaster tolerance requirements and recovery plan.
This graph shows how difficult should the migration project will be depending on two factors: Platform operation capabilities (which affects customer's use of the new container platform during and after migration) and the source tech stack. (which affects difficulty and workload of the migration. Also the operation team capabilities and functions determines whether the customer can achieve the migration goal.
- Monitoring — Monitoring methods change from typical servers to containers and services.
- Logging — The need to know which container from which host to collect logs from makes it difficult especially by the increase, decrease and movement of containers.
- Troubleshooting — Difficulty to analyze container failures by adopting past behavior.
- Security — The impact of rapid development of the community version on security guarantees. Permission management poses new challenges to operations.
- Network — Increased difficulty of network planning and design.
- Compatible K8s — Migrating from K8s cluster built on AWS or other cloud providers through tools like Kubeadmin to Amazon EKS. Easiest because of infrastructure consistency and similar tooling. Some considerations in case of network plugins, Ingress and image repos.
- Variant of K8s — K8s cluster provided by third-party platforms like RedHat OpenShift. Differences from its deployment from the typical K8s cluster adds to the difficulty of the migration process
- Heterogenous container orchestration engine — Migrating from other orchestration stack than Kubernetes. Huge difference in design concepts and implementation adds to the difficulty of the migration process
- Containerization — Migrating from server deployment to containerized application. This introduces three main risks which are lack of developer support in the customer's IT team, lack of deployment instruction documents and customer's micro-service requirements, so it's considered the most technically difficult.
This stage has 5 main goals:
- Investigate the goals of migration
- Build the migration team
- Assign roles and responsibilities
- Evaluate migration methods
- Formulate migration project plan
Understanding customer's technical and business goals and migration targets through questionnaires and interviews
Discover Business Information (DBI)
- What is the target application system for migration? Which business unit does it belong to? Do they have any important business activities recently?
- Migration cycle — When does it start? What is the time span? Is there a clear deadline?
- Migration expectations — What is the goal to achieve? Does the customer have any clear metrics?
- Personnel — What is the number of personnel responsible for the migration? What kind of skills do they possess? Which modules are they responsible for?
- Cost — What is the labor cost? Migration cost? Dual environment parallel run cost? Target cluster planning cost?
Discover Technical Information (DTI)
- Where is the platform of the source cluster?
- What is the source cluster’s actual usage of computing, storage, and network resources?
- Is the application stateful or stateless?
- What are the dependencies among applications?
- What is the technology stack used by the platform where the source cluster is located?
- Is there any source container cluster-specific information?
Prepare a research report including the following information about selecting migration methods and output solutions (not limited)
- Cluster information
- Image repositories
- Log collection subsystem
- Monitoring subsystem
- CI/CD subsystem
- Business impact
Depending on the container migration maturity model (People - Tooling - Source platform), you can recommend a suitable migration method to the customer. Here we discuss the migration methods of different Source platforms mentioned before
- Compatible K8s — If the source cluster is on AWS, it's easy to migrate through AWS tools, no matter stateless or stateful. If on another cloud platform, it's easy to migrate it with AWS tools if stateless. If stateful, you'll need another third-party partner software
- Variant Kubernetes — Depends on source platform
- Heterogenous container orchestration engine — From the design concept to the specific deployment, different container orchestration engines differ a lot, so this type of migration project becomes very complicated.
- Containerization — The most complex, requires refactoring of the whole containerized architecture.
- With CI/CD system — Configuring and labeling network and working nodes enables you to automate release processing targeted to EKS.
Formulating guidance plan for the migration process. Using project management best practices and agile delivery, include the following in your plan
- Review project management methods, tools, and capabilities for gap analysis.
- Define project management methods and tools, and how to use them in the project.
- Define project communication methods and problem escalation mechanisms.
- Develop a project task scheduling table, and clarify project risks and solutions.
- Decide the composition of the migration team, and clarify the responsibilities of the team.
- Outline the resources and costs required to migrate the target environment to AWS.
For technical planning, include the following
- Discover the application dependency, which is critical for project prioritization and planning.
- Clarify the migration priority of the applications, and select the appropriate application system for migration verification.
Following configuration will be predefined to you when you use AWS Landing Zone
- Account structure — Initial multi-account structure with baseline security
- Network structure — Basic network configurations for network isolation, connection between AWS and the local network and user-configurable network access and management options. You should still plan for the EKS Pod IP bool based on the characteristics of the CNI network plug-in
- Account security baseline — Settings for AWS CloudTrail - Config - IAM, Cross account access and Amazon VPC - GuardDuty
- AWS user access management — Provide a framework for cross account user IAM based on Microsoft Active Directory, centralized cost mangement and reporting. Creation and management of users and permissions for the Amazon EKS cluster.
This is a group of people experiences with AWS and Amazon EKS experience whom you should train to lead the migration process. You should also design how the CCoE team will lead and perform the migration task.
- AWS Basic Environment management — Customers must operate and maintain computing, storage, network, and permissions with managed services to reduce the workload.
- Container cluster operations — Worker node management (managed and unmanaged), worker node upgrade methods, dynamic scaling of work nodes, Pod capacity management, application deployment, and so on.
- Monitoring — Monitoring the status of hosts, pods and application servers
- Logging — Collecting and processing logs of hosts and pods.
- Release management — Version control, CI/CD (DevOps)
- Change management — Deployment and process description of the change management tools to manage changes in the original process throughout the migration process
According to customer's needs and following these best practices
- For cloud infrastructure security, see Secure Cloud Computing Architecture (SCCA) on AWS GovCloud (US)
- IAM Roles — User roles, resource roles, and Pod roles
- Managed or unmanaged Node Groups
- Control SSH login
- EC2 security group — Security group reference and port opening between services
- Network isolation — VPC, subnet, AWS PrivateLink, VPC peering
- Restrict network access to API server endpoint
- Open the private endpoint of the API server
- Protection service load balancers — Network Load Balancer (NLB) or Application Load Balancer (ALB)? ALB ingress or Nginx ingress?
- Build secure images — Content addressable image identifier (CAIID)
- Use vulnerability scanning — Images scanner tools
- Image Repository — Use private image repository
Runtime Security — Restricting Pod permissions
- Namespace — Provide scoping for cluster objects; allow fine-grained cluster object management
- RBAC — Manage the authorization of the cluster based on the least privilege principle with periodic audits to protect customers from external threats and internal misconfiguration or accidents.
- Restrict the runtime permissions — Minimizing capabilities of the running containers to protect from malicious and misbehaving containers
- Pod security strategy — Enforcing K8s and EKS security best practices (e.g. Not running as root - not sharing host node's process or network space - enforcing SELinux)
Implementing your migration plan with simple application migration requested by the customer to try migration experience. This stage is preceded by designing 4 plans so far : Migration plan (AWS architecture - app architecture - operations process), testing plan, Cutover plan and Rollback plan in case of unsuccessful cutover. The CCoE team should lead the migration process and you can also use automated tooling. It's good to set a checklist (customer-specific) to confirm migration completion before cutover.
Test migration before cutover
- Functional verification
- Performance verification
- Disaster Recovery
Switch the transaction flow to the new system with close watching. If any abnormal behavior was detected, run the rollback plan. This process requires the playbook and the runbook being output before hand. It also needs exercising before performing, and extensive CCoE team support because of their wide experience with diverse migration teams.