DEV Community

Cover image for aws app mesh with cdk in 5 steps
Constantin
Constantin

Posted on • Updated on

aws app mesh with cdk in 5 steps

note:

if you want to skip reading and just want to see the finished code visit my GitHub

prerequisites:

  • docker installed
  • node and npm installed
  • AWS account
  • a few spare bucks (AWS is not for free)

Introduction:

what's fargate

I'm not copy-pasting the marketing description here. What is interesting in the end is how it affects your day-to-day job.
The most important features are:

  • host your Docker images at ease
  • utilize AWS cloud-native features
  • don't waste time in managing EC2 instances
  • quick deployment and advanced rollout features when using aws-code-deploy

what does a service mesh for you:

A service mesh helps you with:

  • service discovery (routing between Microservices),
  • streamlined logging and tracing between Microservices
  • help with resilience by routing traffic away from failed instances (usually you'd implement this with a load balancer).
  • enables you to easily implement features like canary deployments and blue/green testing

how does it do that?

These features are archived by deploying a managed proxy alongside your application which intercepts traffic from and to your application. The proxy takes care of fetching the mesh configuration, routing the requests, and registering your instance to the mesh. The deployment is often referred to as 'sidecar' since you have one proxy per service instance.

how is it supposed to work in amazon

Amazon supports Service-Meshes by giving you access to a managed envoy-proxy hosting.

What you need to do:

  • you deploy an envoy proxy as a side-car image next to your application container.
  • you utilize the fargate-proxy config and reroute traffic to the envoy-proxy
  • you define virtual services/gateways/routes to stitch your mesh together

terminology in AWS

  • virtual node: represents a compute instance like a fargate container
  • virtual service: represents a logical group (1..n) of nodes
  • virtual route: uses parameter (HTTP/GRPC) to route traffic to specific nodes/services
  • virtual gateway: specifies an incoming or outgoing gateway from app-mesh

AWS Showcase:
Aws showcase

resources

some great resources I used to build my service mesh are:

Step 1, preparation.

I've prepared two microservices with a small react app that polls two endpoints.

BFF Architecture

let's clone the GitHub repository

git clone git@github.com:jaecktec/aws-cdk-appmesh-example.git
cd aws-cdk-appmesh-example
Enter fullscreen mode Exit fullscreen mode

Quick walkthrough:

infrastructure folder

contains AWS CDK infrastructure such as:

  • fargate cluster, services and task-defs
  • private DNS namespace
  • ALB used to host a small web app that accesses a non-exposed microservice through a gateway.

services

contains some microservices alongside Dockerfiles

services/color-teller-backend

rudimentary expressjs service which returns the environment variable 'COLOR' on [GET]/color and 200 on [GET]/health

services/color-teller-client

small react-app whith polls /gateway/color/color and /version and a expressjs services that exposes two endpoints, one /gateway/color/* which is a http-proxy to an endpoint defined in an env-variable called COLOR_BACKEND + a version endpoint which returns the env-variable called VERSION

test your prerequisites

open the infrastructure folder:

cd infrastructure
Enter fullscreen mode Exit fullscreen mode

and run

npm i
npx cdk deploy
Enter fullscreen mode Exit fullscreen mode

this might take a few seconds, once you're asked to confirm, confirm by entering y and hit enter.

once everything got deployed you should see an HTTP address, open this in the browser. If everything worked you should see a Color Teller showing vanilla red

Sample image of deployed app

to not hit conflicts, let's clean up:

npx cdk destroy
Enter fullscreen mode Exit fullscreen mode

Feel free to browse through the code but please don't judge me on the microservices, they were hacked together in a local coffee shop (not ⭐️🪣, I'm not that wealthy)

Step 1: create mesh

In our first step, we will create our Mesh and modify createService to accept and use the mesh.

before the first createService call add

const mesh = new Mesh(this, 'mesh');
Enter fullscreen mode Exit fullscreen mode

this will create an app-mesh for us

Step 2: create a virtual node

now modify the signature of createService to accept a Mesh and return a VirtualNode:

// ...
private createService(
  id: string,
  cluster: Cluster,
  image: ContainerImage,
  namespace: PrivateDnsNamespace,
  mesh: Mesh,
  envOverwrite: { [key: string]: string } = {},
): {
  service: FargateService,
  port: number,
  privateDnsName: string,
  node: VirtualNode,
} {
// ...
Enter fullscreen mode Exit fullscreen mode

now inside the createService function we can create our VirtualNode and return it

const bgNode = new VirtualNode(this, `${id}-virtual-node`, {
  mesh,
  virtualNodeName: `${id}-virtual-node`,
  accessLog: AccessLog.fromFilePath('/dev/stdout'),
  serviceDiscovery: ServiceDiscovery.cloudMap(service.cloudMapService!),
  listeners: [VirtualNodeListener.http({
    port: appPort,
    healthCheck: HealthCheck.http({
      path: 'health', // no forward slash, this makes them appear in x-ray later on... this took me 3 hours to figure out (╯°□°)╯︵ ┻━┻
    }),
  })],
});

return {
  service,
  port: appPort,
  privateDnsName: `${id}.${namespace.namespaceName}`,
  node: bgNode
}
Enter fullscreen mode Exit fullscreen mode

Step 3: attach an envoy-sidecar image

in addition we need to add a new containerDefinition for the envoy-proxy. AWS provides the envoi-proxy image under region-specific paths, however, I chose to use the global image.
A thing that is easy to miss: We need to add the managed policy AWSAppMeshEnvoyAccess to the taskRole

const virtualNodeName = `${id}-virtual-node`;
const envoyContainer = taskDef.addContainer('envoy', {
  image: ContainerImage.fromRegistry('public.ecr.aws/appmesh/aws-appmesh-envoy:v1.19.1.0-prod'),
  essential: true,
  environment: {
    'APPMESH_RESOURCE_ARN': `arn:aws:appmesh:${Aws.REGION}:${Aws.ACCOUNT_ID}:mesh/${mesh.meshName}/virtualNode/${virtualNodeName}`,
  },
  healthCheck: {
    command: [
      'CMD-SHELL',
      'curl -s http://localhost:9901/server_info | grep state | grep -q LIVE',
    ],
    interval: Duration.seconds(5),
    timeout: Duration.seconds(2),
    startPeriod: Duration.seconds(10),
    retries: 3,
  },
  user: '1337', // important later for the proxy config
  logging: new AwsLogDriver({
    streamPrefix: `${id}/envoy/`,
  }),
});
taskDef.taskRole.addManagedPolicy(
  ManagedPolicy.fromAwsManagedPolicyName('AWSAppMeshEnvoyAccess'),
);
envoyContainer.addUlimits({ name: UlimitName.NOFILE, hardLimit: 15000, softLimit: 15000 });
appContainer.addContainerDependencies({
  container: envoyContainer,
});
Enter fullscreen mode Exit fullscreen mode

now we have our envoy-sidecar deployed. However, the traffic needs to be routed through the proxy so it can do its job. For that, we will add a proxyConfiguration to the TaskDef

proxyConfiguration: new AppMeshProxyConfiguration({
  containerName: 'envoy',
  properties: {
    ignoredUID: 1337, // user from envoy-container
    appPorts: [ appPort ],
    proxyIngressPort: 15000,
    proxyEgressPort: 15001,
    egressIgnoredIPs: [ '169.254.170.2', '169.254.169.254' ],
  },
}),
Enter fullscreen mode Exit fullscreen mode

Step 4: create virtual service

In this step, we will create our virtual service and add some routing. we could not have the router however, it enables you easily modify it to tinker around with routing

private createVirtualService(
  serviceName: string,
  mesh: Mesh,
  backendNode: VirtualNode,
  namespace: PrivateDnsNamespace,
): VirtualService {
  const router = new VirtualRouter(this, `${serviceName}-virtual-router`, {
    mesh,
    listeners: [VirtualRouterListener.http(3000)],
  });

  router.addRoute('default', {
    routeSpec: RouteSpec.http({
      weightedTargets: [{
        weight: 1,
        virtualNode: backendNode,
      }],
    }),
  });

  const service = new VirtualService(this, serviceName, {
    virtualServiceProvider: VirtualServiceProvider.virtualRouter(router),
    virtualServiceName: `${serviceName}.${namespace.namespaceName}`,
  });
  // https://docs.aws.amazon.com/app-mesh/latest/userguide/troubleshoot-connectivity.html#ts-connectivity-dns-resolution-virtual-service
  new Service(this, `${serviceName}-dummy-service`, {
    namespace,
    name: serviceName,
    dnsRecordType: DnsRecordType.A,
    description: 'The dummy for App Mesh',
  }).registerIpInstance('dummy-instance', { ipv4: '10.10.10.10' });

  return service;
}
Enter fullscreen mode Exit fullscreen mode

and call the method right after creating the createService invocation.

// [...]
const backendVService = this.createVirtualService('color-service', mesh, backendNode, namespace);
// [...]
Enter fullscreen mode Exit fullscreen mode

now our service will be available (from a meshified node) under the DNS name color-service.service.local

next step we also need to update the environment variable in our gateway service so it knows where to find our color-backend:

COLOR_BACKEND: `http://${backendVService.virtualServiceName}:${backendPort}`,
Enter fullscreen mode Exit fullscreen mode

and we need to tell the virtual node of the gateway service that we require to access the service
(you also need to name the node attribute from the creteService function)

// [...]
const { service: clientFGService, port: appPort, node: clientNode } = this.createService(
      'client-1',
// [...]
clientNode.addBackend(Backend.virtualService(backendVService));
// [...]
Enter fullscreen mode Exit fullscreen mode

now let's deploy the stack and check our result:

npx cdk deploy
Enter fullscreen mode Exit fullscreen mode

The routing will be taken care of by app-mesh. Theoretically, we're done however, the ALB still directly routs the traffic to one of the microservices, so technically it is exposed (and we don't get normalized metrics for it).

Step 5: create an ingress gateway

In this step, we will deploy an envoy-proxy as the ingress gateway. We just need to deploy envoy-proxy as the main container, expose the application port (8080) and configure it as a VirtualGateway

These are a lot of steps that are just a tutorial of deploying an app as a fargate-container.
Important here is:

  • taskRole needs managed policy AWSAppMeshEnvoyAccess
  • app port is 8080 however, the health check port is 9901. That means, we need to explicitly allow communication on 9901 between ALB and the APP
  • health url is /server_info (for ALB)
private createIngressService(
  cluster: Cluster,
  listener: ApplicationListener,
  mesh: Mesh,
): {
  gateway: VirtualGateway,
  fgService: FargateService,
} {
  const port = 8080;
  const gateway = new VirtualGateway(this, 'virtual-gateway', {
    mesh,
    listeners: [VirtualGatewayListener.http({ port })],
  });

  const taskDef = new FargateTaskDefinition(this, 'ingress-fargate-task-def', {
    memoryLimitMiB: 512,
  });

  const container = taskDef.addContainer('ingress-app', {
    // most up-to-date envoy image at the point of writing the article
    image: ContainerImage.fromRegistry('public.ecr.aws/appmesh/aws-appmesh-envoy:v1.19.1.0-prod'),
    logging: new AwsLogDriver({ streamPrefix: 'ingress-app-' }),
    portMappings: [
      { containerPort: port },
      { containerPort: 9901 }, // for health check
    ],
    healthCheck: {
      // health check from Documentation
      command: ['CMD-SHELL', 'curl -s http://localhost:9901/server_info | grep state | grep -q LIVE || exit 1'],
      interval: Duration.seconds(5),
      timeout: Duration.seconds(2),
      startPeriod: Duration.seconds(10),
      retries: 3,
    },
    environment: {
      APPMESH_VIRTUAL_NODE_NAME: `mesh/${mesh.meshName}/virtualGateway/${gateway.virtualGatewayName}`, // important, otherwhise envoiy can't fetch the config
      AWS_REGION: Aws.REGION,
    },
    memoryLimitMiB: 320, // limit examples from the official docs
    cpu: 208, // limit examples from the official docs
  });
  // limit examples from the official docs
  container.addUlimits({
    name: UlimitName.NOFILE,
    hardLimit: 1024000,
    softLimit: 1024000,
  });
  const service = new FargateService(this, 'ingress-service', {
    cluster,
    assignPublicIp: true, // for public vpc
    minHealthyPercent: 0, // for zero downtime rolling deployment set desiredcount=2 and minHealty = 50
    desiredCount: 1,
    taskDefinition: taskDef,
    vpcSubnets: { subnetType: SubnetType.PUBLIC },
    securityGroups: [new SecurityGroup(this, 'ingress-default-sg', {
      securityGroupName: 'ingress-fargate-service',
      vpc: cluster.vpc,
    })],
  });
  listener.addTargets('ingress-gateway-target', {
    port,
    protocol: ApplicationProtocol.HTTP,
    healthCheck: {
      protocol: ELBProtocol.HTTP,
      interval: Duration.seconds(10),
      // health port from aws-envoy docs
      port: '9901',
      // health check path from aws-envoy docs
      path: '/server_info',
    },
    targets: [service],
    deregistrationDelay: Duration.seconds(0), // not needed, just speeds up the deployment for this example
  });

  // required so the ALB can reach the health-check endpoint
  service.connections.allowFrom(listener, Port.tcp(9901));
  // required so the service can fetch the documentation
  taskDef.taskRole.addManagedPolicy(ManagedPolicy.fromAwsManagedPolicyName('AWSAppMeshEnvoyAccess'));

  return {gateway, fgService: service};
}
Enter fullscreen mode Exit fullscreen mode

Lastly, we just need to remove the ALB ListenerTarget pointing to our clientService. The call of the function createIngressService will attach a ListenerTarget pointing to our gateway.

for this we quickly create a virtual service for our clientService as well:

const clientVService = this.createVirtualService('client-service', mesh, clientNode, namespace);
Enter fullscreen mode Exit fullscreen mode

create the ingress service with our previously created method:

const {gateway: virtualGateway, fgService: virtualGatewayFGService} = this.createIngressService(cluster, listener, mesh);
Enter fullscreen mode Exit fullscreen mode

and point the VirtualGateway to it:

const clientVRouter = new VirtualRouter(this, 'client-virtual-router', {
  mesh,
  listeners: [VirtualRouterListener.http(3000)],
});

clientVRouter.addRoute('default', {
  routeSpec: RouteSpec.http({
    weightedTargets: [{
      weight: 1,
      virtualNode: clientNode,
    }],
  }),
});

virtualGateway.addGatewayRoute('web-route', {
  routeSpec: GatewayRouteSpec.http({
    routeTarget: clientVService,
    match: {
      path: HttpGatewayRoutePathMatch.startsWith('/'),
    },
  }),
});
Enter fullscreen mode Exit fullscreen mode

Bonus - AWS-X-Ray:

using x-ray is very easy, we only need to deploy an x-ray sidecar image and enable envoy x-ray:

First, let's modify both createService and createIngressService to deploy the x-ray sidecar image and grant XRay policies to the taskRole

taskDef.addContainer(`${id}-xray`, {
  image: ContainerImage.fromRegistry('public.ecr.aws/xray/aws-xray-daemon:latest'),
  essential: false,
  memoryReservationMiB: 256,
  environment: {
    AWS_REGION: Aws.REGION,
  },
  user: '1337', // X-Ray traffic should not go through Envoy proxy
  logging: new AwsLogDriver({
    streamPrefix: id + '-xray-',
  }),
  portMappings: [ {
    containerPort: 2000,
    protocol: Protocol.UDP,
  } ],
});
Enter fullscreen mode Exit fullscreen mode

and

taskDef.addContainer(`ingress-xray`, {
  image: ContainerImage.fromRegistry('public.ecr.aws/xray/aws-xray-daemon:latest'),
  essential: false,
  memoryReservationMiB: 256,
  environment: {
    AWS_REGION: Aws.REGION,
  },
  user: '1337', // X-Ray traffic should not go through Envoy proxy
  logging: new AwsLogDriver({
    streamPrefix: 'ingress-xray-',
  }),
  portMappings: [ {
    containerPort: 2000,
    protocol: Protocol.UDP,
  } ],
});
taskDef.taskRole.addManagedPolicy(
  ManagedPolicy.fromAwsManagedPolicyName('AWSXRayDaemonWriteAccess'),
);
Enter fullscreen mode Exit fullscreen mode

(note: only that the id property got replaced by a constant)

now we need to enable x-ray on all the envoy containers by adding the env var 'ENABLE_ENVOY_XRAY_TRACING': '1' to the envoy-sidecar image (right underneath APPMESH_RESOURCE_ARN)

Finally, let's run

npx cdk deploy
Enter fullscreen mode Exit fullscreen mode

Open the website and wait a minute.
Now let's go to the x-ray console and check the results:

Alt Text

What's next

Now that you have your mesh configured: What can you do with it?

First:

You could deploy multiple instances (virtual nodes / fg services) and play around with the routing.
For example, you can route by cookie or play around with the weights and see what happens.

second: I intentionally put everything into one file, so it's easy to browse and to understand, but it screams for a refactoring :)

third:
if you want to access a 3rd party API you can also add monitoring for that by providing a virtual service. I haven't tested this, however the docs mention it.

Discussion (6)

Collapse
flameoftheforest profile image
Bryan Chong

Hi,

I can't do git clone git@github.com:jaecktec/aws-cdk-appmesh-example.git.
It seems to me the repo isn't public?

I nav to https://github.com/search?q=jaecktec and there's only 1 AndrXsExample available.

Can you please make the repo public?

Thanks!

Collapse
jaecktec profile image
Constantin Author

github.com/jaecktec/aws-cdk-appmes...

The repo is public, I just checked..
Not sure why you can't clone it

Collapse
flameoftheforest profile image
Bryan Chong

Many thanks!

Collapse
jaecktec profile image
Constantin Author

BTW the search only looks in descriptions, you can easily find my repos by entering github.com/jaecktec

Collapse
mmuller88 profile image
Martin Muller

Cool stuff. You might want to put the post under cdk tag as well :)

Collapse
jaecktec profile image
Constantin Author

Done :)