DEV Community

Cover image for I used Cypress as an Xbox web scraper and I regret nothing
Anna
Anna

Posted on • Edited on

I used Cypress as an Xbox web scraper and I regret nothing

Like many people, I would like to get my hands on the new Xbox. And like everyone but the most diligent online shoppers, I have so far failed in my efforts to do so, and have instead been relentlessly greeted by images such as this one:

costco

So what does an enterprising/desperate web developer do? Build their own alert system, of course!

Now, a web scraper is a pretty simple application and generally the ideal use case for this sort of thing. But I wanted to add a visual element to it, to make sure I wasn't getting false positives, and because I tend to prefer user interfaces over bare code (I do work at Stackery, after all). Also, I've been playing with the Cypress test suite for the past month or so, and absolutely love it for frontend testing, so I've been looking for more ways to implement it in my projects.

Now, I should say: I'm guessing this is not exactly the use case the devs at Cypress.io had in mind when they built the browser-based testing library, but as the famous saying goes, "You can invent a hammer, but you can't stop the first user from using it to hit themselves in the head1".

So without further ado, let's hit ourselves in the proverbial head and get that Xbox!

Setup: get yourself a Cypress account

Cypress has a very neat feature that allows you to view videos from your automated test runs in their web app. In order to do so, you'll need a free developer account:

  1. Go to the Cypress sign-up page and create an account
  2. Once you're in their dashboard, go ahead and create a new project. Name it "Xbox stock scraper", "testing abomination", or whatever you'd like. I generally name my projects the same as my repo, because that's how my brain works
  3. Now, you'll want to take note of the projectId as well as the record key, as you'll need this later

Create a serverless stack for your scraper

Because store inventories changes frequently, we'll want to run our scraper regularly - every hour to start, though it's easy to adjust that up or down as you see fit. Of course, we want to automate these runs, because the whole point is that you have a life and are trying to avoid refreshing web pages on the reg. Is it me, or is this starting to sound like an ideal serverless use case? Not just me? Thought so!

I originally wanted to run the whole thing in a Lambda, but after an hours-long rabbit-hole, I found out that's really, really hard, and ultimately not worth it when a CodeBuild job will do the trick just fine.

I'll be using Stackery to build my stack, so these instructions go through that workflow. This part is optional, as you can also do this in the AWS Console, but I like doing things the easy way, and Stackery is serverless on easy mode2.

  1. If you don't already have one, create a free Stackery account
  2. Navigate to /stacks, and click the Add a Stack dropdown arrow to select With a new repo. Here's what that looks like for me:
    xbox-1

  3. Normally, you'd add resources one by one in the Design Canvas, but as this stack is mainly based on a CodeBuild job and related roles, it's easier to copy-pasta an AWS SAM template like so:

xbox-compressed

Under Edit Mode, click Template, clear out the existing template, and paste the following:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
  SendMessage:
    Type: AWS::Serverless::Function
    Properties:
      FunctionName: !Sub ${AWS::StackName}-SendMessage
      Description: !Sub
        - Stack ${StackTagName} Environment ${EnvironmentTagName} Function ${ResourceName}
        - ResourceName: SendMessage
      CodeUri: src/SendMessage
      Handler: index.handler
      Runtime: nodejs12.x
      MemorySize: 3008
      Timeout: 30
      Tracing: Active
      Policies:
        - AWSXrayWriteOnlyAccess
        - SNSPublishMessagePolicy:
            TopicName: !GetAtt XboxAlert.TopicName
      Events:
        EventRule:
          Type: EventBridgeRule
          Properties:
            Pattern:
              source:
                - aws.codebuild
              detail-type:
                - CodeBuild Build State Change
              detail:
                build-status:
                  - SUCCEEDED
                  - FAILED
                project-name:
                  - cypress-xbox-scraper
          Metadata:
            StackeryName: TriggerMessage
      Environment:
        Variables:
          TOPIC_NAME: !GetAtt XboxAlert.TopicName
          TOPIC_ARN: !Ref XboxAlert
  CodeBuildIAMRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          Effect: Allow
          Principal:
            Service: codebuild.amazonaws.com
          Action: sts:AssumeRole
      RoleName: !Sub ${AWS::StackName}-CodeBuildIAMRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/AdministratorAccess
  CypressScraper:
    Type: AWS::CodeBuild::Project
    Properties:
      Artifacts:
        Type: NO_ARTIFACTS
      Description: Cypress Xbox Scraper
      Environment:
        ComputeType: BUILD_GENERAL1_SMALL
        Image: aws/codebuild/standard:2.0
        Type: LINUX_CONTAINER
        PrivilegedMode: true
      Name: cypress-xbox-scraper
      ServiceRole: !Ref CodeBuildIAMRole
      Source:
        BuildSpec: buildspec.yml
        Location: https://github.com/<github-user>/<repo-name>.git
        SourceIdentifier: BUILD_SCRIPTS_SRC
        Type: GITHUB
        Auth:
          Type: OAUTH
  CypressScraperTriggerIAMRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          Effect: Allow
          Principal:
            Service:
              - events.amazonaws.com
          Action: sts:AssumeRole
      Policies:
        - PolicyName: TriggerCypressScraperCodeBuild
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - codebuild:StartBuild
                  - codebuild:BatchGetBuilds
                Resource:
                  - !GetAtt CypressScraper.Arn
      RoleName: !Sub ${AWS::StackName}-CypressScraperTriggerRole
  TriggerScraper:
    Type: AWS::Events::Rule
    Properties:
      ScheduleExpression: rate(1 hour)
      State: ENABLED
      RoleArn: !GetAtt CypressScraperTriggerIAMRole.Arn
      Targets:
        - Arn: !GetAtt CypressScraper.Arn
          Id: cypress-xbox-scraper
          RoleArn: !GetAtt CypressScraperTriggerIAMRole.Arn
  XboxAlert:
    Type: AWS::SNS::Topic
    Properties:
      TopicName: !Sub ${AWS::StackName}-XboxAlert
Parameters:
  StackTagName:
    Type: String
    Description: Stack Name (injected by Stackery at deployment time)
  EnvironmentTagName:
    Type: String
    Description: Environment Name (injected by Stackery at deployment time)
Enter fullscreen mode Exit fullscreen mode

Let's break this down a bit. For those new to serverless, this is an AWS SAM template. While using Stackery means you generally can avoid writing template files, there are a few things worth noting, and one line you'll need to input your own data into.

We'll start with lines 55-74:

  CypressScraper:
    Type: AWS::CodeBuild::Project
    Properties:
      Artifacts:
        Type: NO_ARTIFACTS
      Description: Cypress Xbox Scraper
      Environment:
        ComputeType: BUILD_GENERAL1_SMALL
        Image: aws/codebuild/standard:2.0
        Type: LINUX_CONTAINER
        PrivilegedMode: true
      Name: cypress-xbox-scraper
      ServiceRole: !Ref CodeBuildIAMRole
      Source:
        BuildSpec: buildspec.yml
        Location: https://github.com/<github-user>/<repo-name>.git
        SourceIdentifier: BUILD_SCRIPTS_SRC
        Type: GITHUB
        Auth:
          Type: OAUTH
Enter fullscreen mode Exit fullscreen mode

This is the CodeBuild project that will be created to run Cypress in a Linux container in one of AWS's magical server estates. You'll need to replace line 70 with the Git repo you just created. This also means you may need to authenticate your Git provider with AWS, but I'll walk you through that a bit later.

Line 101 is where you can change the frequency at which messages are sent. Learn more about AWS schedule expressions here.

Now, if you switch back to Visual mode, you'll see that several resources were just auto-magically populated from the template:

xbox-3

They include:

  • TriggerScraper: The CloudWatch event rule that triggers the Cypress CodeBuild job every hour
  • TriggerMessage: The EventBridge Rule that triggers the SendMessage function once the CodeBuild job succeeds or fails
  • SendMessage: The Lambda function that sends a the SNS message if Xboxes are back in stock
  • XboxAlert: The SNS topic for sending SMS messages

You can double-click each resource to see its individual settings.

Look at that: a whole backend, and you didn't even have to open the AWS Console!

  1. Hit the Commit... button to commit this to your Git repo, then follow the link below the stack name to your new repo URL, clone the stack locally, and open it in your favorite VSCode (or other text editor, if you must)

xbox-2

To the code!

As you can see, Stackery created some directories for your function, as well as an AWS SAM template you'll be able to deploy. Thanks, Stackery!

First we'll want to add Cypress:

  1. From the root of your repo, run npm install cypress --save
  2. Once it's installed, run ./node_modules/.bin/cypress open.

Cypress will create its own directory, with a bunch of example code. You can go ahead and delete cypress/integration/examples and create cypress/integration/scraper.spec.js. Here's what will go in there:

// xbox-stock-alert/cypress/integration/scraper.spec.js

describe('Xbox out-of-stock scraper', () => {
  it('Checks to see if Xboxes are out of stock at Microsoft', () => {
    cy.visit('https://www.xbox.com/en-us/configure/8WJ714N3RBTL', {
      headers: {
        "Accept-Encoding": "gzip, deflate",
        "keepAlive": true
      }
    });
    cy.get('[aria-label="Checkout bundle"]')
      .should('be.disabled')
  });
});
Enter fullscreen mode Exit fullscreen mode

Let's break that down:

  1. Cypress will visit a specific URL - in this case, it's the product page of the Xbox Series X console
  2. The added headers allow the page to actually load without the dreaded ESOCKETTIMEDOUT error (I found this out the hard way, so you don't have to!)
  3. Cypress looks for an element with the aria-label "Checkout bundle" and checks if it's disabled. If it is, the test ends and it is considered successful. If it isn't, the test ends as a failure (but we all know it tried really, really hard)

Now, why the specific "Checkout bundle" element? Well, if you go to the Xbox page in your browser and inspect it, you'll see that it's actually the checkout button that would be enabled were the Xbox in stock:

Checkout button

Let's automate this sh*t!

Ok, we've got our test, and we've got a chron timer set to run once an hour. Now we need to add the CodeBuild job that actually runs this test. We also need to add code to our SendMessage function that notifies us if the test failed, meaning the checkout button is enabled and we're one step closer to new Xbox bliss.

Remember that Cypress projectId and record key you noted forever ago? Here's where those come in.

Create a new file in the root directory called buildspec.yml and add the following and save3:

version: 0.2
phases:
  install:
    runtime-versions:
      nodejs: 10
  build:
    commands:
      - npm install && npm run cypress -- --headless --browser electron --record --key <your-record-key>
Enter fullscreen mode Exit fullscreen mode

Open up cypress.json and replace it with the following and save:

{
  "baseUrl": "https://www.xbox.com/en-us/configure/8WJ714N3RBTL",
  "defaultCommandTimeout": 30000,
  "chromeWebSecurity": false,
  "projectId": "<your-projectId>"
}
Enter fullscreen mode Exit fullscreen mode

Next, we'll add the function code that sends an alert should the test fail. Open up src/SendMessage/index.js and replace it with the following:

// xbox-stock-alert/src/SendMessage/index.js

const AWS = require('aws-sdk');
const sns = new AWS.SNS({region: 'us-west-2'});

const message = 'Xbox alert! Click me now: https://www.xbox.com/en-us/configure/8WJ714N3RBTL';
const defaultMessage = 'No Xboxes available, try again later';

exports.handler = async (event) => {
  // Log the event argument for debugging and for use in local development
  console.log(JSON.stringify(event, undefined, 2));
  // If the CodeBuild job was successful, that means Xboxes are not in stock and no message needs to be sent
  if (event.detail['build-status'] === 'SUCCEEDED') {
    console.log(defaultMessage)
    return {
      statusCode: 200,
      body: defaultMessage
    };
  } else if (event.detail['build-status'] === 'FAILED') {
    // If the CodeBuild job failed, that means Xboxes are back in stock!
    console.log('Sending message: ', message);

    // Create SNS parameters
    const params = {
      Message: message, /* required */
      TopicArn: process.env.TOPIC_ARN,
      MessageAttributes: {
        'AWS.SNS.SMS.SMSType': {
          DataType: 'String',
          StringValue: 'Promotional'
        },
        'AWS.SNS.SMS.SenderID': {
          DataType: 'String',
          StringValue: 'XboxAlert'
        },
      },
    };

    try {
      let data = await sns.publish(params).promise();
      console.log('Message sent! Xbox purchase, commence!');
      return { 
        statusCode: 200,
        body: data
      };
    } catch (err) {
      console.log('Sending failed', err);
      throw err;
    }
  }
  return {};
};
Enter fullscreen mode Exit fullscreen mode

Oh, and while you're at it, you may want to add node_modules and package-lock.json to your .gitignore, unless polluting Git repos is your thing.

Time to deploy this bad boy

Be sure to git add, commit, and push your changes. When deploying, AWS will need access to your Git provider. Follow these instructions to set up access tokens in your account if you've never done that before. (This doc might also come in handy for noobs like me).

If you're using Stackery to deploy, like the smart and also good-looking developer you are, all you need to do is run the following command in the root of your repo:

stackery deploy
Enter fullscreen mode Exit fullscreen mode

This will take a few minutes, during which time you can daydream about how awesome that new Xbox is going to be once it's hooked up to your 4K TV.

waiting gif

Done? Ok! Next step: adding your phone number for text alerts.

Can I get your digits?

As I mentioned above, one of the resources created in your stack was the XboxAlert SNS topic. It was created during the deployment, but right now it's not doing anything. Let's change that.

  1. Open the AWS Console, and navigate to the SNS Dashboard
  2. Under Topics, you should see your freshly-minted topic, called something like xbox-stock-alert-<env>-XboxAlert. Click its name
  3. Click the big orange Create subscription button
  4. Fill out the form like so with your mobile number, and click Create subscription again:

subscription

You'll need to verify your phone number if you haven't used it with SNS before, and then you're good to go!

Testing time

Still in AWS, you should now be able to open up the CodeBuild console and see a new project in there:

xbox-4

You'll want to run it manually to make sure everything works before setting and forgetting it, so go ahead and select your project and hit that Start build button. This will take some time as well, but you can tail the CloudWatch logs by clicking the project name and selecting the most recent build run.

Vids or it didn't happen

Hopefully, your build was a success (and if it wasn't, hit me up - I think I hit all the errors while building this out and may be able to help).

But how can you make sure? Well, you can go back to your project in Cypress.io, and see if there's anything in your latest runs. If all went well, you'll be able to watch a video of the headless browser running your spec!

xbox-5

And, should one day that test fail 🤞, you'll get a notification straight to your phone letting you know that Xbox is right there for the taking. Good luck!

Notes

1 I actually just made that up, but I imagine the inventor of the hammer said that at some point.
2 I also just made that up, but that doesn't make it any less true.
3 A much better way to do this is to use environment parameters stored in AWS Systems Manager Parameter Store to store your record key, but for the sake of brevity my example hard-codes the key. Just make sure your repo is private if you follow my bad example 🙏

Postscript

It's possible to extend the scraper spec to add more retailers, though I ran into issues with a few, such as Walmart's bot detector:

walmart

I wasn't able to get these running without errors, but maybe someone else will have more luck and can comment with their solutions:

// xbox-stock-alert/cypress/integration/scraper.spec.js

describe('Xbox out-of-stock scraper - more retailers', () => {
  it('Checks to see if Xboxes are out of stock at GameStop', () => {
    cy.visit('https://www.gamestop.com/accessories/xbox-series-x/products/xbox-series-x/11108371.html?condition=New', {
      headers: {
        "Accept-Encoding": "gzip, deflate",
        "keepAlive": true
      }
    });
    cy.get('span.delivery-out-of-stock')
    cy.get('span.store-unavailable')
  });
  it('Checks to see if Xboxes are out of stock at Best Buy', () => {
    cy.visit('https://www.bestbuy.com/site/microsoft-xbox-series-x-1tb-console-black/6428324.p?skuId=6428324', {
      headers: {
        "Accept-Encoding": "gzip, deflate",
        "keepAlive": true
      }
    });
    cy.get('[data-sku-id="6428324"]')
      .should('be.disabled')
  });
  it('Checks to see if Xboxes are out of stock at Walmart', () => {
    cy.visit('https://www.walmart.com/ip/Xbox-Series-X/443574645', {
      headers: {
        "Accept-Encoding": "gzip, deflate",
        "keepAlive": true
      }
    });
    cy.get('.spin-button-children')
      .contains('Get in-stock alert');
  });
  it('Checks to see if Xboxes are out of stock at Costco', () => {
    cy.visit('https://www.costco.com/xbox-series-x-1tb-console-with-additional-controller.product.100691493.html', {
      headers: {
        "Accept-Encoding": "gzip, deflate",
        "keepAlive": true
      },
      pageLoadTimeout: 60000
    });
    cy.get('.oos-overlay')
  });
});
Enter fullscreen mode Exit fullscreen mode

Top comments (14)

Collapse
 
dansilcox profile image
Dan Silcox

This is awesome! I’ve been wanting to use cypress for something for ages and also how did I not know about stackery!! We were talking at work recently about being able to convert XML based draw.io diagrams into SAM templates and stackery is basically a more mature version of that functionality by the looks - truly awesome :)

Collapse
 
annaspies profile image
Anna

Yay! Give it a try and feel free to ask questions via the chat - one of us is on the other end (during PST biz hours, anyway) and will be happy to help if anything is confusing.

Collapse
 
graciegregory profile image
Gracie Gregory (she/her)

👋 ❤️👋 ❤️

Damn, I can see y'all have been hard at work over there. Stackery is looking 🔥

Collapse
 
graciegregory profile image
Gracie Gregory (she/her) • Edited

... and so is the website 🤩

Collapse
 
annaspies profile image
Anna

Thanks, miss you friend!

And clearly I need more work if I have the time to create testing suite abominations 😂

Collapse
 
ryanycoleman profile image
Ryan Coleman

I think this counts as R&D, right?

right?

Thread Thread
 
rolldy profile image
rolldy

Who doesn't love a shrugging white guys gif?

Thread Thread
 
annaspies profile image
Anna

I myself like a good shrugging classic movie gif

breakfast club

Collapse
 
farrahc32 profile image
Farrah

Great post!!!

Collapse
 
bluefoxbandido profile image
Adam Parker

I'm having a hard time finding the cypress.json file

Also the integration folder that is supposed to be in the cypress folder

It's probably operator error, but I'm lost.

Collapse
 
annaspies profile image
Anna

Hey, you need to run these two steps in the root of your repo:

  1. npm install cypress --save
  2. Once it's installed, run ./node_modules/.bin/cypress open.
Collapse
 
annaspies profile image
Anna

That will initialize cypress in that project, and create all of those files. Let me know if that still doesn't work for you!

Collapse
 
nikosdev profile image
Nikos Kanakis

Can i use stackery without an AWS account?

Collapse
 
annaspies profile image
Anna

You can use it to architect a serverless app, but you can't deploy without an AWS account, as that's what we deploy into.

You can try playing around on a canvas or the free VS Code extension without an account:
app.stackery.io/editor/design?owne...