DEV Community

Discussion on: Pwned Together: Hacking dev.to

 
joshcheek profile image
Josh Cheek • Edited

How about something like this?

require 'uri'

def valid?(link)
  uri = URI.parse link
  return false if uri.scheme != 'https'
  return false if uri.userinfo
  return false if uri.host != 'gist.github.com'
  return false if uri.port != 443 # I think it has to be this if its https?
  return false if uri.fragment
  return false if uri.query

  # idk if old gist ids could be arbitrary chars,
  # but the 10 or so I looked at all seemed to be hex
  path, gist = File.split uri.path
  return false unless gist && gist.match?(/\A[0-9a-f]{32}\z/)

  path, user = File.split path
  return false unless user && user.match?(/\A[a-zA-Z0-9\-]{1,39}\z/)

  path == '/'
end

Plus, a bunch of test cases:

valid = [
  # suggested one
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # max name size (based on the existing regex, I didn't verify that it matches Github's rules)
  "https://gist.github.com/aaaaaaaaaabbbbbbbbbbccccccccccddddddddd/4bb1682ce590dc42402b2edddbca7aaa",
  # min name size
  "https://gist.github.com/a/4bb1682ce590dc42402b2edddbca7aaa",
  # each of the gist id char values (seems to be hex)
  "https://gist.github.com/a/0123456789abcdef0123456789abcdef",
  # user can have a dash in their name
  "https://gist.github.com/quincy-larson/4bb1682ce590dc42402b2edddbca7aaa",
  # just for contrast with some of the tests below
  "https://gist.github.com/a/0000000000111111111122222222223a",
]

invalid = [
  # the same as the valid one, except it's http. I actually thought browsers
  # would refuse to make http requests from https sites, but either way, it
  # should be disallowed as it's MITMable
  "http://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # host
  "https://evil.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # auth credentials
  "https://user:pass@gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # http port (not totally sure this matters)
  "https://gist.github.com:80/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # query
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa?q",
  # fragment
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa#f",
  # this one exceeds the max name size
  "https://gist.github.com/aaaaaaaaaabbbbbbbbbbccccccccccddddddddde/4bb1682ce590dc42402b2edddbca7aaa",
  # less than min name size
  "https://gist.github.com//4bb1682ce590dc42402b2edddbca7aaa",
  # invalid gist id chars (outside the hex range)
  "https://gist.github.com/a/0000000000111111111122222222223x",
  # quick test showed that GH was case sensitive about the hex in the gist id
  "https://gist.github.com/a/0000000000111111111122222222223A",
  # too few path segments
  "https://gist.github.com",
  "https://gist.github.com/",
  "https://gist.github.com/a",
  # too many path segments
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa/4bb1682ce590dc42402b2edddbca7aaa",
  # these ones were used to bypass validity in the blog
  "//evil.com/script#gist.github.com",
  "https://gist.github.com@evil.com/",
  "https://gist.github.com/AntonyGarand/a8a0b4a36a040edc6051e888afce8fab/raw/4deb366ddaf0597e82fea808f7f4cb3ad763d98f/poc.js",
]

valid.all?    { |link| valid? link } # => true
invalid.none? { |link| valid? link } # => true
Thread Thread
 
joshcheek profile image
Josh Cheek

Although, we should probably check whether you're allowed to have unicode in your username 🤔

Thread Thread
 
andy profile image
Andy Zhao (he/him)

@joshcheek gonna take that valid? method for checking giphy links. :)

Thread Thread
 
andy profile image
Andy Zhao (he/him)

Also that's pretty great. Might end up implementing it for the gist Liquid tag.