DEV Community

loading...

Checking whether a number sequence has two or four digits in Python

rrees profile image Robert Rees ・2 min read

I had an interesting problem the other day which I thought would be pretty trivial but actually turned out to be a bit of a pain.

UK dates are often written in the following formats:

  • dd/mm/yy
  • dd/mm/yyyy

The script I was writing needed to do a quick and dirty validation that the string input was in one of the valid forms before using a proper library to actually read the date.

The first regular expression I tried turned out to not be valid in Python so I started using the excellent Pythex online tool to help craft something that would work.

In the examples at follow add the start and end anchors mentally, I've omitted them.

My first naive attempt didn't work at all:

\d{2}/\d{2}/[\d{4}|\d{2}]

This matched if the year was one digit long. I thought maybe it was because I had the longer sequence at the start but swapping around the or (the pipe symbol) doesn't matter.

The first issue here is the square brackets which is actually a character sequence selection. I'd simply forgotten what it meant so I think all that was matching was a single digit character as the entire square bracket evaluates to a single character.

Okay...

\d{2}/\d{2}/\d{2}|\d{4}

This kind of works except that it doesn't match four digit dates but anything that has at least two digits so three digits years are fine if you don't have the line boundary.

As the or is evaluated left to right my first formulation was actually correct.

\d{2}/\d{2}/\d{4}|\d{2}

And for the tl;dr cut and paste version

r'\A\d{2}/\d{2}/(\d{4}|\d{2})\Z'

I didn't expect this to be such a struggle, I was kind of the old adage of trying to fix a problem with a regular expression and ending up with two problems...

Discussion (0)

pic
Editor guide