DEV Community

Cover image for DeepCode’s Top Findings #1: Java Date (This one made me dizzy)
cu_0xff 🇪🇺 for DeepCode.AI

Posted on • Originally published at Medium

DeepCode’s Top Findings #1: Java Date (This one made me dizzy)

Hey,
DeepCode offers an AI based Static Program Analysis for Java, Javascript and Typescript, and Python. You might know, DeepCode uses thousands of open source repos to train our engine. We asked the engine team to provide some stats on the findings. On the top suggestions from our engine, we want to introduce and give some background in this series of blog articles. And this one made me dizzy as we are looking back on like 70 years of software engineering and still have problems with — drumroll — date formats…

Language: Java
Defect: Date Data Format (Category General 1)
Diagnose: Ambiguous date-time output formats (e.g. 12h output without am/pm suffix)

Background:

Quick, what is the difference between the date format strings MM-DD-YYYY and mm-dd-yyyy or MM-dd-yyyy ?? I can tell you, they will all have the same result when you call them at the right moment. But obviously, they are fundamentally different. Well, let us shed some light…

Java had a bad start regarding its date classes. java.util.Date had serious design flaws when it was introduced which led to lots of confusion. A Date instance in Java is actually not a date but a moment in time, therefore (1) it has no time zone, (2) no format, (3) no calendar system. I suggest this valuable blog post for the full story. After years, the Java community acted and introduced new classes. Still, there are lots of traps (for example, java.text.SimpleDateFormat is not thread-safe while java.time.format.DateTimeFormatter is) but for now, let us focus the most common mistake which is the one flagged by DeepCode:

  • Using mm for months and/or MM for minutes (wrong!).
  • Using hh for “hour of the day” when really HH was intended. HH ranges from 0 to 23 while h ranges from 1 to 12 and is mostly used as single character. It needs the AM/PM information or it is ambiguous.
  • Using YYYY for year. It is meant to be used in conjunction with “week of the year” and can lead to unexpected results in the first and last week of a year. Normally, you want to use yyyy .
  • Using DD for “day of the month” but in reality, it means “day of the year” Make sure to use the correct pattern characters. As a reference in Java, the following applies.
Pattern Character Date or Time component Example Result
G Era designator AD
y Year 2020(yyyy),20(yy)
Y Week-year (year of the week, may provide unexpected results first and last week of the year) 2020(YYYY), 20(YY)
M Month in year July(MMMM), Jul(MMM), 07(MM)
w Results in week in year 16
W Results in week in month 3
D Gives the day count in the year 266
d Day of the month 09(dd), 9(d)
F Day of the week in month 4
E Day name in the week Tuesday, Tue
u Day number of week where 1 represents Monday, 2 represents Tuesday and so on 2
a AM or PM marker AM
H Hour in the day (0-23) 12
k Hour in the day (1-24) 23
K Hour in am/pm for 12 hour format (0-11) 0
h Hour in am/pm for 12 hour format (1-12) 12
m Minute in the hour 59
s Second in the minute 35
S Millisecond in the minute 978
z Timezone Pacific Standard Time; PST; GMT-08:00
Z Timezone offset in hours (RFC pattern) -0800
X Timezone offset in ISO format -08; -0800; -08:00

Note: This is Java. Do not simply expect this to be the same elsewhere. Always check the documentation.

This is not exhaustive on problems around dates and times. We could talk about the difference between UTC offset and timezones, the problems around timezone abbreviations (is BST British Summer Time or British Standard Time or rather Bougainville Standard Time (No, I did not make this up)? Well, who knows), or different calendars in different locales.
To provide the answer to our little puzzle above, you probably almost always want MM-dd-yyyy.

CU

0xff

Top comments (0)