SQL Timestamp Database information

SQL is the first programming language to have explicit temporal datatypes. I have had the theory that if Cobol had been designed with a TIMESTAMP datatype, we would have avoided all that Y2K trouble. At least now, more people are aware of the ISO 8601 time and date display standards. Who knows? Maybe people will start to use them.
The temporal support in each SQL product can be classified as either a "Unix-style" or "Cobol-style" internal representation.
In the Unix-style representation, each point in time is shown as a very large integer number that represents the number of clock ticks from a base date. This is how the Unix operating system handles its temporal data. The use of clock ticks makes calculations very easy — it becomes simple integer math. However, it is hard to convert the clock ticks into a yearmonth-day-hour-minute-second format.
In the Cobol-style representation, the database has a separate internal field for the year, month, day, hour, minute, and seconds. This is great for displaying the information, but not for calculations.

SQL Timestamp Database information:

One of the debates in the SQL Standards Committee was how to handle intervals of time. The reason that time is tricky is that it is continuous. The defining mathematical property of a continuum is that any part of it can be further sub-divided forever. Give me any line segment and I can cut it into smaller segments endlessly. But we run into the problem that the defining property of a point is that it cannot be further subdivided. So how can there be points in a continuum?
sql timestamp data type

When you give a year, say 2000, you are really giving me an interval of 365 days. Give me a date, say 2000-01-01, you are not giving me a point; you are identifying an interval of 24 hours. Give me the date and time 2000-01-01 00:00:00 and you are giving me an interval of 60 seconds. It never stops!!
The decision in SQL was to view time as a series of open ended intervals. That is, the segment includes the starting point in time, but never gets to the end point of the interval. This has some nice properties. It prevents you from counting the end of one event and the start of another event as identical moments in time. An open interval minus an open interval gives open intervals as a result and all points are accounted for.
But intervals are hard to work with conceptually. Let me give you an actual example that was posted in a newsgroup. We have a table that catches information about the user activity on a system. It is a very simple "log file" that shows when someone starts and ends a session with the system. We do not even care who the user was, since I am assuming that user_activity_id is a unique number that identifies a session, without identifying individual users. The table looks like this:
CREATE TABLE User_Activity (user_activity_id INTEGER NOT NULL PRIMARY KEY, login TIMESTAMP NOT NULL, logout TIMESTAMP, -- null means session is still active CHECK (login < logout), ...);

Using a NULL in the logout column to mean that the session is still active adds a little complexity to the problem. I decided to use the current timestamp at the time the query is executed as the logout time.
I would like to be able to report the number of user sessions logged on during each hour of the day. So, if someone began a session at 03:12 Hrs and ended it at 06:45 Hrs, I would like them to be counted as being logged on the system for 03:00 Hrs, 04:00 Hrs, 05:00 Hrs and 06:00 Hrs. This report should work all the hours in several years of data.
One solution proposed in the newsgroup involved using CASE expressions to classify each time extracted from the TIMESTAMP values as to what hourly interval it belongs. The logic got worse from there.
Conclusion:
Instead of trying to put the session into the periods, this query puts the starts and stops of the periods into the session interval. A period can have a start time, a stop time or both inside the session; this case is why you need to remove the duplicate period numbers.

No comments

Post Top Ad