This question has been asked before but I am facing a slightly different problem.


I have a table which logs events and stores their timestamps (as datetime). I need to be able to break up time into chunks and get number of events that occurred in that interval. The interval can be custom (Say from 5 minutes to 1 hour and even beyond).


The obvious solution is to convert the datetime to unix_timestamp divide it by number of seconds in the interval, take its floor function and multiply it back by the number of seconds. Finally convert the unix_timestamp back to the datetime format.


This works fine for small intervals.


from_unixtime(floor(unix_timestamp(event.timestamp)/300)*300) as start_time,
count(*) as total 
from event 
where timestamp>='2012-08-03 00:00:00' 
group by start_time;

This gives the correct output


| start_time          | total |
| 2012-08-03 00:00:00 |    11 |
| 2012-08-03 00:05:00 |     4 |
| 2012-08-03 00:10:00 |     4 |
| 2012-08-03 00:15:00 |     7 |
| 2012-08-03 00:20:00 |     8 |
| 2012-08-03 00:25:00 |     1 |
| 2012-08-03 00:30:00 |     1 |
| 2012-08-03 00:35:00 |     3 |
| 2012-08-03 00:40:00 |     3 |
| 2012-08-03 00:45:00 |     5 |
~~~~~OUTPUT SNIPPED~~~~~~~~~~~~

But if I increase the interval to say 1 hour (3600 sec)


mysql> select from_unixtime(floor(unix_timestamp(event.timestamp)/3600)*3600) as start_time, count(*) as total from event where timestamp>='2012-08-03 00:00:00' group by start_time;
| start_time          | total |
| 2012-08-02 23:30:00 |    35 |
| 2012-08-03 00:30:00 |    30 |
| 2012-08-03 01:30:00 |    12 |
| 2012-08-03 02:30:00 |    18 |
| 2012-08-03 03:30:00 |    12 |
| 2012-08-03 04:30:00 |     4 |
| 2012-08-03 05:30:00 |     3 |
| 2012-08-03 06:30:00 |    13 |
| 2012-08-03 07:30:00 |   269 |
| 2012-08-03 08:30:00 |   681 |
| 2012-08-03 09:30:00 |  1523 |
| 2012-08-03 10:30:00 |   911 |

The reason, as far as I could gauge, for the boundaries not being set properly is that unix_timestamp will convert time from my local timezone (GMT + 0530) to UTC and then output the numerical value.

据我所知,未正确设置边界的原因是unix_timestamp会将时间从我的本地时区(GMT + 0530)转换为UTC,然后输出数值。

So a value like 2012-08-03 00:00:00 will actually be 2012-08-02 18:30:00. Dividing and using floor will set the minutes part to 00. But when I use from_unixtime, it will convert it back to GMT + 0530 and hence give me intervals that begin at 30 mins.

所以像2012-08-03 00:00:00这样的值实际上是2012-08-02 18:30:00。划分和使用楼层会将分钟部分设置为00.但是当我使用from_unixtime时,它会将其转换回GMT + 0530,因此给我的间隔时间为30分钟。

How do I ensure the query works correctly irrespective of the timezone? I use MySQL 5.1.52 so to_seconds() is not available

无论时区如何,我如何确保查询正常工作?我使用MySQL 5.1.52所以to_seconds()不可用

EDIT: The query should also fire correctly irrespective of the interval (can be hours, minutes, days). A generic solution would be appreciated


2 个解决方案



You can use TIMESTAMPDIFF to group by intervals of time:


For a specified interval of hours, you can use:


SELECT   '2012-08-03 00:00:00' + 
         INTERVAL FLOOR(TIMESTAMPDIFF(HOUR, '2012-08-03 00:00:00', timestamp) / ) *  HOUR AS start_time,
         COUNT(*) AS total 
FROM     event 
WHERE    timestamp >= '2012-08-03 00:00:00'
GROUP BY start_time

Replace the occurances of 2012-08-03 00:00:00 with your minimum input date.

用最小输入日期替换2012-08-03 00:00:00的出现。

is your specified interval in hours (every 2 hours, 3 hours, etc.), and you can do the same for minutes:


SELECT   '2012-08-03 00:00:00' + 
         INTERVAL FLOOR(TIMESTAMPDIFF(MINUTE, '2012-08-03 00:00:00', timestamp) / ) *  MINUTE AS start_time,
         COUNT(*) AS total 
FROM     event 
WHERE    timestamp >= '2012-08-03 00:00:00'
GROUP BY start_time

Where is your specified interval in minutes (every 45 minutes, 90 minutes, etc).

其中 是您指定的间隔(以分钟为单位)(每45分钟,90分钟等)。

Be sure you're passing in your minimum input date (in this example 2012-08-03 00:00:00) as the second parameter to TIMESTAMPDIFF.

确保您将最小输入日期(在此示例中为2012-08-03 00:00:00)作为TIMESTAMPDIFF的第二个参数传递。

EDIT: If you don't want to worry about which interval unit to pick in the TIMESTAMPDIFF function, then of course just do the interval by seconds (300 = 5 minutes, 3600 = 1 hour, 7200 = 2 hours, etc.)

编辑:如果您不想担心在TIMESTAMPDIFF函数中选择哪个间隔单位,那么当然只需要按秒进行间隔(300 = 5分钟,3600 = 1小时,7200 = 2小时等)

SELECT   '2012-08-03 00:00:00' + 
         INTERVAL FLOOR(TIMESTAMPDIFF(SECOND, '2012-08-03 00:00:00', timestamp) / ) *  SECOND AS start_time,
         COUNT(*) AS total 
FROM     event 
WHERE    timestamp >= '2012-08-03 00:00:00'
GROUP BY start_time

EDIT2: To address your comment pertaining to reducing the number of areas in the statement where you have to pass in your minimum parameter date, you can use:


SELECT   b.mindate + 
         INTERVAL FLOOR(TIMESTAMPDIFF(SECOND, b.mindate, timestamp) / ) *  SECOND AS start_time,
         COUNT(*) AS total 
FROM     event 
JOIN     (SELECT '2012-08-03 00:00:00' AS mindate) b ON timestamp >= b.mindate
GROUP BY start_time

And simply pass in your minimum datetime parameter once into the join subselect.


You can even make a second column in the join subselect for your seconds interval (e.g. 3600) and name the column something like secinterval... then change the 's to b.secinterval, so you only have to pass in your minimum date parameter AND interval one time each.

您甚至可以在连接子选择中为秒间隔(例如3600)创建第二列,并将列命名为secinterval ...然后将 更改为b.secinterval,因此您只需要传入最小日期参数和间隔各一次。

SQLFiddle Demo



the easier method would be:



select date(timestamp) as date_timestamp, hour(timestamp) as hour_timestamp, count(*) as total 
from event
where timestamp>='2012-08-03 00:00:00' 
group by date_timestamp, hour_timestamp

if you would like to use your original approach.



select from_unixtime(floor(unix_timestamp(event.timestamp-1800)/3600)*3600+1800) as start_time, 
count(*) as total 
from event 
where timestamp>='2012-08-03 00:00:00' 
group by start_time;

for the first method, it also allows user to set different interval. For example, if user wants the log to group by 15 minutes,


select date(time) as date_timestamp, 
    hour(time) as hour_timestamp,  
    floor(minute(time) as minute_timestamp / 15) * 15 as minute_timestamp
    count(*) as total
from event
group by date_timestamp, hour_timestamp, minute_timestamp

