jeudi 21 avril 2016

Grouping a set of log entries into visits in SQL based on time since last entry

I have a table of log entries with an id, timestamp, source_ip (for the IP address) and some other data. I want to group this into "visits", where a visit is all log entries from one IP address where there were < X seconds since last log entry. i.e. for every log entry in a visit, there must be at least one other entry in that visit whose timstamp was < X seconds before or after this one.

If X = 10 minutes IP A has the following requests: 12:00, 12:05, 12:11, 12:40, 12:42, 12:50, 12:52, 14:01, then there are 3 visit groups: [12:00, 12:05, 12:11], [12:40, 12:42, 12:50, 12:52], [14:01].

I would like to do this entirely in SQL, but I'm not sure how. I'm guessing it a form of group by, perhaps with Common Table Expressions (WITH clause)? Can anyone tell me how to generate this? I'd know how to do it in Python (say), but I'd like to have it done in SQL

I'm currently trying this with SQLite 3, but I'm willing to change to PostgreSQL (even to postgresql 9.5).

Aucun commentaire:

Enregistrer un commentaire