Postgres “fuzzy” join / dedupe

I have an event log that rarely ends up with duplicated events (due to “at least once” delivery). The logs are practically unique on a short time frame, but might repeat on a time scale of days / weeks. I’d like to create a view of some kind where N events with the same text that occurred within 10 seconds will be reduced down to 1. How can I do this in Postgres? Perhaps a window function, or a partiton?

This is what I currently have, which works fine for toy examples, but explodes for anything nontrivial in size:

select * from events where (
    select count(*)
    from events as nearby
    where
        nearby.data=events.data and
        nearby.time>events.time and
        nearby.time<events.time+10
) = 0;

Go to Source
Author: lurf jurv