Question :
I have a postgresql database with a table like this:
id_usuario | dt_atualizacao | atualizacao | outros_dados
123 | 01-01-2001 | 0 | abc
123 | 01-01-2005 | 1 | abe
123 | 01-01-2012 | 1 | acd
123 | 01-01-2018 | 1 | acc
124 | 01-01-2017 | 0 | acj
124 | 01-01-2018 | 1 | agy
125 | 01-01-2018 | 0 | awe
What happens is this: In this application, when a user is registered for the first time, he leaves the field atualizacao = 0
and sets the current date.
But when an update is made to this register (say I changed the field outros_dados
), instead of updating the current record, it creates a new record with a new date and field atualização = 1
.
What I need is a select that takes all user ids, without repeating it and being the most current one. In the example above, I would need the following result:
id_usuario | dt_atualizacao | atualizacao | outros_dados
123 | 01-01-2018 | 1 | acc
124 | 01-01-2018 | 1 | agy
125 | 01-01-2018 | 0 | awe
Any tips on how to do this?
Answer :
The table structure is not exactly “great”.
If you want to have an update history then you should, as well as @Dudaskank , is to change the data type from DATE
to TIMESTAMP
to be able to store the time, minute and second of the update and thus be able to pick up the last update in a simpler and cleaner way.
The best way to get the latest updates, given the change in the dt_atualizacao
column type, is to use the DISTINCT ON
:
SELECT DISTINCT ON (id_usuario) id_usuario
, dt_atualizacao
, outros_dados
FROM historico
ORDER BY id_usuario
, dt_atualizacao DESC
The Historico
table will be your table with the updates.
Some points to note:
date
field, which seems to save only the date in this example, use timestamp
or similar, so if there is more than one update on the same date, you can differentiate between them which is the most recent. atualizacao
column. Just get the highest date for given id
to know the most recent. Update 8/29/2018 1:49 PM
As the author has reported that only one update per day is what happens in practice, even using date
will work without problems.
I’ve also created a SQL Fiddle with the example data, both with my solution and colleague’s solution @ João Martins . For some reason, which I do not understand much to speak the truth, my query ended up being faster, so the tip is (but his is much easier to read by a human, if you do not need more speed is also another tip) .
For the query, try something like this, where historico
is the name of the table and dt_atualizacao
the field with the date and time of that record:
SELECT * FROM historico a WHERE dt_atualizacao=(SELECT MAX(dt_atualizacao) FROM historico b WHERE b.id_usuario=a.id_usuario);