The task is to get the most recent login time of the day for each user. Then send such a result to an API.
The log file (file.log) looks like the following (only the last two days):
2021-05-26 09:28:40.720+0000 INFO [alice]: logged in
2021-05-26 09:47:44.714+0000 INFO [alice]: logged in
2021-05-26 09:48:47.379+0000 INFO [frank]: logged in
2021-05-26 09:50:30.582+0000 INFO [bob]: logged in
2021-05-26 09:51:57.903+0000 INFO [alice]: logged in
2021-05-26 09:51:58.590+0000 INFO [alice]: logged in
2021-05-26 09:51:58.608+0000 INFO [alice]: logged in
2021-05-26 09:58:03.701+0000 INFO [bob]: logged in
2021-05-26 10:00:30.295+0000 INFO [alice]: logged in
2021-05-26 10:07:19.646+0000 INFO [frank]: logged in
2021-05-26 10:30:57.741+0000 INFO [alice]: logged in
2021-05-26 10:32:18.680+0000 INFO [alice]: logged in
2021-05-26 11:49:15.756+0000 INFO [bob]: logged in
2021-05-26 11:49:16.112+0000 INFO [alice]: logged in
2021-05-26 11:49:45.783+0000 INFO [frank]: logged in
2021-05-26 11:50:01.289+0000 INFO [susan]: logged in
2021-05-26 11:50:18.526+0000 INFO [frank]: logged in
2021-05-26 12:46:51.695+0000 INFO [bob]: logged in
2021-05-26 12:49:22.957+0000 INFO [alice]: logged in
2021-05-26 12:49:32.019+0000 INFO [frank]: logged in
2021-05-26 13:27:59.130+0000 INFO [alice]: logged in
2021-05-26 13:27:59.131+0000 INFO [alice]: logged in
2021-05-26 13:28:05.917+0000 INFO [bob]: logged in
2021-05-26 13:28:37.896+0000 INFO [frank]: logged in
2021-05-26 13:29:03.567+0000 INFO [bob]: logged in
2021-05-26 13:33:13.660+0000 INFO [frank]: logged in
2021-05-26 13:33:18.855+0000 INFO [alice]: logged in
2021-05-26 14:08:33.071+0000 INFO [alice]: logged in
2021-05-27 01:00:00.060+0000 INFO [alice]: logged in
2021-05-27 02:14:16.376+0000 INFO [alice]: logged in
2021-05-27 02:14:31.096+0000 INFO [alice]: logged in
2021-05-27 02:14:38.673+0000 INFO [bob]: logged in
2021-05-27 02:17:04.743+0000 INFO [bob]: logged in
2021-05-27 02:17:04.953+0000 INFO [alice]: logged in
2021-05-27 02:17:10.777+0000 INFO [alice]: logged in
2021-05-27 02:17:10.778+0000 INFO [alice]: logged in
2021-05-27 02:26:33.354+0000 INFO [bob]: logged in
2021-05-27 03:16:03.776+0000 INFO [alice]: logged in
2021-05-27 03:16:03.776+0000 INFO [alice]: logged in
2021-05-27 03:16:03.777+0000 INFO [alice]: logged in
2021-05-27 03:17:24.907+0000 INFO [bob]: logged in
2021-05-27 03:23:40.098+0000 INFO [frank]: logged in
2021-05-27 03:55:54.217+0000 INFO [alice]: logged in
2021-05-27 03:55:55.706+0000 INFO [alice]: logged in
2021-05-27 03:56:55.150+0000 INFO [alice]: logged in
2021-05-27 04:00:41.350+0000 INFO [alice]: logged in
2021-05-27 04:02:10.483+0000 INFO [bob]: logged in
2021-05-27 04:04:22.981+0000 INFO [bob]: logged in
2021-05-27 04:19:04.411+0000 INFO [alice]: logged in
2021-05-27 04:27:20.947+0000 INFO [bob]: logged in
2021-05-27 04:27:21.308+0000 INFO [alice]: logged in
2021-05-27 05:48:13.161+0000 INFO [alice]: logged in
2021-05-27 05:48:37.195+0000 INFO [alice]: logged in
2021-05-27 06:04:32.551+0000 INFO [bob]: logged in
2021-05-27 06:04:39.121+0000 INFO [alice]: logged in
2021-05-27 06:16:48.495+0000 INFO [bob]: logged in
2021-05-27 06:35:02.143+0000 INFO [alice]: logged in
2021-05-27 06:35:41.609+0000 INFO [bob]: logged in
2021-05-27 06:36:04.664+0000 INFO [bob]: logged in
2021-05-27 06:37:36.787+0000 INFO [frank]: logged in
2021-05-27 06:38:00.993+0000 INFO [alice]: logged in
2021-05-27 06:39:15.904+0000 INFO [alice]: logged in
2021-05-27 06:40:45.971+0000 INFO [bob]: logged in
2021-05-27 06:40:51.106+0000 INFO [alice]: logged in
2021-05-27 06:40:52.237+0000 INFO [alice]: logged in
2021-05-27 06:40:52.361+0000 INFO [alice]: logged in
2021-05-27 06:41:06.290+0000 INFO [frank]: logged in
2021-05-27 06:41:12.399+0000 INFO [alice]: logged in
2021-05-27 06:47:18.085+0000 INFO [frank]: logged in
2021-05-27 06:47:21.375+0000 INFO [alice]: logged in
2021-05-27 06:49:59.740+0000 INFO [frank]: logged in
2021-05-27 06:50:23.645+0000 INFO [alice]: logged in
2021-05-27 06:50:23.646+0000 INFO [alice]: logged in
2021-05-27 06:51:28.829+0000 INFO [frank]: logged in
2021-05-27 06:51:29.224+0000 INFO [alice]: logged in
2021-05-27 06:52:39.460+0000 INFO [bob]: logged in
2021-05-27 06:54:55.778+0000 INFO [alice]: logged in
2021-05-27 06:54:55.792+0000 INFO [alice]: logged in
2021-05-27 06:54:59.776+0000 INFO [alice]: logged in
2021-05-27 07:04:18.643+0000 INFO [bob]: logged in
2021-05-27 07:04:48.062+0000 INFO [frank]: logged in
2021-05-27 07:11:06.814+0000 INFO [alice]: logged in
2021-05-27 07:11:59.307+0000 INFO [frank]: logged in
2021-05-27 07:12:09.189+0000 INFO [bob]: logged in
2021-05-27 07:12:46.338+0000 INFO [martin]: logged in
2021-05-27 07:14:14.124+0000 INFO [martin]: logged in
2021-05-27 07:32:59.817+0000 INFO [alice]: logged in
2021-05-27 07:33:01.126+0000 INFO [alice]: logged in
2021-05-27 07:36:52.810+0000 INFO [frank]: logged in
2021-05-27 07:39:17.658+0000 INFO [alice]: logged in
2021-05-27 08:01:49.556+0000 INFO [alice]: logged in
2021-05-27 08:10:08.179+0000 INFO [frank]: logged in
2021-05-27 08:14:37.349+0000 INFO [alice]: logged in
2021-05-27 08:15:41.975+0000 INFO [bob]: logged in
2021-05-27 08:18:41.127+0000 INFO [admin]: logged in
2021-05-27 08:19:12.261+0000 INFO [admin]: logged in
2021-05-27 08:48:26.673+0000 INFO [bob]: logged in
2021-05-27 08:48:27.030+0000 INFO [alice]: logged in
2021-05-27 08:49:20.622+0000 INFO [alice]: logged in
2021-05-27 09:24:28.605+0000 INFO [alice]: logged in
2021-05-27 09:27:46.069+0000 INFO [alice]: logged in
2021-05-27 09:29:16.216+0000 INFO [bob]: logged in
2021-05-27 09:29:16.464+0000 INFO [bob]: logged in
2021-05-27 09:45:54.497+0000 INFO [alice]: logged in
Note: these are only the lines that end with logged in. The original file contains 24K different logs and could be much bigger.
Parsing the file should give this result:
2021-05-27T09:45:54.497Z alice
2021-05-27T09:29:16.464Z bob
2021-05-27T08:19:12.261Z admin
2021-05-27T08:10:08.179Z frank
2021-05-27T07:14:14.124Z martin
Basically, the most recent login time of today (in this case 2021-05-27) for each user. Users that haven't logged in today, shouldn't be included.
Then such a result is sent to an API one by one.
To do that I implemented the following bash script:
#!/bin/bash
SERVER_API_URL="http://localhost:8080/api/user"
TODAY=$(date +'%Y-%m-%d')
while read login_time user_id; do
# following line is wrapped in echo for testing
echo "curl -X PUT $SERVER_API_URL/$user_id/lastLoginTime/$login_time"
done < <(cat file.log | grep "$TODAY.*logged in" | sort -r | awk -F' ' '!seen[$4]++' | awk '{p=index($2,"+"); print $1"T"substr($2,1,p-1)"Z",$4 }' | awk -F'[' '{ print $1,$2}' | cut -d']' -f1)
Running this script on file.log gives this result:
curl -X PUT http://localhost:8080/api/user/alice/lastLoginTime/2021-05-27T09:45:54.497Z
curl -X PUT http://localhost:8080/api/user/bob/lastLoginTime/2021-05-27T09:29:16.464Z
curl -X PUT http://localhost:8080/api/user/admin/lastLoginTime/2021-05-27T08:19:12.261Z
curl -X PUT http://localhost:8080/api/user/frank/lastLoginTime/2021-05-27T08:10:08.179Z
curl -X PUT http://localhost:8080/api/user/martin/lastLoginTime/2021-05-27T07:14:14.124Z
For testing purposes, the curl command is printed to console instead of running it.
It does the job, but I am not satisfied by the super long last command. How can it be improved? Any suggestion is welcome.
catis useless and can be replaced by giving the filename as argument togrep. \$\endgroup\$