Conversation
When we need to reduce the used amount of disk space, we should drop some tables or partitions. In this case, we need to find out tables or partitions which are not read from. This extension logs when and which user initializes Seq Scan on which partition or table. The log file name is pg_log/access.log. Add a new hook - init_scan_hook which is called from InitScanRelation.
| gettimeofday(&tv, NULL); | ||
| pg_strftime(buf, sizeof(buf), | ||
| "%Y-%m-%d %H:%M:%S %Z,", | ||
| pg_localtime((pg_time_t*)&tv.tv_sec, log_timezone)); |
There was a problem hiding this comment.
That's a tricky one, I checked that tv.tv_sec might be long int and pg_time_t is always int64. In the line (pg_time_t*)&tv.tv_sec we might interpret 32-bit variable as a 64-bit one, isn't it an UB?
Check the elog.c:setup_formatted_log_time(void) out, they do something similar.
| * the end of the file before writing. The adjustment of the file offset and | ||
| * the write operation are performed as an atomic step." | ||
| */ | ||
| int f = open(LOG_FILE_NAME, O_WRONLY | O_APPEND | O_CREAT, S_IRUSR | S_IWUSR); |
There was a problem hiding this comment.
A bit general proposition, in PG code the special functions are used that replace sys calls related to files. (Check out OpenTransientFile() or CloseTransientFile())
These special funcs do something tricky with FDs, I don't quite get it for now, but I think we should use it for the safety purposes.
| if (f < 0 ) | ||
| { | ||
| ereport(WARNING, (errcode_for_file_access(), | ||
| errmsg("could not open file " LOG_FILE_NAME))); |
There was a problem hiding this comment.
I don't think the file name is actually logged :)
| static void | ||
| access_log_init_scan_hook(Relation currentRelation) | ||
| { | ||
| char buf[512]; |
There was a problem hiding this comment.
Do you think we can use StringInfoData instead of char buf[512]?
The line buf[19 + 1 + 6] = ' '; or sprintf(buf + 19, ".%06d", (int) (tv.tv_usec)); looks quite unhealthy for me.
|
|
||
| PG_MODULE_MAGIC; | ||
|
|
||
| #define LOG_FILE_NAME "pg_log/access.log" |
There was a problem hiding this comment.
Are you sure about current dir for the process being datadir? It seems for me that we can be hacked with this, can we use snprintf(path, MAXPGPATH, "%s/pg_log/access.log", DataDir);?
| static init_scan_hook_type next_init_scan_hook = NULL; | ||
|
|
||
| static void | ||
| write_to_log(const char* str) |
There was a problem hiding this comment.
We call write_to_log() func for every scan operation. I think we call this function once per partition scan too.
It does not seem right for me to call open/write/close once per write_to_log() invocation, what If a table has a trillion of partitions.
I would open the file lazily and keep a track of its openness, or we can do it in PG_init.
When we need to reduce the used amount of disk space, we should drop some tables
or partitions. In this case, we need to find out tables or partitions which are
not read from. This extension logs when and which user initializes Seq Scan on
which partition or table. The log file name is pg_log/access.log.
Add a new hook - init_scan_hook which is called from InitScanRelation.