Artificial intelligent assistant

How can I list the top IO consuming files? My document management software is doing a lot of IO and I would like to know which files it is accessing the most. Is there a Linux tool that would give me the list of the top IO consuming files, like iotop but for files, every few seconds? That could look like: $ thetool THRPUT R/W/SWP FILE 40MB/s write /usr/alfresco/repo/1283421/1324928.doc 12MB/s read /usr/alfresco/cache/3928dh29f8if 11MB/s read /tmp/239398hf2f024f472.tmp I looked in the man pages of `iotop`,`lsof`,`strace` and they do not seem to offer such a feature.

I think your "number of bytes" metric is the wrong one. Consider two accesses. One reads 10MB from a file. The other reads every 512th byte of the file for the first 10MB. The "number of bytes" will be 512 times higher for the first access compared to the second. Yet they will both put precisely the same load on the I/O subsystem.

If you can accept "number of operations", which is just about as good or as bad as "number of bytes", then you have something you can actually measure. The `inotifywatch` program does this, and it's likely part of your distribution's `inotify-tools` package.

It will immediately tell you which files comprise the bulk of the accesses, and it will likely allow you to solve your actual problem.

xcX3v84RxoQ-4GxG32940ukFUIEgYdPy 6ff0672c81beb417b95dba2c0a0d7a54