recursive statistics on file types in directory? I did a website scrape for a conversion project. I'd like to do some statistics on the types of files in there -- for instance, 400 `.html` files, 100 `.gif`, etc. What...--prophetes.ai

recursive statistics on file types in directory? I did a website scrape for a conversion project. I'd like to do some statistics on the types of files in there -- for instance, 400 `.html` files, 100 `.gif`, etc. What's an easy way to do this? It has to be recursive. Edit: With the script that maxschelpzig posted, I'm having some problems due to the architecture of the site I've scraped. Some of the files are of the name `.php?blah=blah&foo=bar` with various arguments, so it counts them all as unique. So the solution needs to consider `.php*` to be all of the same type, so to speak.

You could use `find` and `uniq` for this, e.g.:

$ find . -type f | sed 's/.*\.//' | sort | uniq -c
16 avi
29 jpg
136 mp3
3 mp4

Command explanation

* `find` recursively prints all filenames
* `sed` deletes from every filename the prefix until the file extension
* `uniq` assumes sorted input
* `-c` does the counting (like a histogram).