Would the performance of this `grep` or `zgrep` command benefit from more memory, or from a faster CPU?
Sophia Terry
I have the following commands:
time grep -F -f 'in2.txt' test.fastq
time zgrep -F -f 'in2.txt' test.fastq.gzThere are about 30 search terms on files with ~5 GB. However I notice that on one computer it takes over 3-5x time to finish searching, this is on an Amazon spinup. Thus I'm wondering what is impacting the speed? Should I spin up an ECS that has more memory or better CPU speed?
21 Answer
CPU and I/O. If you are searching for a small (30 is quite small) set of terms, you are most likely to be I/O bound, and conceivably going to be CPU bound. You will not be memory bound.
[IMHO]
The right answer, of course, is to test it. You can do this a few ways, including having two terminals open and running 'dstat' while you run the command in question. If it takes a couple of seconds to complete, you should get an idea which resources are maxed out (to 100% or to some steady-state value), and which are not.
1