









|
[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
[NMLUG] Python tight loop causing massive CPU barfage
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Paul Tietjens wrote:
| Dan Parrish wrote:
|
|> -----BEGIN PGP SIGNED MESSAGE-----
|> Hash: SHA1
|>
|> Paul Tietjens wrote:
|> | I have a python script that essentially opens a few thousand (between
|> | 70,000 and 230,000 or so) files, reads the first 1024 bytes and looks
|> | for a string match.
|> |
|> | The goal is to search an entire partition full of Maildirs for specific
|> | emails.
|> |
|> | I want the process to happen as fast as possible. So far, it takes
|> | around 21 minutes - but there's a snag. While this script is running,
|> | every other process on the machine becomes sluggish to the point of
|> | nonresponsiveness.
|> |
|> | No amount of playing with nice and priority levels seems to help.
|> |
|> | What has helped, is a small sleep() in the loop - but that raises the
|> | amount of time taken to complete the tasks fairly rapidly (from 21
|> | minutes to over an hour).
|> |
|> | In the end, I set up a goofy sort of throttling that alters the amount
|> | of time sleep()ing by the average load.
|> |
|> | Is there a better way to do this? I'm not much of a coder, and I know
|> | there are a couple on this list - so any tips offered, no matter how
|> | nebulous, would be great.
|> |
|> | Thanks in advance!
|> | _______________________________________________
|> | NMLUG mailing list
|> | NMLUG@nmlug.org
|> | http://www.nmlug.org/mailman/listinfo/nmlug
|>
|> This sounds like a pretty disk-intensive action. You'll wanna check
|> your DMA settings first, and see what you can do to tweak the HD
|> performance...Particularly in read actions.
|>
|> This won't be the whole enchilada, but it might help.
|>
|> - -Dan
|
|
|
| Thanks Dan! The hardware in question is a SCSI Raid with some rather
| overblown stats - so I didn't think about looking into the read speeds -
| but that might actually be something to look into - I think it's running
| straight mirroring now - but there might be some tweaking I can do
| somewhere.
| _______________________________________________
| NMLUG mailing list
| NMLUG@nmlug.org
| http://www.nmlug.org/mailman/listinfo/nmlug
Ok,yeah...Nevermind about the DMA then. I had another idea, though.
Rather than running through entire directories again and again, why not
just scan all the emails as they enter the system? This would certainly
spread out the load over time rather than crunching several thousand
files at once...Just another thought. ;-)
- -Dan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFCCb1RnURHNoE9YE4RAoAHAJ9aL+YCU2/m//YtqnAollT0K8jKDQCgkSic
JzoaFuds1tAtGUbrYTqhXp0=
=QbQZ
-----END PGP SIGNATURE-----
|
|