









|
[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
[NMLUG] Python tight loop causing massive CPU barfage
Dan Parrish wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Paul Tietjens wrote:
> | I have a python script that essentially opens a few thousand (between
> | 70,000 and 230,000 or so) files, reads the first 1024 bytes and looks
> | for a string match.
> |
> | The goal is to search an entire partition full of Maildirs for specific
> | emails.
> |
> | I want the process to happen as fast as possible. So far, it takes
> | around 21 minutes - but there's a snag. While this script is running,
> | every other process on the machine becomes sluggish to the point of
> | nonresponsiveness.
> |
> | No amount of playing with nice and priority levels seems to help.
> |
> | What has helped, is a small sleep() in the loop - but that raises the
> | amount of time taken to complete the tasks fairly rapidly (from 21
> | minutes to over an hour).
> |
> | In the end, I set up a goofy sort of throttling that alters the amount
> | of time sleep()ing by the average load.
> |
> | Is there a better way to do this? I'm not much of a coder, and I know
> | there are a couple on this list - so any tips offered, no matter how
> | nebulous, would be great.
> |
> | Thanks in advance!
> | _______________________________________________
> | NMLUG mailing list
> | NMLUG@nmlug.org
> | http://www.nmlug.org/mailman/listinfo/nmlug
>
> This sounds like a pretty disk-intensive action. You'll wanna check
> your DMA settings first, and see what you can do to tweak the HD
> performance...Particularly in read actions.
>
> This won't be the whole enchilada, but it might help.
>
> - -Dan
Thanks Dan! The hardware in question is a SCSI Raid with some rather
overblown stats - so I didn't think about looking into the read speeds -
but that might actually be something to look into - I think it's running
straight mirroring now - but there might be some tweaking I can do
somewhere.
|
|