2

Clone () / fourk () / process creation is slow on some machines

 2 years ago
source link: https://www.codesd.com/item/clone-fourk-process-creation-is-slow-on-some-machines.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Clone () / fourk () / process creation is slow on some machines

advertisements

Creating new processes is very slow on some of my machines, and not others.

The machines are all similar, and some of the slow machines are running the exact same workloads on the same hardware and kernel (2.6.32-26, Ubuntu 10.04) as some of the fast machines. Tasks that do not involve process creation are the same speeds on all machines.

For example, this program executes ~50 times slower on the affected machines:

int main()
{
    int i;
    for (i=0;i<10000;i++)
    {
        int p = fork();
        if (!p) exit(0);
        waitpid(p);
    }
    return 0;
}

What could be causing task creation to be much slower, and what other differences could I look for in the machines?

Edit1: Running bash scripts (as they spawn a lot of subprocesses) is also very slow on these machines, and strace on the slow scripts shows the slowdown in the clone() kernel call.

Edit2: vmstat doesn't show any significant differences on the fast vs slow machines. They all have more than enough RAM for their workloads and don't go to swap.

Edit3: I don't see anything suspicious in dmesg

Edit4: I'm not sure why this is on stackoverflow now, I'm not asking about the example program above (just using it to demonstrate the problem), but linux administration/tuning, but if people think it belongs here, cool.


We experienced the same issue with our application stack, noticing massive degradation in application performance and longer clone times with strace. Using your test program across 18 nodes, I reproduced your results on the same 3 we were experiencing slow clone times with. All nodes were provisioned the same way, but with slightly different hardware. We checked the BIOS, vmstat, vm.overcommit_memory and replaced the RAM with no improvement. We then moved our drives to updated hardware and the issue was resolved.

CentOS 5.9 2.6.18-348.1.1.el5 #1 SMP Tue Jan 22 16:19:19 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

"bad" and "good" lspci:

$ diff ../bad_lspci_sort ../good_lspci_sort
< Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 05)
> Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

< Host bridge: Intel Corporation Xeon E3-1200 Processor Family DRAM Controller (rev 09)
> Host bridge: Intel Corporation Xeon E3-1200 v2/Ivy Bridge DRAM Controller (rev 09)

< ISA bridge: Intel Corporation C204 Chipset Family LPC Controller (rev 05)
> ISA bridge: Intel Corporation C202 Chipset Family LPC Controller (rev 05)

< PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 (rev b5)
> PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 (rev b5)

< VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) (rev 04)
> VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)




About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK