jeudi 1 janvier 2015

stdout redirect. sh: resource temporarily unavailable


I have large batches of bash processes. Each bash script invokes executeables which have their stdout redirected to distinct log files. About 5% of the runs end up with: sh: [name of log]: Resource temporarily unavailable I tried to reduce amount of jobs running in parallel, but still the error persisted on some of the bash scripts.


Additional info:



  • Ubuntu 14.04 LTS running on VM using ESXi

  • Happens on a new partition, allocated with gparted and LVM (new logical volume consisting of the entire partition)

  • The LV is exported using nfs-kernel-server

  • The LV is also shared to windows using Samba

  • The LV is formatted using ext4

  • I have admin rights on this machine


More detailed info



  • Everything is run in a cluster, using Sun-Grid-Engine

  • There are 4 virtual machines: m1, m2, m3, m4

  • m1 runs sge master, sge exec, and ldap server

  • m2, m3, m4 run sge exec

  • m3 runs nfs-kernel-server, exporting a home folder sitting in logical volume (using LVM) that uses a partition on a local disk, to m1, m2, m4

  • m3 has a soft link to the home folder

  • m1, m2, m4 mount the home folder through fstab, so all machines end up pointing to the same home folder

  • m3, m2, m4 run ldap clients, connecting to m1

  • All jobs are submitted to the cluster through m1 (configured as a submission host)

  • Jobs fail exclusively on m3 (which exports the disk). Most of the jobs on m3 are passing though. Failures are random, but consistently on m3 alone.

  • m3 also shares the home via samba to windows clients Any help would be greatly appreciated :) (how to debug, which logs are relevant, how to get more info out of the system, etc...)


Thank you in advance!



Aucun commentaire:

Enregistrer un commentaire