Quantcast

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
Marco Lehnort created JENKINS-13735:
---------------------------------------

             Summary: Jenkins starts wrong slave for job restricted to specific one
                 Key: JENKINS-13735
                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
             Project: Jenkins
          Issue Type: Bug
          Components: slave-setup, vsphere-cloud
    Affects Versions: current
         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
            Reporter: Marco Lehnort
            Assignee: Kohsuke Kawaguchi


I'm using the following setup:
- WinXP slaves A,B,C
- jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"

Assume all slaves are disconnected and powered off, no builds are queued.
When starting a build manually, say jC, the following will happen:
- job jC will be scheduled and also displayed accordingly in the build queue
- tooltip will say it's waiting because slave C is offline
- next, slave A is powered on by Jenkins and connection is established
- jC will not be started, Jenkins seems to honor the restriction correctly
- after some idle time, Jenkins realizes the slave is idle and causes shut down
- then, same procedure happens with slave B
- on occasion, next one is slave A again
- finally (on good luck?) slave C happens to be started
- jC is executed

It is possible that jC is waiting for hours (indefinitely?), because the required
slave is not powered on. I also observed this behaviour using a time-trigger
instead of manual trigger, so I assume it is independent of the type of trigger.
Occasionally it also happens that the correct slave is powered up right away,
but that seems to happen by chance. The concrete pattern is not obvious to me.

Note that the component selection above is just my best guess.

Cheers, Marco


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org

     [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on JENKINS-13735 started by Kohsuke Kawaguchi.

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

     [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on JENKINS-13735 stopped by Marco Lehnort.

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

     [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marco Lehnort updated JENKINS-13735:
------------------------------------

    Component/s: master-slave
   

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=162845#comment-162845 ]

Jason Swager commented on JENKINS-13735:
----------------------------------------

I've been encountering the same problem.  I thought it was in the code of the vSphere Plugin, but it turns out that it's not.  Jenkins is issuing a connect() call on slaves that have no reason to be starting up due to the queued jobs that I can see.

Part of the problem IS the vSphere Plugin itself.  Originally, when a job was fired up, any slave that was down that could the job would be started by the vSphere Plugin because the connect() method would get called on all those slaves, which resulted in a large number of VMs being powered on for a single job.  I added code to the plugin to throttle that behavior.  Unfortunately, the throttling is causing this problem to get worse.  Where as originally, jA, jB, and jC might have been started up, jC now MIGHT get started up due to the vSphere plugin throttling the VM startups.

Initial investigation seems to indicate that the Slave.canTake() function might not be functioning as expected. If I find anything further during my investigation, I'll post here.
               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=162892#comment-162892 ]

jsiirola commented on JENKINS-13735:
------------------------------------

I am also seeing this problem after upgrading from 1.459 -> 1.464 (running Winstone under Linux).  I do not have the vSphere plugin installed.  In my case, the problem is being exacerbated by one of the build slaves being down for maintenance.  This has led to jobs stacking up in the queue, which in turn has led to Jenkins starting every slave in the farm.
               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=163033#comment-163033 ]

Jason Swager commented on JENKINS-13735:
----------------------------------------

I believe that I have a fix for this, but being new to git and even more to Jenkins core programming, I'll just submit the patch (hopefully did that right) as part of this comment.  The patch should address a flaw in the code logic where a slave that cannot handle a build request is started.  The very minor change is to add one additional check to make sure that the slave CAN handle the request before flagging it to be startable.


 core/src/main/java/hudson/slaves/RetentionStrategy.java |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/java/hudson/slaves/RetentionStrategy.java b/core/src/main/java/hudson/slaves/RetentionStrategy.java
index 02611e5..f007ac6 100644
--- a/core/src/main/java/hudson/slaves/RetentionStrategy.java
+++ b/core/src/main/java/hudson/slaves/RetentionStrategy.java
@@ -218,7 +218,7 @@ public abstract class RetentionStrategy<T extends Computer> extends AbstractDesc
                         }
                     }
 
-                    if (needExecutor) {
+                    if (needExecutor && (c.getNode().canTake(item) == null)) {
                         demandMilliseconds = System.currentTimeMillis() - item.buildableStartMilliseconds;
                         needComputer = demandMilliseconds > inDemandDelay * 1000 * 60 /*MINS->MILLIS*/;
                         break;

               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=163033#comment-163033 ]

Jason Swager edited comment on JENKINS-13735 at 5/21/12 4:07 PM:
-----------------------------------------------------------------

I believe that I have a fix for this, but being new to git and even more to Jenkins core programming, I'll just submit the patch (hopefully did that right) as part of this comment.  The patch should address a flaw in the code logic where a slave that cannot handle a build request is started.  The very minor change is to add one additional check to make sure that the slave CAN handle the request before flagging it to be startable.

{noformat}
 core/src/main/java/hudson/slaves/RetentionStrategy.java |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/java/hudson/slaves/RetentionStrategy.java b/core/src/main/java/hudson/slaves/RetentionStrategy.java
index 02611e5..f007ac6 100644
--- a/core/src/main/java/hudson/slaves/RetentionStrategy.java
+++ b/core/src/main/java/hudson/slaves/RetentionStrategy.java
@@ -218,7 +218,7 @@ public abstract class RetentionStrategy<T extends Computer> extends AbstractDesc
                         }
                     }
 
-                    if (needExecutor) {
+                    if (needExecutor && (c.getNode().canTake(item) == null)) {
                         demandMilliseconds = System.currentTimeMillis() - item.buildableStartMilliseconds;
                         needComputer = demandMilliseconds > inDemandDelay * 1000 * 60 /*MINS->MILLIS*/;
                         break;
{noformat}
               
      was (Author: jswager1):
    I believe that I have a fix for this, but being new to git and even more to Jenkins core programming, I'll just submit the patch (hopefully did that right) as part of this comment.  The patch should address a flaw in the code logic where a slave that cannot handle a build request is started.  The very minor change is to add one additional check to make sure that the slave CAN handle the request before flagging it to be startable.


 core/src/main/java/hudson/slaves/RetentionStrategy.java |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/java/hudson/slaves/RetentionStrategy.java b/core/src/main/java/hudson/slaves/RetentionStrategy.java
index 02611e5..f007ac6 100644
--- a/core/src/main/java/hudson/slaves/RetentionStrategy.java
+++ b/core/src/main/java/hudson/slaves/RetentionStrategy.java
@@ -218,7 +218,7 @@ public abstract class RetentionStrategy<T extends Computer> extends AbstractDesc
                         }
                     }
 
-                    if (needExecutor) {
+                    if (needExecutor && (c.getNode().canTake(item) == null)) {
                         demandMilliseconds = System.currentTimeMillis() - item.buildableStartMilliseconds;
                         needComputer = demandMilliseconds > inDemandDelay * 1000 * 60 /*MINS->MILLIS*/;
                         break;

                 

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=163103#comment-163103 ]

Marco Lehnort commented on JENKINS-13735:
-----------------------------------------

I forked, applied the change and added a pull request:
https://github.com/jenkinsci/jenkins/pull/481.
               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=163111#comment-163111 ]

Jason Swager commented on JENKINS-13735:
----------------------------------------

Thank you!  I've really got to learn how to do this myself...
               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

     [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

SCM/JIRA link daemon resolved JENKINS-13735.
--------------------------------------------

    Resolution: Fixed
   

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=163275#comment-163275 ]

SCM/JIRA link daemon commented on JENKINS-13735:
------------------------------------------------

Code changed in jenkins
User: fma1977
Path:
 changelog.html
 core/src/main/java/hudson/slaves/RetentionStrategy.java
http://jenkins-ci.org/commit/jenkins/71ad43a141ddeb62d6df4a13b8513c42d73c0b82
Log:
  [FIXED JENKINS-13735]

Added test whether the currently checked slave computer actually can take the buildable item before flagging it as needed (avoids powering up and connecting to slaves for jobs they can't build).





               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

    [ https://issues.jenkins-ci.org/browse/JENKINS-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=163626#comment-163626 ]

dogfood commented on JENKINS-13735:
-----------------------------------

Integrated in !http://ci.jenkins-ci.org/images/16x16/blue.png! [jenkins_ui-changes_branch #30|http://ci.jenkins-ci.org/job/jenkins_ui-changes_branch/30/]
     [FIXED JENKINS-13735] (Revision 71ad43a141ddeb62d6df4a13b8513c42d73c0b82)

     Result = SUCCESS
Kohsuke Kawaguchi : [71ad43a141ddeb62d6df4a13b8513c42d73c0b82|https://github.com/jenkinsci/jenkins/commit/71ad43a141ddeb62d6df4a13b8513c42d73c0b82]
Files :
* core/src/main/java/hudson/slaves/RetentionStrategy.java
* changelog.html

               

> Jenkins starts wrong slave for job restricted to specific one
> -------------------------------------------------------------
>
>                 Key: JENKINS-13735
>                 URL: https://issues.jenkins-ci.org/browse/JENKINS-13735
>             Project: Jenkins
>          Issue Type: Bug
>          Components: master-slave, slave-setup, vsphere-cloud
>    Affects Versions: current
>         Environment: Jenkins 1.463 under Tomcat6 on Linux (SLES 11), Windows XP slave VMs controlled via vSphere Cloud plugin
>            Reporter: Marco Lehnort
>            Assignee: Kohsuke Kawaguchi
>              Labels: slave
>
> I'm using the following setup:
> - WinXP slaves A,B,C
> - jobs jA, jB, jC, tied to slaves A,B,C respectively using "Restrict where this job can run"
> Assume all slaves are disconnected and powered off, no builds are queued.
> When starting a build manually, say jC, the following will happen:
> - job jC will be scheduled and also displayed accordingly in the build queue
> - tooltip will say it's waiting because slave C is offline
> - next, slave A is powered on by Jenkins and connection is established
> - jC will not be started, Jenkins seems to honor the restriction correctly
> - after some idle time, Jenkins realizes the slave is idle and causes shut down
> - then, same procedure happens with slave B
> - on occasion, next one is slave A again
> - finally (on good luck?) slave C happens to be started
> - jC is executed
> It is possible that jC is waiting for hours (indefinitely?), because the required
> slave is not powered on. I also observed this behaviour using a time-trigger
> instead of manual trigger, so I assume it is independent of the type of trigger.
> Occasionally it also happens that the correct slave is powered up right away,
> but that seems to happen by chance. The concrete pattern is not obvious to me.
> Note that the component selection above is just my best guess.
> Cheers, Marco

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jenkins-ci.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org

I deployed the jenkins v1.470 today and tested the fis. Works like a charm!!!
No irrelevant slaves are powered up, the correct slave required to execute a job is started.

Thanks to everyone for the fast responses to my problem!
@Jason: thanks for your analysis and fix!

Cheers, Marco.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[JIRA] (JENKINS-13735) Jenkins starts wrong slave for job restricted to specific one

JIRA noreply@jenkins-ci.org
In reply to this post by JIRA noreply@jenkins-ci.org
Marco Lehnort closed Bug JENKINS-13735 as Fixed

Closing as everything seems to work as expected.

Change By: Marco Lehnort (14/Jun/12 7:32 AM)
Status: Resolved Closed
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators.
For more information on JIRA, see: http://www.atlassian.com/software/jira
Loading...