Quantcast

(new here) applying Jenkins on our complex code base

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

(new here) applying Jenkins on our complex code base

Bram de Jong
Hello all,


I've just read the "official" jenkins book which gave me some ideas,
but I wanted to flesh out my ideas before I start configuring.

We have a relatively complex system which looks like this:

// old code bases, actively used elsewhere, but not actively changed
A <- svn repo, updated infrequently, updates take up to 30 minutes due
to bad layout
B <- svn repo, updated fairly frequently, uses things from A
// new code bases
C <- git repo, updated frequently, no dependencies
D <- svn repo, updated frequently, uses things from A, B and E
E <- git repo, updated frequently, uses things from A, B, C and D

Some facts about these reposities:

* For A, B, D and E, the same structure holds:
        A/Codebase <- shared code
        A/Applications/X <- an application
        A/Applications/Y <- an application
        A/Applications/Z <- an application
* All applications are C++.
* Most of the apps have configurations for building in MS Visual
Studio 2008 and/or 2010
* Many of the apps have configurations for building in XCode
* A few of the apps have configurations for building with linux makefiles.
* All the applications assume that all the A, B, C, D and E
repositories are in the same subfolder. I.e. for example code in B
uses code from A by refering to it as
../../../........./C/Codebase/something.h
* We need at least nightly builds for all the applications (-> at
least 50 apps), CI would be even better

Now, this definitely does not translate easily to a simple one
project, one repo kind of thing, so while the Jenkins book was
interesting, it did leave me wanting a bit! :-)

My initial guess on how to solve this:
* put each of the repo's in a job that *only fetches* the repo into a
shared directory and doesn't do anything else.
* use the join plugin to create different pipelines (apps in A only
need A, apps in B need A+B, apps in C need A+B+D+C)
* use a mac with windows and linux VM, set up windows and linux as slaves
* some apps have unit tests using the spendid
http://code.google.com/p/googletest/ which I suppose is compatible
with Jenkins

Some questions:
* I would love some feedback on this whole configuration and my plan of attack
* Does it make sense to try to use of the parameter-ised builds for
building the apps on various compiler/platforms? I'm worrying that
each of the jobs that do this will need something special/unique for
each platform.
* Would it be best to put the master on osx/windows/linux? With the VM
system I propose it doesn't really matter who the master is..
* Do you have any hints of things I should read?
* All of the apps share/assume the same file structure. Will this get
me into as much trouble as I think it will - jenkins being built
around the idea of each app having a separate workspace?

A lot of questions, ...

 - Bram

--
Bram de Jong - CTO
SampleSumo BVBA, Wiedauwkaai 23 G, B-9000 Ghent, Belgium
Web: http://www.samplesumo.com
Twitter: http://twitter.com/SampleSumo
Facebook: http://facebook.com/SampleSumo
Phone: +32 9 3355925 - Mobile: +32 484 154730
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Les Mikesell
On Mon, Jun 4, 2012 at 6:23 AM, Bram de Jong <[hidden email]> wrote:
>>
> My initial guess on how to solve this:
> * put each of the repo's in a job that *only fetches* the repo into a
> shared directory and doesn't do anything else.

I'd think in terms of jobs that build components and applications, not
so much in relationships to repositories.

> * use the join plugin to create different pipelines (apps in A only
> need A, apps in B need A+B, apps in C need A+B+D+C)

Where the code is in subversion, you might use svn externals to pull
the components in instead of anything special in jenkins.

> * use a mac with windows and linux VM, set up windows and linux as slaves

The VM's act the same as real machines, so you can use whatever is
convenient with sufficient performance.  It is a good idea to give the
slaves labels that reflect the build capabilities  and restrict the
jobs to labels rather than specific nodes so it is easy to expand the
pool for more capacity.

> Some questions:
> * I would love some feedback on this whole configuration and my plan of attack

I'd start with jobs that approximate whatever you are doing manually.
Aside from having fewer surprises, you want developers to be able to
do their own test builds before committing and have the same thing
happen in the integration run.

> * Does it make sense to try to use of the parameter-ised builds for
> building the apps on various compiler/platforms? I'm worrying that
> each of the jobs that do this will need something special/unique for
> each platform.

That's easy enough to change if you see you need it (you notice you
are creating many copies of a job for a single project with small
differences).   You may be more interested in the matrix build
capability if you build the same thing on several platforms, and for
that you need to execute the same command on each so you may need the
Xshell or groovy plugins to handle the variations.

> * Would it be best to put the master on osx/windows/linux? With the VM
> system I propose it doesn't really matter who the master is..

I'd run it on linux since it is probably the best-tested platform.

> * Do you have any hints of things I should read?

It's not that complicated, at least to get started.

> * All of the apps share/assume the same file structure. Will this get
> me into as much trouble as I think it will - jenkins being built
> around the idea of each app having a separate workspace?

Yes, with the possible exception of svn externals, you'll need to
arrange for each job to archive its build results and for anything
else that needs them to copy them wherever they need to be.  It is
probably possible to make jenkins leave the file structure alone and
do builds that just happen to work because needed components are
already there, but it seems like a really bad idea and doesn't mesh
well with distributed build farms.

--
   Les Mikesell
     [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Bram de Jong
>> My initial guess on how to solve this:
>> * put each of the repo's in a job that *only fetches* the repo into a
>> shared directory and doesn't do anything else.
>
> I'd think in terms of jobs that build components and applications, not
> so much in relationships to repositories.

The problem is that this would mean 100 different jobs that all do the
same thing (i.e. update 5 repositories - one of which is SUPER slow).
Each job will have need approx 10GB of HD space to just have the
repositories.
I.e. the overhead of having each of the 5 repos reproduced for each of
the 100+ jobs would be immense.

> Where the code is in subversion, you might use svn externals to pull
> the components in instead of anything special in jenkins.

But the code is spread in Git and Subversion, not just Subversion.
It's a bit of a mixture.
Also, there is no way I can reorganize things:
reorganizing the structure would mean fixing the build
scripts/projects for at leadst 100 applications on 5 different
platforms (if you count various version of visual studio as
platforms)!

> It is a good idea to give the
> slaves labels that reflect the build capabilities  and restrict the
> jobs to labels rather than specific nodes so it is easy to expand the
> pool for more capacity.

Duly noted!

> I'd start with jobs that approximate whatever you are doing manually.
> Aside from having fewer surprises, you want developers to be able to
> do their own test builds before committing and have the same thing
> happen in the integration run.

What I described in my email is exactly what we do:
1. make a super folder
2. check out the 5 repos in there
3. all the apps in those 5 repos now build

We don't create a new super folder for each application.

>> * Would it be best to put the master on osx/windows/linux? With the VM
>> system I propose it doesn't really matter who the master is..
>
> I'd run it on linux since it is probably the best-tested platform.

Duly noted!

>> * All of the apps share/assume the same file structure. Will this get
>> me into as much trouble as I think it will - jenkins being built
>> around the idea of each app having a separate workspace?
>
> Yes, with the possible exception of svn externals, you'll need to
> arrange for each job to archive its build results and for anything
> else that needs them to copy them wherever they need to be.  It is
> probably possible to make jenkins leave the file structure alone and
> do builds that just happen to work because needed components are
> already there, but it seems like a really bad idea and doesn't mesh
> well with distributed build farms.

Hmm, so this is *definitely* going to get me into trouble...


 - bram
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Bram de Jong
On Mon, Jun 4, 2012 at 5:27 PM, Bram de Jong <[hidden email]> wrote:

>>> My initial guess on how to solve this:
>>> * put each of the repo's in a job that *only fetches* the repo into a
>>> shared directory and doesn't do anything else.
>>
>> I'd think in terms of jobs that build components and applications, not
>> so much in relationships to repositories.
>
> The problem is that this would mean 100 different jobs that all do the
> same thing (i.e. update 5 repositories - one of which is SUPER slow).
> Each job will have need approx 10GB of HD space to just have the
> repositories.
> I.e. the overhead of having each of the 5 repos reproduced for each of
> the 100+ jobs would be immense.
<snip>

I suppose the gist of my question is:
If I have 50 different applications that are all sitting in the same
repository, and I want each of these applications to build separately
as a job, do I really need to do 50 different checkouts of the the
repository?

cheers,

 - bram
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

slide
On Mon, Jun 4, 2012 at 9:13 AM, Bram de Jong <[hidden email]> wrote:

> On Mon, Jun 4, 2012 at 5:27 PM, Bram de Jong <[hidden email]> wrote:
>>>> My initial guess on how to solve this:
>>>> * put each of the repo's in a job that *only fetches* the repo into a
>>>> shared directory and doesn't do anything else.
>>>
>>> I'd think in terms of jobs that build components and applications, not
>>> so much in relationships to repositories.
>>
>> The problem is that this would mean 100 different jobs that all do the
>> same thing (i.e. update 5 repositories - one of which is SUPER slow).
>> Each job will have need approx 10GB of HD space to just have the
>> repositories.
>> I.e. the overhead of having each of the 5 repos reproduced for each of
>> the 100+ jobs would be immense.
> <snip>
>
> I suppose the gist of my question is:
> If I have 50 different applications that are all sitting in the same
> repository, and I want each of these applications to build separately
> as a job, do I really need to do 50 different checkouts of the the
> repository?
>
> cheers,
>
>  - bram

No, we have a similar situation. I have a job that does the checkout
and then set the workspace for each of the other jobs to the place
where the SCM job checked the code out to and do the actual builds.

--
Website: http://earl-of-code.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Les Mikesell
In reply to this post by Bram de Jong
On Mon, Jun 4, 2012 at 10:27 AM, Bram de Jong
<[hidden email]> wrote:
>>
>> I'd think in terms of jobs that build components and applications, not
>> so much in relationships to repositories.
>
> The problem is that this would mean 100 different jobs that all do the
> same thing (i.e. update 5 repositories - one of which is SUPER slow).
> Each job will have need approx 10GB of HD space to just have the
> repositories.

I don't understand 'update repositories' in the context of a build.
Normally your repositories are full of branches/tags and maybe even
other unrelated projects, and a build just checks out exactly the
version it needs (generally the latest trunk rev for CI work).   And
aside from that, jenkins will keep your last workspace and do an
update to pull only the changes in the next revision once you get
started (optionally, of course), and will send the jobs back to the
same slave as long as it is available.   If you need 100 x 10 GB to do
the work you need to do, that doesn't seem like a big problem these
days especially since you can easily spread it over several slaves.

> I.e. the overhead of having each of the 5 repos reproduced for each of
> the 100+ jobs would be immense.

Again, I don't understand "reproducing" a repo.  Why does the build
server ever care about anything except a checked-out workspace of
exactly the project/revision it is building?  Aren't you doing
checkouts over a network already?

>> Where the code is in subversion, you might use svn externals to pull
>> the components in instead of anything special in jenkins.
>
> But the code is spread in Git and Subversion, not just Subversion.
> It's a bit of a mixture.

If you typically pull all the source together for a single compile and
you want to trigger a new build when any code is touched you may have
to have separate jobs polling each repo for changes, then triggering
the upstream run.  To get started, though, I'd just do scheduled
nightly builds and start them manually in the web interface if you
know something is changed and want a run earlier.

If you build binary component objects separately, then pull them
together in the final applications, you can have jenkins archive the
build artifacts from one job, then use the copy artifact plugin to
install them where an upstream job needs them.

Normally you will need to manage versioning of each component too, so
it is hard to generalize about the best way to do it.   With
subversion, you can point your svn externals at specific tags to
control the component revisions that are included but you'll want
similar control from other sources so a library can be changed for one
applications needs while still having access to an older rev for
others that use it.

> Also, there is no way I can reorganize things:
> reorganizing the structure would mean fixing the build
> scripts/projects for at leadst 100 applications on 5 different
> platforms (if you count various version of visual studio as
> platforms)!

If they are already broken (in the sense of just accidentally working
because needed components happen to be in the right place already),
you should probably fix them so the work predictably when automated,
at least to the point where either jenkins or the build scripts
retrieve everything that is needed, one way or another.  One approach
would be to add a 'top-level' application build script that assembles
everything you need and runs the commands for the build so you don't
have to change the existing scripts.  Or mix/match this with
shell/batch/groovy scripts embedded in the jenkins job.

If the builds already work such that someone with a suitable toolset
installed can check out the source and run a command to build, then a
jenkins job can do that for you.   Jenkins can supply more logic than
that, but normally you want the build scripts at the top level and for
each component that is built separately to do the work and to be
included in the source version control so that the same thing happens
in a developer's local test run and in the CI version.

> What I described in my email is exactly what we do:
> 1. make a super folder
> 2. check out the 5 repos in there
> 3. all the apps in those 5 repos now build

Is all of this somehow connected inseparably?   Normally, builds would
be oriented towards versions of specific components and then
applications containing them, and the versions of different parts
would be allowed to advance at different rates.

> We don't create a new super folder for each application.

Super folders and whole repositories don't relate very well to
specific build jobs.

--
   Les Mikesell
     [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Bram de Jong
In reply to this post by slide
On Mon, Jun 4, 2012 at 6:16 PM, Slide <[hidden email]> wrote:
> No, we have a similar situation. I have a job that does the checkout
> and then set the workspace for each of the other jobs to the place
> where the SCM job checked the code out to and do the actual builds.

Aha, and could you perhaps give me an idea of how you set this up precisely?


 - bram

--
Bram de Jong - CTO
SampleSumo BVBA, Wiedauwkaai 23 G, B-9000 Ghent, Belgium
Web: http://www.samplesumo.com
Twitter: http://twitter.com/SampleSumo
Facebook: http://facebook.com/SampleSumo
Phone: +32 9 3355925 - Mobile: +32 484 154730
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Les Mikesell
In reply to this post by Bram de Jong
On Mon, Jun 4, 2012 at 11:13 AM, Bram de Jong
<[hidden email]> wrote:
> >
> I suppose the gist of my question is:
> If I have 50 different applications that are all sitting in the same
> repository, and I want each of these applications to build separately
> as a job, do I really need to do 50 different checkouts of the the
> repository?

You don't _have_ to, but normally, the point of a build is to build a
component or application, not a bunch of unrelated stuff.  So for most
people it doesn't make any sense to check out anything but exactly the
revision you want at exactly the top-level directory containing the
code you want to build.   And jenkin's repository pollers will then
correctly track new commits and build the right things automatically
when you do it that way.

It rarely makes any sense to check out a whole repository that is
likely to contain tags/branches, etc. that are unrelated to your
current build.

--
   Les Mikesell
      [hidden email]
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

slide
In reply to this post by Bram de Jong
On Mon, Jun 4, 2012 at 9:22 AM, Bram de Jong <[hidden email]> wrote:

> On Mon, Jun 4, 2012 at 6:16 PM, Slide <[hidden email]> wrote:
>> No, we have a similar situation. I have a job that does the checkout
>> and then set the workspace for each of the other jobs to the place
>> where the SCM job checked the code out to and do the actual builds.
>
> Aha, and could you perhaps give me an idea of how you set this up precisely?
>
>
>  - bram
>
> --
> Bram de Jong - CTO
> SampleSumo BVBA, Wiedauwkaai 23 G, B-9000 Ghent, Belgium
> Web: http://www.samplesumo.com
> Twitter: http://twitter.com/SampleSumo
> Facebook: http://facebook.com/SampleSumo
> Phone: +32 9 3355925 - Mobile: +32 484 154730

We only have one SCM to deal with (ClearCase) so it will be slightly
different, but you may want to look into the multi-SCM plugin which
may allow polling of multiple SCMs (but I digress). The way I have
things setup is I have a job that polls my Clearcase for changes, when
it finds changes, it updates the local working copy in a specific
directory. Then it kicks off other jobs that are the actual other
builds. These jobs use a custom workspace that is the directory that
was updated in the upstream job. The tricky part is when you want to
do distributed builds, you either have to copy the source to the
slaves (which is what I do) or use a network share for the source
code.

Let me know if that helps, or if you need some more information.

Thanks,

slide



--
Website: http://earl-of-code.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Bram de Jong
In reply to this post by Les Mikesell
On Mon, Jun 4, 2012 at 6:18 PM, Les Mikesell <[hidden email]> wrote:

>> I suppose the gist of my question is:
>> If I have 50 different applications that are all sitting in the same
>> repository, and I want each of these applications to build separately
>> as a job, do I really need to do 50 different checkouts of the the
>> repository?
>
> You don't _have_ to, but normally, the point of a build is to build a
> component or application, not a bunch of unrelated stuff.  So for most
> people it doesn't make any sense to check out anything but exactly the
> revision you want at exactly the top-level directory containing the
> code you want to build.   And jenkin's repository pollers will then
> correctly track new commits and build the right things automatically
> when you do it that way.

When I was talking about repositories and checking them out, I did
mean only the trunks/master branches, never the whole repo.

> It rarely makes any sense to check out a whole repository that is
> likely to contain tags/branches, etc. that are unrelated to your
> current build.

If you'd go back to my first email and substitute "repository" by
"trunk" maybe my emails make more sense?
Our apps don't need the full repositories, they just need the trunk/master.
But still, the rest of my email(s) stay the same: we have many apps
sitting in the same repository.

On Mon, Jun 4, 2012 at 6:18 PM, Les Mikesell <[hidden email]> wrote:
>> I.e. the overhead of having each of the 5 repos reproduced for each of
>> the 100+ jobs would be immense.
>
> Again, I don't understand "reproducing" a repo.  Why does the build
> server ever care about anything except a checked-out workspace of
> exactly the project/revision it is building?  Aren't you doing
> checkouts over a network already?

Trunk. We'd need a trunk/master checkout of the 5 repos for each job.

>> What I described in my email is exactly what we do:
>> 1. make a super folder
>> 2. check out the 5 repos in there
>> 3. all the apps in those 5 repos now build
>
> Is all of this somehow connected inseparably?   Normally, builds would
> be oriented towards versions of specific components and then
> applications containing them, and the versions of different parts
> would be allowed to advance at different rates.

Yes, inseparable. Each of the repositories holds part of a huge
codebase that is used in many of the applications all over the various
repositories. Sure, we don't really like it this way, but we have no
say in the matter, we can't change the way it is set up.

I definitely understand that -as Slide and you both say- this makes it
rather hard (impossible?) to make easy distributed builds, but for now
we're looking at a single computer with 2 VM's on it, that might help
for now...


 - bram
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: (new here) applying Jenkins on our complex code base

Les Mikesell
On Mon, Jun 4, 2012 at 11:48 AM, Bram de Jong
<[hidden email]> wrote:
>
> If you'd go back to my first email and substitute "repository" by
> "trunk" maybe my emails make more sense?
> Our apps don't need the full repositories, they just need the trunk/master.
> But still, the rest of my email(s) stay the same: we have many apps
> sitting in the same repository.

At least in subversion, if you put multiple projects in the same repo
you would normally put the project directories at the top level, each
with its own trunk/branches/tags fanout, so what you check out (and
what jenkins watches) is project/trunk - and you don't waste/duplicate
any space when each job does its own checkout.   If you have a top
level trunk with multiple projects under that, things can get ugly.
But you should still be able to point to trunk/project for a build
job.

>> Is all of this somehow connected inseparably?   Normally, builds would
>> be oriented towards versions of specific components and then
>> applications containing them, and the versions of different parts
>> would be allowed to advance at different rates.
>
> Yes, inseparable. Each of the repositories holds part of a huge
> codebase that is used in many of the applications all over the various
> repositories. Sure, we don't really like it this way, but we have no
> say in the matter, we can't change the way it is set up.

Everybody has components/libraries that are included in multiple
higher level applications.  There are rare situations where you might
want to tie all of the trunk/HEAD versions together,  but it is much
more common that you would want to tag specific versions of every
component so each has it's own development cycle that won't break the
dependent applications that build against tested/tagged component
releases, advancing those tags independently.  So at some point in
your application level builds you must have some way to pick the
component versions.  The one I am most familiar with is svn externals,
but you can do it other ways, and they don't have that much to do with
repository layouts.  The catch is that the stock jenkins poller is
only going to know about one version control system per job.

> I definitely understand that -as Slide and you both say- this makes it
> rather hard (impossible?) to make easy distributed builds, but for now
> we're looking at a single computer with 2 VM's on it, that might help
> for now...

Your quick-fix here might be a network file share mounted/mapped into
all the slaves, but then you can only run one build at a time of
anything with dependencies.   That approach does work for
seldom-changed or release-versioned components like boost library
versions, etc.

--
   Les Mikesell
      [hidden email]
Loading...