[dundee] VMWare Server 2.0

Thu Oct 2 23:59:13 UTC 2008

2008/10/2 Lee Hughes <toxicnaan at yahoo.co.uk>:
> All good stuff, what I was trying to say and failing miserably was that
> applications need
> to be better at reclaiming resources that are no longer needed. I think as
> programmers, it's
> easy to keep a connection to a files/databases/objects open for longer than
> that is
> needed.

Your point was clear :-) and you're absolutely correct, it's WAY
easier to do the wrong thing than the right one!

The trick is balancing all the functional requirements with the non
functional ones, like resource usage, code clarity, elegance,
speed/efficiency, extensibility/reuse, error handling, time to
market...  If you're merely a mortal, then pick two!  Oh, and the two
you pick may or may not be easily achievable, with the
language/environment/tools you're using!  Worse still you'll probably
only find out you picked the wrong two (or the wrong
language/environment) when it's too late ;-)  And then even when you
choose wisely, the non functional requirements might change!! :-\

> Apache for instance..
> ----------------------------------snip--
>
> Recycle Apache Processes
>
> If you noticed, I changed the MaxRequestsPerChild variable to 500, from 0.
> This variable tells Apache how many requests a given child process can
> handle before it should be killed. You want to kill processes, because
> different page requests will allocate more memory. If a script allocates a
> lot of memory, the Apache process under which it runs will allocate that
> memory, and it won't let it go. If you're bumping up against the memory
> limit of your system, this could cause you to have unnecessary swapping.
> Different people use different settings here. How to set this is probably a
> function of the traffic you receive and the nature of your site. Use your
> brain on this one.
>
> --snip---------------------------
>
>
> okay, that's all well and good, but constantly asking unix to create
> processes it going to
> slow the whole show down, fork is expensive!! :-). The ideal thing for
> apache to do is do
> this reclaim dynamically when it's less busy (i.e doing 10 request a second,
> rather than a 1000). This is just one app, I think the whole way processes
> allocate memory has to
> be rethought. It's probably easy to do with object way of doing things,
> because it's
> fairly easy to tell if object is in use and if it's not then
> goodbye...having looked at java's
> memory management stuff, it's rather nice. This stuff has to dynamic and
> self learning,

Sadly even in RE's with fantastic garbage collection we can leak
memory (as programmers can still do things wrong) e.g. You might add
some Object's to a Map as a form of caching but forget to remove the
Objects when you should.  Thankfully the JVM provides different
reference types which are treated differently by the garbage
collector, but these features aren't widely understood, and are
arguably too complex:

http://www-128.ibm.com/developerworks/library/j-refs/

> networks are variable beasts, and load + traffic patterns can vary
> considerably in a instant,
> having to manually tune applications for 'what may happen, or is happening'
> is so 1960's. ;-)
>
> Memory allocation  today in unix ?, it's basic and it's ugly. If vm's are
> going to be optimised and run at their full potential then the application
> running on top of them will
> need to be redesigned. Redesign applications, I doubt it....?
> The hope we have is running RE's , that do good memory management on behalf
> on running applications.

Personally I wouldn't restrict the blame to the applications.
Applications have to try and cover up for mistakes, shortcomings and
leaky abstractions further down the stack and tool-chain; whilst they
themselves introduce a whole new class of problems!  These problems a
rife and run right down to the hardware architectures themselves.

You're right to say it's unlikely applications will be redesigned, but
then neither will the RE's, Operating Systems or Hardware...  I mean
how long are hardware vendors going to keep pushing us von-neuman
shared memory machines with multi-core CPU's?  We really need to
reboot from the bottom (hardware) up and reset our broken abstractions
which aren't just leaking but causing floods!

Until then, we're stuck with the industry standard "best-practice"
approach of brushing it all under the carpet!  Which is probably the
only thing we can all rely on!

> Perhaps someone can tell me otherwise.
>
> and don't even mention memory leaks....
>
> anyway It's all good ammunition for my thesis project. ;-)
>
>
> Cheers,
> Lee
>
>
>
>
>
>
>
> --- On Thu, 2/10/08, Rick Moynihan <rick.moynihan at gmail.com> wrote:
>
> From: Rick Moynihan <rick.moynihan at gmail.com>
> Subject: Re: [dundee] VMWare Server 2.0
> To: toxicnaan at yahoo.co.uk, "Tayside Linux User Group"
> <dundee at lists.lug.org.uk>
> Date: Thursday, 2 October, 2008, 3:36 PM
>
> 2008/10/2 Lee Hughes <toxicnaan at yahoo.co.uk>:
>> Thing that gets me about virtual machines is memory usage, it all well and
>> good on
>> a single machine (non vm)  you can have say 1GB of ram and setup 2GB of
> swap
>> space,
>> but linux seems rather lazy about claiming and reclaiming memory, if I
> look
>> at my own
>> machine I have 1GB of ram,  and currently using 787MB of swap! This would
>> cause
>> havoc in a virtual environment.... okay I need a memory upgrade.
>>
>> Until linux applications constrain their memory use , or they can be given
>> hints on
>> maximum or minimum  memory use then using any virtual machine technology
>> that
>> support paging to disk, is a no no.
>>
>> paging to disk is not usually a bit problem, as only one machine is
>> effected, and that
>> machine has already exhausted it's memory  , so a slow down
>  is
> expected.
>> Misbehaving
>> VM's that are paging will effect performance of all vm's on that
> system.
>>
>> Take Apache for example, this always seems to grow in size, it will fork()
>> more depending
>> on it's load, using more memory in the process, I've never seen it
> release
>> memory ,
>> unless you restart the entire process. :-(. Obviously you can tune it, but
>> be great
>> if this, and other app were aware they we're being virtualised, and
> tuned
>> their memory
>> allocation accordingly..
>> .
>> So, perhaps applications should become more vm aware? Programmers should
>> stop thinking that memory is infiite resources, and stop assuming that if
>> they allocate more memory than is available then, the kernel/libc will
> just
>> 'sort it out for them'. Memory leaks on
>> one vm's app's could potential effect others..
>>
>> Linux still
>  suffers from memory leaks , they get fixed, I was told once
> that
>> the unix
>> mount command leaks lots of memory, sure you only ever run it, it does it
>> job
>> and it quits (linux then reclaims memory) but that's not excuse for
> sloppy
>> code.
>>
>> Java VM seems a bit more promising, at least you can force garbage
>> collection in
>> low memory situations.
>>
>> But what is the solution to this, large disk administrators setup up temp
>> area's, where
>> users can create very large working files but for a limited amount of
> time?
>> Perhaps
>> this needs to be implemented in memory management too?  Okay, mr apache
> you
>> can
>> double your memory size for but only for x amount of time.
>>
>> Openvz seems to stay away from virtual memory, and allows you you to
>> allocate
>> min and max pages, but it's rather a black magic  do with
>  your wetting
> your
>> finger and putting it the air.
>>
>> For my installed I've just pack as much ram as possible in , to avoid
>> unnecessary swapping.
>>
>> I'd be interested about commercial vm solutions, do they have a magic
> bullet
>> for
>> memory management?
>
> Some interesting points, but I'm not convinced that it was
> virtualisation that triggered the need for applications to be more
> memory aware to avoid interference with other processes.  Rather I
> think we've had this very same problem since we implemented
> virtual-memory and time-sharing, which must date it to around 1960!
> :-)
>
> You're right that language Runtime Environments (also called VM's but
> I'll call them RE's to avoid confusion) do a lot to help improve
> memory management, and Java's garbage collector is *VERY* highly
> regarded.  However even the JVM (and CLR) have historically had issues
> here, as garbage
>  collection could momentarily freeze the RE.
>
> The JVM used a "generational collection" strategy which mitigated
> this
> for a long time, but still proved problematic, in environments which
> required 'soft realtime response'.  Java 5 however saw an awesome
> improvement here with the implementation of Parallel & Concurrent
> Garbage Collection which kicks many of these issues firmly into touch.
>  Indeed, the JVM is even supposed to be able to dynamically select a
> garbage collection strategy to suit the application/environment:
>
> http://chaoticjava.com/posts/parallel-and-concurrent-garbage-collectors/
>
> I guess it might be nice to see this kind of approach adopted into VM
> hypervisors.
>
> The Erlang Runtime Environment, has another approach supported thanks
> to it's light-weight process model.  Here processes within the RE have
> separate heaps that are GC'd separately; drastically minimizing the
> time a process
>  can freeze for garbage collection:
>
> http://prog21.dadgum.com/16.html
>
> Speaking of Erlang, you might find this lecture interesting, where Joe
> Armstrong points to many of these memory issues (at least as they
> relate to forking/threading, being due to the granularity of page
> table sizes in the O/S:
>
> http://www.infoq.com/presentations/erlang-software-for-a-concurrent-world
>
> By the way...  If you're worried about run-away processes leaking
> memory then you might want to look into the process/task monitors God,
> or monit:
>
> - http://god.rubyforge.org/
> - http://www.tildeslash.com/monit/doc/examples.php
>
> They allow you to easily setup rules to restart a process if it's
> memory goes over a specified value for too long.  They also do a lot
> more besides!
>
> R.
>
> _______________________________________________
> dundee GNU/Linux Users Group mailing list
> dundee at lists.lug.org.uk
>  http://dundee.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/dundee
> Chat on IRC, #tlug on dundee.lug.org.uk
>
>
> _______________________________________________
> dundee GNU/Linux Users Group mailing list
> dundee at lists.lug.org.uk  http://dundee.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/dundee
> Chat on IRC, #tlug on dundee.lug.org.uk
>

-- 
Rick Moynihan
rick.moynihan at gmail.com
http://sourcesmouth.co.uk/blog/