Discussion:
Memory Pools
(too old to reply)
Jim Starkey
2005-02-15 17:58:03 UTC
Permalink
The single most contentious issue in the merge debate will be memory
pools. Here is some background.

The History

My use of memory pools goes back to PDP-11 Datatrieve. The PDP-11 was a
16 bit machine, but unlike the 286, didn't have segment registers. A 16
bit address space was it. Most of the too-many PDP-11 operating systems
gave user code the full 64K. An exception was the most popular one,
RSTS, which took 8K to map the Basic interpreter (or something). To
give decent functionality in limited address space, density was
everything. Datatrieve used three pools. The permanent pool (metadata)
started at the end of code an grew up. The execution (temporary) pool
start at the top of the address space and grew down. What was left over
in the middle was available for sort space. At the end of request
execution, the lower limit of the execution pool was moved back to the
top of the address space.

VAX Datatrieve had more address space but more challenges. The product
was architected as a callable facility fronted ended by a "terminal
server". An interactive user could have only one active request, but
the Datatrieve server (the product did automatic, network wide query
decomposition), also a front end to the callable facility, could and did
support multiple active requests. Rather than write request memory
garbage collect code, I kept the remnants of the Datatrieve-11 pool
architecture to support a per-request compilation pool (purged at the
end of request compilation) and a per-request runtime pool (purged at
the end of request execution).

Interbase, originally gds/Galaxy, was targeted at the Unix workstation
market, effectively split between 100% 68000 and the rest MicroVAX.
Workstations in that era were 3MB to 4 MB machines, max.
Galaxy/Interbase had to share address space and physical memory with the
client program and OS runtime services. Compared to the PDP-11, it was
generous, but if the virtual space exceeded physical memory, it flogged
itself to death. I used pools again, but for a different reasons. The
primary reasons were that the code to intelligently delete the more
complex execution structures would have been quite large in code size,
fragile, and prone to memory leakage and corruption. Allocating request
specific memory from a request pool allow the entire request to vanish
with a pool deletion.

The memory pool allocator was rewritten for Firebird 1.5, but the
semantics were unchanged.

Objects and Complexity

C++ has two huge advantages over C. One is polymorphism (Dimitry will
be happy to explain), generally referred to as inheritance in
object-speak. Where C would have a single structure that had to
represent many variations of a theme, C++ supports a type hierarchy of
like-minded objects with different implementations. This allows, for
example, a whole hierarchy of runtime node with different internal
representation but sharing a common interface. The second huge advance
was destructors -- code called by the infrastructure when an object was
deleted to clean up after itself.

At least two big things have changed since the original release of
Interbase in 1985. One is that virtual address space and physical
memory configurations are huge and growing. The other, not unrelated,
is that the complexity of software has mushroomed. The answer to almost
unbounded complexity is object technology that allows complex
implementation to be encapsulated into externally simple objects.
Object technology demands a price, however, and that price shouldn't be
a surprise to database engineers. That price is referential integrity.
Objects are created, establish relationships to other objects, and are
destroyed. For this to work, the integrity of object relationships must
be guaranteed. Garbage collected languages like Java and Lisp never
reclaim objects for which a pointer exists. C++, at least not the way
we use it, is not garbage collected, so the responsibility maintaining
the integrity of inter-object relationships falls of the programmer.
There are many ways to do this (one size does NOT fit all), but it must
be done. C++ provides the hook -- the class destructor, a method called
by the language infrastructure to break its relationship to other
objects, preserving the integrity of the data structures.

The memory pool architecture inherited by Firebird from Datatrieve-11 is
based on the assumption that all pointers in the execution structure are
either to permanent things or to other things in the same pool. At one
point, this was a simplifying aassumption that significantly reduced the
code size and complexity. Interbase/Firebird has grown dramatically
since then, and the code to manage pools now dwarfs the code necessary
to cleanup individual objects. At the same time, the fact that the
destructor mechanism is not respected by Firebird memory pools means
that resource management of individual objects can not be managed by the
object but must be managed globally across the entire code base. It is
not sufficient for an engineer to understand the implementation of an
object and its immediate clients to work on an object, it is necessary
for him or her to understand the entire system to know when that object
can disappear without notice, leaving related objects with broken pointers.

Object oriented technology, in short, is incompatible with
delete-by-pool. We can choose objects, we can choose memory pools, but
we can't choose both.

Peripheral Issues

A variety of defenses of memory pools have been raised. Here are some
answers.

One argument is locality of reference -- that objects within the same
pool are physically close, reducing page faults. Answer: a full 32 bit
address space worth of physical memory is about $600. If machines page
fault, locality won't help.

Another argument is that pool induced locality improves processor cache
efficiency when multiple threads are scheduled on different processors.
Answer: A running request makes memory references to code (shared),
metadata (shared), lock tables (shared), and execution objects (shared
after compiled request caching in enabled). The only code that is
thread specific is the request impure area which probably accounts for
less than 10% of memory references. A 10% theoretical improvement is
overwhelmed with more practical efficiencies of object encapsulation.

NUMA (non-uniform memory access). One hardware architecture for very
large numbers of processors in a single physical address space involves
clusters of processors sharing a memory controller. Local memory
references are fast. Memory references outside the cluster invoke a
message exchange to the processor cluster controlling the memory. The
argument is that on a NUMA machine, pools can take advantage of cluster
aware memory allocators to ensure that allocated memory is local the the
processor cluster. Answer: First, NUMA machines don't exist in our
space. Second, the same argument against process/thread/cache affinity
apply to NUMA, but even more so. Third, classic, where all memory
references are local, is a more appropriate architecture for a NUMA
machine. Fourth, a NUMA machine is a dog for database management, which
is disk bound, not cpu bound. The way to make a NUMA machine do fast
database access is to put the database in a different cabinet connected
with a big pipe.

Bottom Line

Memory pools as used by Firebird are incompatible with object oriented
programming. You can pick objects or you can pick pools. Yes, classes
can be made pool-aware, but has nothing to do with the problem of
referential integrity between objects. Simply put, ripping an object
out of a complex structure when its parent memory pool is deleted
destroys the integrity of that data structure.

There is a compromise position that works but is of questionable value,
which is to support memory pools with the restriction that deleting a
pool containing an active object is a fatal error. It is consistent
with most of the pro-pool arguments as well as object referential
integrity. The benefit is problematic, at best.
Samofatov, Nickolay
2005-02-15 19:57:16 UTC
Permalink
Hi, All!

Let me summarize logic behind use of functional memory pools in modern
server systems.
Apache, PostgreSQL, Oracle and basically every serious server system
uses them and I can explain why.
I explained this more than once, but can write it now here the last
time.

Modern systems associate memory pools with objects with defined
lifetime.

Let me quote a piece of Apache documentation which gives some insight
about the pools:
---
3.3.7.1 Memory management with Pools

Apache offers functions for a variety of tasks. One major service apache
offers to its modules is memory management. Since memory management is a
complex task in C/C++ and memory holes are the hardest to find bugs in a
server, Apache takes care of freeing all used memory after a module has
finished its tasks. To accomplish that, all memory has to be allocated
by the apache core. Therefore memory is organized in pools. Each pool is
associated with a task and has a corresponding lifetime. The main pools
are the server, connection and request pool. The server pool lives until
the server is shut down or restarted, the connection pool lives until
the corresponding connection is closed and the request pool is created
upon arrival and destroyed after finishing a request. Any module can
request any type of memory from a pool. That way the core knows about
all used memory. Once the pool has reached the end of its lifetime, the
core deallocates all memory managed by the pool. If a module needs
memory that should even have a shorter lifetime than any of the
available pools a module can ask Apache to create a sub pool. The module
can then use that pool like any other. After the pool has served its
purpose, the module can ask Apache to destroy the pool. The advantage is
that if a module forgets to destroy the sub pool, the core only has to
destroy the parent pool to destroy all sub pools.
---

In database systems pools are commonly associated with such entities as
Server, Database, Transaction, Attachment, Statement and Request.

The single and most important thing about pools is that when you
allocate from particular pool you BIND LIFETIME of memory block or
object to the object which owns the pool.

This property creates several IMPORTANT outcomes:
1. Monitoring. Looking at pool sizes for various objects DBA can say
which god damned request consumed all memory of the server and kill the
offending session/request/transaction.
2. Memory leaks debugging. To debug server which consumed too much
memory on the customer site we take the snapshot of process, look up
which pool is too big and knowing functional purpose of the pool and
having pool dump information memory leak location is usually obvious
immediately, especially if line number information for allocations is
stored.
3. Failure isolation. Shit happens. Sometimes people forget to call
destructors and deallocate memory blocks. This is not normal situation,
but again, shit happens. And server needs to continue working without
consuming all memory in the world even if some bad things happen. Or
stop immediately producing developer report after memory leak is
detected encouraging developers to fix the leak.

The things below this point are not as important and are more sidenotes
than architectural concepts.

For 99% percent of allocations delete-by-pool does not impose any
problems. As long object doesn't own external things like file handles
or synchronization objects it may disappear magically in bulk.
This comes from the fact that if somebody stores pointer to some object
it should know very well the lifetime policy for object it points to.
Firebird strings and container classes are safe to delete by pool.
And if object points to external thing and is deleted by pool you can
usually notice the loss from OS monitoring tools. Sometimes, if you
store millions of small structures, such as data records in savepoint
backout journal it is faster to delete them by pool, not one-by-one.

Anyways, whether we allow deleting objects by pool or print out
developer report and dump core when such thing happens IS NOT A BIG
DEAL. Important things are written above.

Also, having memory pools infrastructure in place allows to do some neat
performance tricks such as using NUMA API to allocate memory with CPU
locality, but this is not a big deal either.


Nickolay Samofatov
Jim Starkey
2005-02-15 20:47:03 UTC
Permalink
Post by Samofatov, Nickolay
Let me quote a piece of Apache documentation which gives some insight
I have used Apache extensively. It is a disaster. It has the worst
system design and the worst memory management of any product I have
since IBSYS on the IBM 7040.

Like Firebird pools, Apache disappears memory when it feels like it. It
is not possible to maintain state in Apache without resorting to shared
memory and semaphores to manage the shared memory. It is particularly
difficult because Apache doesn't tell you when you're going away -- it
just destroys your memory.

It is impossible to put an application that maintains state in Apache.
The best you can do is open a socket to another process that can at
least detect a socket that has been closed.

You should compare the module environment between Apache and IIS, which
does support embedded applications. I have a deeper grudge against
Microsoft than anyone else on the project, but the application
environment on IIS is usable, and Apache is not.

Using C++ has the same problem as Firebird -- destructors are not
called, nor architecturally could they, since the Apache allocation call
hasn't the slightest idea of what it's allocating.
Post by Samofatov, Nickolay
---
3.3.7.1 Memory management with Pools
Apache offers functions for a variety of tasks. One major service apache
offers to its modules is memory management. Since memory management is a
complex task in C/C++ and memory holes are the hardest to find bugs in a
server, Apache takes care of freeing all used memory after a module has
finished its tasks.
And this makes Apache incompatible with objects. Apache is openly
hostile to C++ -- it is (or was) impossible to even link a C++ program
using Apache supplied tools. I know of no module in Apache (other than
my own) that is implemented in C++.
Post by Samofatov, Nickolay
For 99% percent of allocations delete-by-pool does not impose any
problems. As long object doesn't own external things like file handles
or synchronization objects it may disappear magically in bulk.
This comes from the fact that if somebody stores pointer to some object
it should know very well the lifetime policy for object it points to.
Firebird strings and container classes are safe to delete by pool.
This is unspeakable crap. No object should have to understand the
system in which it operates. To do so is a gross violation of the
concept of encapsulation. An object should understand its internals,
the resources it allocates, and the interfaces of related objects.
Requiring an object to understand the entire system negates all benefits
of object oriented programming. And, like Firebird, it makes it
impossible to reuse code not written for the system or to reuse code
written for the system in other contexts. In short, it destroys the
foundation of code re-usability.

Firebird is an excellent case study. Over the last year Vulcan can
developed dozens of useful classes -- a string class, a set of
configuration management, status vector handling, exception classes,
message file handling, internal event handling. None of these are in
bed with Vulcan internals, yet Firebird has been unable to re-use any of
the code because of its 20 year old dependency on pools to manage object
lifetime.

Firebird cannot live indefinitely in the dark ages.
Post by Samofatov, Nickolay
And if object points to external thing and is deleted by pool you can
usually notice the loss from OS monitoring tools. Sometimes, if you
store millions of small structures, such as data records in savepoint
backout journal it is faster to delete them by pool, not one-by-one.
Nickolay, I showed you performance numbers, which you ignored, and
continue to ignore. You had access to the test bed, the sample data,
your memory manager, and my memory manager. What you say is half true,
half untrue. It maybe faster to delete them as a pool if the objects
are allocated once, never deleted, and never re-used. If you add object
re-use into the equation, you get quite a different answer.

But performance isn't the issue, as you well know, or you would have
adopted the Vulcan memory manager that benchmarks five times the
performance of the Firebird memory manger. It primary issue is object
referential integrity, without which the benefits of object programming
are lost.
Post by Samofatov, Nickolay
Also, having memory pools infrastructure in place allows to do some neat
performance tricks such as using NUMA API to allocate memory with CPU
locality, but this is not a big deal either.
The last time you raised the issue of NUMA I offered you my dual
Opteron. Given the opportunity to demonstrate your argument, you
dropped the subject like a hot potato. The machine is still available.
Are you prepared to produce a test that demonstrates a performance gain?

But again, the issue isn't performance. The issue is object referential
integrity. But I repeat myself.
Brad Pepers
2005-02-15 20:53:03 UTC
Permalink
Post by Samofatov, Nickolay
Hi, All!
Let me summarize logic behind use of functional memory pools in modern
server systems.
Apache, PostgreSQL, Oracle and basically every serious server system
uses them and I can explain why.
I explained this more than once, but can write it now here the last
time.
Just an outsiders opinion but while I hear your arguments, I disagree
with them. I think memory pools just move the problem around and
instead of having memeory leaks you have allocation and object lifetime
problems which I think are harder to solve especially when C++ gives you
a good way to handle memory allocation/deallocation using the
constructor and destructor. I think that when shit happens you don't
try to merrily go along and try to pretend it didn't but instead should
log the problem and die immediately. I think pools lead to more sloppy
programming of just letting the pool handle mistakes rather than
requiring some responsibility on the developer. All the benefits of
memory leak tracking and monitoring can happen without pools. In fact
there are nice tools already built to do this and work with C++ rather
than having to build new tools to work with the memory pools. And as
for failure isolation I say again that when things fail, it should die
so people fix it rather than hiding behind something that tries to
pretend things aren't messed up and carrying on.

So to me your argument comes down to other people do it which isn't an
argument at all and then some technical points that are all very open to
question. I don't see a good argument why memory leak tracking is
easier with pools. In fact I think you will need to add extra tools to
your environment rather than using valgrind and others freely available
to track memory access and leaking problems. Monitoring is also
something that can be done just as well I think without pools so I don't
see the argument there. And then it finally comes down to whether its
better to carry on after things screw up or to log it and die (which is
perhaps more of a philosophical question), I would argue using the
features built in to the language and making developers responsible for
their code is better than code to work around the way the language works
and letting programmers just be lazy.

Just my un-asked for 10-bits worth!
--
Brad Pepers
***@linuxcanada.com
Paulo Gaspar
2005-02-17 16:24:21 UTC
Permalink
Hi Nickolay,


First I will rant a bit, but rest assured that I will then get constructive.

(The rant:)

Let me tell you one thing, your repetitive references to the word MODERN
(also in other posts) impress me as little as Jim's obscure references to
arcane systems belonging to the realms of archeology. Those are NOT
technical arguments. Saying something is MODERN or that something else
is an ANCIENT proven practice has no technical value for me.

I mention this because I suppose many others might be feeling just the same
frustration about this kind of argument.

Also, I couldn't care less for what Apache, PostgreSQL, Oracle or any
other "serious server system" do, UNLESS you explain exactly WHY they
do it - or point to adequate documentation doing so.

(Getting constructive:)

There is only ONE real advantage I see in memory pools as they are
applied in FB: they are lifecycle related. If you have one memory pool
per each cycle of allocation/deallocation of objects (like a pool per
Post by Samofatov, Nickolay
The single and most important thing about pools is that when you
allocate from particular pool you BIND LIFETIME of memory block or
object to the object which owns the pool.
BUT then you do not mention memory fragmentation. Keeping memory
fragmentation low might be specially usefull if you need to allocate
large blocks of memory.

Now, I will directly jump to the advantages you point out for memory pool
usage, since all the initial talk about Apache's model essentially adds
nothing
Post by Samofatov, Nickolay
...
1. Monitoring. Looking at pool sizes for various objects DBA can say
which god damned request consumed all memory of the server and kill the
offending session/request/transaction.
It is easy to provide alternative mechanisms to do monitoring that just by
keeping counters per thread.

On the killing side, is it not problematic assuming you will free the
whole block
without calling destructors? You are then missing the "++" advantage,
specially
when managing objects that might allocate other resources besides memory.
Post by Samofatov, Nickolay
2. Memory leaks debugging. To debug server which consumed too much
memory on the customer site we take the snapshot of process, look up
which pool is too big and knowing functional purpose of the pool and
having pool dump information memory leak location is usually obvious
immediately, especially if line number information for allocations is
stored.
This is just the same as 1., isn't it?
Post by Samofatov, Nickolay
3. Failure isolation. Shit happens. Sometimes people forget to call
destructors and deallocate memory blocks. This is not normal situation,
but again, shit happens. And server needs to continue working without
consuming all memory in the world even if some bad things happen. Or
stop immediately producing developer report after memory leak is
detected encouraging developers to fix the leak.
Shit must NOT happen on production software, and when it can not be avoided,
it must be handled graciously.

Actually, having a policy of releasing memory in blocks encourages shit
to happen,
since programmers will be tempted to rely on the final block release.
Post by Samofatov, Nickolay
For 99% percent of allocations delete-by-pool does not impose any
problems. As long object doesn't own external things like file handles
or synchronization objects it may disappear magically in bulk.
This comes from the fact that if somebody stores pointer to some object
it should know very well the lifetime policy for object it points to.
Firebird strings and container classes are safe to delete by pool.
And if object points to external thing and is deleted by pool you can
usually notice the loss from OS monitoring tools. Sometimes, if you
store millions of small structures, such as data records in savepoint
backout journal it is faster to delete them by pool, not one-by-one.
This smells. Does this mean that I must think, object by object, where am
I goint to place it? (Has it file handles? Has it references to other
resources? etc.)

Having "millions of small structures" and just release them in ablock just
to save the time of deallocating one by one???

Do you know what destructors are? Do you know how much safer your
programming becomes when each object cleans up its own mess? Do
you know you are throwing that away?

You also throw away code reuse, because all your classes must really
be specific to this situation. You also make life harder to any newcomer,
since this is so non-standard.

Finally, you don't gain that much because you don't have so many
millions of objects, CPUs are really fast, and since you have to read and
write things to disk all this deallocation crap takes only some hundreths
of one percent of the time spent on a given transaction.

Bigger gain I see: less memory fragmentation for large memory blocks.

Other ways to deal with it: allocate LARGE memory blocks (buffers,
pages, etc.) to per-lifecycle memory allocation zones and customize the
default allocation - which will be used by the other "smaller" objects -
to a separated smaller-object memory zone.

Anyway, to further discredit the importance of some custom memory
allocation practices, there is an interesting paper at:
http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf

On the side of USING custom allocation:

http://www.camtp.uni-mb.si/books/Thinking-in-C++/TIC2Vone-distribution/html/Chapter13.html

At the above text search for the "Overloading new & delete",
"Overloading global new & delete", "Overloading new & delete for a class"
and SPECIALLY "placement new & delete".

It shows (you probably know this, but can be interesting for the pure
C guys) how it is possible to build custom allocators for C++. You
can even have special allocator operators per class and operators that
accept "placement", like in:
ClassX* xp = new(memZone) ClassX;

You can also find this excelent book ("Thinking in C++" by Bruce Eckel)
for download here:
http://www.mindview.net/Books/DownloadSites/
(Use the "Master Download Site)

Regards,
Paulo Gaspar
Jim Starkey
2005-02-17 19:18:21 UTC
Permalink
Post by Paulo Gaspar
Let me tell you one thing, your repetitive references to the word MODERN
(also in other posts) impress me as little as Jim's obscure references to
arcane systems belonging to the realms of archeology. Those are NOT
technical arguments. Saying something is MODERN or that something else
is an ANCIENT proven practice has no technical value for me.
They are indications that something has been learned from experience.
We don't exactly start from the beginning on every question, you know.

I bring up "arcane systems" because these were the systems that drove
the current architecture. I've tried to explain that decisions I made
twenty years ago had basis of the systems of that era, but that that era
has passed.

I'm sorry if I bore you with the history of Interbase. I believe that
no technical questions are absolute, but can only be considered in
context of the platforms at the time. I think it does make a difference
in how a database management system is implemented if the target system
has 3 MB or 4 GB of memory.

Many the issues we've been discussing are decisions that I original made
and now want to change. I think that people find mind the history
interesting and useful, but if not, well, the delete button is at your
command.

Modern, however, does mean something. Naming conventions, for example,
have changed quite a bit. When I wrote Interbase, most platforms
supported global names of 8 characters or less. And, at the time, the
industry leaders in both mini-computers and workstations required dollar
signs in global names. In today's world, the dollar is proscribed, not
required.

The older practice was name segments dellimited by underscore. Current
practice uses capitalization to indicate name segments. Java has taken
naming conventions a step further, and suggests that class name begin
with a capital letter and member names begin with a lower letter. I
found this convention useful, and have adopted it. I believe there is
merit to understanding both the reasons for transition from a namespace
restricted by an 8 character length and contemporary usage. If you wish
to believe that experience and history has no bearing on contemporary
decisions, that is your right.
Post by Paulo Gaspar
There is only ONE real advantage I see in memory pools as they are
applied in FB: they are lifecycle related. If you have one memory pool
per each cycle of allocation/deallocation of objects (like a pool per
Post by Samofatov, Nickolay
The single and most important thing about pools is that when you
allocate from particular pool you BIND LIFETIME of memory block or
object to the object which owns the pool.
This was, in fact, once the case.
Paulo Gaspar
2005-02-18 02:25:19 UTC
Permalink
Post by Jim Starkey
Post by Paulo Gaspar
Let me tell you one thing, your repetitive references to the word MODERN
(also in other posts) impress me as little as Jim's obscure
references to
Post by Jim Starkey
Post by Paulo Gaspar
arcane systems belonging to the realms of archeology. Those are NOT
technical arguments. Saying something is MODERN or that something else
is an ANCIENT proven practice has no technical value for me.
Jim, lets remember that my answer was not to your post but to the
"MODERN" post, ok?

To enphasize that "MODERN" is no technical argument I also said that
"ANCIENT" is no argument either.

This does not mean that I dislike the historical references. Actually, I
quite like to follow those parts of your posts. I just mean that they
are not technical arguments and I sustain that.
Post by Jim Starkey
They are indications that something has been learned from
experience. We don't exactly start from the beginning on every
question, you know.

Of course. And when you say: "I already tried this and it does not work
because than X and Y happen" it IS a valid technical argument.

(Obviously, if you just say "I tried it and I don't like it", that it
does not stick, does it?)
Post by Jim Starkey
I bring up "arcane systems" because these were the systems that drove
the current architecture. I've tried to explain that decisions I made
twenty years ago had basis of the systems of that era, but that that era
has passed.

Yes, that was quite clear in this thread. You were saying: "I know I did
it this way because of context A but that does not apply anymore".
Post by Jim Starkey
I'm sorry if I bore you with the history of Interbase. I believe
that no technical questions are absolute, but can only be considered in
context of the platforms at the time. I think it does make a difference
in how a database management system is implemented if the target system
has 3 MB or 4 GB of memory.

You do not bore me. If you did, I would just skip reading your posts or
its historical bits.

I am the one that must regret for not making clear that, while I think
that MODERN or OLD are not technical arguments on itself, I did not want
to atack the sharing of experience.

Experience is clearly a big part of the value that each of us brings to
this group.
Post by Jim Starkey
Many the issues we've been discussing are decisions that I original
made and now want to change. I think that people find mind the history
interesting and useful, but if not, well, the delete button is at your
command.
Post by Jim Starkey
Modern, however, does mean something. Naming conventions, for
example, have changed quite a bit. When I wrote Interbase, most
platforms supported global names of 8 characters or less. And, at the
time, the industry leaders in both mini-computers and workstations
required dollar signs in global names. In today's world, the dollar is
proscribed, not required.

Yeah, but saying "this is good because it is modern" and add no other
value to it like on the post I was replying to just sounds like CRAP.
Post by Jim Starkey
The older practice was name segments dellimited by underscore.
Current practice uses capitalization to indicate name segments. Java
has taken naming conventions a step further, and suggests that class
name begin with a capital letter and member names begin with a lower
letter. I found this convention useful, and have adopted it. I believe
there is merit to understanding both the reasons for transition from a
namespace restricted by an 8 character length and contemporary usage.
If you wish to believe that experience and history has no bearing on
contemporary decisions, that is your right.

Jim, believe me, I am not that young and I don't believe you (or I) have
stopped learning and adapting just because of the age.

BTW, I think the rest of your post was really juicy. IMO it puts the
right level of detail supporting your previous arguments.

Best regards,
Paulo Gaspar
Jim Starkey
2005-02-18 07:36:19 UTC
Permalink
All that said, the current numbers for a small block (usually < 4K)
allocation and release is around 600 picoseconds on a cheap P4, not
something to lose a lot of sleep over.
600 picoseconds? I wish. Please make that 600 *nanoseconds*, which
still isn't that shabby.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Paulo Gaspar
2005-02-18 07:45:26 UTC
Permalink
Hi Nickolay,


First I will rant a bit, but rest assured that I will then get constructive.

(The rant:)

Let me tell you one thing, your repetitive references to the word MODERN
(also in other posts) impress me as little as Jim's obscure references to
arcane systems belonging to the realms of archeology. Those are NOT
technical arguments. Saying something is MODERN or that something else
is an ANCIENT proven practice has no technical value for me.

I mention this because I suppose many others might be feeling just the same
frustration about this kind of argument.

Also, I couldn't care less for what Apache, PostgreSQL, Oracle or any
other "serious server system" do, UNLESS you explain exactly WHY they
do it - or point to adequate documentation doing so.

(Getting constructive:)

There is only ONE real advantage I see in memory pools as they are
applied in FB: they are lifecycle related. If you have one memory pool
per each cycle of allocation/deallocation of objects (like a pool per
Post by Samofatov, Nickolay
The single and most important thing about pools is that when you
allocate from particular pool you BIND LIFETIME of memory block or
object to the object which owns the pool.
BUT then you do not mention memory fragmentation. Keeping memory
fragmentation low might be specially usefull if you need to allocate
large blocks of memory.

Now, I will directly jump to the advantages you point out for memory pool
usage, since all the initial talk about Apache's model essentially adds
nothing
Post by Samofatov, Nickolay
...
1. Monitoring. Looking at pool sizes for various objects DBA can say
which god damned request consumed all memory of the server and kill the
offending session/request/transaction.
It is easy to provide alternative mechanisms to do monitoring that just by
keeping counters per thread.

On the killing side, is it not problematic assuming you will free the
whole block
without calling destructors? You are then missing the "++" advantage,
specially
when managing objects that might allocate other resources besides memory.
Post by Samofatov, Nickolay
2. Memory leaks debugging. To debug server which consumed too much
memory on the customer site we take the snapshot of process, look up
which pool is too big and knowing functional purpose of the pool and
having pool dump information memory leak location is usually obvious
immediately, especially if line number information for allocations is
stored.
This is just the same as 1., isn't it?
Post by Samofatov, Nickolay
3. Failure isolation. Shit happens. Sometimes people forget to call
destructors and deallocate memory blocks. This is not normal situation,
but again, shit happens. And server needs to continue working without
consuming all memory in the world even if some bad things happen. Or
stop immediately producing developer report after memory leak is
detected encouraging developers to fix the leak.
Shit must NOT happen on production software, and when it can not be avoided,
it must be handled graciously.

Actually, having a policy of releasing memory in blocks encourages shit
to happen,
since programmers will be tempted to rely on the final block release.
Post by Samofatov, Nickolay
For 99% percent of allocations delete-by-pool does not impose any
problems. As long object doesn't own external things like file handles
or synchronization objects it may disappear magically in bulk.
This comes from the fact that if somebody stores pointer to some object
it should know very well the lifetime policy for object it points to.
Firebird strings and container classes are safe to delete by pool.
And if object points to external thing and is deleted by pool you can
usually notice the loss from OS monitoring tools. Sometimes, if you
store millions of small structures, such as data records in savepoint
backout journal it is faster to delete them by pool, not one-by-one.
This smells. Does this mean that I must think, object by object, where am
I goint to place it? (Has it file handles? Has it references to other
resources? etc.)

Having "millions of small structures" and just release them in ablock just
to save the time of deallocating one by one???

Do you know what destructors are? Do you know how much safer your
programming becomes when each object cleans up its own mess? Do
you know you are throwing that away?

You also throw away code reuse, because all your classes must really
be specific to this situation. You also make life harder to any newcomer,
since this is so non-standard.

Finally, you don't gain that much because you don't have so many
millions of objects, CPUs are really fast, and since you have to read and
write things to disk all this deallocation crap takes only some hundreths
of one percent of the time spent on a given transaction.

Bigger gain I see: less memory fragmentation for large memory blocks.

Other ways to deal with it: allocate LARGE memory blocks (buffers,
pages, etc.) to per-lifecycle memory allocation zones and customize the
default allocation - which will be used by the other "smaller" objects -
to a separated smaller-object memory zone.

Anyway, to further discredit the importance of some custom memory
allocation practices, there is an interesting paper at:
http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf

On the side of USING custom allocation:

http://www.camtp.uni-mb.si/books/Thinking-in-C++/TIC2Vone-distribution/html/Chapter13.html

At the above text search for the "Overloading new & delete",
"Overloading global new & delete", "Overloading new & delete for a class"
and SPECIALLY "placement new & delete".

It shows (you probably know this, but can be interesting for the pure
C guys) how it is possible to build custom allocators for C++. You
can even have special allocator operators per class and operators that
accept "placement", like in:
ClassX* xp = new(memZone) ClassX;

You can also find this excelent book ("Thinking in C++" by Bruce Eckel)
for download here:
http://www.mindview.net/Books/DownloadSites/
(Use the "Master Download Site)

Regards,
Paulo Gaspar
Jim Starkey
2005-02-18 09:26:31 UTC
Permalink
Post by Samofatov, Nickolay
Hi, All!
Let me summarize logic behind use of functional memory pools in modern
server systems.
Apache, PostgreSQL, Oracle and basically every serious server system
uses them and I can explain why.
I explained this more than once, but can write it now here the last
time.
Uh, Nickolay, might it be because Apache, PostgresSQL, and Oracle are
all written in C? And that if/when they cut over to C++ for the same
reasons that Firebird has, they will learn about object technology, and
change their methodology?
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Thomas Miller
2005-02-18 10:09:23 UTC
Permalink
Post by Jim Starkey
Post by Samofatov, Nickolay
Hi, All!
Let me summarize logic behind use of functional memory pools in modern
server systems. Apache, PostgreSQL, Oracle and basically every
serious server system
uses them and I can explain why.
I explained this more than once, but can write it now here the last
time.
Uh, Nickolay, might it be because Apache, PostgresSQL, and Oracle are
all written in C? And that if/when they cut over to C++ for the same
reasons that Firebird has, they will learn about object technology,
and change their methodology?
We use to be an Oracle partner and Oracle always made a big thing about
their memory pools to reduce memory
thrashing (re allocating memory and defragmentation). Same with IO
stuff. There was never any mention of memory
leak management. Maybe Oracle does use pools to help with memory leaks,
but their main reason for using pools,
is to minimize memory thrashing.

One thing is for sure, the longer I program the more I learn. Jim's
review of history is good to look at the way it was
done and we now need to look at the way is should be done from lessons
learned over the years. In 10 years time,
we will be having a discussion of FB7 and how stupid we were to do the
memory management the way we did
for FB3. But that is a discussion for many years down the road. Today,
we need to use today's best practices.
Paulo Gaspar
2005-02-18 14:39:10 UTC
Permalink
Wait, wait, wait....

Lets avoid further misunderstandings. Jim said in his previous post of
Post by Jim Starkey
Nobody has suggested that pools be eliminated. The argument is over
delete-by-pool. Pool are good and useful things as long as they respect
the integrity of objects.

So, the problem is to always call the destructors or open exceptions,
right???

Now I am going to have diner (just arrived to Lisbon from Amsterdam) and
then I will say something more about the FlyweightPattern:
http://c2.com/cgi/wiki?FlyweightPattern

...which applies exactly to the case where you would have a million of
small objects.


Regards,
Paulo Gaspar
Post by Jim Starkey
Post by Samofatov, Nickolay
Hi, All!
Let me summarize logic behind use of functional memory pools in modern
server systems. Apache, PostgreSQL, Oracle and basically every
serious server system
uses them and I can explain why.
I explained this more than once, but can write it now here the last
time.
Uh, Nickolay, might it be because Apache, PostgresSQL, and Oracle are
all written in C? And that if/when they cut over to C++ for the same
reasons that Firebird has, they will learn about object technology,
and change their methodology?
Paulo Gaspar
2005-02-18 16:03:24 UTC
Permalink
OPS!

The FlyweightPattern is not at all what I meant. Too many years without
reviewing the book! (G4's Patterns)

Anyway, what I used for such situations (zilions of small objects) was
to, instead of creating zillions of
independent objects, to create a container class implementing the
object's logic and holding a simple array
(or list, hashtable, whatever) of structs with the object's properties.

You can then act over each of those objects via its parent container,
either passing an index or id as an
extra parameter to identify the instance to manipulate:

zillionManager.switchState(myObjectId, CRAZY, VERY);
zillionManager.jumpOfTheBridge(myObjectId);
zillionManager.openParachute(myObjectId, REALY_FAST);

or by using a proxy obtained trough the container:

RadicalSportsman& radMan = zillionManager.getProxy(myObjectId);
...
radMan.switchState(CRAZY, VERY);
radMan.jumpOfTheBridge();
radMan.openParachute(REALY_FAST);
...
radMan.release(); // ...'cause I got it from a proxy pool???

You can then have multiple implementations depending on the expected use
cases and performance needs.
For instance, if using the proxy method:
- The proxy might be an object created every time it is needed or from
a pool;
- The logic might all be at the Container with the proxy just piping
the calls (efficient with inline methods)
or it might work in tandem with the container, with the container
managing allocation and deallocation
logic and the proxy implementing the other methods.

Anyway, there is always an option to make this kind of thing efficient
and C++ inline qualifier might keep
the code quite efficient.

Regards,
Paulo Gaspar

P.S.: DO you know that Eclipse (http://www.eclipse.org/) is starting to
have a rather nice support to work
in C++??? Nice editing features, code browsing, etc.
Post by Paulo Gaspar
Wait, wait, wait....
Lets avoid further misunderstandings. Jim said in his previous post of
Post by Jim Starkey
Nobody has suggested that pools be eliminated. The argument is over
delete-by-pool. Pool are good and useful things as long as they
respect the integrity of objects.
So, the problem is to always call the destructors or open exceptions,
right???
Now I am going to have diner (just arrived to Lisbon from Amsterdam)
http://c2.com/cgi/wiki?FlyweightPattern
...which applies exactly to the case where you would have a million of
small objects.
Regards,
Paulo Gaspar
Post by Jim Starkey
Post by Samofatov, Nickolay
Hi, All!
Let me summarize logic behind use of functional memory pools in modern
server systems. Apache, PostgreSQL, Oracle and basically every
serious server system
uses them and I can explain why.
I explained this more than once, but can write it now here the last
time.
Uh, Nickolay, might it be because Apache, PostgresSQL, and Oracle are
all written in C? And that if/when they cut over to C++ for the same
reasons that Firebird has, they will learn about object technology,
and change their methodology?
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel
Paulo Gaspar
2005-02-18 22:43:17 UTC
Permalink
Post by Jim Starkey
Post by Samofatov, Nickolay
Hi, All!
Let me summarize logic behind use of functional memory pools in modern
server systems. Apache, PostgreSQL, Oracle and basically every
serious server system
Post by Jim Starkey
Post by Samofatov, Nickolay
uses them and I can explain why.
...
Uh, Nickolay, might it be because Apache, PostgresSQL, and Oracle are
all written in C? And that if/when they cut over to C++ for the same
reasons that Firebird has, they will learn about object technology, and
change their methodology?
I do not really understand WHY your are fighting because:
- As far as I understand, EVERYBODY (among the most active players)
agrees that destructors must always be called and that pools might be
"good and useful things".

- The only diference I see is about wether there should be or not any
exception to the destructor call rule. Since the case made for such
exception is based on performance concerns, I believe it will be easy to
dismiss such exceptions one by one as they pop up, just by presenting a
design which respects the constructor call rule and has low performance
impact.

And now for the quotes that support the fact that everybody basically
Post by Jim Starkey
As I have said, I don't any any problem with alternative memory managers
that disappear along with their allocations. But no object with a
destructor (or needs a destructor) should be allocated from one.
...
Nobody has suggested that pools be eliminated. The argument is over
delete-by-pool. Pool are good and useful things as long as they respect
the integrity of objects.
...
Jim Starkey
a) pools are useful for bugchecks and memory tracing (other possible
benefits are left to Nickolay)
b) destructors must be called
c) exceptions from (b) are discussed one-by-one and implemented after a
performance research
Dmitry
Currently Firebird uses approach 1. I talked with Dmitry Emanov and he
agrees that long-term goal is to move to approach 3. In short, in new
code destructors must always be called and memory deallocated explicitly
unless you really, really understand what you are doing.
...
Nickolay
Regards,
Paulo Gaspar
Jim Starkey
2005-02-19 08:03:23 UTC
Permalink
Post by Paulo Gaspar
- As far as I understand, EVERYBODY (among the most active players)
agrees that destructors must always be called and that pools might be
"good and useful things".
Object lifetime is one of the two or three major strategic questions in
the merge between Firebird 2 and Vulcan. Almost all objects in Firebird
2 are deleted by pool deletion. To make this work, every resource
controlled by an object subject to delete-by-pool must also be pool
aware and allocated from the same pool. This makes writing object code
in Firebird difficult, time consuming, and fragile. It also makes
adoption of non-pool aware classes problematical at best. Vulcan still
has much of his code, but all new classes respect standard object
lifetime conventions, and none are "pool aware".

A question facing the project is whether or not we should accept object
oriented technology into our database implementation. If the answer is
yes, then I believe we must respect the integrity of objects and observe
individual object lifetime controls. This necessarily involves
rejection of the concept of delete-by-pool purposes other than
exceptional, localized mechanisms.

So where you and many other developers may say "of course, that's the
way objects work", in fact, that isn't the way Firebird 2 and
predecessors work.

I am arguing that post-merge object integrity be respected, a
pre-condition for adaptation of object technology. I am also arguing
that it is no longer necessary for all Firebird classes to be pool aware
and, in fact, most classes will not be. These questions seep into
almost every line of code in the prodct. I believe it better that we
resolve this question abstractly than as a debating point in the
general merge debate.

In my mind, the key technical questions around the merge are:

1. Object integrity (also known as delete-by-pool), including
obligatory pool awareness
2. The provider architecture
3. Object structured vs. flat configuration files
4. Thread synchronization primitives
5. The division of work between the Y-valve and the providers
6. Exception handling
Alex Peshkov
2005-02-21 09:14:04 UTC
Permalink
Post by Jim Starkey
Object lifetime is one of the two or three major strategic questions in
the merge between Firebird 2 and Vulcan. Almost all objects in Firebird
2 are deleted by pool deletion. To make this work, every resource
controlled by an object subject to delete-by-pool must also be pool
aware and allocated from the same pool.
If an object was allocated from the pool, allocating all controlled by
it resources from the same pool (or a pool with lifetime less than
original) is a normal requirement. Only when this condition is satisfied
pools work normally. I don't mean deletion by pool - that's bad
practice, and it should better be gone as soon as possible (except some
specially discussed exceptions, may be). But pools would not solve
problems of memory fragmentation, memory usage control and all other
related (in which they were accepted to be useful) if we don't follow
this rule.
Moreover - if some resource is controlled by the object, for what reason
should it not be released when that object itself dies?
Post by Jim Starkey
This makes writing object code
in Firebird difficult, time consuming, and fragile.
That's true a long as object doesn't know from which pool was it
allocated. But new firebird objects (like string and the whole family of
dynamic arrays) all know it. It leads to memory losses - sizeof
(pointer) per object - but makes working with pooled memory as easy as
possible. Object allocates all it's internals from that pool without any
problems. As a side effect it makes possible to delete-by-pool such
object, but for me this is nothing more than side effect (in 99.9%) - I
leave not more than 0.1% for special cases.
Post by Jim Starkey
It also makes
adoption of non-pool aware classes problematical at best.
Not very hard - make it allocate memory from that same pool from which
it was allocated itself. Exceptions handling with the non-firebird rules
seem to be much worse problem when trying to adopt foreign classes.
Post by Jim Starkey
Vulcan still
has much of his code, but all new classes respect standard object
lifetime conventions, and none are "pool aware".
A question facing the project is whether or not we should accept object
oriented technology into our database implementation. If the answer is
yes, then I believe we must respect the integrity of objects and observe
individual object lifetime controls. This necessarily involves
rejection of the concept of delete-by-pool purposes other than
exceptional, localized mechanisms.
Completely agreed.
Post by Jim Starkey
So where you and many other developers may say "of course, that's the
way objects work", in fact, that isn't the way Firebird 2 and
predecessors work.
I am arguing that post-merge object integrity be respected, a
pre-condition for adaptation of object technology. I am also arguing
that it is no longer necessary for all Firebird classes to be pool aware
and, in fact, most classes will not be.
This is the worst thing we can do. If non-pool-aware class needs to use
pool-aware one, how will it determine, which pool to use. We will have
to invite special solution for any such case. And this will really
become waste of time (like now when every function which had not
received tdbb parameter needs to do some bad calls to get it from TLS,
with the same lines of code repeated more, and more, and more ...).
Post by Jim Starkey
These questions seep into
almost every line of code in the prodct. I believe it better that we
resolve this question abstractly than as a debating point in the
general merge debate.
1. Object integrity (also known as delete-by-pool), including
obligatory pool awareness
I suggest to divide it into 2 parts - delete-by-pool and pool awareness
are two different things. I say "no" to delete-by-pool but "yes" to
pool-awareness.
** ** ** **
BTW - Nickolay does a lot of work fixing memory corruption and memory
leaks. If he says that pools help him to do _real_ things, why should we
break his working environment?
** ** ** **
Post by Jim Starkey
2. The provider architecture
no architectural problems, but one small practical. Why not link default
(current) database provider with y-valve and remote listener statically?
This gives a bit better results in windows then separate dll's.
Post by Jim Starkey
3. Object structured vs. flat configuration files
firebird's configuration system is a toy compared with vulcan...

Sorry, not ready to comment rest now.

Alex.
Jim Starkey
2005-02-21 10:11:02 UTC
Permalink
Post by Alex Peshkov
If an object was allocated from the pool, allocating all controlled by
it resources from the same pool (or a pool with lifetime less than
original) is a normal requirement. Only when this condition is
satisfied pools work normally. I don't mean deletion by pool - that's
bad practice, and it should better be gone as soon as possible (except
some specially discussed exceptions, may be). But pools would not
solve problems of memory fragmentation, memory usage control and all
other related (in which they were accepted to be useful) if we don't
follow this rule.
Moreover - if some resource is controlled by the object, for what
reason should it not be released when that object itself dies?
If we are agreed that we can not tolerate delete-by-pool, then the
primary issue is settled.

The secondary issue is whether the engineering costs pool awareness are
justified by an offsetting gain. However, since pool affinity is no
longer required for correct operation of the the server, this can
considered on a case by case basis.

All shared objects are allocated from the permanent pool, which is also
the pool used by the "new" operator. There is no need for these
objects to be pool aware since, by definition, they are shared, and
anything they reference is shared, too.

In Vulcan, compiled statements are designed to shared as well. At the
moment, a CStatement (compiled statement) has its own pool, as do
private compiled statements in Firebird. I don't know how to charge an
attachment for the memory cost of shared compiled statement. Do we
charge only the unlucky guy who gets there first, and give everyone else
a free ride? Do we charge everyone who uses a shared statement the full
cost? What about internal compiled statements?

The Firebird runtime is designed to run without additional memory
allocation (or "was" designed that way), so it isn't much of an issue.
So in the final analysis, everything but the per-instance statement
"impure area" is shared.

The memory fragmentation issue needs a fresh look. The Vulcan memory
manager handles small blocks and large blocks differently. Small blocks
area allocated from a sub-pool and are re-used. Large block are
allocated from a different sub-pool but and recombined on release.
Since all objects and metadata strings qualify as small objects and can
be allocated and and released without memory fragmentation. Large
objects, on the other hand, come in two flavors, persistent and
transient. Persistent large objects tend to stay around until server
shutdown. Transient large objects tend to allocated in a runtime
context, used, and released. Intermixed persistent and transient large
object will impeded recombination and result in memory fragmentation.
The solution is a number of "permanent" pools, for example, one pool for
allocating record buffers and a second for everything else. There may
be a need for a third or fourth, but I haven't seen it.

The issue of delete-by-pool is life or death for objects within
Firebird. Pool awareness is just an engineering tradeoff.
Post by Alex Peshkov
Post by Jim Starkey
I am arguing that post-merge object integrity be respected, a
pre-condition for adaptation of object technology. I am also arguing
that it is no longer necessary for all Firebird classes to be pool
aware and, in fact, most classes will not be.
This is the worst thing we can do. If non-pool-aware class needs to
use pool-aware one, how will it determine, which pool to use. We will
have to invite special solution for any such case. And this will
really become waste of time (like now when every function which had
not received tdbb parameter needs to do some bad calls to get it from
TLS, with the same lines of code repeated more, and more, and more ...).
Oh, no. We can do things that are much worse. There is no reason that
any object *has* to allocate other objects from a particular pool, just
there *may* be some advantages to doing so under certain circumstances
that can be considered on a case by case basis.
Post by Alex Peshkov
BTW - Nickolay does a lot of work fixing memory corruption and memory
leaks. If he says that pools help him to do _real_ things, why should
we break his working environment?
So did I. I wrote the Vulcan memory manager because the Firebird memory
manager didn't catch multiple releases, bad releases, or corruption.

If Nickolay has an argument to make, let's let him make it. So far he's
chosen to sit this one out, as is his right. If he has something to
say, he'll say it.
Post by Alex Peshkov
Post by Jim Starkey
2. The provider architecture
no architectural problems, but one small practical. Why not link
default (current) database provider with y-valve and remote listener
statically? This gives a bit better results in windows then separate
dll's.
Because there will more than one engine and possibly more than one
remote. If Firebird 3 has one engine and Firebird 4 has another (an
absolute certainty!) and the engines are released as separate shared
libraries, the installation of Firebird 4 won't have any impact on
application running on Firebird 3. If we bundle the engine with the
Y-valve, however, replacing the Y-valve eliminates the old engine.

A provider exports exactly one symbol, an entrypoint that returns a
vector of pointers to the libraries "SubSystem" objects. This means, if
you wish, that the remote interface and engine could share a library.
It also means that if you do so, they configuration system can't
differentiate them by name. It also means that the cost of activating
an engine provider is essentially the same as the delta cost in
activating a combined Y-valve/engine library.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Alex Peshkov
2005-02-24 00:22:32 UTC
Permalink
Post by Jim Starkey
If we are agreed that we can not tolerate delete-by-pool, then the
primary issue is settled.
The secondary issue is whether the engineering costs pool awareness are
justified by an offsetting gain. However, since pool affinity is no
longer required for correct operation of the the server, this can
considered on a case by case basis.
But it would be useful to have some default rule - or we will have to
discuss each "new" in the program.
Post by Jim Starkey
All shared objects are allocated from the permanent pool, which is also
the pool used by the "new" operator. There is no need for these
objects to be pool aware since, by definition, they are shared, and
anything they reference is shared, too.
In Vulcan, compiled statements are designed to shared as well.
That's strange - they should be specific for Database object. How can 2
different databases share statements?
Post by Jim Starkey
At the
moment, a CStatement (compiled statement) has its own pool, as do
private compiled statements in Firebird. I don't know how to charge an
attachment for the memory cost of shared compiled statement. Do we
charge only the unlucky guy who gets there first, and give everyone else
a free ride? Do we charge everyone who uses a shared statement the full
cost? What about internal compiled statements?
As soon as we start to place CStatement into Database's pool, all this
problems should be gone in the natural way - database pool is common
pool and no user should be charged for it.
Post by Jim Starkey
The Firebird runtime is designed to run without additional memory
allocation (or "was" designed that way), so it isn't much of an issue.
So in the final analysis, everything but the per-instance statement
"impure area" is shared.
I'm afraid it "was". At least in case of deep triggers recursion dynamic
allocation will happen - I've not found another way to effectively solve
problem with twice (or more) updates of the same record in such recursion.

User should be charged memory in Attachment, Transaction and Impure
pools. For dynamic memory allocation at runtime may be used any of this
pools, depending upon in which data structure should reference to it be
stored.
Post by Jim Starkey
The memory fragmentation issue needs a fresh look. The Vulcan memory
manager handles small blocks and large blocks differently. Small blocks
area allocated from a sub-pool and are re-used. Large block are
allocated from a different sub-pool but and recombined on release.
Since all objects and metadata strings qualify as small objects and can
be allocated and and released without memory fragmentation. Large
objects, on the other hand, come in two flavors, persistent and
transient. Persistent large objects tend to stay around until server
shutdown. Transient large objects tend to allocated in a runtime
context, used, and released. Intermixed persistent and transient large
object will impeded recombination and result in memory fragmentation.
The solution is a number of "permanent" pools, for example, one pool for
allocating record buffers and a second for everything else. There may
be a need for a third or fourth, but I haven't seen it.
The issue of delete-by-pool is life or death for objects within
Firebird. Pool awareness is just an engineering tradeoff.
Post by Alex Peshkov
Post by Jim Starkey
I am arguing that post-merge object integrity be respected, a
pre-condition for adaptation of object technology. I am also arguing
that it is no longer necessary for all Firebird classes to be pool
aware and, in fact, most classes will not be.
This is the worst thing we can do. If non-pool-aware class needs to
use pool-aware one, how will it determine, which pool to use. We will
have to invite special solution for any such case. And this will
really become waste of time (like now when every function which had
not received tdbb parameter needs to do some bad calls to get it from
TLS, with the same lines of code repeated more, and more, and more ...).
Oh, no. We can do things that are much worse. There is no reason that
any object *has* to allocate other objects from a particular pool, just
there *may* be some advantages to doing so under certain circumstances
that can be considered on a case by case basis.
I didn't want to say that the object must always allocate data from that
same pool from which it was allocated itself. I meant that if there are
no special (considered on a case by case basis) requirements for memory
allocation by the object this is the simplest way for it to find a place
for placing his internals.
The wrong solution, which I was talking about, is to mix pool-aware and
non-pool-aware objects in the same design. When non-pool-aware object is
inserted between needs to create pool-aware object, it has to make a
decision - waht pool to use? Certainly, in some special cases, this
decision may be done naturally (place record buffer in a pool of record
buffers, for example), but in general it will mean some special
procedure to determine the correct pool. I suggest to have a standard
for such procedure - every object knows the pool from which it was
allocated (default 'new' also allocates memory from some pool, is not
it?) and when we have no special requirements (like record buffer) all
internal allocations are done from that pool.
Post by Jim Starkey
Post by Alex Peshkov
Post by Jim Starkey
2. The provider architecture
no architectural problems, but one small practical. Why not link
default (current) database provider with y-valve and remote listener
statically? This gives a bit better results in windows then separate
dll's.
Because there will more than one engine and possibly more than one
remote. If Firebird 3 has one engine and Firebird 4 has another (an
absolute certainty!) and the engines are released as separate shared
libraries, the installation of Firebird 4 won't have any impact on
application running on Firebird 3. If we bundle the engine with the
Y-valve, however, replacing the Y-valve eliminates the old engine.
A provider exports exactly one symbol, an entrypoint that returns a
vector of pointers to the libraries "SubSystem" objects. This means, if
you wish, that the remote interface and engine could share a library.
It also means that if you do so, they configuration system can't
differentiate them by name. It also means that the cost of activating
an engine provider is essentially the same as the delta cost in
activating a combined Y-valve/engine library.
I didn't suggest to build current provider *only* as static executable.
As far as I understand vulcan, it can run many processes each being
multithreaded accessing the same database. But why all this processes
should use the same library? Use of dynamic library is mostly effective
when we need a kind of embedded engine (though nothing prevents from
having not only dll/so form of library, also lib/a). Small
inefficiencies of dll compared with exe in windows are nothing relative
to having engine in user address space. For remote connections that
inefficiencies are not compensated by something.
I don't suggest to break OSRI - just a way to build server with current
provider to make it take most of host OS. I think this will not be
needed for *nix*s, this is just BG's wonderful OS problem.
Jim Starkey
2005-02-24 04:40:02 UTC
Permalink
Post by Alex Peshkov
Post by Jim Starkey
The secondary issue is whether the engineering costs pool awareness
are justified by an offsetting gain. However, since pool affinity is
no longer required for correct operation of the the server, this can
considered on a case by case basis.
But it would be useful to have some default rule - or we will have to
discuss each "new" in the program.
The default is that a class is not pool aware. If you think it should
be, be prepared to make a case for it and/or extend someone else's
implementation.

I think the case for pools has been greatly overblown, but it's
something we need to discuss in context. Iff there's a benefit, we
should do it.
Post by Alex Peshkov
Post by Jim Starkey
In Vulcan, compiled statements are designed to shared as well.
That's strange - they should be specific for Database object. How can
2 different databases share statements?
Yes, compiled statements are database specific. But they can be shared
across attachments if security checks are made and any session specific
name resolution can be confirmed consistent (Firebird doesn't need this,
yet).
Post by Alex Peshkov
As soon as we start to place CStatement into Database's pool, all this
problems should be gone in the natural way - database pool is common
pool and no user should be charged for it.
I see no reason to put CStatement into a database specific pool. Do
you? If so, what is it?
Post by Alex Peshkov
I'm afraid it "was". At least in case of deep triggers recursion
dynamic allocation will happen - I've not found another way to
effectively solve problem with twice (or more) updates of the same
record in such recursion.
Keep looking. It's possible that this is the first instance in over 20
years that requires dynamic allocation, but it isn't the first time that
a developer thought it was the case.
Post by Alex Peshkov
Post by Jim Starkey
Oh, no. We can do things that are much worse. There is no reason
that any object *has* to allocate other objects from a particular
pool, just there *may* be some advantages to doing so under certain
circumstances that can be considered on a case by case basis.
I didn't want to say that the object must always allocate data from
that same pool from which it was allocated itself. I meant that if
there are no special (considered on a case by case basis) requirements
for memory allocation by the object this is the simplest way for it to
find a place for placing his internals.
It is clearly not the simplest. In most cases, it is the most complex
and difficult to implement. A strong case *must* be made to justify the
development and runtime cost.
Post by Alex Peshkov
The wrong solution, which I was talking about, is to mix pool-aware
and non-pool-aware objects in the same design. When non-pool-aware
object is inserted between needs to create pool-aware object, it has
to make a decision - waht pool to use? Certainly, in some special
cases, this decision may be done naturally (place record buffer in a
pool of record buffers, for example), but in general it will mean some
special procedure to determine the correct pool. I suggest to have a
standard for such procedure - every object knows the pool from which
it was allocated (default 'new' also allocates memory from some pool,
is not it?) and when we have no special requirements (like record
buffer) all internal allocations are done from that pool.
My guess is that less than 5% of the object allocations have any benefit
from special pools. Convince me that I'm wrong.
Post by Alex Peshkov
Post by Jim Starkey
Post by Jim Starkey
2. The provider architecture
I didn't suggest to build current provider *only* as static
executable. As far as I understand vulcan, it can run many processes
each being multithreaded accessing the same database. But why all this
processes should use the same library? Use of dynamic library is
mostly effective when we need a kind of embedded engine (though
nothing prevents from having not only dll/so form of library, also
lib/a). Small inefficiencies of dll compared with exe in windows are
nothing relative to having engine in user address space. For remote
connections that inefficiencies are not compensated by something.
I don't follow. An engine provider is loaded only if a database
resolves through the configuration file systems to the engine provider.
Unless a user alters the default, pre-defined configuration, a client
process is going to see only the remote and possibly gateway providers.
Post by Alex Peshkov
I don't suggest to break OSRI - just a way to build server with
current provider to make it take most of host OS. I think this will
not be needed for *nix*s, this is just BG's wonderful OS problem.
A client program links to the Y-valve. The Y-valves resolves the given
database name string through the configuration file system, which give
it a set of providers. The Y-valve loads these providers in given
order. The first one to load successfully and returns success from at
attachment is the designated winner. It really doesn't have anything to
do with operating system or even Bill Gates.
Alex Peshkov
2005-02-25 09:11:38 UTC
Permalink
Post by Jim Starkey
Post by Alex Peshkov
Post by Jim Starkey
The secondary issue is whether the engineering costs pool awareness
are justified by an offsetting gain. However, since pool affinity is
no longer required for correct operation of the the server, this can
considered on a case by case basis.
But it would be useful to have some default rule - or we will have to
discuss each "new" in the program.
The default is that a class is not pool aware. If you think it should
be, be prepared to make a case for it and/or extend someone else's
implementation.
I think the case for pools has been greatly overblown, but it's
something we need to discuss in context. Iff there's a benefit, we
should do it.
I think that we only discuss, should class be pool-aware by default or
not. We have agreed, that pools are needed to reduce fragmentation, to
monitor use of memory per Attachment (and may be kill Attachment, which
overuses memory in order to save server as whole), to be able to monitor
memory leaks at pool level. I think that you overestimate the
engineering efforts, required to achieve pool awareness of objects. As
soon as we have default rules for memory allocation in pool aware
object, this efforts become not too big (another words - very small).
Post by Jim Starkey
Post by Alex Peshkov
Post by Jim Starkey
In Vulcan, compiled statements are designed to shared as well.
That's strange - they should be specific for Database object. How can
2 different databases share statements?
Yes, compiled statements are database specific. But they can be shared
across attachments if security checks are made and any session specific
name resolution can be confirmed consistent (Firebird doesn't need this,
yet).
Post by Alex Peshkov
As soon as we start to place CStatement into Database's pool, all this
problems should be gone in the natural way - database pool is common
pool and no user should be charged for it.
I see no reason to put CStatement into a database specific pool. Do
you? If so, what is it?
If we talk about cache, then we must be ready to remove old CStatement
according to some criteria. Small memory blocks (there is a lot of small
blocks in CStatement, is not it?), freed when deleting CStatement, will
be reused in database specific pool. As far as I remember your previous
letters, reusing of small blocks by vulcan memory manager is very
efficient, even more efficient than cutting them once again from empty pool.
Post by Jim Starkey
Post by Alex Peshkov
I'm afraid it "was". At least in case of deep triggers recursion
dynamic allocation will happen - I've not found another way to
effectively solve problem with twice (or more) updates of the same
record in such recursion.
Keep looking. It's possible that this is the first instance in over 20
years that requires dynamic allocation, but it isn't the first time that
a developer thought it was the case.
I will ask you a question in separate thread.
Post by Jim Starkey
Post by Alex Peshkov
I didn't want to say that the object must always allocate data from
that same pool from which it was allocated itself. I meant that if
there are no special (considered on a case by case basis) requirements
for memory allocation by the object this is the simplest way for it to
find a place for placing his internals.
It is clearly not the simplest. In most cases, it is the most complex
and difficult to implement. A strong case *must* be made to justify the
development and runtime cost.
It is implemented in firebird2 classes library and I see absolutely no
problems using it. All objects are derived from class PermanentStorage,
and the only thing it does is keeping and providing information about
the pool, from which memory for this object was allocated. All required
efforts are:
1. Write in object constructor:
Object(MemoryPool &p, ...) : PermanentStorage(p), ....
2. When allocating memory in any method of pool aware object, write:
Object* o = FB_NEW(getPool()) Object(getPool(), ...);
Even if it helps only to monitor memory usage by Attachment, reduce
fragmentation, monitor leaks and help in debugging, it seems to be not
too high price for this.
Post by Jim Starkey
Post by Alex Peshkov
The wrong solution, which I was talking about, is to mix pool-aware
and non-pool-aware objects in the same design. When non-pool-aware
object is inserted between needs to create pool-aware object, it has
to make a decision - waht pool to use? Certainly, in some special
cases, this decision may be done naturally (place record buffer in a
pool of record buffers, for example), but in general it will mean some
special procedure to determine the correct pool. I suggest to have a
standard for such procedure - every object knows the pool from which
it was allocated (default 'new' also allocates memory from some pool,
is not it?) and when we have no special requirements (like record
buffer) all internal allocations are done from that pool.
My guess is that less than 5% of the object allocations have any benefit
from special pools. Convince me that I'm wrong.
Jim Starkey
2005-02-25 10:10:10 UTC
Permalink
Post by Alex Peshkov
I think that we only discuss, should class be pool-aware by default or
not. We have agreed, that pools are needed to reduce fragmentation, to
monitor use of memory per Attachment (and may be kill Attachment,
which overuses memory in order to save server as whole), to be able to
monitor memory leaks at pool level. I think that you overestimate the
engineering efforts, required to achieve pool awareness of objects. As
soon as we have default rules for memory allocation in pool aware
object, this efforts become not too big (another words - very small).
No, I don't think we are in agreement on these things. Memory
fragmentation is important, but is primarily addressed by the memory
manager. Pools are only needed is the patterns of memory usage defeat
the fragmentation control in the memory manager. I haven't seen any
evidence that this is the case.

Per attachment memory is quite simple. If we don't charge an attachment
for compiled objects, which are shared among attachments, attachment
specific memory is pretty much limited to request impure areas, sort
work space, and savepoint space. Unless I missed something significant,
which is certainly possible, these can be easily managed without
resorting to pools.

It isn't necessary to have task specific pools to do leak detection.
When an engine is shut down, any memory outstanding is a leak. Nothing
additional is required.

My objection to pool awareness of general purpose classes is that it
limits the re-useability of those classes. I would like to see us
develop a nice collection of reusable classes. If every class
requires the Firebird memory manager as part of it's environment, these
classes are aren't going to be very useful.

I believe in simplicity. Classes are simpler, more general, and more
useful is they aren't pool aware. Before we accept the complexity of
pool awareness, I want to be convinced that there is some benefit. At
the moment, I don't see it.
Post by Alex Peshkov
Post by Jim Starkey
I see no reason to put CStatement into a database specific pool. Do
you? If so, what is it?
If we talk about cache, then we must be ready to remove old CStatement
according to some criteria. Small memory blocks (there is a lot of
small blocks in CStatement, is not it?), freed when deleting
CStatement, will be reused in database specific pool. As far as I
remember your previous letters, reusing of small blocks by vulcan
memory manager is very efficient, even more efficient than cutting
them once again from empty pool.
A CStatement could be dropped out of the compiled statement cached by
the cache manager releasing it reference count the CStatement. When the
last instance is deleted, the CStatement will go away, releasing all of
this friends and relations, include a whole bunch of small-ish execution
nodes. At the moment, CStatement has the same statement pool as
Firebird, and it is released en mass. When we turn the execution nodes
into a type hierarchies (statements, expressions, and RSBs), I think we
should get rid of the pool affiliation code.

In the current Vulcan implementation, each provider has an instance of
the memory manager that overloads the operator "new". What this boils
down to is that each provider has a dedicated default memory pool. I
think it is better for objects freed by a release of CStatement to go
back into a provider pool than a database specific pool, which would
reduce memory usage on a server handling a many databases.
Post by Alex Peshkov
Post by Jim Starkey
Post by Alex Peshkov
I didn't want to say that the object must always allocate data from
that same pool from which it was allocated itself. I meant that if
there are no special (considered on a case by case basis)
requirements for memory allocation by the object this is the
simplest way for it to find a place for placing his internals.
It is clearly not the simplest. In most cases, it is the most
complex and difficult to implement. A strong case *must* be made to
justify the development and runtime cost.
It is implemented in firebird2 classes library and I see absolutely no
problems using it. All objects are derived from class
PermanentStorage, and the only thing it does is keeping and providing
information about the pool, from which memory for this object was
Object(MemoryPool &p, ...) : PermanentStorage(p), ....
Object* o = FB_NEW(getPool()) Object(getPool(), ...);
Even if it helps only to monitor memory usage by Attachment, reduce
fragmentation, monitor leaks and help in debugging, it seems to be not
too high price for this.
This complexity, like any complexity, requires justification. It is
simpler to write:

Object *o = new Object (...);
Post by Alex Peshkov
Post by Jim Starkey
My guess is that less than 5% of the object allocations have any
benefit from special pools. Convince me that I'm wrong.
Dan Wilson
2005-02-15 20:54:01 UTC
Permalink
Post by Jim Starkey
Bottom Line
Memory pools as used by Firebird are incompatible with object oriented
programming. You can pick objects or you can pick pools. Yes, classes
can be made pool-aware, but has nothing to do with the problem of
referential integrity between objects. Simply put, ripping an object
out of a complex structure when its parent memory pool is deleted
destroys the integrity of that data structure.
I am not a Firebird developer. As such, I probably should not even put my oar in this water.

Both Nickolay's and Jim's explanations were so clear, however, that I'd like to give it a shot.

It seems to me that one could have both objects and pools, if one designed the pools and objects appropriately. If, for example, all objects used in the system shared a common ancestor class, called FBObject or VObject or what have you, and that ancestor class overloaded the new and delete operators, one could design a memory pool architecture that knew about every single object allocated from that pool, and could thus call the destructor method for each of these objects whenever the destructor method for the entire pool was called. Thus, objects could be allocated and deallocated at will, but once the pool's lifetime had expired, a single call to the pool destructor would clean up all the objects, including honoring that's objects destructor method.

I look forward to learning more about how Firebird works through the answers to this post, and will endeavour to keep a thick skin.

Best regards,

Dan Wilson.
Jim Starkey
2005-02-16 13:54:07 UTC
Permalink
Post by Dan Wilson
It seems to me that one could have both objects and pools, if one designed the pools and objects appropriately. If, for example, all objects used in the system shared a common ancestor class, called FBObject or VObject or what have you, and that ancestor class overloaded the new and delete operators, one could design a memory pool architecture that knew about every single object allocated from that pool, and could thus call the destructor method for each of these objects whenever the destructor method for the entire pool was called. Thus, objects could be allocated and deallocated at will, but once the pool's lifetime had expired, a single call to the pool destructor would clean up all the objects, including honoring that's objects destructor method.
That addresses the problem of getting the destructors called for
objected deleted-by-pool, but it doesn't do anything for other objects
that might have pointers to any of those objects. A common mechanism
for managing object lifetimes is reference counting -- an object does an
addRef() to another object to keep it from being deleted, maintaining
the integrity of the its pointer. If the pool is deleted, the object
goes away whether its reference count was zero or positive.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Olivier Mascia
2005-02-17 01:00:12 UTC
Permalink
Post by Jim Starkey
Post by Dan Wilson
It seems to me that one could have both objects and pools, if one
designed the pools and objects appropriately. If, for example, all
objects used in the system shared a common ancestor class, called
FBObject or VObject or what have you, and that ancestor class
overloaded the new and delete operators, one could design a memory
pool architecture that knew about every single object allocated from
that pool, and could thus call the destructor method for each of
these objects whenever the destructor method for the entire pool was
called. Thus, objects could be allocated and deallocated at will,
but once the pool's lifetime had expired, a single call to the pool
destructor would clean up all the objects, including honoring that's
objects destructor method.
That addresses the problem of getting the destructors called for
objected deleted-by-pool, but it doesn't do anything for other objects
that might have pointers to any of those objects. A common mechanism
for managing object lifetimes is reference counting -- an object does
an addRef() to another object to keep it from being deleted,
maintaining the integrity of the its pointer. If the pool is deleted,
the object goes away whether its reference count was zero or positive.
What's more, the number one argument for pools, and that is what Apache
Group says, is that memory allocation / free in C++ is difficult, that
people can forget to call destructors and so on... Hence pools are a
dream come true. False. Plain false. I have spend the last 20 years
architecting and helping code large technical software projects (though
not database engines per se) : there are well-known easy techniques
with good performance to not be concerned by memory allocation in C++.

The rule of thumb is : use the stack. For every object requiring a
'dynamic' memory allocation, use a stack based smart pointer and design
your objects so that they *can't* be used *without* the smart pointer.
The compiler, unless buggy but that's another story, will take care of
all memory management for you.

Want something a bit more sophisticated? Okay, add in specialized
allocators per class or per family of classes to gain a kind of
locality for objects with close relationships or that benefit from a
specialized allocator. In that way this is similar to pools. But those
'pools' in my talk have no way to be globally unallocated. So no object
will ever be deleted by a pool. Objects will have their intended
lifetime. C++ compiler will take care of proper destructor calls and
memory free, including in heavily C++ exception-based code. And you'll
at the same time gain specialized and localized allocation of some
objects if you like it or want it.

The runtime cost of the indirection implied by a smart-pointer is
peanuts today, compared to the benefits.
--
Olivier Mascia
Paulo Gaspar
2005-02-17 16:24:20 UTC
Permalink
Post by Jim Starkey
Post by Dan Wilson
It seems to me that one could have both objects and pools, if one
designed the pools and objects appropriately. If, for example, all
objects used in the system shared a common ancestor class, called
FBObject or VObject or what have you, and that ancestor class overloaded
the new and delete operators...
Post by Jim Starkey
Post by Dan Wilson
...
That addresses the problem of getting the destructors called for
objected deleted-by-pool, but it doesn't do anything for other objects
that might have pointers to any of those objects. A common mechanism
for managing object lifetimes is reference counting -- an object does an
addRef() to another object to keep it from being deleted, maintaining
the integrity of the its pointer. If the pool is deleted, the object
goes away whether its reference count was zero or positive.
Actually, as I mentioned in my previous post, C++ even allows
overloading the new/delete operators both globally, per class, and even
allows placement allocation, as in
ClassX* xp = new(memZone) ClassX;

Anyway, I see advantage on allocation large objects on memory pools
shared by other large objects with the same lifecycle (e.g.: depending
on the same request), since it reduces fragmentation. And low
fragmentation is important to keep allocating large objects without
excessive memory consumption.

So, for me, per lifecycle pools for large objects and just another one
for all the smaller allocations, smells good.

On reference counting, I have mixed feellings...

Actually, reference counting is a dangerous practice in some situations.
Even with the help of smart pointers, it does not do better than the
similar mechanisms in Visual Basic, Delphi, ActiveX, etc. My experience
with those mechanisms is that shit still happens, a lot.

Personally, I rather not rely on an unreliable mechanism like reference
counting or block deallocation, since that might give me a false sense
of security. I rather stay alert and pay all the attention to what I do.

Since C++ smart pointers (an overloading of the "->" operator) make the
magic behind the scenes so "invisible", people tend to ignore what is
really happening until memory starts vanishing or objects sudenly get
deallocated before expected.

A couple of common problems with reference counting:
- Memory can still leak when you have cyclic references (e.g. when you
have A pointing to B that points to C which also points to B as in
A->B<->C, and then remove A's reference to B, you still have B and C
referencing each other and they are not released. I know this one from
when I used VB;
- Objects getting sudenly deallocated happens in languages where
reference counting is not the native/single reference mechanism (like
C++ and Delphi). It happens when an object might have used some other
mechanism to aquire a reference to another but then releases the
reference using the reference counting mechanism, subtracting a unit to
the counter that it had not added, this way causing its destruction
while one other object was still referencing it.

For additional discussion of this technique:
http://agora.cubik.org/wiki/view/Main/ReferenceCounting
Olivier Mascia
2005-02-18 00:33:18 UTC
Permalink
Post by Paulo Gaspar
On reference counting, I have mixed feellings...
...
- Memory can still leak when you have cyclic references (e.g. when you
have A pointing to B that points to C which also points to B as in
A->B<->C, and then remove A's reference to B, you still have B and C
referencing each other and they are not released. I know this one from
when I used VB;
This is because VB does not use class specific smart-pointers but too
generic smart-pointers. A correctly design C++ object, if it has
relationship to other object, knows (read: the programmer coded as
such) what to do with these relations when it is freed. It is just a
matter of design.
Post by Paulo Gaspar
- Objects getting sudenly deallocated happens in languages where
reference counting is not the native/single reference mechanism (like
C++ and Delphi). It happens when an object might have used some other
mechanism to aquire a reference to another but then releases the
reference using the reference counting mechanism, subtracting a unit to
the counter that it had not added, this way causing its destruction
while one other object was still referencing it.
Just a matter of correct or incorrect design. You can perfectly forbid
such thing. Your objects meant to be accessed through smart-pointers
can be designed so that you can't get a reference to them without
having their reference count bump. It is easy to use templates to
produce a class-specific smart-pointer for each underlying object
class. The specific smart-pointer class can be the one and only way to
create the underlying object. With no exposed interface allowing you to
get a reference, other than encapsulated in a smart-pointer instance.

All in all, I agree with many things you wrote in your previous post,
and only challenge the limitations you mentionned here about
reference-counting. It can be done right. Which does not mean this is
The Right Thing(r) to do for Firebird, but as we were discussing it...
--
Olivier Mascia
Paulo Gaspar
2005-02-18 02:36:11 UTC
Permalink
I completely agree with you Olivier, but I was pointing to the bumps on
the road.

I think the page at the URL I mentioned in my post refers to papers
exposing some of the techniques you describe.

Regards,
Paulo Gaspar
Post by Olivier Mascia
Post by Paulo Gaspar
On reference counting, I have mixed feellings...
...
- Memory can still leak when you have cyclic references (e.g. when you
have A pointing to B that points to C which also points to B as in
A->B<->C, and then remove A's reference to B, you still have B and C
referencing each other and they are not released. I know this one from
when I used VB;
This is because VB does not use class specific smart-pointers but too
generic smart-pointers. A correctly design C++ object, if it has
relationship to other object, knows (read: the programmer coded as
such) what to do with these relations when it is freed. It is just a
matter of design.
Post by Paulo Gaspar
- Objects getting sudenly deallocated happens in languages where
reference counting is not the native/single reference mechanism (like
C++ and Delphi). It happens when an object might have used some other
mechanism to aquire a reference to another but then releases the
reference using the reference counting mechanism, subtracting a unit to
the counter that it had not added, this way causing its destruction
while one other object was still referencing it.
Just a matter of correct or incorrect design. You can perfectly forbid
such thing. Your objects meant to be accessed through smart-pointers
can be designed so that you can't get a reference to them without
having their reference count bump. It is easy to use templates to
produce a class-specific smart-pointer for each underlying object
class. The specific smart-pointer class can be the one and only way to
create the underlying object. With no exposed interface allowing you
to get a reference, other than encapsulated in a smart-pointer instance.
All in all, I agree with many things you wrote in your previous post,
and only challenge the limitations you mentionned here about
reference-counting. It can be done right. Which does not mean this is
The Right Thing(r) to do for Firebird, but as we were discussing it...
--
Olivier Mascia
Paulo Gaspar
2005-02-18 07:45:29 UTC
Permalink
Post by Jim Starkey
Post by Dan Wilson
It seems to me that one could have both objects and pools, if one
designed the pools and objects appropriately. If, for example, all
objects used in the system shared a common ancestor class, called
FBObject or VObject or what have you, and that ancestor class overloaded
the new and delete operators...
Post by Jim Starkey
Post by Dan Wilson
...
That addresses the problem of getting the destructors called for
objected deleted-by-pool, but it doesn't do anything for other objects
that might have pointers to any of those objects. A common mechanism
for managing object lifetimes is reference counting -- an object does an
addRef() to another object to keep it from being deleted, maintaining
the integrity of the its pointer. If the pool is deleted, the object
goes away whether its reference count was zero or positive.
Actually, as I mentioned in my previous post, C++ even allows
overloading the new/delete operators both globally, per class, and even
allows placement allocation, as in
ClassX* xp = new(memZone) ClassX;

Anyway, I see advantage on allocation large objects on memory pools
shared by other large objects with the same lifecycle (e.g.: depending
on the same request), since it reduces fragmentation. And low
fragmentation is important to keep allocating large objects without
excessive memory consumption.

So, for me, per lifecycle pools for large objects and just another one
for all the smaller allocations, smells good.

On reference counting, I have mixed feellings...

Actually, reference counting is a dangerous practice in some situations.
Even with the help of smart pointers, it does not do better than the
similar mechanisms in Visual Basic, Delphi, ActiveX, etc. My experience
with those mechanisms is that shit still happens, a lot.

Personally, I rather not rely on an unreliable mechanism like reference
counting or block deallocation, since that might give me a false sense
of security. I rather stay alert and pay all the attention to what I do.

Since C++ smart pointers (an overloading of the "->" operator) make the
magic behind the scenes so "invisible", people tend to ignore what is
really happening until memory starts vanishing or objects sudenly get
deallocated before expected.

A couple of common problems with reference counting:
- Memory can still leak when you have cyclic references (e.g. when you
have A pointing to B that points to C which also points to B as in
A->B<->C, and then remove A's reference to B, you still have B and C
referencing each other and they are not released. I know this one from
when I used VB;
- Objects getting sudenly deallocated happens in languages where
reference counting is not the native/single reference mechanism (like
C++ and Delphi). It happens when an object might have used some other
mechanism to aquire a reference to another but then releases the
reference using the reference counting mechanism, subtracting a unit to
the counter that it had not added, this way causing its destruction
while one other object was still referencing it.

For additional discussion of this technique:
http://agora.cubik.org/wiki/view/Main/ReferenceCounting
Thomas Miller
2005-02-17 06:41:23 UTC
Permalink
Post by Dan Wilson
Post by Jim Starkey
Bottom Line
Memory pools as used by Firebird are incompatible with object oriented
programming. You can pick objects or you can pick pools. Yes, classes
can be made pool-aware, but has nothing to do with the problem of
referential integrity between objects. Simply put, ripping an object
out of a complex structure when its parent memory pool is deleted
destroys the integrity of that data structure.
I am not a Firebird developer. As such, I probably should not even put my oar in this water.
Both Nickolay's and Jim's explanations were so clear, however, that I'd like to give it a shot.
It seems to me that one could have both objects and pools, if one designed the pools and objects appropriately. If, for example, all objects used in the system shared a common ancestor class, called FBObject or VObject or what have you, and that ancestor class overloaded the new and delete operators, one could design a memory pool architecture that knew about every single object allocated from that pool, and could thus call the destructor method for each of these objects whenever the destructor method for the entire pool was called. Thus, objects could be allocated and deallocated at will, but once the pool's lifetime had expired, a single call to the pool destructor would clean up all the objects, including honoring that's objects destructor method.
The main problem here is that the only one that knows it is done is the
object itself. Not the memory pool. The only reason I can see to have
memory pools is to improve on memory allocation speed, period (memory
thrashing). So if a layer was built in that was a memory allocation
layer with interfaces made of proper objects, then that would allow you
to swap out memory allocation methodology. One object to allocate
directly from OS memory and another to allocate a block of memory to
prevent thrashing. Either way, the object has to be responsible for
releasing the memory.

You still would have the issue of memory leaks inside the pools. It is
nuts to have a pool call an objects destructor unless you are trying to
kill a runaway process. On the other hand, you can kill a runaway
process without memory pools.

In addition, the object itself can set up its own little memory pool to
cut down on memory thrashing. Delphi is horrible at this (memory
thrashing) when it comes to manipulating large strings (like on a ascii
import and creating an HTML page). The common practice is to set up a
memory buffer and read / write your changes to the buffer so Delphi
doesn't try
to allocate a new memory block on every change.

While you guys are thinking about this, remember, as a user, the thing I
care most about is accuracy. Speed is #4 or #5 on my list. I can
always throw more hardware at the problem if speed is an issue. Objects
are much easier to maintain. So if it is a question of pools or
objects, then it is no contest: objects.
Post by Dan Wilson
I look forward to learning more about how Firebird works through the answers to this post, and will endeavour to keep a thick skin.
Best regards,
Dan Wilson.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
--
Thomas Miller
Wash DC Delphi SIG Chairperson
Delphi Client/Server Certified Developer
BSS Accounting & Distribution Software
BSS Enterprise Accounting FrameWork

http://www.bss-software.com
http://www.cpcug.org/user/delphi/index.html
https://sourceforge.net/projects/uopl/
http://sourceforge.net/projects/dbexpressplus
Mark O'Donohue
2005-02-17 00:45:17 UTC
Permalink
Hi All
Post by Jim Starkey
The single most contentious issue in the merge debate will be memory
pools. Here is some background.
Memory pools as used by Firebird are incompatible with object
oriented programming. You can pick objects or you can pick pools.
Can we have both?

Allocation method can be specified at a class level, so is it possible
to specify at a classes level to :
A. use pools allocation for special classes where there is some
advantage.

B. no pools (or perhaps pools where destructor are called) for
classes where destructor use is preferred.


I agree that destructor are a *really* important part of OO programming,
and I would think that only if a particular class can be show to have a
*real* performance benefit by moving it to a pool/no destructor
allocation scheme that it should then be dumbed down to use pool
allocation.


Cheers

Mark
Jim Starkey
2005-02-17 07:19:19 UTC
Permalink
Post by Mark O'Donohue
Hi All
Post by Jim Starkey
The single most contentious issue in the merge debate will be memory
pools. Here is some background.
Memory pools as used by Firebird are incompatible with object
oriented programming. You can pick objects or you can pick pools.
Can we have both?
Allocation method can be specified at a class level, so is it possible
A. use pools allocation for special classes where there is some
advantage.
B. no pools (or perhaps pools where destructor are called) for
classes where destructor use is preferred.
I agree that destructor are a *really* important part of OO
programming, and I would think that only if a particular class can be
show to have a *real* performance benefit by moving it to a pool/no
destructor allocation scheme that it should then be dumbed down to use
pool allocation.
Mark, you have a good point here. There are localized needs for dumbed
down memory allocation. Ann has argued for one in the save point
mechanism. My SymbolManager class allocates large hunks of memory that
it subdivides into strings. There are undoubtably many more.

The common denominator seems to be:

1. The objects themselves are task specific
2. These objects are almost never deleted individually, so
recombination is unnecessary
3. The process takes place within a single thread so interlocking is
unnecessary

I think the way to handle these cases is to define SimpleMemoryManager
(or whatever) class that allocates large chunks of memory and gives out
small chunks, and to require objects that are designed to be allocated
from SimpleMemoryManager to have a class specific "new" to do the
allocation. This insures that ordinary objects are allocated from
dumbed down pool while still allowing classes designed for delete-by-pool.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Samofatov, Nickolay
2005-02-17 18:42:35 UTC
Permalink
Paulo,

With my references to Apache, PostgreSQL and Oracle I encouraged people
to look into these systems to see how they use pools to achieve peace of
mind for database developers and users.
You can see how they allow to look into detailed memory usage data for
each functional object such as request, procedure or transaction. You
would see how they allow tracing pools usage to capture problems (leaks
or just big memory consumption) happening in production environment.
Interbase 7 provides some of this information too, FWIW.
Post by Paulo Gaspar
There is only ONE real advantage I see in memory pools as
they are applied in FB: they are lifecycle related. If you
have one memory pool per each cycle of
allocation/deallocation of objects (like a pool per
Post by Samofatov, Nickolay
The single and most important thing about pools is that when you
allocate from particular pool you BIND LIFETIME of memory block or
object to the object which owns the pool.
BUT then you do not mention memory fragmentation. Keeping
memory fragmentation low might be specially useful if you
need to allocate large blocks of memory.
Yup. This is what I named "neat performance tricks" in that letter.
It is nice and was mentioned before, but not critically important,
mostly due to "best fit" memory allocation strategy used in Firebird.
Post by Paulo Gaspar
Post by Samofatov, Nickolay
...
1. Monitoring. Looking at pool sizes for various objects DBA can say
which god damned request consumed all memory of the server
and kill the
Post by Samofatov, Nickolay
offending session/request/transaction.
It is easy to provide alternative mechanisms to do monitoring
that just by keeping counters per thread.
Not really. Thread may be doing a variety of things, including
allocation of data for global needs such as populating metadata cache.
DBA needs to know that particular query consumes particular number of
bytes of memory, and that transaction started at 6:03 PM by user PAULO
from 172.20.1.15 consumed 512,345,112 bytes of memory and is growing.

When you have functional pools you see the breakdown of all memory used
by server to functional pieces. This is very valuable knowledge, both
for DBA and for developer.

To have this you need to associate functional context and set lifetime
period for each memory block allocated. By allocating memory from the
right pool you do just that.
Post by Paulo Gaspar
On the killing side, is it not problematic assuming you will
free the whole block without calling destructors? You are
then missing the "++" advantage, specially when managing
objects that might allocate other resources besides memory.
I never said that destructors need never be called. I was trying to tell
that in special cases they may not be called if there are reasons for
doing so.
See more on this below.
Post by Paulo Gaspar
Post by Samofatov, Nickolay
2. Memory leaks debugging. To debug server which consumed too much
memory on the customer site we take the snapshot of process, look up
which pool is too big and knowing functional purpose of the pool and
having pool dump information memory leak location is usually obvious
immediately, especially if line number information for
allocations is
Post by Samofatov, Nickolay
stored.
This is just the same as 1., isn't it?
Not really. Point 1 is about DBA-level monitoring. DBA sees how server
resources are consumed and can do corrective actions if necessary.
But if anomalous thing is detected by DBA (such as request or
transaction eating too much memory) he can capture the state of process,
send this dump to me and most probably I will be able to fix the issue
by reading data from dump taken on production system.

This is critically important thing for big servers which serve many
users and normally use much resources. Having accurate pools information
you can see that request X consumed 10 MB of RAM among 1000's requests
consuming a few GB in total. And during debugging you need to analyze
only problematic 10 MB, not entire process memory.
Post by Paulo Gaspar
Post by Samofatov, Nickolay
3. Failure isolation. Shit happens. Sometimes people forget to call
destructors and deallocate memory blocks. This is not normal
situation,
Post by Samofatov, Nickolay
but again, shit happens. And server needs to continue
working without
Post by Samofatov, Nickolay
consuming all memory in the world even if some bad things happen. Or
stop immediately producing developer report after memory leak is
detected encouraging developers to fix the leak.
Shit must NOT happen on production software, and when it can
not be avoided, it must be handled graciously.
Actually, having a policy of releasing memory in blocks
encourages shit to happen, since programmers will be tempted
to rely on the final block release.
You see, shit happens. It happens in application software, it happens in
server software.
By allocating memory from pool you set a constraint for lifetime of this
object. If it lives longer then this constraint allows then we obviously
have a problem. Quoting myself, "We can stop immediately producing
developer report after memory leak is detected encouraging developers to
fix the leak."

Without pools you would notice it only after server eats large
(noticeable externally) amount of RAM or terminates. Both is
unacceptable for serious server software.
Post by Paulo Gaspar
Does this mean that I must think, object by
object, where am I going to place it? (Has it file handles?
Has it references to other resources? etc.)
You must think of object lifetime when you allocate it and allocate
object from the right pool.
If you create transient object with lifetime bound to stack you need to
use "automatic" thread-specific pool. Strings and container classes
declared on stack pick up this pool automatically. If you allocate
object for processing of particular request, you need to allocate it
from request pool, if you allocate object global for database then you
allocate it from database pool, etc.

Regarding allowing delete-by-pool semantics, there was a nice talk
between developers about it, let me translate one interesting piece from
it:
--
There are 4 possible deallocation policies to some extent compatible
with "failure isolation" goal:
1. Allow deletion-by-pool for all allocations. This "hides" memory leaks
and while doing so may leave dangling references in some cases (but this
kind of code is unusual for current Firebird2).
2. Require explicit deletion of everything from the pool. Pool must be
empty on deallocation, otherwise - punt.
3. Require explicit deletion of everything from the pool, except blocks
allocated by special form of allocator.
4. Allow deletion-by-pool for everything, except blocks allocated by
special form of allocator. Presence of such blocks in pool at pool
deallocation time would punt the server.

Currently Firebird uses approach 1. I talked with Dmitry Emanov and he
agrees that long-term goal is to move to approach 3. In short, in new
code destructors must always be called and memory deallocated explicitly
unless you really, really understand what you are doing.
--

And finally, Paulo, rants, bad words and personal insults (saying that
core developer is dumb, fool, doesn't know C++, etc) are unnecessary
things.
The fact that such things are tolerated from Jim is only because he has
a big, big credit of respect as original author of Interbase/Firebird
which he spends like that.
Post by Paulo Gaspar
Paulo Gaspar
Nickolay
Paulo Gaspar
2005-02-18 02:05:21 UTC
Permalink
Post by Samofatov, Nickolay
Paulo,
With my references to Apache, PostgreSQL and Oracle I encouraged people
to look into these systems to see how they use pools to achieve peace of
mind for database developers and users.
You can see how they allow to look into detailed memory usage data for
each functional object such as request, procedure or transaction. You
would see how they allow tracing pools usage to capture problems (leaks
or just big memory consumption) happening in production environment.
Interbase 7 provides some of this information too, FWIW.
Using them as reference is helpful. But please be more explicit about it.

If you tell me "Apache does this and you can see how they use that
technique and the other" it is great. If you add a few URLs, it is super
great.

But if you tell me "this works because Apache does it this way"... well, it
is much less helpful because it sounds like "if it is good for them than it
is good for us" and that is not always the case. The fact that some
technique
works well for them does not mean that it is the best for Firebird.
Post by Samofatov, Nickolay
Post by Paulo Gaspar
...
...
1. Monitoring. ...
...
It is easy to provide alternative mechanisms to do monitoring
that just by keeping counters per thread.
Not really. Thread may be doing a variety of things, including
allocation of data for global needs such as populating metadata cache.
DBA needs to know that particular query consumes particular number of
bytes of memory, and that transaction started at 6:03 PM by user PAULO
from 172.20.1.15 consumed 512,345,112 bytes of memory and is growing.
When you have functional pools you see the breakdown of all memory used
by server to functional pieces. This is very valuable knowledge, both
for DBA and for developer.
To have this you need to associate functional context and set lifetime
period for each memory block allocated. By allocating memory from the
right pool you do just that.
By using the right counter, you do that too.

In Java it is EASY to associate scoped context information with a thread.
Since there are performance counters on so many operating systems
implemented in C, I have to believe you can do the same with C++.
(And yes, my low level systems programming is too rusty and obsolete at
the moment!)

Assuming you can do that, it should be easy to associate a "request by
PAULO scope counter" to a request's thread and switch to a "metadata
scope counter" when it calls the metadata fetching functionality, and back
to the original "request scope" before exiting the metadata functionality.
Post by Samofatov, Nickolay
Post by Paulo Gaspar
...
This is just the same as 1., isn't it?
Not really. Point 1 is about DBA-level monitoring. DBA sees how server
resources are consumed and can do corrective actions if necessary.
But if anomalous thing is detected by DBA (such as request or
transaction eating too much memory) he can capture the state of process,
send this dump to me and most probably I will be able to fix the issue
by reading data from dump taken on production system.
Now I understand.
Post by Samofatov, Nickolay
This is critically important thing for big servers which serve many
users and normally use much resources. Having accurate pools information
you can see that request X consumed 10 MB of RAM among 1000's requests
consuming a few GB in total. And during debugging you need to analyze
only problematic 10 MB, not entire process memory.
In the end the general problem is to track those unreleased objects.

I wonder, however, if so much sebug baggage should be passed to the
production code. On most of the projects I worked (on my C/C++ era) it was
quite feasable (while not trivial) do detect memory leaks during testing.

Most production problems that led to memory leaks had to do with locking
and other interaction problems between threads AND THEN the issue was
"why was that thread blocked".

In your experience, does a memory dump from a SINGLE thread help you
to diagnose that kind of issue???
Post by Samofatov, Nickolay
Post by Paulo Gaspar
Shit must NOT happen on production software, and when it can
not be avoided, it must be handled graciously.
Actually, having a policy of releasing memory in blocks
encourages shit to happen, since programmers will be tempted
to rely on the final block release.
You see, shit happens. It happens in application software, it happens in
server software.
By allocating memory from pool you set a constraint for lifetime of this
object. If it lives longer then this constraint allows then we obviously
have a problem. Quoting myself, "We can stop immediately producing
developer report after memory leak is detected encouraging developers to
fix the leak."
Without pools you would notice it only after server eats large
(noticeable externally) amount of RAM or terminates. Both is
unacceptable for serious server software.
Using the counter scheme described above you could have that too.

At the end of the request its counter should be back to 0.
Post by Samofatov, Nickolay
Post by Paulo Gaspar
Does this mean that I must think, object by
object, where am I going to place it? (Has it file handles?
Has it references to other resources? etc.)
You must think of object lifetime when you allocate it and allocate
object from the right pool.
If you create transient object with lifetime bound to stack you need to
use "automatic" thread-specific pool. Strings and container classes
declared on stack pick up this pool automatically. If you allocate
object for processing of particular request, you need to allocate it
from request pool, if you allocate object global for database then you
allocate it from database pool, etc.
Yes, I understand the concepts of "object lifetime", also known by
"object lifecycle".
Post by Samofatov, Nickolay
Regarding allowing delete-by-pool semantics, there was a nice talk
between developers about it, let me translate one interesting piece from
--
There are 4 possible deallocation policies to some extent compatible
1. Allow deletion-by-pool for all allocations. This "hides" memory leaks
and while doing so may leave dangling references in some cases (but this
kind of code is unusual for current Firebird2).
2. Require explicit deletion of everything from the pool. Pool must be
empty on deallocation, otherwise - punt.
3. Require explicit deletion of everything from the pool, except blocks
allocated by special form of allocator.
4. Allow deletion-by-pool for everything, except blocks allocated by
special form of allocator. Presence of such blocks in pool at pool
deallocation time would punt the server.
Currently Firebird uses approach 1. I talked with Dmitry Emanov and he
agrees that long-term goal is to move to approach 3. In short, in new
code destructors must always be called and memory deallocated explicitly
unless you really, really understand what you are doing.
That is very interesting.

However, why 3 and not 4? And why not just 2?

Aren't the blocks allocated in a special way things like buffers and pages?
Aren't they larger in size and smaller in numbers?

If it is so, calling their destructors is not a big extra cost and you
are sure
you are handling them one by one in a proper way, (this way being able
to detect possible excessive memory consumption due to, lets say, an
unreleased and unreused buffer, inside the request's lifetime).
Post by Samofatov, Nickolay
And finally, Paulo, rants, bad words and personal insults (saying that
core developer is dumb, fool, doesn't know C++, etc) are unnecessary
things.
The fact that such things are tolerated from Jim is only because he has
a big, big credit of respect as original author of Interbase/Firebird
which he spends like that.
Agreed.

However, in case those remarks target me, lets make 2 things clear:
- I already apologised for the "bad words" I used on FB-Architect because
I figure I was atacking fire with fire and that is not (at least in
this case)
correct.

- But lets make it clear that I never pretended to say anyone of you are
dumb, fool or whatever and I don't understand how it could come across
that way. Even on the FB-Architect post I displayed some admiration for
the technical skils and intelectual capacity of both persons involved.

And if I reference C++ documentation and books on several of my recent
posts that has only to do with presenting my qrguments to the wider range
of readers of this list since it is common knowledge that some were more
proficient with C that with C++.

Best regards,
Paulo Gaspar
Dmitry Yemanov
2005-02-18 03:06:28 UTC
Permalink
Post by Paulo Gaspar
However, why 3 and not 4? And why not just 2?
(4) means no destructor calls unless asked for explicitly. I thought it's
being considered bad by almost everyone here. (2) is what Jim requires, but
there are cases when it's much better to remove an array of 1M objects as a
single entity instead of calling destructors 1M times. Given that an object
doesn't contain external dependencies and its destructor is quite trivial,
of course. This should be a quite rare case, but if it saves us 500% of the
CPU time, I go for it.

My personal bottom line:

a) pools are useful for bugchecks and memory tracing (other possible
benefits are left to Nickolay)
b) destructors must be called
c) exceptions from (b) are discussed one-by-one and implemented after a
performance research


Dmitry
Jim Starkey
2005-02-18 07:31:18 UTC
Permalink
Post by Dmitry Yemanov
Post by Paulo Gaspar
However, why 3 and not 4? And why not just 2?
(4) means no destructor calls unless asked for explicitly. I thought it's
being considered bad by almost everyone here. (2) is what Jim requires, but
there are cases when it's much better to remove an array of 1M objects as a
single entity instead of calling destructors 1M times.
If a destructor exists, presumably it does something important and must
be called. If it doesn't do anything, it shouldn't exist, and if it
doesn't exist, it won't be called.

The danger is when somebody says, "hey, I know what the destructor does,
and this time I can get away with not calling it." Then the class
internals change, the destructor becomes important, and way off in a
forgotten corner of the system is a piece of code that doesn't respect
the destructor. Boom.

As I have said, I don't any any problem with alternative memory managers
that disappear along with their allocations. But no object with a
destructor (or needs a destructor) should be allocated from one.
Post by Dmitry Yemanov
a) pools are useful for bugchecks and memory tracing (other possible
benefits are left to Nickolay)
Nobody has suggested that pools be eliminated. The argument is over
delete-by-pool. Pool are good and useful things as long as they respect
the integrity of objects.
Post by Dmitry Yemanov
c) exceptions from (b) are discussed one-by-one and implemented after a
performance research
An examination of alternative strategies is also in order.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Paulo Gaspar
2005-02-18 21:10:23 UTC
Permalink
Sory Dmitry: somehow I managed to swap 4 with 3 in my mind just after
reading it. Which was quite silly since Nickolay even repeated what 3 was
on his conclusion.
=:o(

It seems that my bad cold destroying my sleeping nights (because of so
much coughing), while allowing me more time to participate on mailling
lists, is taking its toll on the way my brain works. =:o(

(Actually, at the moment I am just having to stand for a while after
sleeping just 4 hours, waiting my cough to calm down.)

Thank you for clearing up my confusion.

Best regards,
Paulo
Post by Dmitry Yemanov
Post by Paulo Gaspar
However, why 3 and not 4? And why not just 2?
(4) means no destructor calls unless asked for explicitly. I thought it's
being considered bad by almost everyone here. (2) is what Jim
requires, but
Post by Dmitry Yemanov
there are cases when it's much better to remove an array of 1M objects as a
single entity instead of calling destructors 1M times. Given that an object
doesn't contain external dependencies and its destructor is quite trivial,
of course. This should be a quite rare case, but if it saves us 500% of the
CPU time, I go for it.
a) pools are useful for bugchecks and memory tracing (other possible
benefits are left to Nickolay)
b) destructors must be called
c) exceptions from (b) are discussed one-by-one and implemented after a
performance research
Dmitry
Adriano dos Santos Fernandes
2005-02-19 09:12:22 UTC
Permalink
Post by Dmitry Yemanov
(2) is what Jim requires, but
there are cases when it's much better to remove an array of 1M objects as a
single entity instead of calling destructors 1M times. Given that an object
doesn't contain external dependencies and its destructor is quite trivial,
of course. This should be a quite rare case, but if it saves us 500% of the
CPU time, I go for it.
Deleting (with delete []) 1000 times a array of 1000000 of objects with a destructor that do nothing take less than 5 seconds in my modest AMD XP 2200.
If the destructor is not defined it's instantaneos.
With virtual destructor I don't wait to see, but if you use virtual you should call the destructor anyway.
Post by Dmitry Yemanov
a) pools are useful for bugchecks and memory tracing (other possible
benefits are left to Nickolay)
b) destructors must be called
c) exceptions from (b) are discussed one-by-one and implemented after a
performance research
I think destroy pool with have objects allocated should "assert" in debug build and log in release build.


Adriano
Jim Starkey
2005-02-19 09:23:25 UTC
Permalink
Post by Adriano dos Santos Fernandes
I think destroy pool with have objects allocated should "assert" in
debug build and log in release build.
During a transition period, pools will always have outstanding objects,
so it will be quite some time before we an enforce this. When we do,
however, I think an attempting to delete a pool with objects outstanding
should throw an exception, leaving the memory pool (and objects)
intact. It is far better to have a memory leak than to turn a reference
to an "expunged" object into a certain server crash.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Adriano dos Santos Fernandes
2005-02-19 09:30:25 UTC
Permalink
Post by Jim Starkey
Post by Adriano dos Santos Fernandes
I think destroy pool with have objects allocated should "assert" in
debug build and log in release build.
During a transition period, pools will always have outstanding objects,
so it will be quite some time before we an enforce this. When we do,
however, I think an attempting to delete a pool with objects outstanding
should throw an exception, leaving the memory pool (and objects)
intact. It is far better to have a memory leak than to turn a reference
to an "expunged" object into a certain server crash.
But throwing exceptions in destructors (of the pool in this case) is a bad thing.


Adriano
Jim Starkey
2005-02-19 09:55:28 UTC
Permalink
Post by Adriano dos Santos Fernandes
Post by Jim Starkey
Post by Adriano dos Santos Fernandes
I think destroy pool with have objects allocated should "assert" in
debug build and log in release build.
During a transition period, pools will always have outstanding
objects, so it will be quite some time before we an enforce this.
When we do, however, I think an attempting to delete a pool with
objects outstanding should throw an exception, leaving the memory
pool (and objects) intact. It is far better to have a memory leak
than to turn a reference to an "expunged" object into a certain
server crash.
But throwing exceptions in destructors (of the pool in this case) is a bad thing.
That's true, but the alternative is worse. It can't delete live objects
and it shouldn't quietly fail. The only other thing I can think of is
to do a quick analysis to see what object is dangling, what code
allocated it, who wrote that code, and send the miscreant a nasty email.

There are many brave souls who believe that software can recover from
unexpected bugchecks. I'm not one of them for a couple of reasons. One
reason is that no piece of code I have ever written has ever worked
first time, so I know with certainty that any code I write to handle an
unexpected bugcheck will fail. Probably other people are better at
writing perfect code, but I hope they aren't in the avionics business.
For airplanes, I want all code paths tested. The other is that in a
bugcheck there is no way to even guess what else might be corrupted.
For database systems, I think it is much better to quit with memory
corruption than to continue and corrupt the database on disk.

While we're on the subject, and since it's a weekend, here's another
boring DEC story. Many years ago I attended a computer conference with
another fellow from DEC research who had published a book on fault
tolerant programming. Just to check up, he went to his publisher's
booth and asked if they had anything on fault tolerance? No, the guy
said, we only do computer books; you need a publisher that does geology.

Fault tolerance is one of those subjects that I know enough to know that
I know very little.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
eg
2005-02-22 01:31:22 UTC
Permalink
Post by Jim Starkey
Post by Adriano dos Santos Fernandes
Post by Jim Starkey
Post by Adriano dos Santos Fernandes
I think destroy pool with have objects allocated should "assert" in
debug build and log in release build.
During a transition period, pools will always have outstanding
objects, so it will be quite some time before we an enforce this.
When we do, however, I think an attempting to delete a pool with
objects outstanding should throw an exception, leaving the memory
pool (and objects) intact. It is far better to have a memory leak
than to turn a reference to an "expunged" object into a certain
server crash.
But throwing exceptions in destructors (of the pool in this case) is a bad thing.
That's true, but the alternative is worse. It can't delete live objects
and it shouldn't quietly fail. The only other thing I can think of is
to do a quick analysis to see what object is dangling, what code
allocated it, who wrote that code, and send the miscreant a nasty email.
It is illegal to throw an exception from a destructor if the destructor
was called via an exception (it is considered a failure of the
exception-handling mechanism and leads to std::terminate() being
called). Exiting a destructor by throwing an exception is also a
violation of the standard library usage requirements. Finally, I believe
there is no way to provide any exception safety guarantee about a class
which does this. This in turn leads to what Herb Sutter refers to as the
canonical exception safety rule: Never allow an exception to escape from
a destructor or from an overloaded "operator delete()" or "operator
delete[]()".
Alex Peshkov
2005-02-22 03:06:29 UTC
Permalink
Post by eg
Post by Jim Starkey
Post by Adriano dos Santos Fernandes
Post by Jim Starkey
Post by Adriano dos Santos Fernandes
I think destroy pool with have objects allocated should "assert" in
debug build and log in release build.
During a transition period, pools will always have outstanding
objects, so it will be quite some time before we an enforce this.
When we do, however, I think an attempting to delete a pool with
objects outstanding should throw an exception, leaving the memory
pool (and objects) intact. It is far better to have a memory leak
than to turn a reference to an "expunged" object into a certain
server crash.
But throwing exceptions in destructors (of the pool in this case) is a bad thing.
That's true, but the alternative is worse. It can't delete live
objects and it shouldn't quietly fail. The only other thing I can
think of is to do a quick analysis to see what object is dangling,
what code allocated it, who wrote that code, and send the miscreant a
nasty email.
It is illegal to throw an exception from a destructor if the destructor
was called via an exception (it is considered a failure of the
exception-handling mechanism and leads to std::terminate() being
called). Exiting a destructor by throwing an exception is also a
violation of the standard library usage requirements. Finally, I believe
there is no way to provide any exception safety guarantee about a class
which does this. This in turn leads to what Herb Sutter refers to as the
canonical exception safety rule: Never allow an exception to escape from
a destructor or from an overloaded "operator delete()" or "operator
delete[]()".
It seems that std::terminate() is really best choice in such case.
Server's memory corrupted, further execution easily leads to DB corruption.

A.
Jim Starkey
2005-02-22 04:28:08 UTC
Permalink
Post by Alex Peshkov
Post by eg
It is illegal to throw an exception from a destructor if the
destructor was called via an exception (it is considered a failure of
the exception-handling mechanism and leads to std::terminate() being
called). Exiting a destructor by throwing an exception is also a
violation of the standard library usage requirements. Finally, I
believe there is no way to provide any exception safety guarantee
about a class which does this. This in turn leads to what Herb Sutter
refers to as the canonical exception safety rule: Never allow an
exception to escape from a destructor or from an overloaded "operator
delete()" or "operator delete[]()".
It seems that std::terminate() is really best choice in such case.
Server's memory corrupted, further execution easily leads to DB corruption.
A called service should never terminate except the most extreme of
circumstances. In my mind, detection of memory leak does not qualify as
extreme.

There are at least ways we can handle the situation. One is to make add
a deletePool method that would throw an exception is the pool were
non-empty, otherwise perform a delete (the pool destructor would be
protected). The other is that deletePool makes the pool for deletion
when the last outstanding block was released. This has a certain amount
of elegance to it, but it can lead to unannounced memory leaks, which I
don't like. In most if not all circumstances, a developer should know
that a pool is empty before deleting it, and if it isn't, he should hear
about it sooner than later. There are other circumstances, though I
can't think of any in Firebird, where multiple threads are cooperating,
and marking a resource for delete is the best way to handle the problem.
eg
2005-02-23 00:33:15 UTC
Permalink
Post by Jim Starkey
Post by Alex Peshkov
Post by eg
It is illegal to throw an exception from a destructor if the
destructor was called via an exception (it is considered a failure of
the exception-handling mechanism and leads to std::terminate() being
called). Exiting a destructor by throwing an exception is also a
violation of the standard library usage requirements. Finally, I
believe there is no way to provide any exception safety guarantee
about a class which does this. This in turn leads to what Herb Sutter
refers to as the canonical exception safety rule: Never allow an
exception to escape from a destructor or from an overloaded "operator
delete()" or "operator delete[]()".
It seems that std::terminate() is really best choice in such case.
Server's memory corrupted, further execution easily leads to DB corruption.
A called service should never terminate except the most extreme of
circumstances. In my mind, detection of memory leak does not qualify as
extreme.
I agree for a called service, but before this discussion gets off track
too much, I am not sure if that is what was said.

Here is my take:
Someone suggested throwing an exception from a destructor as a potential
mechanism for handling a situation. I said, don't do that (see above)...
because if said destructor (which throws) is entered "because of an
exception" then std:terminate gets called... do not pass go, do not
collect $200, etc...

To which Alex Peshkov responded that "that may be the best choice"
because in that situation "Server's memory is corrupted" (to
paraphrase). He didnt mention a memory leak as far as I can tell.
I am not familiar enough with the internals to know that "if" the pool's
destructor is entered via an exception, what it says about the state of
the engine (it may be toast). That is an architectural question.

All I can say is, designing a system which throws exceptions out of
destructors is a recipe for trouble.
Alex Peshkov
2005-02-24 00:38:27 UTC
Permalink
Post by Jim Starkey
Post by Alex Peshkov
Post by eg
It is illegal to throw an exception from a destructor if the
destructor was called via an exception (it is considered a failure of
the exception-handling mechanism and leads to std::terminate() being
called). Exiting a destructor by throwing an exception is also a
violation of the standard library usage requirements. Finally, I
believe there is no way to provide any exception safety guarantee
about a class which does this. This in turn leads to what Herb Sutter
refers to as the canonical exception safety rule: Never allow an
exception to escape from a destructor or from an overloaded "operator
delete()" or "operator delete[]()".
It seems that std::terminate() is really best choice in such case.
Server's memory corrupted, further execution easily leads to DB corruption.
A called service should never terminate except the most extreme of
circumstances. In my mind, detection of memory leak does not qualify as
extreme.
Yes, it was too hot to say that we should terminate in case of detected
memory leak. With current codebase firebird will terminate too often :)
Post by Jim Starkey
There are at least ways we can handle the situation. One is to make add
a deletePool method that would throw an exception is the pool were
non-empty, otherwise perform a delete (the pool destructor would be
protected). The other is that deletePool makes the pool for deletion
when the last outstanding block was released. This has a certain amount
of elegance to it, but it can lead to unannounced memory leaks, which I
don't like. In most if not all circumstances, a developer should know
that a pool is empty before deleting it, and if it isn't, he should hear
about it sooner than later. There are other circumstances, though I
can't think of any in Firebird, where multiple threads are cooperating,
and marking a resource for delete is the best way to handle the problem.
If we choose last approach (mark pool for deletion), we may get a very
big memory leak (one case when memory leaks single bytes, another -
whole pools). I'm afraid that the only realistic approach for the
reasonably near future if to log the situation and delete pool. Taking
into account that current engine, which follows deletes pools bravely
and without logging, works, this should not be great danger.
Jim Starkey
2005-02-24 04:47:23 UTC
Permalink
Post by Alex Peshkov
If we choose last approach (mark pool for deletion), we may get a very
big memory leak (one case when memory leaks single bytes, another -
whole pools). I'm afraid that the only realistic approach for the
reasonably near future if to log the situation and delete pool. Taking
into account that current engine, which follows deletes pools bravely
and without logging, works, this should not be great danger.
Absolutely not! There is no way that the system can possibly delete an
active object that it knows nothing about. This is an absolutely
guarenteed server crash!

If all attempts to delete non-empty pool throw exceptions, memory leaks
will be detected and fixed during development. If we get leaks on a
production server, we accept the fact that there's a leak and get on
with life.
Alex Peshkov
2005-02-24 05:04:55 UTC
Permalink
Post by Jim Starkey
Absolutely not! There is no way that the system can possibly delete an
active object that it knows nothing about. This is an absolutely
guarenteed server crash!
Not always. Object A created object B in the same pool, but due to the
bug in destructor had not released it. If there are no external
references to B, nothing bad will happen.
In general I'm absolutely agreed with what you say, but current state of
firebird is not general - it is very-very specific.
Post by Jim Starkey
If all attempts to delete non-empty pool throw exceptions, memory leaks
will be detected and fixed during development.
Certainly, this is the best way to go.
Post by Jim Starkey
If we get leaks on a
production server, we accept the fact that there's a leak and get on
with life.
Even don't know what is worse case - server, which crashes once per
week, or server, which overflows memory and kills with that host's OS
(for w9x "operating system" that's quite real) once per day. I hope this
will not happen with our production version, but...
May be as a last step we should forcely restart server when leaked
memory becomes larger than normal?
Roman Rokytskyy
2005-02-24 05:27:10 UTC
Permalink
Post by Alex Peshkov
Not always. Object A created object B in the same pool, but due to the
bug in destructor had not released it. If there are no external
references to B, nothing bad will happen.
Probably an offtopic question - isn't there any chance to introduce
Java-like garbage collector to the C++ code and to forget about the memory
allocation issues at once? We just let GC (or pool) to traverse references,
build reference graphs and release eligible objects. What is wrong with this
idea?

Roman
Jim Starkey
2005-02-24 07:33:15 UTC
Permalink
Post by Roman Rokytskyy
Post by Alex Peshkov
Not always. Object A created object B in the same pool, but due to the
bug in destructor had not released it. If there are no external
references to B, nothing bad will happen.
Probably an offtopic question - isn't there any chance to introduce
Java-like garbage collector to the C++ code and to forget about the memory
allocation issues at once? We just let GC (or pool) to traverse references,
build reference graphs and release eligible objects. What is wrong with this
idea?
The short answer is no. Java garbage collection generally requires
quiescing the system to perform a garbage collect cycle (Sun abandons
garbage collection is a thread starts, Netfrastructure allocates objects
during garbage collection pre-marked, and most other JVMs freeze all
threads for the duration). None of these are possible in C++.
Something could be done by inheritence from a common GC class, but would
probably require substantial locking to solve the threading issues. But
the most serious problem is that Firebird is utterly type-unsafe.
Almost every block and piece of casts pointers of something to something
else entirely different (Java, course, does not permit a cast to
something an object isn't).

That's why it can't be done. The stronger argument is why it shouldn't
be done.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
marius popa
2005-02-25 12:26:12 UTC
Permalink
Post by Roman Rokytskyy
Post by Alex Peshkov
Not always. Object A created object B in the same pool, but due to the
bug in destructor had not released it. If there are no external
references to B, nothing bad will happen.
Probably an offtopic question - isn't there any chance to introduce
Java-like garbage collector to the C++ code and to forget about the memory
allocation issues at once? We just let GC (or pool) to traverse references,
build reference graphs and release eligible objects. What is wrong with this
idea?
Related to this thread i found a page with many references to gc (1875
references)

http://www.cs.kent.ac.uk/people/staff/rej/gc.html

"the Garbage Collection page is a comprehensive resource for automatic
dynamic memory management a.k.a garbage collection."

There are gc for c++ but this have to be decided if is worth it

Nickolay started using valgrind for detecting memory leaks in
firebird2.0, i will try to see what it really does
--
Regards,

Marius - developer flamerobin.org
Jim Starkey
2005-02-25 12:50:01 UTC
Permalink
Post by marius popa
Nickolay started using valgrind for detecting memory leaks in
firebird2.0, i will try to see what it really does
I don't get it. A memory manager (aka pool manager) is a leak
detector. It has to know what memory is outstanding and what memory is
available for reuse. I can't image why another leak detector is
necessary other than the unfortunate fact that in Firebird 2, almost
nothing is released, so everything is a leak.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Jim Starkey
2005-02-24 07:14:12 UTC
Permalink
Post by Alex Peshkov
Post by Jim Starkey
Absolutely not! There is no way that the system can possibly delete
an active object that it knows nothing about. This is an absolutely
guarenteed server crash!
Not always. Object A created object B in the same pool, but due to the
bug in destructor had not released it. If there are no external
references to B, nothing bad will happen.
In general I'm absolutely agreed with what you say, but current state
of firebird is not general - it is very-very specific.
You can't and don't know anything about the dangling object. If it's a
live object, here is what can happen:

1. If the object owns a socket, the socket isn't closed.
2. If the object has opened a file, the file will remain open
3. If the object has allocated system shared memory, that memory is lost
4. If the object has allocated system semaphors, the semaphores are lost
5. If the object has allocated memory outside the pool, that memory
is lost
6. If the object hold reference counts to other objects, those
objects won't get deleted
7. Any code using the object will get, best case, incorrect results,
but probably crash
8. Any writes to the object after release will corrupt the system
memory unless
9. The memory is reallocated, in which case write references to the
old object with corrupt the new object, leading to either
incorrect behavior or a system crash.
10. When the object is eventually deleted, the pool will be gone, and
thet system will crash.

The fundamental fallacy with your argument is that you know everything
that is going on, what objects are in the pool, what objects those
object allocate and from where, who other external resources might be in
use, etc. You can't know this things. And if you know them in the
morning, somebody else might check in perfectly valid code in the
afternoon that would invalidate your knowledge.

The way to make reliable systems to make them simpler. The way to make
simpler systems is move the locus of intelligent from the "system" to a
set of autonomous objects. Object export behaviors but their interior
workings are private. This allow objects to work in different contexts,
permitting code reuse, but also reduces the level of complexity of the
entire system by orders of magnitude.

Firebird is a C program written in C++. Our next step is to reduces
it's complexity with object technology. That requires that you accept
the fundamental definition of objects as opaque, encapsulated entities.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Claudio Valderrama C.
2005-02-18 21:37:22 UTC
Permalink
Post by Jim Starkey
All that said, the current numbers for a small block (usually < 4K)
allocation and release is around 600 picoseconds on a cheap P4, not
something to lose a lot of sleep over.
600 picoseconds? I wish. Please make that 600 *nanoseconds*, which
still isn't that shabby.
I thought they were 600 femtoseconds.
:-)

C.
Sergio Samayoa
2005-02-19 09:19:15 UTC
Permalink
Post by Jim Starkey
2 are deleted by pool deletion. To make this work, every resource
controlled by an object subject to delete-by-pool must also be pool
Tell me ignorant (I don't known nothing about FB internals) but this could
be done defining an "PooledObject" interface and implemented by all objects
which must be put in the pool?.

Pool just known that he contains "PooledObject" implementators and when pool
want to destroy contained "PooledObject" just call something like
canDestroy() to check if object can be destroyed then something like
destroy()?

PD: "quedar bien con Dios y con el Diablo"
Jim Starkey
2005-02-19 09:30:16 UTC
Permalink
Post by Sergio Samayoa
Post by Jim Starkey
2 are deleted by pool deletion. To make this work, every resource
controlled by an object subject to delete-by-pool must also be pool
Tell me ignorant (I don't known nothing about FB internals) but this could
be done defining an "PooledObject" interface and implemented by all objects
which must be put in the pool?.
It's more or less done that way now. There are three problems with it.
First, it doesn't work with objects that don't inherit from
PooledObject, which means the project can't reuse classes developed for
other contexts. Second, it means that Firebird objects can be used in
other contexts, such as the utilities that don't following "pool"
rules. Third, it does nothing about pointers to objects in a deleted
pool by objects outside of the pool.
Post by Sergio Samayoa
Pool just known that he contains "PooledObject" implementators and when pool
want to destroy contained "PooledObject" just call something like
canDestroy() to check if object can be destroyed then something like
destroy()?
What happens if the object says no?
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Sergio Samayoa
2005-02-19 10:03:29 UTC
Permalink
Post by Jim Starkey
First, it doesn't work with objects that don't inherit from
PooledObject, which means the project can't reuse classes developed for
mmm...
I'm Java developer then I though that C++ classes can implement interfaces
without inheritance. Pool can ask if pooled object is instanceof the
interfaces and cast with the interface and call the methods I said.
Post by Jim Starkey
What happens if the object says no?
mmm...
Good question.
It's up to the pool implementer.
If pool is shutting down could ignore.
Jim Starkey
2005-02-19 10:32:36 UTC
Permalink
Post by Sergio Samayoa
Post by Jim Starkey
First, it doesn't work with objects that don't inherit from
PooledObject, which means the project can't reuse classes developed for
mmm...
I'm Java developer then I though that C++ classes can implement interfaces
without inheritance. Pool can ask if pooled object is instanceof the
interfaces and cast with the interface and call the methods I said.
This doesn't work nearly as well in C++ as in Java. Runtime type
information was not part of the original C++ language. It has been
added, but introduces enough link time problems that all compilers that
I've used disable it by default. The object world has never quite
settle on whether run time type information (RTTI) is a good thing or
not. The persistent pattern seems to be that large projects usually
build it in themselves for regrettable reasons, and builtin RTTI
facilities are only marginally better than ad hoc.

We had a nice brawl about this about a year ago. The context then was
exception objects. One camp felt RTTI was necessary for exception
handlers to be able to distinquish between exception types. The other
camp argued that a type hierarchy of exception objects made this
unnecessary. In typical fashion, it was resolved by Firebire 2 do it
one way and Vulcan the other. Oh, boy, another merge issue! My
favorite! (reference upon request).
Post by Sergio Samayoa
Post by Jim Starkey
What happens if the object says no?
mmm...
Good question.
It's up to the pool implementer.
No, I disagree. We should decide what behavior is correct.

Personally, I would like the policy to be "no dangling objects." If the
pool policy is a noisy, logged memory leak, programmers will be less
inclined to consider that in just their exceptional instance, a little
bit of memory is ok to loose. Let's do those guys a favor and keep them
honest.
Post by Sergio Samayoa
If pool is shutting down could ignore.
If the server is shutting down, yes. If somebody is deleting a
temporary pool, that is a different question.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Sergio Samayoa
2005-02-19 10:59:30 UTC
Permalink
Post by Jim Starkey
This doesn't work nearly as well in C++ as in Java. Runtime type
Thank you for the clarification.
Claudio Valderrama C.
2005-02-20 19:49:04 UTC
Permalink
Post by Jim Starkey
This doesn't work nearly as well in C++ as in Java. Runtime type
information was not part of the original C++ language.
Maybe do you refer to C with Classes? The typical C++ of 1990 had some RTTI
at least that I can remember. Sure, the language was older, but commercial
implementations other than ATT's own one didn't appear immediately.
Post by Jim Starkey
It has been
added, but introduces enough link time problems that all compilers
that I've used disable it by default. The object world has never
quite settle on whether run time type information (RTTI) is a good
thing or not. The persistent pattern seems to be that large projects
usually build it in themselves for regrettable reasons, and builtin
RTTI facilities are only marginally better than ad hoc.
The alternative is to have a bulky RTII like Delphi has. I'm personally not
a fan of paying that price. I would welcome some sort of extensive RTTI.
Post by Jim Starkey
We had a nice brawl about this about a year ago. The context then was
exception objects. One camp felt RTTI was necessary for exception
handlers to be able to distinquish between exception types. The other
camp argued that a type hierarchy of exception objects made this
unnecessary. In typical fashion, it was resolved by Firebire 2 do it
one way and Vulcan the other. Oh, boy, another merge issue! My
favorite! (reference upon request).
Asking for a typeid of an object is the last resort. A set of catch blocks
ordered by hierarchy looks always clearer in code. Not always the clearest
approach can be taken, of course.
Post by Jim Starkey
Personally, I would like the policy to be "no dangling objects." If
the pool policy is a noisy, logged memory leak, programmers will be
less inclined to consider that in just their exceptional instance, a
little bit of memory is ok to loose. Let's do those guys a favor and
keep them honest.
There are two problems: for one side, lazy people that don't deallocate what
they allocated dynamically. Of course, better alternatives exists or the
same deallocation has to be done in normal code and in a catch block. Not
deallocating in an implementation without garbage collection is bad
programming to me. In the other side, since bugs exist, leaks may happen. If
the leak happens in a section called too often, we have dramatic memory
consuption.

C.
Jim Starkey
2005-02-21 04:32:00 UTC
Permalink
Post by Claudio Valderrama C.
There are two problems: for one side, lazy people that don't deallocate what
they allocated dynamically. Of course, better alternatives exists or the
same deallocation has to be done in normal code and in a catch block. Not
deallocating in an implementation without garbage collection is bad
programming to me. In the other side, since bugs exist, leaks may happen. If
the leak happens in a section called too often, we have dramatic memory
consuption.
The traditional way to handle bugs is to fix them. A policy of "no
dangling" objects means that a memory leak will throw an exception
reporting (when built for debug) the file and line number of the
allocation. That should be a pretty good indication that there is a bug
and a pretty good hint where the problem originated. The alternative,
like Apache, is to tolerate buggy code, and to make it difficult for
consciencious programmers to find their leaks.

There are been a great deal of discussion of object integrity and object
referential integrity. There seems to be a growing consensus that
deleting an object without calling its destructor and/or will
outstanding pointers to it in other objects is a bad thing. Do you have
an opinion on this question?
Rented Mule
2005-02-22 08:04:24 UTC
Permalink
You are joking, right? A memory leak by definition is
a failure of the system. The first detection of leaked
memory could be loosly related to the first coughs of
Pneumonia. Something is wrong and if you don't address
it, you WILL DIE. The same goes for a system that runs
a service that leaks memory. Eventually, it WILL reach
a failed state. Especially when you consider the fact
that this is generally ran as a background service.

Jason
Post by Jim Starkey
A called service should never terminate except the
most extreme of
circumstances. In my mind, detection of memory leak
does not qualify as
extreme.
__________________________________
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250
Jim Starkey
2005-02-22 08:15:20 UTC
Permalink
Post by Rented Mule
You are joking, right? A memory leak by definition is
a failure of the system. The first detection of leaked
memory could be loosly related to the first coughs of
Pneumonia. Something is wrong and if you don't address
it, you WILL DIE. The same goes for a system that runs
a service that leaks memory. Eventually, it WILL reach
a failed state. Especially when you consider the fact
that this is generally ran as a background service.
No, I'm not joking. 98% of the objects in Firebird aren't released now.
It's going to be a long, hard slog to get from here to reasonably clean
object discipline. In the meantime, we're going to suffer leaks.
Committing suicide on the first one is going to slow down the process.

But, no, a memory leak is a flaw, not a failure. Remember the
alternative is to delete the object any way, which is worse. The goal
is a system that doesn't leak. A step towards getting there is an
allocation mechanism that complains when memory is lost.
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Paulo Gaspar
2005-02-22 12:03:51 UTC
Permalink
Maybe "Rented Mule" is confusing "memory leak" with "memory corruption".

I believe Jim is absolutly right about this. Leaks should be detected as
soon as possible but
the system should handle them as graciously as possible.

Memory corruption (where you can have overlaping of allocated blocks and
so) is the really
nasty one that should cause imediate termination of the server. As far
as I remember, Jim
also defended imediate termination of the server in case of memory
corruption in the recent
past.

BTW, most production servers I know of, even in Java, have memory leaks.


Regards,
Paulo Gaspar
Post by Jim Starkey
Post by Rented Mule
You are joking, right? A memory leak by definition is
a failure of the system. The first detection of leaked
memory could be loosly related to the first coughs of
Pneumonia. Something is wrong and if you don't address
it, you WILL DIE. The same goes for a system that runs
a service that leaks memory. Eventually, it WILL reach
a failed state. Especially when you consider the fact
that this is generally ran as a background service.
No, I'm not joking. 98% of the objects in Firebird aren't released
now. It's going to be a long, hard slog to get from here to
reasonably clean object discipline. In the meantime, we're going to
suffer leaks. Committing suicide on the first one is going to slow
down the process.
But, no, a memory leak is a flaw, not a failure. Remember the
alternative is to delete the object any way, which is worse. The goal
is a system that doesn't leak. A step towards getting there is an
allocation mechanism that complains when memory is lost.
Leyne, Sean
2005-02-25 13:32:01 UTC
Permalink
Jim,
Post by Jim Starkey
Post by marius popa
Nickolay started using valgrind for detecting memory leaks in
firebird2.0, i will try to see what it really does
I don't get it. A memory manager (aka pool manager) is a leak
detector.
Valgrind is not a memory manager -- it is a memory debugging/monitoring
tool.


Sean
Jim Starkey
2005-02-25 13:52:59 UTC
Permalink
Post by Leyne, Sean
Valgrind is not a memory manager -- it is a memory debugging/monitoring
tool.
I thought the memory pool was supposed to be a memory
debugging/monitoring tool. Why does a Firebird2 pool manager need a
debugging/monitoring tool? And what does valgrind use for a memory
debugging/monitoring tool? And how many levels does this all go?

For the record, the Vulcan MemMgr class manages pools, detects memory
related bugs, and monitors memory usage by count and size. It also
contains the structures to report, at any time, the number of objects
and totals size object objects allocated by source file name line number
as well as the number and total size objects released pending reuse,
also by source file and line number (I didn't bring the pool analysis
code because I didn't think anyone cared). If you'd like to see what
you can do with this information, take a look at
http://www.netfrastructure.com/analysis/home.nfs?a=analysis&mem=analysis
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Daniel Rail
2005-02-25 14:19:52 UTC
Permalink
Hi Jim,
Post by Jim Starkey
I thought the memory pool was supposed to be a memory
debugging/monitoring tool. Why does a Firebird2 pool manager need a
debugging/monitoring tool? And what does valgrind use for a memory
debugging/monitoring tool? And how many levels does this all go?
Before going any further, you should read the overview of Valgrind:
http://valgrind.kde.org/overview.html

It's an external debugging tool. And, it's not compiled within
Firebird.
--
Best regards,
Daniel Rail
Senior System Engineer
ACCRA Group Inc. (www.accra.ca)
ACCRA Med Software Inc. (www.filopto.com)
Leyne, Sean
2005-02-25 14:22:37 UTC
Permalink
Jim,
Post by Jim Starkey
Post by Leyne, Sean
Valgrind is not a memory manager -- it is a memory
debugging/monitoring
Post by Jim Starkey
Post by Leyne, Sean
tool.
I thought the memory pool was supposed to be a memory
debugging/monitoring tool.
Memory pools provide better means of debugging/monitoring memory usage,
they are not intended to provide the full scope of debugging
requirements (not without a LOT of work).
Post by Jim Starkey
Why does a Firebird2 pool manager need a debugging/monitoring tool?
How do you find memory / variable allocation errors?
Post by Jim Starkey
And what does valgrind use for a memory debugging/monitoring tool?
Valgrind is the memory debugging tool. Have a read:

http://valgrind.kde.org/
Post by Jim Starkey
For the record, the Vulcan MemMgr class manages pools, detects memory
related bugs
I doubt that can it detects all memory errors, which is what Valgrind.

There are conditions which a memory manager can never catch since it
involves accesses to memory not allocated via the manager.


Sean
Jim Starkey
2005-02-25 14:43:50 UTC
Permalink
Post by Leyne, Sean
Memory pools provide better means of debugging/monitoring memory usage,
they are not intended to provide the full scope of debugging
requirements (not without a LOT of work).
It's a very good investment. Many of the internal debugging tools in
Firebird have atrophied or completely disappeared. Investment in
internal debugging tools would make it a great deal easier to track down
bugs.
Post by Leyne, Sean
How do you find memory / variable allocation errors?
First by checking the object prolog. In MemMgr, a only a valid object
has a pointer to the pool. Then you can check to see if the address
falls within the general bounds of the memory belonging to the pool
(MemMgr hasn't needed to do this). Then, if compile for debugging, it
checks the state of guard bytes.

A double delete is the most likely problem followed closely by a block
overrun, though I have to admit that an invalid pool pointer in the
prolog rarely gets as far as the memory manage release code. The
Firebird 2 memory manage is usually deep in the bowells of the bplus
tree before it dies having already destroyed the evidence.

Another useful trick is having separate entrypoints for debug/non-debug
allocations so a mix of modules compiled with different options doesn't
crash and burn.
Post by Leyne, Sean
Post by Jim Starkey
For the record, the Vulcan MemMgr class manages pools, detects memory
related bugs
I doubt that can it detects all memory errors, which is what Valgrind.
Gosh, the debugger catches the really gross ones before MemMgr or
(Valgrind) even has a chance.
Post by Leyne, Sean
There are conditions which a memory manager can never catch since it
involves accesses to memory not allocated via the manager.
Why would anyone want to bypass the memory manager?
--
Jim Starkey
Netfrastructure, Inc.
978 526-1376
Continue reading on narkive:
Loading...