Don't
Thinking Out Loud.Donald Noyes.20090224.m05
I have heard this phrase over and over again as an argument for efficiency, and are typified by such statements as "Once And Only Once". It seems that if something is to be done, recorded, that one process or organization or organism should do it.
This is unnatural being against the nature of the way things are. Why take but one picture of a family member?
We as human beings are unique, separate, and autonomous. Nature provides for duplication and multiplication as a defense against destruction and extinction.
In the computing and communicating mediums we often we hear from another camp, all "x" should be handled by "y", we don't need "a" through "w" anymore, they are either "fat", "thin", "hot", "cold", and they are not "just right", and will run in "z".
In this medium we are forever abandoning things found Useful Usable Used for other things perceived to be "better". Often they are. Who would argue that a 8 inch floppy disk is better for storage than is a Flash Drive? Who would rather have a machine with a 20 Meg Hard Drive over one with a a 500 Meg Hard Drive? This is progress, this is improvement, and this is change for the better. But here we are talking about NewGenerations of the same basic thing, "storage".
But to only have "one machine", "one hard drive", or "one user" is an obviously unseemly scenario.
I say that duplication, multiplication, increases, are healthy and natural. That people misuse and mishandle the things we have is no argument for their extinction. (the user, or the used)
The internet is a perfect example of multiplicity and an argument for duplication. While some may say "it is too much", I think that having several million blades of grass in a lawn is far better than a clump or two.
What's your take on this?
I'd rather have a lawn with a clump or two of grass, because I wouldn't have to mow it.
Suggestions to Avoid Duplication and employ Once And Only Once refer to developer-authored mechanisms. Essentially, they state that an object (in the broad sense, which could include classes, functions, templates, table definitions, etc.) should be defined in one place. Defining it in more than one place is redundant, and likely to result in anomalous behaviour when (say) definition A gets updated but definition A' does not. It requires additional, otherwise-unnecessary developer effort to maintain redundant definitions. However, this does not apply to data-preservation and validation mechanisms such as redundant storage, multiple data-entry, backups, etc., which rely on duplication to ensure reliability and/or longevity.
Wow, Donald! That straw man is burned!
Seriously, I've never once heard people suggest Once And Only Once applies to family photos, or that Duplicates Are Bad applies to hard drives. If anything, language or library support for those would probably fall under the Zero One Infinity rule.
My take on this is that you're confusing software (or any design, for that matter) with instances of a design.
Software must remain simple. Multiplicity inside software is a horrific thing, a punishment not even the folks at Gitmo considered for its inhumanity. You may think it enjoyable, perhaps even employment preserving, to maintain software rife with duplication and unnecessary lack of abstraction. I do not. I have far better things to do with my time, like actually solve customer problems.
However, instances of a design may multiply up to the limits of its ecosystem. You can have any number of threads executing the same binary image as you want, and it might even be helpful, as long as you don't run into the server's memory bottleneck. Even if you never do, you can only support so many threads per process before the kernel runs out of threads for that process.
Can you imagine how big the Linux kernel would be if we didn't have the subroutine abstraction? You think Windows was big?
Of course, ideally, you want multiple species to balance each other out. Humans, left to their own devices, are an unbelievably destructive species. But, we're not alone. Invasive species in the swamps of Florida are showing an incredible ability to decimate local fish stocks. Conservationists are powerless to stop them. Why? Because their natural predators don't exist to keep them from wildly affecting the local ecosystem. Once new predators exist, a new balance will be reached, allowing for more equitable resource sharing amongst the species.
Unfortunately, nobody in the software community has yet learned how to make programs compete for resources in a manner like plants and animals do in the real world. I'm not entirely sure I'd want that either -- think about it -- you're happily browsing your web pages when all of a sudden portions of your browser window starts to wig out. Curious, you look into what's happening, and you see that your e-mail client and the desktop itself are fighting the browser for RAM. Eventually, the "pack" of software defeat the browser, and it just dies a slow, miserable death. The desktop and e-mail clients consume the image formerly held by the browser, thus assimilating its bits into themselves for its use. Sound preposterous? I sure hope so.
While people in the software community don't have much interest in writing programs that eat pieces of one program then later alleviate themselves upon another, the software community has, in fact, come up with a number of both competitive and cooperative approaches to resource management. I do suggest, however, that we want the software world to be considerably less scary and more intellectually tractable than our own. Safety, security, transactions, verifiable trust, understandable economy or market-based principles for resource management, etc. are much favorable to viruses, worms, sniffers, zombies, etc.
Computers presently manage their resources far more efficiently than any market-based solution ever could. This is because computers run via dictatorship, and are generally unbiased; the kernel knows all, and therefore can make superior decisions. The whole reason dictatorships do not work in the Real World is because humans don't have all the information they need to make (good) decisions, and are easily biased to support one group over another (and often already are when they come to power).
Dictatorships don't work in computer systems, either, at least not any more generally or efficiently than they do in human societies. First, kernels do not "know all" - critically, they (a) do not at implementation time know exactly what programs will be running atop them, and (b) cannot at runtime generally confirm or deny any interesting properties with which to make better decisions (Rices Theorem). Second, while it may be possible to effectively apply imperial resource management when dealing with very well defined systems (e.g. programming on embedded systems without any desire for runtime upgrade or ad-hoc runtime observation), competition and such get involved the moment independently developed or controlled processes with differing goals start demanding access to a common set of resources (resources including sensors, actuators, energy, CPU, memory, bandwidth, priority scheduling, etc.). We would do well to recognize this competition and tame it (e.g. with market-driven resource policies, contracts, credit, trust, accounting, auditing, etc). I suspect doing so will allow far more efficient and effective resource management - both competitive and cooperative - than is currently achieved by modern imperative approaches.
RE: 'kernels do not "know all"': But it knows what resources are available, and when to disperse them to requesting applications. [Irrelevant. A kernel that merely services resources upon request is not 'making decisions' at all, much less intelligent decisions based on having 'all' the information.]
Then, neither does capitalism, for producers sell to consumers on a first-come, first-served basis too. Since nobody can predict the future, this is actually the only way it can work. So, we're still left with a collection of equivalent resource management techniques.
Uh, no. First-come-first-served is common only to very small businesses in capitalist systems, and is rare even among them (deadline driven supply, class-based concurrent operations, and various forms of retainership being more common). Most producers systems greatly leverage concurrent production and economies of scale. As far as 'intelligence' goes, capitalism itself isn't intelligent; rather, market-based systems both put the means of resource provision into user-space thus allowing intelligence to be developed and leveraged by consumers without any 'special' integration into a 'kernel' service.
Have you any idea how utterly retarded imposing a market system would be on any software system? Efficiency goes out the window entirely. First, you have to find a means by which programs earn an income. Then you must find a way by which they can select and choose which resources to acquire based on price/value comparisons, including the intelligence necessary to defer acquisition until such time affordability makes sense. Then you have to deal with underprivileged programs. A newly launched program will have zero in its account, so how can it earn anything by running? It can't run because it can't acquire resources! I could go on, and on, and on, and on. The market system sucks for resource management when you already have access to the knowledge you need to properly administer your own resources. The whole reason the market system works for us is because of our limited ability to process information. By turning individuals into cellular automata, the market system defines the rules by which automatons interact. In a system where any program, permissions permitting, can query any resource on its current state or availability, imposing a market system is just plain unnecessary overhead, and yields unfair policies of epic proportions.
Just to be clear, market based resource management does not necessarily involve actual exchange of money, and rarely involves giving 'accounts' to individual programs. What it does involve is adaptations of such ideas as credit, trust, bids and auctions, requests, offers, tickets, roles and responsibilities, auditing, etc. to multi-process interaction... allowing for both competitive and cooperative interactions among processes. In any case, most of your assertions are in error. E.g. a newly launched process can receive resources from whichever program launched it, and a newly launched service (loosely, any amalgamation of processes working towards a common end) would generally request and obtain resources from whomever chooses to invoke it. Systems may also 'borrow' resources combined with a promise to pay them back, and low-priority operations can simply run on overhead in the background.
In this case, the Ell Four Micro Kernel already implements policies along these lines, and ultimately, nearly all kernels in existance do too. However, Ell Four allows any process to manage resources in this manner. There's always an "ownership" chain, ultimately leading back to the root process.
Ownership is one of many resource management strategies, and isn't always a particularly efficient one when working with resources that need to be shared, since there is rarely a standard process for renegotiating ownership - the 'chain' really ties one's hands when it comes to sharing resources. I've a much greater fondness for Object Capability Model with its distributed ownership and built in approaches for both sharing and limiting access to resources 'sideways' across processes. Anyhow, I see a contradiction between your most recent statement (that the kernel is handing ownership and management of resources off to processes) and your earlier assertion (that the kernel is somehow taking advantage of its 'knowing all' to make wise resource management decisions). Putting resource management in the hands of process chains simply localizes the management, and the knowledge with which it is made.
Then you admit that market-based resource management isn't always a particularly efficient one when working with resources that need to be shared? Markets imply private ownership.
I don't agree that markets imply private ownership. Ownership is one option, among many, when it comes to controlling resources. Others include: contract-based access, lease, tolls, tickets and booking, borrowing, and lending. Further, there are alternatives to 'private' ownership, such as group ownership and time shares.
RE: can query any resource on its current state or availability - I'll note on this point that polling resources for availability is extremely inefficient.
Not necessarily. The hotdog stand on the street corner could be offering food at $0.50 per hotdog, but if I can visibly see a line so long that it wraps around the block, I'll take my business elsewhere. One poll, one decision, end of discussion.
You presume that there is a market of such essential resources (food, in this case) such that you are free to take your business elsewhere. But, even in that case, polling isn't very efficient if you're looking for a good deal. You 'ended the discussion' before following your scenario through the conclusion of 'taking your business elsewhere'. It can take a lot of effort to 'poll' many different food vendors once each, by the time you finish your information on line length and relative prices can be out of date, and the latency cost for attempting to query each vendor can be sufficient to render the total effort to be of negative value.
Of course there is a market; you wouldn't use market-driven techniques if there <i>weren't.</i>
It seems you assume that markets exist prior to the systems of trade. Causality is the other direction: standardized support for market-based resource management is what allows markets to exist. There is rarely a market for such resources in 'dictatorship' based kernel designs.
The assumption is incorrect.
Which assumption? And on what grounds?
At some point, common sense enters the picture. If I stand in the line, I'm queued waiting for the resource to become available. Existing OSes already use this technique. Polling is immediately useful and of immense value, provided you don't abuse it. Attempting to discredit the entire solution, as you are attempting to do now, on the basis that it is possible that a necessary precondition just might, in some circumstance, not hold is utterly preposterous. You're driving this otherwise fruitful discussion into the ground.
Polling is inefficient, but standing in line is NOT the only other viable option (and isn't even an option I'd recommend on large-scale systems).
You have no choice if you're blocked on something.
Why would I have no choice but to wait in a line? Are you assuming something that I'm not? I'm getting the impression that you have a very narrow world view about what is possible in systems architecture based on some small, incestuous set of examples.
My world view is informed by everyday, working man's realities, and not some ivory-tower theoretical conception of resource management. If I'm blocked, meaning I cannot continue in my task's progress, because of some missing resource, then clearly if I want to finish my task, I have to wait for said resource. Why is this such a hard concept?
If you need to wait on a resource doesn't require that you wait in a line for a resource, especially not at any equivalent to a particular hot dog vendor.
You keep saying this, but you keep omitting examples to back your claim. The very definition of the term block is that I'm unable to proceed forward along that particular task. This means that I'm queued for that resource. Can I perform something else in the meantime? Sure!! Absolutely!! But that doesn't change the fact that I am queued!
So, in your world-view, 'forall X: Blocked(X) -> Queued(X)'? Sorry. It may be a terminological issue, but I don't believe it. A queue indicates that the order in which you'll receive your resource is in some manner processed first-come-first-served, potentially modified by some arbitrary notion of priority. If you'd like some counter-examples, consider Blackboard Architecture and Tuple Space approaches to concurrent behavior and fulfilling requests concurrently. Further consider cluster-analysis and processing groups of requests together rather (leveraging economies of scale) than pulling them out of a 'queue' one at a time. Also relevant is the 'closed' vs. 'open' nature between waiting in line on a particular vendor vs. advertising a need in a market of vendors that allow entries to be added or removed over time. I agree that, if one cannot procede without a resource, then one must obtain that resource before continuing. But 'queues' are, to me, merely one strategy for sharing resources, and not at all part of "the very definition of the term 'block'".
And the "incestuous" set of examples I mention is Windows/Unix/Minix/Ell Four/Linux/Cee Language, etc. their kernel designs and strongly 'local machine' abstractions.
So, all this time, you've been thinking about non-local machine abstractions? WHY THE HELL did you not say so? Because this WHOLE TIME I've been talking about local machine resource management!
Resources are resources, local or remote, and 'kernels' are still centralized beasts whether or not they abstract over multiple CPUs and non-uniform memory architectures. All my statements apply both for local and remote resources.
What immediately comes to mind in this case is obtaining resources based on something like a 'wanted' ad - a request for proposal in acquisition terms - where one posts to a common billboard the need for food and a temporal window then allows some bidding to take place to acquire the contract. The other direction also works: you could request a food service to obtain generic class of food on your behalf, in which case it could take advantage of the economies of scale by having food 'immediately' ready in a probabilistic sense based on the size of its consumer base; in this case the hot-dog vendor would be a server client of the food service. Either approach works if you aren't too picky about where your food comes from. Both these approaches require that resources be capable of being traded and managed without being bound by hierarchical 'chains of ownership'.
I see this as being absolutely no different than a call to malloc(). To apply your first example, malloc() can post the request for RAM on some common billboard, which processes can then decide what to release, if anything. malloc() blocks (waits in line) until enough contiguous memory becomes available, or can timeout after a certain period of time. In the second case, this is positively no different from how an existing kernel manages its RAM reserves. Invoking malloc() will cause the kernel to coalesce multiple pages of memory into a single, contiguous region (via page table magic). My application cares not where this memory comes from. When I said that the kernel is omniscient, I was referring specifically to the resources it specifically managed, memory and CPU time being two of them. This is a cold, hard requirement -- you cannot escape this, because the kernel has to manage each and every page the CPU can access (particularly if it intends to support virtual memory, in which case it also partially manages disk stores too).
'malloc()' can be viewed as a request for memory resources and could be well implemented by market strategies. But there exist significant 'positive' differences between posting a request to a billboard allowing multiple independently developed systems compete or cooperate in providing you the necessary service (because that's what a block of memory really is - a service answering to address-based fetch and update requests) versus giving the request to a kernel that needs to make all the associated decisions and that might not be able to reclaim the necessary resources from other processes to which it has already provided resources. As far as your 'cold, hard requirement' goes, I see only your assumption that 'a' kernel must be responsible for managing 'pages' of memory (among other things). That's one option. Another valid option would be to write code that integrates a service registry with services potentially distributed across many machines, and to simply insert (alongside size) latency and availability and security requirements into one's demands for 'malloc()', and to truly allow multiple systems to compete or cooperate in providing the service. As an aside, there is little reason to block immediately upon a call to 'malloc()'. When it comes to making requests that require replies, such as 'malloc()' replying with a service providing a block of memory, consider Send Receive Reply Eventually.
You write words, but I still don't see your point, or how it differs in any significant capacity to what we have in service right now. Regarding Send Receive Reply Eventually, please be aware I'm quite fluent in this style of coding, as it is a mandated architecture when writing Amiga Operating System device drivers and filesystem handlers.
If you are 'fluent' in alternative styles of coding then that should be reflected in your assertions, assumptions, and examples; when you make claims such as "absolutely no different than a call to malloc() [...] malloc() blocks (waits in line)" you give me very little reason to believe you are aware of alternatives and much reason to believe your experience is incredibly limited. As far as your inability to see a difference, I feel it is because you're abstracting away the wrong details. The difference between market approaches and kernels is in how resource allocations are decided, not between which resources are allocated. Things That Are Different Are Not The Same.
They are, and I consider any variant of Send Receive Reply to be utterly orthogonal to the issues I'm talking about. You still need to allocate memory from the kernel. You still need to schedule CPU time for process execution. You still need to schedule message delivery, either via sockets, message buffers, or packets going out the wire. These are the kinds of resources I'm referring to. And these are the kinds of resources which a kernel dominates over. And, without these kinds of resources, which cannot be managed in a market-driven manner precisely because of fairness issues, you can't have Send Receive ReplyAnything.
I don't assume that memory allocation needs to come "from the kernel", and I (despite considerable work in middleware and virtual machine domains) remain unaware of any 'fairness issues' that are prevent market-driven approaches from working with resources such as memory, communications, processing, routing, energy and power, etc. Market systems get bootstrapped much the same as any other system.
Isn't that how the original Amiga Operating System worked? (Sorry, joke... I'll get my coat.)
Not really; it was more closer to how native American tribal life existed. As a rule, all applications respected the Great Spirit (exec.library), and all is peaceful. It usually isn't until some newly loaded application which doesn't obey the Great Spirit, and fails to respect that the Earth doesn't belong to them, but in fact the opposite, that revenge is brought upon the peoples of the RAM in the form of a thunder bird (GURU Meditation Alert). OK, analogy and joke taken way too far now. Wait up for me!
This sort of thing might make for a great game of Core Wars, but it has no place when productivity is necessary.
"you're happily browsing your web pages when all of a sudden portions of your browser window starts to wig out. Curious, you look into what's happening, and you see that your e-mail client and the desktop itself are fighting the browser for RAM. Eventually, the "pack" of software defeat the browser, and it just dies a slow, miserable death. [...] Sound preposterous? I sure hope so."
Nope, sounds pretty normal to me; that happens at least a few times a day. And sometimes my email client has a violent fling with the word processor that insists upon integrating with it and ultimately they both limp away and go into torpor to recover from their injuries. But then, I have to use Windows products and have to have antivirus etc. installed on my work PC.
Did I seem to say do a single thing many places in different ways, or many things in a single place? I meant to say, that duplication is not a Bad Thing. It is how we work. For example, browsing is a duplicated event. Millions of people, right now are browsing some place on the web. But they are not all using the same kind of browser, nor are they looking at the same pages. Some prefer Internet Explorer, others Firefox, etc. To have more than three applications, or processes, or organizations, do or accomplish the same task is not evil or for that matter inefficient. It is differences in them that matter. The user "feels better" when using favorites. To Avoid Duplication and mandate the use of "One For All" and "All For One" is to crush Software Diverstity. This can be applied to all Human Activity. Some spectators like football and call it Football, while others like soccer and call it Football. To mandate but one sport for all spectators is terribly wrong. To go further, I might even say "Avoid Buzz Words" as statements of the what, where, when and how you do any thing. Buzzwords lose meaning each time they are invoked. Instead, look at the Good Things in life with a admiring but critical eye. If you are building, sculpting, painting, programming, photographing, working at your occupation, cooking, or whatever, do it well, not just according to someone else's idea of "right". Go ahead, take the picture number 49,000,001 of the eiffel tower. It is really not duplicating, it is originating. You will then be capturing more than a view of a structure, but also you will be capturing a "moment", an experience, and an impression. See the following, to see what I mean:
-- Still Thinking Out Loud.Donald Noyes
I don't think anyone's opposing you on that view. I don't recall anyone advocating Avoid Duplication, Duplicates Are Bad, or Once And Only Once as universal principles that apply to everything.
Software Engineering Is Art Of Compromise. All else being equal, Avoid Duplication. However, there are other factors and "design principles" to consider that may bump Avoid Duplication down a notch. Nature "loves" duplication of information, it's part of evolution, and so cannot always be "bad".
See also: Shared Library, Dll Hell
See original on c2.com