Collections As Grouping

Stating what may to some Obvious and to others Not so obvious - (with Examples) - Collecting and Collections are central to ModernTechnology


Examples: InformationOrganization

Locale.Space.Page

Url.Artifact.PageOrProcess PhysicalManufacturing

Product.Assembly.Component.Part ElectricalWiring

ConduitOrTray.Cable.Wire

PackageOrSuite.Program.ProcedureOrSubroutine.Expression.Component FilePackaging

ZipArchiveContainer.Directory.File.Contents

The thing each of these have in common is the action of "Grouping", "Staging", "Compartmentalization" and "Direction".

They have a flow of from the most numerous (rightmost) to the least (leftmost)

In InformationOrganization, under a certain Locale may exist Tens to Hundreds of Spaces, under a certain space may exist Tens to Hundreds to Thousands of Pages.

In PhysicalManufacturing, under a certain Product, may exist many Assemblies, made up of many Components, having many Parts

In ElectricalWiring, in an identified Tray or Conduit, may exist many Cables, made up of many Wires

In the process of coding a package or Suite, may include many Programs, each of which contains several to many ProceduresOrSubroutines, each of which will contain from a few to many Expressions

It is convenient, and lends to efficiency, that such organization allows the construction of complex structures, whether they be physical or abstract.

It is possible to construct artifacts randomly and without organization, leaving it only to chance that anything useful evolves, or trusting to efforts to rescue what has evolved, even though no predetermined centralized organization of effort or structuring and planning is involved. But the chances for success are greatly reduced, approaching zero as complexity increases.


Caveat: One has to be careful about a hierarchical view and an actual hierarchy. For example, in the physical modeling levels (car.assembly.part.etc), the accounting or inventory department may not care much about physical construction hierarchies, yet still track info about a given part. In such a case, the hierarchy should perhaps be a *view* into a "master" collection and not a hard-wired characteristic of the collection itself. Related: Limits Of Hierarchies. --top

I see this naming convention as a classification scheme, not as a hierarchy. Thus a wire.cable can refer to the same thing as a cable.wire, i.e wire #7 in Cable 27 is the same thing as cable 27's wire 7. I also see by the wiring diagram that #7 wire in #27 cable is connected at the machine end at BSPX445 Junction Box to terminal block tb7, terminal #4a and at the Control panel end to Body Shop Marshalling panel CPMP044 (Area 44) to tb12, terminal 37. The limits are physical, since cable 27 is a 24 conductor cable, and the wire numbers are sequential from #1 through #24. It may be that not all wires are used, but those that exist are still classifed according to scheme. While wire numbers may be duplicated, their classification and thus their naming is tied to the cable number, which in a given installation is unduplicated. In a large installation the name of the cable usually follows a scheme that is tied to its physical beginnings and endings. The referenced cable might be called CPMP044-027-BSPX0445. This scheme is particularly useful when the plant design utilizes an automated cable routing process which limits cable tray loading while optimizing distances. -- Donald Noyes

Nonetheless you gave these as examples of Collections As Grouping. I think multiple views (Collections As Views or Views For Grouping??) are a very common aspect. The finance and production department being good examples. car.motor.combustion.cable.wire may denote the same thing as inventory.cables.ieee-4711 or assets.inferiors.cm_inc.131072. Further complicating this by filtering onto only some aspects of these leaf elements in each case (possibly complete disjunct ones).

The question is: How is identity kept between these domains? Is the identity even relevant/necessary in all cases? Who is responsible for maintaining the 'actual' hierarchy? How are the multiple hierarchies kept in sync?

The relational answer is to not model hierarchies but only relations - which may be combined in any way. And to use views to let different departments have their own - well - views. The strange thing is that views seem to be seldom used. Are they too expensive? Unwieldy? Too low level (in the sense of not usable by end-users)?

I'd really like to put different views on my data. There are some niche products which do this. Email programs allow to filter by keys. Search folders on the Mac do this. But I#d really like to have this as a basic functionality e.g. as part of the filesystem.

-- Gunnar Zarncke


There is a physical reality in the example of the classification scheme of atControlRoom/panel/terminal/wire/cable/tray/cable/wire/terminal/panel/atEquipment which does not prevent, in fact which invites, views:

you could have a

view.trayloading which shows what cables are located in a section of tray

view.cables.entering/leaving.panel/box

view.terminalblock.connections.wires/cable

view.cables.entering/leaving.ControlRoom

etc. That views are used in this example case is true:

The cable/tray classification is shown in a Conduit/Tray/CableSchedule

The cable/wire/terminal.connections are shown in WiringLists

etc.


How is identity kept between these domains? Is the identity even relevant/necessary in all cases?

In most cases, identity isn't relevant. Object Identity is fundamentally an illusion since the concept of 'object' is itself a view of data. Modeling Object Identity where it isn't intrinsic (e.g. names in a communication structure) is a silly idea in a system where your aim is to avoid a fixed 'view' of things.

That said, it is reasonable to model 'unknown' properties by, essentially, naming them with unresolved variables. This appears in the database as surrogate keys especially if you use the unknown variables as a key (which implicitly indicates that every variable has a unique set of unknown properties... e.g. you don't know the exact location of any two vehicles, but you do know that no two vehicles occupy the same position at the same time, and so you can assert the unknown variables identifying each vehicle are unique). This allows you to store 'facts' in your database that may otherwise have seemed inconsistent, such as two distinct vehicles being marked by the same VIN (which may occur by manufacturing accident, theft, etc.).

That sort of 'identity' is really just a name for a collection of properties - a 'bundle' as described in Bundle Substance Mismatch. Use of such names fits readily into the classification schemes described by Donald... since, after all, such names 'classify' anything meeting a particular set of properties (albeit a set of properties that is rather specific to what what we 'view' as a unique individual).

That sort of 'identity' is more or less what I mean with vagueness: You do not necessarily know what these unknown variables are. I agree that one is in principle able to introduce them as need arises. But maybe it would be better to a) infer them from use or b) postpone them as long as possible or create them on demand. -- .gz

It goes against my principle of "To Describe Is Perhaps To Value". You cannot value what you do not describe. Inference and Postponement are not solutions, they are avoidances. The groupings I described above can be valued (receive identification and quantification). It follows, in lists of members of a collection, which is what I mean by Grouping, that the name you call something, and the description of what that name means, allows grouping to occur. In this way you can move forward to Valuing The Object.



See original on c2.com