Click to See Complete Forum and Search --> : Importance of Accessor/Mutator methods


humble_learner
July 11th, 2008, 07:39 AM
Hi,

We have a project which contains a number of 'Entity' classes - classes which hold data to be passed around. As we are dealing with C++, is it a rule that Get() and Set() methods are necessary for all the private data members ?

It sometimes seems to be an overheard having these methods but in the interests of OOAD, is it a rule that data should be private and hence accessed only using the Set()/Get() implementations ?

Ofcourse C++ texts would insist on the Set/Get() implementations, but in real life projects with a number of simple data types in a class (entity class), is it necessary for the accessor and mutators ?

I would personally do away with Set() and Get() because the entity classes by themselves are so simple that can be replicated in structures too.

Please let me know your thoughts.

TheCPUWizard
July 11th, 2008, 07:46 AM
ANY statemet that includes "ALWAYS" or "NEVER" is ALWAYS wrong. :eek:

However, Get/Set is a good idea in 99.999% of the cases.

If they are implemented as inline methods, then there is (almost always) no overhead at runtime (the "typing" overhead does NOT EVER count).

Yes your class is simple TODAY.

But next week....

You decide to lazy evaluate a gettable piece of data....
You need to add a validation to a settable peiece of data...
You decide to agrssively evaluate a peice of getabble data in the set of a different piece....

All three of these patterns (and more) occur on a regular basis. If you exposed the "naked" data, then not only do you have to change your class, but you also have to change all of the places where your class is used.

See the ramifications????

Kheun
July 11th, 2008, 07:50 AM
It is not necessary to provide a pair of accessor and mutator functions for every private member variable. As long as there isn't any need, you don't have to provide them.

TheCPUWizard
July 11th, 2008, 07:58 AM
It is not necessary to provide a pair of accessor and mutator functions for every private member variable. As long as there isn't any need, you don't have to provide them.

One of us mis-read the original question. :eek:

Complete agreement that you should not expost every private member. That would be plain silly..

I took the question to mean..if there is a peice of thata that you MUST expose, should it be implemented as "Get/Set with private data" or simply expose the data member publically....

Kheun
July 11th, 2008, 08:27 AM
Yeah, either that or the question isn't clear. Either way, both advices are correct. :)

GNiewerth
July 11th, 2008, 08:28 AM
Accessor/Mutator functions define the interface of a class and hide the implementation. By using accessor/mutator functions the caller doesnīt have to know how or where data is stored, thatīs left to the callee. When correctly implemented the callee code can be freely modified, as long as the calleeīs interface remains untouched.

Take a look at this example:

#include <vector>

class Dataset
{
std::vector<double> m_Data;

public:
Dataset() : Average( 0.0 )
{
}

void update( double dData )
{
// store data in vector
m_Data.push_back( dData );

// update average (calculation on previous average)
// separated into 3 steps for readability reasons
Average = (Average * m_Data.size() -1);
Average += dData;
Average /= m_Data.size();
}
double Average;
};


Now you can use this class and call update to store data in it. Each time you update data the average will be automatically calculated and you can fetch it from the Average member.

After a while you notice you need to call update() more often than you need to access Average, so you decide you donīt want to calculate the average with each update() call and you end up with this:


#include <vector>
#include <numeric>

class Dataset
{
std::vector<double> m_Data;

public:
Dataset();

void update( double dData )
{
// store data in vector
m_Data.push_back( dData );
}

void calculate_average()
{
if( false == m_Data.empty() )
{
// compute average of stored values
Average = accumulate( m_Data.begin(), m_Data.end(), 0.0 );
Average /= m_Data.size();
}
else
{
// there are no values, return 0
Average = 0.0;
}
}
double Average;
};


Now you can call update() as often as you want without having the average calculated with each call. The drawback, however, is you have to call calculate_average before you access Average, so you have to adjust you current code to call calculate_average before accessing Average. Doh.

If you had used an accessor function instead of directly accessing the Average member your class would have evolved from


#include <vector>

class Dataset
{
double m_dAverage;
std::vector<double> m_Data;

public:
Dataset() : m_dAverage( 0.0 )
{
}

void update( double dData )
{
// store data in vector
m_Data.push_back( dData );

// update average (calculation on previous average)
// separated into 3 steps for readability reasons
Average = (Average * m_Data.size() -1);
Average += dData;
Average /= m_Data.size();
}

double get_average() const
{
return m_dAverage;
}
};


to


class Dataset
{
std::vector<double> m_Data;

public:
Dataset()
{
}

void update( double dData )
{
// store data in vector
m_Data.push_back( dData );
}

double get_average() const
{
double dAverage = 0.0;
if( false == m_Data.empty() )
{
// compute average of stored values
Average = accumulate( m_Data.begin(), m_Data.end(), 0.0 );
Average /= m_Data.size();
}
return dAverage;
}
};


Aaaaah, much nicer now. The classī interface has not been changed, you can still call get_average() without modifying existing code.
For some reasons there was a change in the client code, it now calls get_average() significantly more often than it calls update(), so repeated average calculations slows down the application. You may come up with this implementation to fix that issue:


#include <vector>
#include <numeric>

class Dataset
{
double m_dAverage;
bool m_bAverageDirty;
std::vector<double> m_Data;

public:
Dataset() : m_dAverage( 0.0 ), m_bAverageDirty( true )
{
}

void update( double dData )
{
// store data in vector
m_Data.push_back( dData );

// Data has been modified, Average is no longer valid

m_bAverageDirty = true;
}

double get_average() const
{
if( true == m_bAverageDirty )
{
// Average needs to be calculated
m_dAverage = accumulate( m_Data.begin(), m_Data.end(), 0.0 );
m_dAverage /= m_Data.size();

// Average is valid now
m_bAverageDirty = false;
}
return m_dAverage;
}
};


Again the interface of the class has not been changed, though there have been major changes in the implementation.

For objects with up to 10 simple members and no further functionality besides grouping data I donīt use accessor/mutator functions.

TheCPUWizard
July 11th, 2008, 03:33 PM
And excellent example of agressive and lazy evaluation. :thumb: :thumb: :thumb:

_uj
July 11th, 2008, 05:10 PM
Please let me know your thoughts.

There are different kinds of classes. Some classes are close to primitives in nature. Examples of such classes are say Complex (abstracting a complex number (two floats)), and Tuple3D (abstracting a 3D coordinate (three floats)). In those cases setters/getters is to overdo it.

At some point a class becomes a more involved data abstraction and then one can start thinking about exposing private variables using getters/setters. But note that getters/setters are a very rudimentary form of encapsulation. A much better and stronger form is to not expose the internal state of a class at all. This is accomplished by asking the class to perform stuff, rather than shuffling data in and out of it.

TheCPUWizard
July 11th, 2008, 05:47 PM
There are different kinds of classes. Some classes are close to primitives in nature. Examples of such classes are say Complex (abstracting a complex number (two floats)), and Tuple3D (abstracting a 3D coordinate (three floats)). In those cases setters/getters is to overdo it.

#1.. Many design choices are personal. What I am 100% convinced of is that a consistant approach is better than an inconsistent approach.

#2..Regarding overkill. I have to disagree (now it gets personal). Even for trivial cases, I find that the potential future problems outweights the typing overhead (the compiler will almost certainly optimize the trivial get/set out of existing resulting in the code actually executing direct access to the data - so there is no runtime (memory or speed) issue.

Consider what happens if you have a complex class and you expose the two floats. Users of this class can take the address of the members. Now you decide to do all your complex math in doubles, but still expose it as floats (not uncommon to use a higher resolution for internal calculations). Even if your compiler supported "property syntax extensions", you are still in big trouble as your client call will not fail horribly (ie not compile!).

If someone could provide a single example where there is a downside to exposing the data (again typing does NOT count)I would be very interested in seeing it.

In 28 years of programming in C/C++, I have NEVER had to edit a file to change a get/set to an exposed data member. However there have been many times I have had to do the reverse (in other peoples code). From a code reuse (no-edits / no-recompiles) scenario any change to a "completed" item is indicitive of a bug/flaw in the original design.

btw: How many people (that know how to set a breakpoint when a statement executes) do not know how to set a data write breakpoint?? And even if you DO know, nearly every (non-ICE) debugger will impose a serious performance impact on such a breakpoint.

kempofighter
July 11th, 2008, 06:10 PM
Yeah, either that or the question isn't clear. Either way, both advices are correct. :)

In the OP, private data was mentioned. I took that too mean, is it necessary to provide the accessor/mutator for private attributes (as opposed to not providing accessor mutators for private attributes). It didn't seem like the OP was asking whether to expose public data. However, defining mutators for private attributes is not much better than making them public in my opinion. n fact, mutators should not be provided unless there is some valid reason to allow a user to modify the state of the class. Many variables might represent a critical state and allowing outside users to modify the state could be bad.

One reason that it might make sense to define the accessor is for testability. Since private data can't be accessed by test drivers, it might make sense to have them so that test drivers can always test the internal state of the class without having to mess with the friend mechanism. So you might want to think about how you are planning to unit test your code, if at all. I can't think of any other reason to provide accessors for all attributes, by default. Don't define them just for the sake of defining them. Tools can be set to not generate them.

TheCPUWizard
July 11th, 2008, 08:11 PM
One reason that it might make sense to define the accessor is for testability. Since private data can't be accessed by test drivers, it might make sense to have them so that test drivers can always test the internal state of the class without having to mess with the friend mechanism. So you might want to think about how you are planning to unit test your code, if at all. I can't think of any other reason to provide accessors for all attributes, by default. Don't define them just for the sake of defining them. Tools can be set to not generate them.

While that is certianly one way to address the testability issue. I will typically take it a step further to minimize "abuse", IF you utilize Factory type patterns to support a DI architecture...

If you need access to provate data for testing putposes, implement a PROTECTED accessor with a specific name (e.g Get_XXX_4Test()). Then implement a derived class. By using a specialized factory, you transparently create your test instances where the test code can access the information in question, but there is no risk that the "production" code will be developed against the "Test" interface.

If you do NOT utilize a DI ready architecture, then creating an INDEPENDANT class which can contain a "mirror" of the state can help. In this pattern you simply ad one method that returns you a snapshot of the state (ie an instance of the INDEPPENDANT class).

In any case, the goal is always to expose the minimum amount of information that is absolutely required.

active2volcano
July 11th, 2008, 09:12 PM
In our <<C++ programming Criterion>>, all the data must be private!
I am also accept the viewpoint of TheCPUWizard!

_uj
July 11th, 2008, 09:33 PM
#2..Regarding overkill. I have to disagree (now it gets personal). Even for trivial cases, I find that the potential future problems outweights the typing overhead (the compiler will almost certainly optimize the trivial get/set out of existing resulting in the code actually executing direct access to the data - so there is no runtime (memory or speed) issue.



Well, What about int?

Is it a class or is it not?

pm_kirkham
July 12th, 2008, 08:06 AM
If you are using C++ to do object-flow programming (a fudge between imperative data modelling and OOP, where you pass an object as a bag of parameters then change its state), then accessors/mutators are a mechanism to abstract away the data storage implementation from the schema.

If you're using C++ to do OOP (where you send a message to an object in order to request a behaviour), then you don't expose any state externally at all.

Usually if you find you need to access state, it's because you have an object which represents an entity rather than a set of associated behaviours. In those cases, I often find it's better to use demeter-style visitors for the non-identifying attributes, rather than getters.

Setters are usually a bad idea in any case, as you rarely want to just set a value without wanting a change in the behaviour of the object, and you often have co-constraints between values, or the ordering in which values are set, which simple setters don't enforce. Getters often result in moving brittleness from the implementation of an attribute to a data schema and encourage a procedural, navigation-oriented style. Which is OK for little scripts for automating COM components (COM and Java beans both tend to be very navigation oriented), but is brittle in larger projects.

active2volcano
July 12th, 2008, 11:04 AM
Could you explain "navigation-oriented style" in detail?

pm_kirkham
July 12th, 2008, 12:00 PM
Where you get long strings of property accessors which navigate a path through a data model implemented as objects. You end up with expressions in client code such as: EAApp.Repository.Models[0].Packages[2].Packages[0].Elements[2].Attributes

It tends to be very brittle against changes to the data model. It's the style of data programming (see CODASYL or navigational database on wikipedia) whose weaknesses both relational databases, OO and the law of Demeter http://www.ccs.neu.edu/research/demeter/papers/law-of-demeter/oopsla88-law-of-demeter.pdf evolved to attempt to mitigate.

Edit: http://c2.com/cgi/wiki?AccessorsAreEvil may also be of interest.

humble_learner
July 14th, 2008, 04:36 AM
Thank you all very much for that excellent discussion.
My question stemmed from the fact that I basically had a simple case like the following :



class Department
{
int DepartmentID;
string DepartmentName;
list<EmployeeInfo> employees;

}


Now I wanted to basically ask if these could all be public members or should there be Getter and Setter methods for each of the above methods. There seems to be a lot of reluctance in new engineers to actually write these methods especially when there are many more such simple members in the class. I had not actually decided on their scope and hence have not mentioned them explictly in the above illustration.
But judging from the response that you have all provided, it seems it is always worthwile to provide the getter and setter methods than directly exposing them as public data members - however simple the class contents might be.

Kheun
July 14th, 2008, 05:20 AM
If your class is only being used to group up related variable and there is no need to add any member function at all, I recommend using a struct declaration instead as it will sort of remind people the intention.

Anyway, your class example does seem sufficiently complex that in the near future you will be extending it by adding member function. If that is the case, it is better to provide the getters and setters and keeping the member variables private.

TheCPUWizard
July 14th, 2008, 07:10 AM
If you are using C++ to do object-flow programming (a fudge between imperative data modelling and OOP, where you pass an object as a bag of parameters then change its state), then accessors/mutators are a mechanism to abstract away the data storage implementation from the schema.

If you're using C++ to do OOP (where you send a message to an object in order to request a behaviour), then you don't expose any state externally at all.

Even in this condition you still expose state. State is any measurable condition of an entity, regardless of means of measurement.

Usually if you find you need to access state, it's because you have an object which represents an entity rather than a set of associated behaviours. In those cases, I often find it's better to use demeter-style visitors for the non-identifying attributes, rather than getters.

Remember most design quidelines indicate that only intrinsic behaviours are includd in a class. If a behaviour can be defined in terms of accessable state then it is something that you "do to" an object, not something the "object does". Every behaviour (method) should i(usually) nvolve at least one non-public aspect.


Setters are usually a bad idea in any case, as you rarely want to just set a value without wanting a change in the behaviour of the object,

The one main point of a setter, is that externally changing the state of an object causes behaviour to occur which in turn causes other state to change. . In the real world, this is the most common case. [e.g. Raising the temperature [state]of most things cause them to expand (behaviour) which is represented by an increase in size (state)]

and you often have co-constraints between values, or the ordering in which values are set, which simple setters don't enforce

Simulatianity is an issue.


Getters often result in moving brittleness from the implementation of an attribute to a data schema and encourage a procedural, navigation-oriented style. Which is OK for little scripts for automating COM components (COM and Java beans both tend to be very navigation oriented), but is brittle in larger projects.

My experience (fairly large projects of 1000 or more classes) is completely different. The decomposition of properties should almost never occur in an explicit chain (a.b.c.d.e). This is usually indicative of a flaw in the overall architecture.

pm_kirkham
July 14th, 2008, 07:53 AM
You seem to be violently agreeing with me - same issues, different emphasis in the respose.

Even in this condition you still expose state. State is any measurable condition of an entity, regardless of means of measurement.
The point I was trying to stress was that an object should be a set of behaviours. That some behaviours allow you to deduce state is not an issue, but exposing an interface which encourages other objects to act on the state rather than respond to the behaviour of a peer is and issue.

The one main point of a setter, is that externally changing the state of an object causes behaviour to occur which in turn causes other state to change. . In the real world, this is the most common case. [e.g. Raising the temperature [state]of most things cause them to expand (behaviour) which is represented by an increase in size (state)]
If you believe that the one main point of a setter is to invoke some behaviour, then I say you're better off calling the method after the behaviour you invoke, not the attribute which is associated by side-effect to the behaviour.

I'd say that the behaviour is to absorb heat, which has the side-effect of altering the temperature, not to set the temperature. You cannot, in the real world, just set the temperature of an object - there's a process, and some behaviour in response to the process, and some side effects of the process. So you extract the fundemental behaviour from the setter, and write an interface which stresses the process, not the state. Otherwise you end up with foo.setTemperature(foo.getTemperaure() + heat / (foo.getMaterial().getSpecificHeatCapacity() * foo.getMass()) in the client code rather than a single foo.absorb(heat) call.

Remember most design quidelines indicate that only intrinsic behaviours are includd in a class. If a behaviour can be defined in terms of accessable state then it is something that you "do to" an object, not something the "object does". Every behaviour (method) should i(usually) nvolve at least one non-public aspect.
A lot of the project's I've worked on are simulation toolkits. There's a certain amount of trade-off between fully encapsulated objects, such as a model for a component, which you would heat up or send control signals to - lots of behaviour, no exposed state, and other classes, such as the syntax tree for the modelling language, which you pass through external processors - almost no intrinsic behavior, lots of state. Even in the AST case you don't need getters and setters, but it's cleaner to use the visitor pattern and pass the properties of the AST object to the visitor. Even when you are doing something with and object, you don't need getters. Doing something to and object is almost always inverted if you are at the scale of individual member accessor - as in the case above, managing the exposed state in the client to do the heating 'to' the object is a different emphasis from saying 'there's a process in which heat is exchanged between two items - one absorbs some heat which is passed to it'. (the above code doesn't quite go that far as making the process explicit, but just has the absorb(heat) interface to the process)

My experience ... is completely different. The decomposition of properties should almost never occur in an explicit chain (a.b.c.d.e). This is usually indicative of a flaw in the overall architecture.
Yes, it's an architectural flaw, though more of a micro-architecture (how a few dozen classes interface to each other) rather than overall (there should be a mechanism to get a specific package starting from the application object). One mechanism of preventing such flaws in to remove the mechanisms which enable them. In my (and others) experience, getters are one such enabling mechanism. IME giving junior engineers an API full of getters and setters, and a code-completing IDE, and then hoping they won't use them because on page 257 of the coding standards it says not to, just doesn't work. Having code which doesn't let you do the wrong thing means you're more likely to fall into the pit of success. It also means you're thinking about the system, processes and interactions, rather than just about data-modelling.

TheCPUWizard
July 14th, 2008, 08:58 AM
Actually we are at diametetricly opposed positions.


There's a certain amount of trade-off between fully encapsulated objects, such as a model for a component, which you would heat up or send control signals to - lots of behaviour

BOTH of the items (bolded) are changes to EXTERNAL STATE. They are NOT behaviours. "Heating Up" something is changing the state of the external tempurature. "Sending Control Signals" is changing the state of an input control. The process by which other states change as a result of these changes is the behaviour.

An incadesent Light Bulb, does not really have a "Turn On" or "Turn Off" behaviour. It has two external stimuli and (at least) one result. If the voltage differential between the inputs is sufficient (value dependent on the specifications of the bulb), then it will begin to emit light (unless some internal state such as a broken filiment prevents it).

---------------

put another way....

Almost all behavours are induced as a RESULT of a Change in State, and the only way to detect the behavoir is by another set of state changes.

Put another way, behaviours are VERBS. Consider a simple physical object.
(using psuedo code for brevity...)

class PhysicalObject
{
public:
double Temperature { get; set; }
double Pressure { get; set; }
Dimensions Size { get; }
}

Heating or cooling the object will cause expansion/contraction. However this action (behavour) is not explicitly induced (ie it is not in the public interface). Also the cooefficient of expansion, is not exposed as it is inherent in the object, not external.


This model does have a problem with multiple influences, but that still does not need a behavour (method) to be exposed:


class Environment
{
public:
double Temperature { get; set; }
double Pressure { get; set; }
}

class PhysicalObject
{
Environment { get; set; }
Dimensions Size { get; }
}


I have also done many simulators (ranging from particle physics to super-scalar physical systems (urban modeling). The majority of the modeling classes do not expose ANY exposed behaviours.

"Action" classes are a different group entirely. These classes are responsible for applying stimuli (setting attributes) to one or more instances. These classes will typically have very little (if any) internal state.

---------------

One BIG advantage of this approach is that it results in more stable classes and significantly less coupling between the classes.

In one of my training classes, I ask students to consider the sequence of going to a sink, getting a cup, filling it with water, and then drinking the water, and returning the cup to its original location. Then I ask them to write the public definition of the "Cup" class.

The majority of them, add many methods to the cup class that are actions they are performing on the cup, and are not inherent features of the cup. This results in code that is harder to undrstand, more likely to change, and less reliable (any change represents a chance to introduce a bug).

kempofighter
July 14th, 2008, 10:39 AM
Thank you all very much for that excellent discussion.
My question stemmed from the fact that I basically had a simple case like the following :



class Department
{
int DepartmentID;
string DepartmentName;
list<EmployeeInfo> employees;

}


Now I wanted to basically ask if these could all be public members or should there be Getter and Setter methods for each of the above methods. There seems to be a lot of reluctance in new engineers to actually write these methods especially when there are many more such simple members in the class. I had not actually decided on their scope and hence have not mentioned them explictly in the above illustration.
But judging from the response that you have all provided, it seems it is always worthwile to provide the getter and setter methods than directly exposing them as public data members - however simple the class contents might be.

Seems like this could be a struct that is encapsulated within a class, in which case leaving them public would be fine because the struct instances would be encapsulated.

The main reason I wanted to reply once more is because I am unsure if you read my previous post about mutators. Providing mutators for every attribute is not a good idea. Class interfaces should be as minimal as possible. Keep data private or protected and provide accessors and mutators on a case by case basis only. Personally, I don't like protected data very much either. That is another debate that I won't get into but that is just my personal opinion. If a derived class needs access to a base class attribute, accessors and/or mutators are better (on a case by case basis).

kempofighter
July 14th, 2008, 10:41 AM
ANY statemet that includes "ALWAYS" or "NEVER" is ALWAYS wrong. :eek:


Is the quoted statement also wrong? It seems like a catch-22. :p

TheCPUWizard
July 14th, 2008, 01:28 PM
Is the quoted statement also wrong? It seems like a catch-22. :p
:D :D :D :D Now there is some understanding. :D :D :D :D

TheCPUWizard
July 14th, 2008, 01:30 PM
Seems like this could be a struct that is encapsulated within a class, in which case leaving them public would be fine because the struct instances would be encapsulated.

Making is a struct changes NOTHING. The only difference is the default visibility, NOTHING ELSE.

Now making it encapsulated, really depends on where (public/private/protected you make the nested class..


The main reason I wanted to reply once more is because I am unsure if you read my previous post about mutators. Providing mutators for every attribute is not a good idea. Class interfaces should be as minimal as possible. Keep data private or protected and provide accessors and mutators on a case by case basis only. Personally, I don't like protected data very much either. That is another debate that I won't get into but that is just my personal opinion. If a derived class needs access to a base class attribute, accessors and/or mutators are better (on a case by case basis).

Here we ae 100% agreement. Classes should alwayw be minimal (yet complete).

pm_kirkham
July 14th, 2008, 01:41 PM
Actually we are at diametetricly opposed positions.
BOTH of the items (bolded) are changes to EXTERNAL STATE. They are NOT behaviours. "Heating Up" something is changing the state of the external tempurature. "Sending Control Signals" is changing the state of an input control. The process by which other states change as a result of these changes is the behaviour.
Temperature is not an external state. If you are talking physics, then there are not 'external states' of objects, only interactions due to energy flows to measuring equipments. Every time you take a layer of ports (the skin of the object you're heating, the voltage of pin on the input control), you can also consider a flow (the heat over the boundary, the current through the wire). If you consider the flow rather than state, in my experience, you get a simpler, loosely coupled system. So in abstract OOP systems, you send a message to invoke a behaviour - OOP style- rather than setting a value and getting a behaviour as side effect - getter/setter style.

A light bulb takes current, which is one stimulus, not two. Modelling flow rather than potential makes a simpler model. A lighting system might be switched on or off - a flow of a control signal.
Almost all behavours are induced as a RESULT of a Change in State, and the only way to detect the behavoir is by another set of state changes.
Can you provide any justification for that statement in the real world? Although some flows are due to fields, it's almost always the flow that causes the behaviour.
I have also done many simulators (ranging from particle physics to super-scalar physical systems (urban modeling). The majority of the modeling classes do not expose ANY exposed behaviours.
Interesting - I was expecting more RDMS front ends since DAOs end up with get/set abuse.

So if you were to model an aircraft engine, you'd just expose the data tables, rather than having a mechanism to request a certain throttle setting (in which you probably would call the method 'set_throttle', since that fits with the domain use case), and have the engine request a certain fuel-flow from the fuel system, and supply a certain force to the airframe? You'd put that behaviour somewhere other than internally to the engine class?

Have you tried using Demeter? I've built large systems both ways, and you're not given me any cause to change my conclusions.
One BIG advantage of this approach is that it results in more stable classes and significantly less coupling between the classes.
Not in my experience. If you've not tried it, then there's no point arguing with you - you're welcome to either read the papers, which express it better than I can, or ignore it. I find thinking about sources and sinks reduces coupling, rather than increasing it - you could slot any engine into any airframe and adjust the throttle and it would work the same.

I ask them to write the public definition of the "Cup" class.
What you put in the model depends on the use cases - what information you actually want the system to provide. These use cases drive what are features of the model - there are no 'inherent' features, only useful ones. In the above example, you wouldn't model the colour of the cup, only the behaviour of the cup important to what you're doing with it - it holds whatever fluid put into it. You probably wouldn't care about its orientation, unless you had a use case to tip it up. (The sink would be a location which holds items, of which a cup is one. Some mechanism for locating the cup would be provided. The person who drinks pours whatever was in the cup somewhere else, presumably their mouth. I don' find that harder to understand than externalising everything.)

Maybe that's why most my projects don't get the 1000s of classes - I'd probably not bother with hard-coding a cup in this case - there is no 'cup', only a buffered flow, and no special 'cup' behaviour (though your example, when you present it, may involve some other cuppyness not mentioned, like a ceramics collector. It really depends what you want the model to do. What it is, I don't really care.)

TheCPUWizard
July 14th, 2008, 05:08 PM
Temperature is not an external state. If you are talking physics, then there are not 'external states' of objects, only interactions due to energy flows to measuring equipments. Every time you take a layer of ports (the skin of the object you're heating, the voltage of pin on the input control), you can also consider a flow (the heat over the boundary, the current through the wire).
[quote]
Except that the current is a function of the internals of the lightbulb. What is applied to the lightbuld is a voltage differential across the TWO inputs.
[quote]
If you consider the flow rather than state, in my experience, you get a simpler, loosely coupled system. So in abstract OOP systems, you send a message to invoke a behaviour - OOP style- rather than setting a value and getting a behaviour as side effect - getter/setter style.

Except if you were to apply current,, then your EXTERNAL application has to take into account the initial current surge, the increase in current as the filement ages, etc.

A light bulb takes current, which is one stimulus, not two. Modelling flow rather than potential makes a simpler model. A lighting system might be switched on or off - a flow of a control signal.

Again it is TWO. I can switch the "hot" side, or I can switch the "nuetral" (discounting safety regulations].

[quote]
Can you provide any justification for that statement in the real world? Although some flows are due to fields, it's almost always the flow that causes the behaviour.

Staying with the "wire" concept fo a minute. Seding a "message" over the wire involves changing the electrical potential. ALWAYS. It may be a sine wave to transmit power, it may be a changing pattern of two (or more states) that is created and interpreted by the devices connected to each end.

If you tied to model a wire as something that transmitted a message, then your model would change continually, and you could NEVER create a class that was stable.


Interesting - I was expecting more RDMS front ends since DAOs end up with get/set abuse.

So if you were to model an aircraft engine, you'd just expose the data tables, rather than having a mechanism to request a certain throttle setting (in which you probably would call the method 'set_throttle', since that fits with the domain use case), and have the engine request a certain fuel-flow from the fuel system, and supply a certain force to the airframe? You'd put that behaviour somewhere other than internally to the engine class?

I would set a physical throttle POSITION. The total ipact of setting that position would be internal. A user of the code could not directly control the fuel-flow (rate), and it almost certainly would not be directly controlled by the engine itself. Classes that represented the various pipes, pumps, filters, etc would all be connected This means that no chances would need to be made to the throttle or the engine in the even that something was to constrict the fuel flow.


Have you tried using Demeter? I've built large systems both ways, and you're not given me any cause to change my conclusions.

Yes I have. I have found it quite useful in certain circumstances, but my experience (working with experts) has been that the models are NOT stable.


Not in my experience. If you've not tried it, then there's no point arguing with you - you're welcome to either read the papers, which express it better than I can, or ignore it. I find thinking about sources and sinks reduces coupling, rather than increasing it - you could slot any engine into any airframe and adjust the throttle and it would work the same.

Except that setting the throttle to a given position (classic pure mechanical throttle) should have completely different results (RPM,Thrust,etc) which can also vary over time. By keeping it to very simple property, all of the differential behaviours are encapsulated.

What you put in the model depends on the use cases - what information you actually want the system to provide. These use cases drive what are features of the model - there are no 'inherent' features, only useful ones.

In the above example, you wouldn't model the colour of the cup, only the behaviour of the cup important to what you're doing with it - it holds whatever fluid put into it. You probably wouldn't care about its orientation, unless you had a use case to tip it up. (The sink would be a location which holds items, of which a cup is one. Some mechanism for locating the cup would be provided. The person who drinks pours whatever was in the cup somewhere else, presumably their mouth. I don' find that harder to understand than externalising everything.)

If you have read Scott Meyers', then you are familiar with the concept of "Minimal Yet Complete". The approach I have used with great success for decades follows this. Actions that can be performed "on" (or "to") an item are virtually boundless. The set of characteristics that define an item is well defined and usually fairly small.

The majority of my classes have not required a single edit in many years. By completely defining the concept, I can write it once, and re-use it many times.


Maybe that's why most my projects don't get the 1000s of classes - I'd probably not bother with hard-coding a cup in this case - there is no 'cup', only a buffered flow, and no special 'cup' behaviour (though your example, when you present it, may involve some other cuppyness not mentioned, like a ceramics collector. It really depends what you want the model to do. What it is, I don't really care.)

There are situations that definately work better with that approach, but my experience (1977-1992 primarily Defense/Space systems, 1992-2008 primarily Business & Industrial Application) has been that they are in the minority.

Consider classic circuit analysis programs (e.g. Spice). The component models (resistor, capacitor, diode, transistor, etc.) all internally contained their behavoiur. These items were written once, and then used in millions of different circuits where the "flows" and "behavours" were radically different, but the models (classes) were immune to all of this.The only thing you did to use these models was set "properties" to indicate what "node" the physical pints were connected to.


Switching to a "business application" environment, the same applies. I have ONE Customer class, ONE Vendor class, ONE Invoice class, ONE Invoice Item class, etc.

When I recently had to handle a very special set of rules (Laws, Tariffs, etc) for an international client, the behavours were all encapsulated in a unique set of classes for that country, and then applied. This approach allowed for over 80% of the application to be built from pre-existing code, resulting in a much lower cost and faster delivery time.

If the codebase had been centered on the behaviours (Taking an Order, Getting Approval to Fufill, Accepting the Order, Filing Shipping Paperwork, etc) then it would have required significant re-work.

I to not think we are "arguing", but rather taking different approaches that each work for us. But I am curious...

1) Of the code you have delivered for new projects in the past six months, what percentage of it has survived without modification for over three years.

2) Between the time you start "Beta" testing until the products end of life cycle, what percentage of the code ends up requiring modification?

pm_kirkham
July 15th, 2008, 08:11 AM
Staying with the "wire" concept fo a minute. Seding a "message" over the wire involves changing the electrical potential. ALWAYS. It may be a sine wave to transmit power, it may be a changing pattern of two (or more states) that is created and interpreted by the devices connected to each end.
One of my first jobs was on a 2-wire control system, where the controller sent a constant voltage and the devices changed their resistance to signal back to the controller - so the signal from the device to the controller (which also powered it) was the current the controller had to provide to maintain the same voltage. (the controller signalled by changing voltage, which is a more common way to specify the interface, though of course to implement it it had to control the current. Eggs, chickens). In systems analysis, I do find that looking at flows and controls yields a simpler, but still correct, model, than one based on potentials - often all you care is that a bulb converts a flow of energy into light, rather than bothering with switch positions.

I have found it quite useful in certain circumstances, but my experience (working with experts) has been that the models are NOT stable.Usually I'm working on frameworks - providing a domain specific languages interepter/compiler - so the models are created by modellers, and the models aren't stable. But they aren't the classes in the code, just data that those classes process.
Consider classic circuit analysis programs (e.g. Spice). The component models (resistor, capacitor, diode, transistor, etc.) all internally contained their behavoiur. These items were written once, and then used in millions of different circuits where the "flows" and "behavours" were radically different, but the models (classes) were immune to all of this.The only thing you did to use these models was set "properties" to indicate what "node" the physical pints were connected to.
Very similar to the bulk of my sort of work - I build the high-level components that are configured into a model by the modellers. I don't tend to use C++ for the model itself - that changes simulation to simulation, and compiling changes takes time, and C++ is a different skill to domain knowledge. The behaviour of each component type doesn't change. The behaviour of the system changes based on the configuration - what you connect to what, but that's not normally coded as separate classes. Typically it's set on initialisation rather than using individual setter methods - each component has an initialise_propertise(const map<id, value>&) and a connect(const map<id, port>&) function, so that's all encapsulated within the component. The system class would be a collection of components and external ports, and the behaviour of the system would be emergent from the combination of the components - each of which has a small, fixed behaviour. That the flows and behaviours of a different circuit vary doesn't require that you create a class for each type of circuit, only that you have a mechanism for configuring the system in a multitude of ways. I don't find that you need getters or setters for that sort of model - if you were relying on them to initialise objects, then adding a new type of component would require a change to the initialisation code, whereas by encapsulating the initilialisation you don't - you can load a DLL containing a function which returns a new, uninitialised component, add that function pointer to a map of component-type name to function - and plug-in new behaviors while you're still running.
Switching to a "business application" environment, the same applies. I have ONE Customer class, ONE Vendor class, ONE Invoice class, ONE Invoice Item class, etc.

When I recently had to handle a very special set of rules (Laws, Tariffs, etc) for an international client, the behavours were all encapsulated in a unique set of classes for that country, and then applied. This approach allowed for over 80% of the application to be built from pre-existing code, resulting in a much lower cost and faster delivery time.
That seems almost exactly the same as what I do, but I don't see how getters and setters come into it - once each component is configured to know where it has to pass its results to, there doesn't need to be anything external to it to set or get values.
If the codebase had been centered on the behaviours (Taking an Order, Getting Approval to Fufill, Accepting the Order, Filing Shipping Paperwork, etc) then it would have required significant re-work.
I don't think you're modelling 'behaviour' quite the way I do - you're describing business processes, which I'd probably put into a XML BPML model with a nice COTS UI for the business process modellers, and either generate code or just interpret it. The behaviours of the classes which make up the interpreter, or the Customer or OrderItem, and the framework around the model, are stable and don't change rapidly. The other stuff is just data, and you can change that all you want without re-work. (unless you find you have something which can't be elegantly composed from your existing behaviours, such as your special set of rules, where you might well want a new type of component.)
I to not think we are "arguing", but rather taking different approaches that each work for us. But I am curious...
Yes. I've almost no non-aerospace/defence experience - almost all of my work has been producing modelling and simulation frameworks, and occasionally algorithm design. What is the reason you find a 'spice for business process flows' doesn't work? What in circumstances do you actually need getters and setters to implement the model, or do they have a different role in other types of applications?

JohnW@Wessex
July 15th, 2008, 09:21 AM
The biggest change in class design that has affected me was moving from a Data + Algorithms to a Container <-> Iterator <-> Algorithm model, influenced by exposure to the STL.

Most image libraries, including my own initial design from years ago, appear to work on the Data + Algorithms model. The image class holds the data and member functions supply the common transformations (mirror, flip, rotate). Unfortunately this model has it's downside as more 'common' algorithms and transformations are identified and added. The class grows until it contains an API that is far too complicated and hard to modify. The number of transformations that can be applied to an image is effectively open ended, and so, the class can never really reach a position of stability.

With the Container <-> Iterator <-> Algorithm model the image class becomes much more stable. Its only job is to contain images and to expose iterators to allow access to the data. The image algorithms and transformations only operate through iterators and become independent of the internal structure of the image and each other. The image container's API is now relatively simple, complete and stable.