Author Archives: timoch

Annoyed with INotifyPropertyChanged?

Have you ever been annoyed with having to implement the cumbersome plumbing required for INotifyPropertyChanged ? Well, I have. So I tried to find a way to make authoring bindable objects better.

The typical example:

As you can see, it is quite verbose. The event and OnPropertyChanged() method need to be implemented for each class and it’s easy to get the implementation of OnPropertyChanged() wrong (typically introducing a race condition). Moreover, it’s 10 lines of code for each property. I hate that. The more you have to write, the bigger a surface for bugs to appear. A property should just boil down to   public string FirstName { get; set; } .

That won’t be possible of course. The default setters and getters just handle the assignment and loading of a compiler-generated backing field. So we somehow need to add the code for each property.

Enter Bindable objects

Similar to class NotificationObject, Bindable is here to help implementing INotifyPropertyChanged. The NotificationObject class only implements RaisePropertyChanged() but does not help with the implementation of the properties. Here is what you can do with Bindable:

Noticeably shorter, isn’t it ?

Behind the scene, Bindable uses a dictionary to store the property values. You probably have noticed that the property name is not given to Bindable.Get() or Bindable.Set(). Bindable leverages the compiler to provide the value automatically:

CallerMemberNameAttribute when applied on an optional parameter instructs the compiler to pass a string whose value is the name of the calling member. So when property FirstName calls Get<string>() , the compiler generates code for  Get<string>("FirstName") .

Here is the actual code for class Bindable:

Some aspects can be improved. For example, enabling subclasses to provide their own backing field. This is left as an exercise to the reader ;-)

I use this class a lot when implementing MVVM either in Wpf or Winforms.

What do you think ?

Edit: fixed code snippets as per Krumelur’s comments. Thanks Krumelur.

Unit testing model validation with MVC’s DataAnnotations

In a previous post, I mentioned that model validation should be tested separately from controller logic. I will demonstrate a way of unit testing the validation of models implemented with System.ComponentModel.DataAnnotations.

It is actually quire easy to unit test model validation. Models are inherently easy to test separately due to their POD (Plain Old Data) nature. We can instantiate them directly. Moreover, DataAnnotations provides us with the necessary interface to run the validation against a model object completely separately from the rest of the application.

A first model validation test class

Here is a basic model that we will unit test for the demonstration:

As you can see, it is quite simple. We’ll go directly to the unit test implementation.

The strategy we are going to use consists in basing each test case on a valid model instance, then modifying it in such a way that it triggers one single validation error. We end up with a skeleton unit test class like this:

Some explanation:

  • ValidateRule() implements the test itself. However, it gets the specifications for each test via an argument. 
  • ValidateRule_Source() provides the specs for our test.
  • class ValidateRuleSpec holds the specifications. Its ToString() uses the spec values to render a distinct string per test. This makes unit test reports easy to read. In case of failure, you know exactly which spec failed.

And now the implementation of our ValidateRule_Source():

Refining the solution

This works but can be improved. Most of the functionality can be abstracted. The actual test need only provide the valid model object and the specifications. A bit of refactoring yields a nicer design for our test:

Notice how we trimmed CreatePersonModelValidationTests to a minimum. The class ModelValidationTestsBase can now be used for most of our model validation unit tests.

What do you think?

Unit test equality is not domain equality

I came across a question on stackoverflow concerning comparing object for equality in unit test. The poster basically wants to get rid of series of assertions like this:

I agree, this is ugly. The more so if you have multiple tests that assert on the equality of these properties. You can always factor it out into a helper function but you still end up actually writing the comparisons yourself.

The selected answer doesn’t feel right. The proposal is to implement Equals() in the class for your tested object. This is not always desirable or even possible. Consider the case where your use case actually makes use of Equals() in its logic. There may already exist an implementation of Equals() that satisfies different needs than those of your test. Moreover, when overriding Equals(), there is more to it than just this single function. GetHashCode() must be implemented too … and correctly ! If you don’t implement GetHashCode(), you may end up with subtle or not-so-subtle bugs if your object gets stored as a dictionary key. In most cases, it will not be an issue because only a very few classes are actually used a dictionary keys. However, if you get into the habit of overriding Equals() without GetHashCode(), you can be bitten hard !!

One of the most favored answer is to use reflection to discover the distinct properties. This is the way to go. Code that solely exists for testing purposes should be kept away from the classes you test. However, I find the proposed solution sub-optimal. For one thing, the method is solely dedicated to testing and directly calls Assert.AreEqual(). For another, I don’t like that it automatically recurse into IList properties, but this is a question of style.

I would propose a general purpose utility method like this.

and of course, the unit test that goes along with it:

It can then be used in a unit test like this:

You can wrap the method via a unit-test friendly static assert method or any way you like. The above test would fail in a quite explicit way. An error message looks like this:

What do you think ?

Asp.Net MVC: Testing your controller actions

I can’t say I like all the aspects of Microsoft ASP.Net MVC. But there is one aspect that I like though is the ability to unit test most of the components of your application. Standard ASP.Net did not prevent you from testing your application. However, the framework and documentation did not encourage you to organize your application in a way that is testable.

ASP.Net MVC really pushes for loosely coupled components and as such, encourages unit testing. Your controllers are not much more than plain methods taking some input, executing some logic and returning an output. Your controller is not responsible for much in the end. Even though it is a central part of any functionality, its responsibility is reduced to implementing the logic for responding to a certain type of request. It relies on the framework configuration and conventions to call its methods in an appropriate way.

Understanding the extend of a controller’s responsibility is a key to writing good, concise unit tests. Sometimes, it’s easier to remember what it is not responsible for:

  • Model binding: turning the encoded form data or routing information into .Net object. A controller action does not need to know and does not care that the id of the object your are updating is a part of your action’s path (eg. /admin/tags/edit/my-tag ) or if it is provided in the form of an encoded form field (eg. <input type="text" name="tagSlug" /> )
  • Model validation: it may seem surprising but in most cases, your controller should not implement the validation logic. From a controller’s action point of view, is there really a difference between a person’s last name missing or an unknown phone number format ? As far as your hypothetical EditPerson action is concerned, there is a validation error. Model validation should be tested separately.
  • Pure business logic: a controller action is meant to handle requests. Even though it does not directly cope with http intricacies, it still very close to the http request/response life cycle. For example, an action ComputeLoanSchedule that needs to return a loan amortization schedule should probably delegate the actual computation to a business service class (ILoanService.GetAmortizationSchedule()) whose sole purpose is to handle such computation. In the future, if you need to expose the amortization schedule feature as a web api, you will only need to implement another controller and call the same business service.

In the end, your controller is only responsible for:

  • Delegating work to domain services: as stated above, the domain logic of your application should be logically separated from your UI layer. It also gives you the flexibility to scale your web application and business domain services. 
  • Returning an appropriate response: a controller’s action responds to a request by returning a response. The standard MVC Controller class expects your actions provide a result that will drive the way the response if created. eg. Returning a ViewResult will trigger view rendering and a RedirectToRouteResult may respond with an HTTP 302.

Unit testing a controller action

When it comes to unit testing, the less to test, the better. As such, the limited responsibility of controller actions is a boon when writing unit tests. As an example, I will be using a very simple controller TagsController . It is part of a blog-like website. Its role is to allow for the management of tags to be applied to other components of the application like articles etc.

TagsController exposes a CRUD-like set of functionality:

  • Index: provides the user with a list of existing tags
  • Create: enables the creation of new tags
  • Edit: enables editing existing tags

The simple case

We will focus on the index functionality for now. It is implemented as a single action on our TagsController:

The TagsController constructor takes a ITagService. It provides our constructor access to the tags in our database. As you can see, the Index() action method does only 2 distinct operations. First, it asks for the tags to display. Then, it tells the framework to render the “Index” view using the retrieved tags as a model.

With such a simple implementation, the unit test will be quite simple also. I’ll follow the usual AAA (Arrange-Act-Assert) pattern.

The first part sets up an ITagService mock for TagsController to consume. We then simply call method Index() .

The assertions start with making sure we got a non-null result of the ViewResult type. Index()  should ask for the “Index” view to be rendered. I prefer explicitly specifying the view name to render in my controller action. I believe this reduces the mental gymnastic necessary when debugging action-view interactions.

mockService.Verify()  ensures ITagService.GetAll()  was called.

I then proceed with checking the model provided to the view is consistent with the data from the mock service object. One of the requirements is that the tags are sorted by name.

As you can see in the Index unit test, there is a lot more code than the method being tested. This is also one of the reason why you should reduce the scope of your tests as much as you can.

A more complex test case

I’ll cover the “create a new tag” functionality. In this case, the functionality is implemented by a pair of actions. A parameter-less Create()  action simply triggers the rendering of an empty tag editor (implemented by a view names “Save”. The other Create()  action takes a SaveTagModel  parameter and responds to form submissions. As such, it behaves differently in case there is a validation error or the tag already exists in the tag db.

I’ll show the tests I put together to cover the functionality of TagsController.Create() . The first of those test is ensuring the initial to Create() triggers the rendering of an empty tag editor. The tag editor is implemented by a view called “Save” shared between the Create()  and Edit()  operations. As you can see, I make sure the view is specified by name. I also make sure the model is in a state consistent with an empty SaveTagModel .

The next 2 tests cover the behavior of the Create(SaveTagModel tag)  action. That is the action that responds to form submission from the tag editor.  These tests need to cover the following

  • What happens when invalid input is provided?
  • What happens when a tag exists with the same ‘slug’?
  • What happens upon success?

It will not come as a surprise that I wrote 3 tests, one for each of the points. The first test ensure Create()  behaves when provided with invalid data. This test demonstrates an important point. The controller is not aware and does not care what the actual model errors are. Its ‘invalid input’ behavior is triggered by any model error. We want to reduce our controller logic as much as possible. There may be some cases where the controller’s action behave differently based on specific errors. If you can avoid it, do so. It is against the separation of concern. A controller is not responsible for validation.

Considering what I have just said, there is an issue with the current implementation of Create() . It checks that a tag does not exist. This is a form of validation and should probably be moved either to model validation (implemented by a custom validation attribute) or to the service (by adding a specific ITagService.Create()  method). On the other hand, since the validation relies on a service component one could argue that it is part of the orchestration the controller is responsible for. I will leave it here because it is such a trivial validation. Anything more complex I would extract it and test it separately. My rule of thumb is: if it takes more than one unit test to cover a piece of validation in the controller, the validation should be moved to its own class.

Here is the test that covers that part.

Last but not least, here is the test that covers the success path. Out action should have called ITagService.Save()  with an appropriate argument and it should redirect us to the index page for our tags. As you can see, the redirection is tested against route values, not against an actual URI. The routing configuration is covered by another set of tests. This will be the subject of another post in the next few days.

Conclusion

As you can see, even though our controller is quite simple (3 methods only and simple at that), we had to write quite a few lines of unit test code to cover all the code paths. If you want to keep your unit tests to a minimum, make sure you respect the separation of concern principle. Test each of the involved components separately and reduce to a minimum the contract between them. The less they know about the other, the easier it is to change one component without your change to ripple through your whole application.

In the next few posts, I will cover testing model binding and validation as well as routes.

Don’t hesitate to hail me in the comments or on Twitter

Testing XPathNavigator

In my previous post about XPathNavigator, I explained in what circumstances the default implementation of XPathNavigator is troublesome. I went over the design of the class and highlighted how that design helps us re-implement XPathNavigator to address the issue.

Testing XPathNavigator

First things first, before attacking the new implementation proper, we want to make sure our implementation is compatible with the default implementation. To do so, we will write tests that will be run both against the Microsoft implementation as well as our implementation once it exists. Our goal here is really twofold. On the one hand, we want to ensure the existing implementation actually works as documented. On the other hand, we want to check our own implementation against the specification tests.

What should we test ?

XPathNavigator is a complex class. So we want to limit my tests to what actually matters for the new implementation. Otherwise, we may be writing literally hundreds of tests.

It is obviously not necessary to test methods that will not be re-implemented. In the previous post, we identified a subset of methods that we will need to re-implement. All other methods are somehow using this basic subset to implement their functionality. The subset is the list of abstract members:

As you can see, we have two distinct groups:

  • The abstract properties expose information about the current node. Our tests will ensure that we get consistent information for all types of node.
  • The abstract methods are all concerned about moving the navigator to another node. The tests need to check that the move operations result in the navigator pointing to the right node given a known starting position.

How should we test it ?

We will test the properties by setting up a XPathNavigator that points to specific nodes of an xml document. Once setup, we simply check the properties expose consistent values. We will test the Move() operations in a very similar way. We will setup the XPathNavigator instance on a specific node, execute the Move() operation we want to test and then check that the XPathNavigator yields values through its properties that are consistent with the navigator’s new position.

This is actually very similar. The only difference is the Move() operation. The similarity will let us factor our most of the test code into a few utility functions.

CanMoveImpl() acts as a parametrized test. It takes 2 arguments:

  • args: a MoveTestArgs instance. This argument describes the test’s original state and the resulting state we should test against.
  • moveOperation: A delegate to the Move() operation to test. Passing the operation to test as a parameter let us also write non-Move() tests by simply passing a no-op callback.

NUnit: I am using NUnit to write the unit tests. It is only a matter of preference. You can adapt the tests to work against another testing framework such as Microsoft Unit Testing Framework. I find NUnit to be simple to use, non-obstrusive and very flexible. 

CanMoveImpl() is called by actual test methods like the following:

It is a parametrized test. The TestCaseSource attribute tells NUnit which method to call to get the MoveTestArgs instance for each test.

Method CanmoveToNext_Source() returns each test case for a given operation. In the above example, we have the test cases for “when position on document root, MoveToNext() should fail”, “When positioned on element whith no next sibling, MoveToNext() should fail” and “when positioned on an element with a next sibling, MoveToNext() should succeed and point to the specific node”.

Each test case is defined by specifying values for the fields of class CanMoveArgs.

Method ExpectNodeProperties() implements the assertions depending on the configuration of its MoveTestArgs instance:

Executing our tests

We want our tests to be executed against the Microsoft implementation as well as our own implementation. The most straight-forward way of achieving this is to implement our tests in an abstract test fixture. The abstract fixture has an factory method to create an instance of XPathNavigator to test against. For each implementation, we create a subclass of our fixture and override the factory method.

CreateNavigable returns an IXPathNavigable. In turn IXpathNavigable lets us create a navigator positioned on the document root thanks to its CreateNavigator() method.

We’ll add the test fixture for our own implementation when we have the skeleton available. In the mean time, this lets us verify our expectations against the actual implementation of XPathNavigator.

The next post on the topic will tackle the new implementation’s design. I’ll make the implementation and test available as a source code download at the end of this series of articles.

XPathDocument and whitespaces

Writing code is fun. At least it is for me. But sometimes it gets irritating. You know, you’re busy on something, you write the code, you know it’s right but it doesn’t work… You keep your focus on that one piece of code you just wrote and it keeps on not working. Sometimes, the reason it doesn’t work is obvious but sometimes, you keep reviewing your code, its surrounding, you debug away several variants of your solution and it keeps on not working …

I just had one of those moments…

And then, bang ! The solution jumps at me and it’s so obvious I almost felt shame :-|

I was writing the unit tests in preparation for my next article on creating a XPathNavigator implementation. The code basically boils down to this:

I am testing MoveToParent() from a whitespace node. "/root/text()"  is expected to give me an XPathNavigator located on the whitespace node inside the <root> element. And nav just keeps on being null. Since I had been busy writing pairs of xml samples and xpath queries to put my test in each situation I needed to test. I immediately assumed my xpath query was not correct. i just kept on tweaking here, there. nav is null still …

After some time, I decided to not put more effort into it and come back to it later, once I can take the necessary step back. I posted a question on stackoverflow.com and worked on something else.

I was busy on something completely different when it struck me. One of those “Haha” moments. XPathDocument has a constructor that takes a XmlSpace enum value. By default, if you don’t specify it, XPathDocument will simply skip all non-significant whitespace node.

That’s it … annoying.

 

 

So what’s wrong with XPathDocument ?

This post is the first in a series of posts related to XPathDocument and XPathNavigator. I will highlight the qualities and drawbacks of the standard .Net implementations and go through the design and development of a new implementation that fits better to my needs.

First, what is an XPathDocument ?

An XPathDocument is used when you need to query xml data using XPath. For example, you can get a list of article id and ordered quantity from the following xml file:

using the following code:

XPath, once you get a hang of it, is very powerful and flexible for accessing Xml data. It allows for complex queries and computation.

Where’s the catch?

This is great but there is a drawback. As per the documentation, XPathDocument provides a fast, read-only, in-memory representation of an XML document by using the XPath data model. This does not scale well with file sizes. If your files grow to tens of MB or larger, it will be as much data that will be loaded in memory. I recently built a mapping utility based on XPath. The starting requirements were to handle lots of small files. It turned out that once in the field, clients were using feeding it a small number of large files instead. Loading these large files in memory caused a lot of issues from bad responsiveness due to excessive swapping to plain  OutOfMemoryException.

The good news is that we can do something about it.

How does XPath work in .Net?

In the above code snippet, you can see that the only reference to  XPathDocument is to create it. We then use it only once to create an XPathNavigator. The rest of the XPath querying involves only XPathNavigators.

XPathNavigator class

An XPathNavigator is a cursor on an xml data structure. As a cursor, it provides basic operations to move the cursor, to query information about the data it points to and also to clone itself. XPathNavigator can also be used to update the underlying data if the implementation supports it.

The data of the node pointed to by a navigator can be accessed using a set of properties. The most commonly used would be LocalName, NamespaceURI, Prefix but most importantly Value and its variants. NodeType is also important. The type of a node determines the allowed move operations supported.

To move a validator around, the following methods can be used. Notice that all but one of them return a boolean. It indicates whether the move operation succeeded. True means the navigator now points to the new node, false, means it has not moved and still point to the original node. The only method that does not return a boolean is MoveToRoot() because it always succeeds.

An operation may fail for various reasons. For example, MoveToNext() will fail if the current node has no next sibling (eg. the last element of a sequence) or if the current node is an attribute. MoveToChild() will fail if there is no child of the current node that satisfies the conditions.

XPath queries?

That’s all very good but you might ask ‘what about XPath queries?’. XPath queries can be executed using the following functions:

Evaluate() returns a value dependent on the XPath query. The result can be an integer, a string or a node set etc. Matches() tells you whether the current satisfies conditions expressed as an XPath expression. The Select() functions return a node iterator over their result. That is a set of XPathNavigators each pointing to a node in the XPath expression result set.

So how do we solve our problem?

The key element that will help us solve our scaling issue lies in the implementation of the XPath querying methods (Evaluate, Match and Select). Their implementation is actually expressed in terms of Move() operations and property checks on XPathNavigators.

The following example uses on one hand Select() to find the <article> nodes of root element <order>, on the other hand, it uses a series of Move() operations to do the same.

All XPath queries can be expressed as a series of Move() and Clone() operations. This is exactly what Select() does behind the scene. This is where the design of the XPathNavigator class shines. Select() is implemented exclusively in terms of Move() and Clone() operations. This means that any implementation of XPathNavigator that supports these operations can benefit from the XPath query language.

Did you notice earlier that some of the Move() operations are virtual, others abstract ? In the same manner that XPath queries can be expressed as a series of Move() operations, most Move() operations can be expressed as a series of some of the most basic move operations. For example, the default implementation of MoveToRoot() is simply  while (this.MoveToParent()) {} Properties of XPathNavigator also follow the same pattern. Virtual properties have a default implementation that relies on the abstract properties.

This design hepls a lot in our case. We can get rid of the default .Net-provided XPathNavigator implementation without changing our usage. We will create a new implementation that will not load all the xml data in memory ; instead, it will cache this information to disk. Of course, since disk IO will occur, our implementation will probably be slower. We will see what we can do about it in a later post.

Below are the limited list of methods and properties that must be implemented in order to support XPath querying.

As you can see, the minimum interface we need to support is not as big as we could have thought. We still have a lot to do though. We have to design our solution and implement it but more importantly, we need to write tests for it.

Conclusion

In a next post, we will setup a series of unit tests. These tests will be run against both the standard implementation (to ensure we understand the requirements correctly) and our new implementation (to make sure we stick to the requirements).

The design of XPathNavigator is quite clever. Basing the implementation of XPath queries on the abstract implementation of primitive Move() and Clone() operations enables implementors to keep their internal representation of the data completely decoupled. An implementation could very well provide an Xml-compatible view on a data structure completely unrelated to Xml. For instance, it is quite simple to expose the information of a tree of POCOs using Reflection. Another example would be to expose other data formats such as JSON to XPath-only consumers.

Game development irony

The guys at http://www.greenheartgames.com/ pulled a nice trick yesterday. Just the day after they released their game development simulation game Game Dev Tycoon, they also uploaded a cracked version of the game for the pirates.

What the heck ! What for ?
Well, this cracked version is the same as the original game except the players are screwed, they cannot expand their in-game company above some limit due to …

Hum, well … Yes, piracy … Ironic, isn’t it ?

This really made my day :-) I haven’t tested the game itself yet but it sure looks like fun.

std::unique_ptr semantics

Probably the best feature introduced by C++11 is std::unique_ptr. It will automagically make sure your dynamically allocated objects are deleted when you don’t use them anymore.

In previous versions of C++, you needed to rely exclusively on documentation and conventions to ensure dynamically allocated memory was handled properly. With C++11, you can ask the compiler to help you out and enforce ownership semantics. Take the following code for example, you need to ensure you call delete when you are done with the object created by function factory(). The compiler will not complain if you forget and you end up with a memory leak.

What can std::unique_ptr do for you ?

With std::unique_ptr, you cannot casually forget to delete the pointer. std::unique_ptr owns the allocated object unless this ownership is explicitly transferred to another entity (eg. another unique_ptr, a std::shared_ptr). When a std::unique_ptr goes out of scope, it will delete any object it owns.

The most explicit way of transferring ownership is via function std::move(). Using the same above example, we can see how it works.

You cannot directly assign an instance of std::unique_ptr to another. This rule ensures that no two unique_ptr can claim ownership of the same dynamically allocated object. It prevents deleting an object multiple times.
This is the magic of std::unique_ptr, you always know who owns the pointed value. By consistently using unique_ptr, you know at a glance when you own a heap object and when you yield ownership to another. Moreover, this rule is enforced by the compiler — the less you need to worry about, the better.

How do you use std::unique_ptr ?

std::unique_ptr helps you express your intent with regards to ownership transfer. Transfer of ownership occurs in the following general cases :

  • Returning a dynamically allocated object from a function
  • Passing a such an object to a function

Returning a std::unique_ptr

Returning a heap-allocated object from a function requires that the caller delete it when it is done with.

By returning a unique_ptr, you ensure the caller takes ownership of the pointed object. In the example below, the signature of createVector() ensures the caller takes the responsibility for releasing the created vector. Normal flow of execution guarantees memory will be released whenever the returned unique_ptr is destroyed.

Passing a std::unique_ptr to a function

A function taking a unique_ptr takes definitive ownership of the pointed object. The caller will not even have the ability to access the pointed object after the function call.
Notice the call to function std::move(), it is required and helps you actually see that ownership is transferred to the function.

Passing a const reference to a unique_ptr

The called function can use the pointed object but may not interfere with its lifetime. The caller function keeps ownership of the pointed object.

In my opinion, there is no real benefit to using a const reference to a unique_ptr from simply using a raw pointer to the object. On the contrary, it introduces a constraint on the caller that it must manage the pointed object through a unique_ptr. But what if your caller context requires you have shared ownership (std::shared_ptr)? Since we mostly care about whether ownership is transfered (not how ownership is cared for), a raw pointer is perfect.

Passing a non-const reference to a unique_ptr

This is, in my opinion, the most complex situation. The called function may or may not take ownership of the passed pointer. “may or may not” is really part of the functions contract. If the called function uses std::move() on the passed unique_ptr, it effectively takes onwnership and once it gives control back to the caller, the caller does not own the pointer anymore, it doesn’t even know the value of the pointer anymore. On the other hand, if the callee merely uses the provided pointer without moving it, the caller still owns the pointed object.

Conclusion

C++11 provides us with a great tool for managing the lifetime of dynamically allocated object. When used consistently, std::unique_ptr lets your code really express when ownership of a dynamic object is transferred and in which direction directly. The compiler will even see that the semantics are respected at compile-time.

However, they are of no use when you don’t intend to express ownership semantics. Pass raw pointers whenever you can but make sure that you use std::unique_ptr or another similar smart pointer when you mean to transfer ownership.

boost::serialization coupling issue

I was evaluating boost::serialization today. Based on the design goals mentioned in the library’s introduction, I felt like boost::serialization would suit my needs.

An interesting point is this :

8. Orthogonal specification of class serialization and archive format. That is, any file format should be able to store serialization of any arbitrary set of C++ data structures without having to alter the serialization of any class.

At first, I interpreted it as an intent to decouple the serialization code (that knows about the object’s internals) and the archive format.
Consider this:

All is fine, ClassA does not know the specifics of type Archive. Any object implementing the Archive concept will do just fine.

Yet, I have a problem with this. Inline code in headers has a tendency to irritate me. It hides the structure of the classes you implement so I often use the PImpl idiom.
I want to (or must in the case of PImpl) move the implementation to ClassA.cpp.
But … can I ?

serialize() is a template method. Can I forward declare a template method ? Well, yes.

In ClassA.hpp :

In ClassA.cpp:

In main.cpp, try to serialize an instance of ClassA :

Compile and link :

1
2
3
4
5
g++ -c main.cpp -o main.o
g++ -c ClassA.cpp -o ClassA.o
g++ main.o -lboost_serialization -o program
main.o: In function <code>void boost::serialization::access::serialize&lt;boost::archive::text_oarchive, classa=""&gt;(boost::archive::text_oarchive&amp;, ClassA&amp;, unsigned int)':
main.cpp:(.text._ZN5boost13serialization6access9serializeINS_7archive13text_oarchiveE6ClassAEEvRT_RT0_j[_ZN5boost13serialization6access9serializeINS_7archive13text_oarchiveE6ClassAEEvRT_RT0_j]+0x25): undefined reference to

void ClassA::serialize<boost::archive::text_oarchive>(boost::archive::text_oarchive&, unsigned int)’
collect2: error: ld returned 1 exit status
make: *** [program] Error 1
Does this compile ? Yes. Does this link ? … No!

What’s the compiler trying to tell me ? Undefined reference to ClassA::serialize(…). But I defined it, no ?
Well, yes and no. You wrote the code but it is a template function. So it needs to be instantiated at compile-time. When main.cpp is compiled, it sees the forward-declaration of ClassA::serialize() and assumes that the linker will find the implementation of void ClassA::serialize<boost::archive::text_oarchive>(boost::archive::text_oarchive&, unsigned int) somewhere.
But it does not because, ClassA.cpp while implementing the template function does not instantiate it. ClassA::serialize() is parsed but never compiled into actual machine code.

You can check for yourself. Look at the file size :

1
2
-rw-rw-r-- 1 timoch timoch 940 Apr 12 12:59 ClassA.o
-rw-rw-r-- 1 timoch timoch 143088 Apr 12 12:59 main.o

940 bytes ? That’s not a lot.

You can force the instantiation of ClassA::serialize() in ClassA.cpp. Add the following at the end:

It works, but is it good ? Not to me.

I need to include a header defining the implementation of the specific format I serialize to. I also need to include text_iarchive.hpp for the loading process to work. Tomorrow, when my object needs to be serialized to another format as part as another use case, I will need to modify its implementation file to include the specifics of that other format. I will need to do this for each and every class to be serialized … not something I would enjoy.

 

Conclusion

 

Templates provide a huge flexibility. Here it is used to enable the & operator to serve as both an extract and inject operator. However, it is at the expense of forcing the client application to put the saving/loading implementation in the same compile unit as the definitions of your target format. It completely voids the efforts put toward decoupling the serialized objects and the format they serialize to.

There are ways to achieve the same flexibility of ‘same operator’ saving and loading while preserving decoupling with the serialization format. I will come back to that in a later post.