Tags: , , , , , , , , , | Categories: Development Posted by bsstahl on 5/25/2017 12:21 AM | Comments (0)

I've written and spoken before about the importance of using the Strategy Pattern to create maintainable and testable systems. Strategies are even more important, almost to the level of necessity, when building AI systems.

The Strategy Pattern is an abstraction tool used to maintain loose-coupling between an application and the algorithm(s) that it uses to do its job. Since the algorithms used in AI systems have many different ways they could be implemented, it is important to abstract the implementation from the system that uses it. I tend to work with systems that use combinatorial optimization methods to solve their problems, but there are many ways for AIs to make decisions. Machine Learning is one of the hottest methods right now but AI systems can also depend on tried-and-true object-oriented logic. The ability to swap algorithms without changing the underlying system allows us the flexibility to try multiple methods before settling on a specific implementation, or even to switch-out implementations as scenarios or situations change.

When I give conference talks on building AI Systems using optimization methods, I always encourage the attendees to create a "naïve" solution first, before spending a lot of effort to build complicated logic. This allows the developer to understand the problem better than he or she did before doing any implementation. Creating this initial solution has another advantage though, it allows us to define the Strategy interface, giving us a better picture of what our application truly needs. Then, when we set-out to build a production-worthy engine, we do so with the knowledge of exactly what we need to produce.

There is also another component of many AIs that can benefit from the use of the Strategy pattern, and that is the determination of user intent. Many implementations of AI will include a user interaction, perhaps through a text-based interface as in a chatbot or a voice interface such as a personal assistant. Each cloud provider has their own set of services designed to determine the intent of the user based on the text or voice input. Each of these implementations has its own strengths and weaknesses. It is beneficial to be able to swap those mechanisms out at will, along with the ability to implement a "naïve" user intent solution during development, and the ability to mock user intent for testing. The strategy pattern is the right tool for this job as well.

As more and more of our applications depend heavily on algorithms, we will need to make a concerted effort to abstract those algorithms away from our applications to maintain loose-coupling and all of the benefits that loose-coupling provides. This is why I consider the Strategy Pattern to be a necessity when developing Artificial Intelligence solutions.

Tags: , , , , , , , , , , , , | Categories: Development Posted by bsstahl on 3/21/2017 7:41 AM | Comments (0)

It is fairly easy these days to test code in isolation if its dependencies are abstracted by a reusable interface. But what do we do if the dependency cannot easily be referenced via such an interface?  Enter Shims, from the Microsoft Fakes Framework (formerly Moles).  Shims allow us to isolate our testing from any dependent methods, including methods in assemblies we do not control, even if those methods are not exposed through a reusable interface. To see how easy it is, follow along with me through this example.

In this sample code on GitHub, we are building a repository for an application that currently gets its data from a file exported from a system that tracks scheduled meetings.  It is very likely that the system will, in the future, expose a more modern interface for that data so we have isolated the data storage using a simple Repository interface that has one method.  This method, called GetMeetings returns a collection of Meeting entities that start during the specified date range.  The method will return an empty collection if no data is found matching the specified criteria, and could throw either of 2 custom errors, a PermissionsException when the user does not have the proper permissions to access the information, and a DataUnavailableException for when the data source is unavailable for any other reason, such as a network outage or if the data file cannot be located.

It is important to point out why a custom exception should be thrown when the data file is not found, rather than allowing the FileNotFoundException to bubble-up.  If we allow the implementation-specific exception to bubble, we have exposed an implementation detail to the caller. That is, the calling code is now aware of the fact that this is a file system implementation.  If code is written in a client that traps for FileNotFoundException, then the repository implementation is swapped-out for a SQL server implementation, the client code will have to change to handle the new types of errors that could be thrown by that implementation.  This violates the Dependency Inversion principle, the “D” from the SOLID principles.  By exposing only a custom exception, we are hiding those implementation details from the caller.

Downstream clients can easily test code that uses this repository without having to actually access the repository implementation because we have exposed the IMeetingSourceRepository interface. However, it is a bit more difficult to actually test the repository implementation itself.  We have a few options here:

  • Create data files that hold known data samples and load those files during unit testing.
  • Create a wrapper around the System.IO namespace that exposes an interface, such as in the System.IO.Abstractions project.
  • Don’t test any code that requires reaching-out to the file system.

Since I am of the opinion that 100% code coverage is both reasonable, and desirable (although not a measurable goal), I will summarily dispose of option 3 for the purpose of this analysis. I have used option 2 many times in my life, and while employing wrapper code is a valid and reasonable solution, it adds additional code to my production deployments that is very limited in terms of what it adds to the loose-coupling of my solution since I already am loosely-coupled to this implementation via the IMeetingSourceRepository interface.

Even though it is far from a perfect solution (many would consider them more integration tests than unit tests), I initially selected option 1 for this implementation. That is, I created data files and deployed them along with my tests.  You can see the test files I created in the Data folder of the MeetingSystem.Data.FileSystem.Test project.  These files are deployed alongside my tests using the DeploymentItem directive that decorates the Repository_GetMeetings_Should class of the test project.  Using this method, I was able to create tests that:

  • Verify that the correct # of meetings are returned from a file
  • Verify that meetings are properly filtered by the StartDateTime of the meeting
  • Validate the data elements returned from the file
  • Validate that the proper custom exception is thrown if a FileNotFoundException is thrown by the underlying code

So we have verified nearly everything we need to test in our implementation.  We’ve verified that the data is returned properly, and that one of our custom exceptions is being returned. But what about the PermissionsException?  We were able to simulate a FileNotFoundException in our tests by just using a bad filename, but how do we test for a permissions problem?  The ReadAllText method of the File object from System.IO will throw a System.Security.SecurityException if the file cannot be read due to a permissions problem.  We need to trap this exception and throw our own exception, but how can we validate that we have successfully done so and that the functionality remains intact through future refactoring?  How can we simulate a permissions exception on a file that we have enough permission on to deploy to a test folder? Enter Shims from the Microsoft Fakes Framework.

Instead of having our tests actually reach-out to the file system and actually try to load a file, we can intercept calls to the System.IO.File.ReadAllText method and have those calls execute some delegate code instead. This code, which we write in our test methods, can be specific to each test and exist only within the context of the test. As a result, we are not deploying any additional code to production, while still thoroughly validating our code.  In fact, using this methodology, I could re-implement my previous tests, including my test data in the tests themselves, making these tests better unit tests.  I could then reserve tests that actually reach out to files for integration test libraries that are run less frequently, and perhaps even behind the scenes.

      Note: If you wish to follow-along with these instructions, you can grab the code from the DemoStart branch of the GitHub repo, rather than the Master branch where this is already done.

      To use Shims, we first have to create a Fakes Assembly.  This is done by right-clicking on the System reference in the test project from Visual Studio 2017, and selecting “Add Fakes Assembly” (full framework only – not yet available for .NET Core assemblies). Be sure to do this in the test project since we don’t want to actually deploy the Fakes assembly in our production code.  Using the add fakes assembly menu item does 2 things:

      1. Adds a reference to Microsoft.QualityTools.Testing.Fakes assembly
      2. Creates 2 .fakes XML files in the Fakes folder within the test project. These items are built into corresponding fakes dll files that are deployed with the test project and used to provide stub and shim objects that mimic the objects in the selected assemblies.  These fake objects reside in the same namespace as their “real” counterparts, except with “Fakes” on the end. Thus, our fake File object will reside in the System.IO.Fakes namespace.

      Fakes

      The next step in using shims is to create a ShimsContext within a Using statement. Any method calls that execute within this context can be intercepted and replaced by our delegates.  For example, a test that replaces the call to ReadAllText with a method that returns a single line of constant data can be seen below.

      Methods on shim objects are referenced through properties of the fake object.  These properties are of type FakesDelegate.Func and match the signature of the method being shimmed.  The return data type is also appended to the property name so that each item’s signature can be represented with a different property name.  In this case, the ReadAllText method of the File object is represented in the System.IO.Fakes.File object as a property called ReadAllTextString, of type FakesDelegate.Func<string, string>, since the method takes a string parameter (the path of the file), and returns a string (the text contents of the file).  If we assign a method delegate to this property, that method will be executed in place of the call to System.IO.File.ReadAllText whenever ReadAllText is called within the ShimContext.

      In the gist shown above, the variable p represents the input parameter and will hold the path specified in the test (in this case “April2017.abc”).  The return value for our delegate method comes from the constant string dataFile.  We can put anything we want here.  We can replace the delegate with a call to an anonymous method, or with a call to an existing method.  We can return a value gleaned from an external source, or, as is needed for our permissions test, throw an exception.

      For the purposes of our test to verify that we throw a PermissionsException when a SecurityException is thrown, we can replace the value of the ReadAllTextString property with our delegate which throws the exception we need to test for,  as seen here:

      System.IO.Fakes.ShimFile.ReadAllTextString =
            p => throw new System.Security.SecurityException("Test Exception");

      Then, we can verify in our test that our custom exception is thrown.  The full working example can be seen by grabbing the Master branch of the GitHub repo.

      What can you test with these Shim objects that you were unable to test before?  Tell me about it on Twitter @bsstahl.

      The demo code for my presentation on Testing in Visual Studio 2017 at the VS2017 Launch event can be found on GitHub.  There are 2 branches to this repository, the Main branch which holds the completed demo, and the DemoStart branch which holds the starting point of the demonstration in case you would like to implement the sample yourself.

      The demo shows how Microsoft Fakes (formerly Moles) can be used to create tests against code that does not implement a reusable interface. This can be done  without having to resort to integration style tests or writing extra wrapper code just to implement an interface.  During my launch presentation, I also use this code to demonstrate the use of Intellitest (formerly Pex) to generate exploratory tests.

      One of the techniques I recommend highly in my Simplify Your API talk is the use of extension methods to hide the complexity of lower-level API functionality.  A good example of a place to use this methodology came-up last night in a great Reflection talk by Jeremy Clark (Twitter, Blog) at the NorthWest Valley .NET User Group

      Jeremy

      Jeremy was demonstrating a method that would spin-through an assembly and load all classes within that assembly that implemented a particular interface.  The syntax to do the checks on each type were just a bit more obtuse than Jeremy would have liked them to be.  As we left that talk, I only half-jokingly told Jeremy that I was going to write him an extension method to make that activity simpler.  Being a man of my word, I present the code below to do just that.

      Tags: , , , , , | Categories: Development Posted by bsstahl on 1/26/2016 2:07 AM | Comments (0)

      Good API design requires the developer to return responses that provide useful and understandable information to the consumers of the API.  To effectively communicate with the consumers, these responses must utilize standards that are known to the developers who will be using them.  For .NET APIs, these standards include:

      • Implementing IDisposable on all objects that need disposal.
      • Throwing a NotImplementedException if a method is on the interface and is expected to be available in the future, but is not yet available for any reason.
      • Throwing an ArgumentException or ArgumentNullException as appropriate to indicate that bad input has been supplied to a method.
      • Throwing an InvalidOperationException if the use of a method is inappropriate or otherwise unavailable in the current context.

      One thing that should absolutely not be done is returning a NULL from a method call unless the NULL is a valid result of the method, based on the provided input.

      I have spent the last few weeks working with a new vendor API.  In general, the implementation of their API has been good, but it is clear that .NET is not their primary framework.  This API does 2 things that have made it more difficult than necessary for me to work with the product:

      1. Disposable objects don’t implement IDisposable. As a result, I cannot simply wrap these objects in a Using statement to handle disposal when they go out of scope.
      2. Several mathematical operators were overloaded, but some of them were implemented simply by returning a NULL. As a result:
        1. I had to decompile their API assembly to determine if I was doing something wrong.
        2. I am still unable to tell if this is a permanent thing or if the feature will be implemented in a future release.

      Please follow all API guidelines for the language or framework you are targeting whenever it is reasonable and possible to do so.

      Tags: , , , , , | Categories: Development Posted by bsstahl on 9/23/2014 3:04 AM | Comments (0)

      To allow ourselves to create the best possible services for our clients, it is important to make those services as flexible and maintainable as possible.  Building services in an agile way helps us to create better services, however it makes it more likely that our service interface will, at some point, have to change.  Changing a service interface after publication is, and should be, a well gated, well thought-out process. By changing the interface, you are changing the contract your service has with all of your clients, and you are probably requiring every one of the service consumers to change.  This should not be done lightly. However, there are a few things that can be done to minimize the impacts of these changes. Several of these things require agreements with the clients up front.  As a result, these items should be included in the Service Level Agreement (SLA) between the service providers and the consumers.

      Caveat: I am a solution architect, not an expert in creating service level agreements.  Typically, my only involvement with SLAs is to object when I can’t get what I need in one from a service provider. My intent here is to call-out a few things that all service providers should include in their SLAs to maintain the flexibility of their APIs. There are many other things that should be included in any good SLA that I will not be discussing here.

      The two items that I believe should be included in all service SLAs are the requirements that the clients support both Lax Versioning and Forward Compatible Contracts.  Each of these items is discussed in some detail below.

      Lax Versioning

      Lax Versioning allows us to add new, optional members to the data contract of the service without that change being considered a breaking change. Some modern service frameworks provide this behavior by default and many of the changes we might make to a service fall into this category.  By reducing the number of changes that are considered breaking, we can lessen the burden on our implementation teams, reducing coordination requirements with service consumers, and shortening time to market of these changes.

      One of the major impacts that Lax Versioning has is that it requires us to either avoid schema validation altogether, or to use specially designed, versionable schemas to do our validation.  I recommend avoiding schema validation wherever reasonable and possible.

      Forward Compatible Contracts

      Forward Compatible Contracts, also known as the Round-Tripping of Unknown Data, requires that the service round-trip any additional data it gets, but doesn’t understand, back to the client and that clients round-trip any additional data they get, but don’t understand, back to the server.  This behavior reduces the coupling between client and server for changes that are covered by Lax Versioning, but need to retain the additional data throughout the call life-cycle.

      For example, suppose we were version a contract such that we added an additional address type to an employee entity  (V1 only has home address, V2 has home and work addresses).  If we change the service to return the V2 employee prior to changing the client, the client will accept the additional (optional) address type because we have already required Lax Versioning, but it will not know what to do with the information.  If a V1 client without round-tripping support sends that employee back to the server, the additional address type will not be included.  If however, the V1 client supports this round-tripping behavior, it will still be unable to use the data in the additional address field, but will return it to the server if the entity is sent back in a subsequent call.  These behaviors with a V1 client and a V2 service are shown in the diagram below.

        image

      If the same practice is used on the server side, then we can decouple the client and server from many implementation changes.  Clients would be free to implement the new versions of contracts as soon as they are ready, without having to wait for the service to roll-out.  Likewise, many changes at the service side could be made knowing that data sent down to the clients will not be lost when it is returned to the server.

      Summary

      Making changes to the contract of existing services is a process that has risk, and requires quite a bit of coordination with clients. Some of the risks and difficulties involved in the process can be mitigated by including just 2 requirements in the Service Level Agreements of our services.  By requiring clients to implement Lax Versioning and making our contracts Forward Compatible, we can reduce the impact of some changes, and decouple others such that we significantly reduce the risk involved in making these changes, and improve our time-to-market for these deployments.

      Tags: , , , , , , | Categories: Development Posted by bsstahl on 4/4/2014 9:26 PM | Comments (0)

      One thing I've noticed during my 30 years in software engineering is that everything old eventually becomes new again.  If you have a particular skill or preferred methodology that seems to have become irrelevant,  just wait a while, it is likely to return in some form or another.  In this case, it seems that recent announcements by Microsoft about how developers will be able to leverage the power of Cortana, are likely to revitalize the need for text processing as an input to the apps we build.

      At one time, many years ago, we had two primary methods of letting the computer know what path we wanted to take within an application; we could select a value from a displayed (textual) menu, or, if we were getting fancy, we could provide an input box that the user could type commands into.  This latter technique was often the purview of text-only adventure games and inputs came in the form "move left" and "look east".  While neither of these input methods was particularly exciting or "natural" to use today's parlance, it was only text input that allowed the full flexibility of executing nearly any application action from any location.  Now that Microsoft has announce that developers on Windows Phone, and likely other platforms, will be able to leverage the platform's built-in digital assistant named "Cortana" and receive inputs into their applications as text input translated from the user's speech (or directly as text typed into Cortana's input box) it makes sense for us to start thinking about our application inputs in this way again. That is, we want to consider, for each action a user might take, how the user might trigger that action by voice command.

      It should be fairly easy to shift to this mindset if we simply imagine, on our user interfaces, a text box where the user could type a command to the app.  The commands that the user might type into this box are the commands we need to enable using the provided speech input APIs.  If we start thinking about inputs in this way now, it might help to shape our user interfaces in ways that make speech input more natural, and our applications more useful, in the coming years. Of course, this also gives us the added benefit of allowing us to reuse our old text parsing skills from that time when we wrote that adventure game…

       

      I previously posted the slides for my Building Enterprise Apps using Entity Framework 4 talk here.  I can now post the source code for the completed demo application.  That code, created for use in Visual Studio 2010 Ultimate, is available in zip format below.  This is the same code that was demonstrated at Desert Code Camp 2011.1 and SoCalCodeCamp 2011 as well as the New Mexico .NET User’s Group (NMUG).

      EF4EnterpriseDemoCode.zip

      Tags: , , , , , , | Categories: Development, Event Posted by bsstahl on 7/13/2009 11:51 AM | Comments (0)

      I will be speaking at the Developer Ignite event in Chandler on July 22nd.  The topic of my talk will be "Simplicity Through Abstraction" during which I will be giving a very high-level overview of using Dependency Injection as an Inversion-of-Control methodology to create simplicity in software architecture.

      While putting my presentation together I have found a number of items that I wanted to include in my presentation, but simply can't due to the obvious constraints of a 5-minute presentation.  Some of these items won't even get a mention, others will be mentioned only in passing.  I include them here as a list of topics for me to discuss in future posts to this blog.  Hopefully this will occur, at least in part, prior to the ignite event so that there will be a set of resources available to those at the event who were previously unfamiliar with these techniques and wish to explore them further.

      These topics include:

      • IoC Containers
      • Dealing with Provider-Specific requirements
      • Configuration as a dependency
      • Local providers for external dependencies
      • Providers as application tiers
      • Testing at the provider level
      • Top Down Design [Added: 7/12/2009]

      If you have a topic that you are particularly interested in, or have any questions about IoC, Dependency Injection, or Providers that you would like me to answer, please use the comments or contact me via twitter: @bsstahl.

      Tags: , , , | Categories: Development Posted by bsstahl on 6/17/2008 12:17 AM | Comments (0)

      Over the last few years I've heard a number of public statements from developers about the lack of need for multiple implementation inheritance in .NET and other modern development platforms. Their logic often seems to imply that if you need multiple implementation inheritance, you are not designing your applications properly.  While admittedly, there are usually work-arounds (such as interface inheritance) that allow us to simulate this feature, they usually require that portions of our code are duplicated, violating the Agile requirement "Don't Repeat Yourself".

      One commonly seen example of where multiple implementation inheritance would be very valuable is in multi-tiered, domain specific applications, especially in the data-tier where we may wish to have more-than-one implementation to support multiple data-stores.  Think about the typical data-tier scenario. In this scenario we have a set of domain objects, based on an inherited set of entities with common properties and methods that represent a physical object in the problem domain. These objects also have a commonality in that they are implementations of an object-type common to that data store and may have properties and methods relating specifically to the storage of data.  So, an object whose responsibility it is to persist an Employee entity to a SQL Server data store, could inherit from both our domain Employee entity, and our SQL Data Storage object.  If we also had an implementation that stored data in XML format, we might have an object that inherits both from the same Employee entity as well as from the XMLNode object. If multiple implementation inheritance were supported in our framework, we could avoid the common work-around of repeating our entity implementation by using an interface to simulate that inheritance, or by simply repeating our data persistence logic in each object.

      I certainly understand the need to ship a product.  Since I am also well aware of the added complexity that multiple implementation inheritance creates in compilers and frameworks, it is easy for me to imagine why this feature did not make it into either of the first two major revs of Microsoft's Common Language Runtime.  It is my opinion however that, with the third major release of the CLR forthcoming (Rev 3s being where Microsoft traditionally "nails it") they should strongly consider adding support for multiple implementation inheritance.