Tags: , , , , , , , , , , , , | Categories: Event Posted by bsstahl on 10/15/2017 6:37 PM | Comments (0)

Another great Desert Code Camp is in the books. A huge shout-out to all of the organizers, speakers & attendees for making the event so awesome.

I was privileged to be able to deliver two talks during this event:

    • A Developer’s Survey of AI Techniques: Artificial Intelligence is far more than just machine learning. There are a variety of tools and techniques that systems use to make rational decisions on our behalf. In this survey designed specifically for software developers, we explore a variety of these methods using demo code written in c#. You will leave with an understanding of the breadth of AI methodologies as well as when and how they might be used. You will also have a library of sample code available for reference.

      • AI that can Reason "Why": One of the big problems with Artificial Intelligences is that while they are often able to give us the best possible solution to a problem, they are rarely able to reason about why that solution is the best. For those times where it is important to understand the why as well as the what, Hybrid AI systems can be used to get the best of both worlds. In this introduction to Hybrid AI systems, we'll design and build one such system that can solve a complex problem for us, and still provide information about why each decision was made so we can evaluate those decisions and learn from our AI's insights.

      Please feel free to contact me on Twitter with any questions or comments on these or any of my presentations.

      I previously wrote about a Hybrid AI system that combined logical and optimization methods of problem solving to identify the best solution to an employee shift assignment problem. This implementation was notable in that a hybrid approach was used so that the optimal solution could be found, but the system could still indicate to the users why a particular assignment was, or wasn’t, included in the results. 

      I recently published to GitHub a demo of a similar system. I use this demo in my presentation, Building AI Solutions that can Reason Why. The code demonstrates the hybridization of multiple AI techniques by creating a solution that iteratively applies a combinatorial optimization engine. Different results are obtained by varying the methods of applying the constraints in that model. In the final (4th) demo  method, an iterative process is used to identify what the shortcomings of the final product are, and why they are necessary.

      These demos use the Conference Scheduler AI project to build a valid schedule.

      There are 4 examples, each of which reside in a separate test method:

      ScheduleWithNoRestrictions()

      The 1st method in BasicExamplesDemo.cs shows an unconstrained model where only the hardest of constraints are excluded. That is, the only features of the schedule that are considered by the scheduler are those that are absolute must-haves.  Since there are fewer hard constraints, it is relatively easy to satisfy all the requirements of this model.

      ScheduleWithHardConstraints()

      The 2nd method in BasicExamplesDemo.cs shows a fully constrained model where  all constraints are considered must-haves. That is, the only schedules that will be considered for our conference are those that meet all of the scheduling criteria. As you might imagine, this can be difficult to do, in this case resulting in No Feasible Solution being found. Because we use a combinatorial optimization model, the system gives us no clues as to  which of the constraints cause the infeasibility, or what to do that might allow it to find a solution.

      ScheduleWithTimePreferencesAsAnOptimization()

      The 3rd method in BasicExamplesDemo.cs shows the solution when the true must-haves are considered hard constraints but preferences are not. The AI attempts to optimize the solution by satisfying as many of the soft constraints (preferences) as possible. This results in an imperfect, but possibly best case schedule, but one where we have little insight as to what preferences were not satisfied, and almost no insight as to why.

      AddConstraintsDemo()

      The final demo, and the only method in AddConstraintsDemo.cs, builds on the 3rd demo, where the true must-haves are considered hard constraints but preferences are not. Here however, instead of attempting to optimize the soft constraints, the AI iteratively adds the preferences as hard constraints, one at a time, re-executing the solution after each to make sure the problem has not become infeasible. If the solution has become infeasible, that fact is recorded along with what was being attempted. Then that constraint is removed and the process continues with the remaining constraints. This Hybrid process still results in an imperfect, but best-case schedule. This time however, we not only know what preferences could not be satisfied, we have a good idea as to why.

      The Hybrid Process

      The process of iteratively executing the optimization, adding constraints one at a time, is show in the diagram below.  It is important to remember that the order in which these constraints are added here is critical since constraining the solution in one way may limit the feasibility of the solution for future constraints.  Great care must be taken in selecting the order that constraints are added in order to obtain the best possible solution.

       

      Hybrid Conference Optimization Process

      The steps are as follows:

      1. Make sure we can solve the problem without any of the soft constraints.  If the problem doesn’t have any feasible solutions at the start of the process, we are certainly not going to find any by adding constraints.
      2. Add a constraint to the solution. Do so by selecting the next most important constraint in order.  In the case of our conference schedule, we are adding in speaker preferences for when they speak. These preferences are being added in the order that they were requested (first-come first-served).
      3. Verify that there is still at least 1 feasible solution to the problem after the constraint is added.  If no feasible solutions can be found:
        1. Remove the constraint.
        2. Record the details of the constraint.
        3. Record the current state of the model.
      4. Repeat steps 2 & 3 until all constraints have been tried.
      5. Publish the solution
        1. The resulting schedule
        2. The constraints that could not be added.  This tells us what preferences could not be accommodated.
        3. The state of the model at the time the failed constraints were tried.  This give us insight as to why the constraints could not be satisfied.

      Note: The sample data in these demos is very loosely based on SoCalCodeCamp San Diego from the summer of 2017. While some of the presenters names and presentations come roughly from the publicly available schedule, pretty much everything else has been fictionalized to make for a compelling demo, including the appearances by some Microsoft rock stars, and the "requests" of the various presenters.

      If you have any questions about this code, or about how Hybrid AIs can be used to provide more information about the solutions to problems than strictly optimization or probabilistic models, please contact me on Twitter.

      Tags: , , , , , , , , , | Categories: Development Posted by bsstahl on 9/28/2017 1:53 PM | Comments (0)

       

      My presentation from the #NDCSydney conference has been published on YouTube.

      We depend on Artificial Intelligences to solve many types of problems for us. Some of these problems have more than one possible solution. Handling those problems with more than one solution while building a modern AI system is something every developer will be asked to do over the course of his or her career. Figuring out the best way to utilize the capacity of a device or machine, finding the shortest path between two points, or determining the best way to schedule people or events are all problems where mathematical optimization techniques and tooling can be used to quickly and efficiently find solutions.

      This session is a software developers introduction to using mathematical optimization in Artificial Intelligence. In it, we will explore some of the foundational techniques for solving these types of problems, and use the open-source Google OR-Tools to put them to work in our AI systems. Since this is a session for developers, we'll keep it in terms that work best for us. That is, we'll go heavy on the code and lighter on the math.

      Tags: , , , , , , , , , | Categories: Event Posted by bsstahl on 6/22/2017 4:10 PM | Comments (0)

      The slide deck for my talk “A Developer’s Survey of AI Techniques” can be found below, while the demo code can be found on GitHub.

       

       

      The talk explores some of the different techniques used to create Artificial Intelligences using the example of a Chutes & Ladders game.  Various AIs are developed using different strategies for playing a variant of the game, using different techniques for deciding where on the game board to move.

      If you would like me to deliver this talk, or any of my talks, at your User Group or Conference, please contact me.

      Tags: , , , , , , , , , , , , | Categories: Development Posted by bsstahl on 3/20/2017 3:41 PM | Comments (0)

      It is fairly easy these days to test code in isolation if its dependencies are abstracted by a reusable interface. But what do we do if the dependency cannot easily be referenced via such an interface?  Enter Shims, from the Microsoft Fakes Framework (formerly Moles).  Shims allow us to isolate our testing from any dependent methods, including methods in assemblies we do not control, even if those methods are not exposed through a reusable interface. To see how easy it is, follow along with me through this example.

      In this sample code on GitHub, we are building a repository for an application that currently gets its data from a file exported from a system that tracks scheduled meetings.  It is very likely that the system will, in the future, expose a more modern interface for that data so we have isolated the data storage using a simple Repository interface that has one method.  This method, called GetMeetings returns a collection of Meeting entities that start during the specified date range.  The method will return an empty collection if no data is found matching the specified criteria, and could throw either of 2 custom errors, a PermissionsException when the user does not have the proper permissions to access the information, and a DataUnavailableException for when the data source is unavailable for any other reason, such as a network outage or if the data file cannot be located.

      It is important to point out why a custom exception should be thrown when the data file is not found, rather than allowing the FileNotFoundException to bubble-up.  If we allow the implementation-specific exception to bubble, we have exposed an implementation detail to the caller. That is, the calling code is now aware of the fact that this is a file system implementation.  If code is written in a client that traps for FileNotFoundException, then the repository implementation is swapped-out for a SQL server implementation, the client code will have to change to handle the new types of errors that could be thrown by that implementation.  This violates the Dependency Inversion principle, the “D” from the SOLID principles.  By exposing only a custom exception, we are hiding those implementation details from the caller.

      Downstream clients can easily test code that uses this repository without having to actually access the repository implementation because we have exposed the IMeetingSourceRepository interface. However, it is a bit more difficult to actually test the repository implementation itself.  We have a few options here:

      • Create data files that hold known data samples and load those files during unit testing.
      • Create a wrapper around the System.IO namespace that exposes an interface, such as in the System.IO.Abstractions project.
      • Don’t test any code that requires reaching-out to the file system.

      Since I am of the opinion that 100% code coverage is both reasonable, and desirable (although not a measurable goal), I will summarily dispose of option 3 for the purpose of this analysis. I have used option 2 many times in my life, and while employing wrapper code is a valid and reasonable solution, it adds additional code to my production deployments that is very limited in terms of what it adds to the loose-coupling of my solution since I already am loosely-coupled to this implementation via the IMeetingSourceRepository interface.

      Even though it is far from a perfect solution (many would consider them more integration tests than unit tests), I initially selected option 1 for this implementation. That is, I created data files and deployed them along with my tests.  You can see the test files I created in the Data folder of the MeetingSystem.Data.FileSystem.Test project.  These files are deployed alongside my tests using the DeploymentItem directive that decorates the Repository_GetMeetings_Should class of the test project.  Using this method, I was able to create tests that:

      • Verify that the correct # of meetings are returned from a file
      • Verify that meetings are properly filtered by the StartDateTime of the meeting
      • Validate the data elements returned from the file
      • Validate that the proper custom exception is thrown if a FileNotFoundException is thrown by the underlying code

      So we have verified nearly everything we need to test in our implementation.  We’ve verified that the data is returned properly, and that one of our custom exceptions is being returned. But what about the PermissionsException?  We were able to simulate a FileNotFoundException in our tests by just using a bad filename, but how do we test for a permissions problem?  The ReadAllText method of the File object from System.IO will throw a System.Security.SecurityException if the file cannot be read due to a permissions problem.  We need to trap this exception and throw our own exception, but how can we validate that we have successfully done so and that the functionality remains intact through future refactoring?  How can we simulate a permissions exception on a file that we have enough permission on to deploy to a test folder? Enter Shims from the Microsoft Fakes Framework.

      Instead of having our tests actually reach-out to the file system and actually try to load a file, we can intercept calls to the System.IO.File.ReadAllText method and have those calls execute some delegate code instead. This code, which we write in our test methods, can be specific to each test and exist only within the context of the test. As a result, we are not deploying any additional code to production, while still thoroughly validating our code.  In fact, using this methodology, I could re-implement my previous tests, including my test data in the tests themselves, making these tests better unit tests.  I could then reserve tests that actually reach out to files for integration test libraries that are run less frequently, and perhaps even behind the scenes.

          Note: If you wish to follow-along with these instructions, you can grab the code from the DemoStart branch of the GitHub repo, rather than the Master branch where this is already done.

          To use Shims, we first have to create a Fakes Assembly.  This is done by right-clicking on the System reference in the test project from Visual Studio 2017, and selecting “Add Fakes Assembly” (full framework only – not yet available for .NET Core assemblies). Be sure to do this in the test project since we don’t want to actually deploy the Fakes assembly in our production code.  Using the add fakes assembly menu item does 2 things:

          1. Adds a reference to Microsoft.QualityTools.Testing.Fakes assembly
          2. Creates 2 .fakes XML files in the Fakes folder within the test project. These items are built into corresponding fakes dll files that are deployed with the test project and used to provide stub and shim objects that mimic the objects in the selected assemblies.  These fake objects reside in the same namespace as their “real” counterparts, except with “Fakes” on the end. Thus, our fake File object will reside in the System.IO.Fakes namespace.

          Fakes

          The next step in using shims is to create a ShimsContext within a Using statement. Any method calls that execute within this context can be intercepted and replaced by our delegates.  For example, a test that replaces the call to ReadAllText with a method that returns a single line of constant data can be seen below.

          Methods on shim objects are referenced through properties of the fake object.  These properties are of type FakesDelegate.Func and match the signature of the method being shimmed.  The return data type is also appended to the property name so that each item’s signature can be represented with a different property name.  In this case, the ReadAllText method of the File object is represented in the System.IO.Fakes.File object as a property called ReadAllTextString, of type FakesDelegate.Func<string, string>, since the method takes a string parameter (the path of the file), and returns a string (the text contents of the file).  If we assign a method delegate to this property, that method will be executed in place of the call to System.IO.File.ReadAllText whenever ReadAllText is called within the ShimContext.

          In the gist shown above, the variable p represents the input parameter and will hold the path specified in the test (in this case “April2017.abc”).  The return value for our delegate method comes from the constant string dataFile.  We can put anything we want here.  We can replace the delegate with a call to an anonymous method, or with a call to an existing method.  We can return a value gleaned from an external source, or, as is needed for our permissions test, throw an exception.

          For the purposes of our test to verify that we throw a PermissionsException when a SecurityException is thrown, we can replace the value of the ReadAllTextString property with our delegate which throws the exception we need to test for,  as seen here:

          System.IO.Fakes.ShimFile.ReadAllTextString =
                p => throw new System.Security.SecurityException("Test Exception");

          Then, we can verify in our test that our custom exception is thrown.  The full working example can be seen by grabbing the Master branch of the GitHub repo.

          What can you test with these Shim objects that you were unable to test before?  Tell me about it on Twitter @bsstahl.

          The demo code for my presentation on Testing in Visual Studio 2017 at the VS2017 Launch event can be found on GitHub.  There are 2 branches to this repository, the Main branch which holds the completed demo, and the DemoStart branch which holds the starting point of the demonstration in case you would like to implement the sample yourself.

          The demo shows how Microsoft Fakes (formerly Moles) can be used to create tests against code that does not implement a reusable interface. This can be done  without having to resort to integration style tests or writing extra wrapper code just to implement an interface.  During my launch presentation, I also use this code to demonstrate the use of Intellitest (formerly Pex) to generate exploratory tests.

          One of the techniques I recommend highly in my Simplify Your API talk is the use of extension methods to hide the complexity of lower-level API functionality.  A good example of a place to use this methodology came-up last night in a great Reflection talk by Jeremy Clark (Twitter, Blog) at the NorthWest Valley .NET User Group

          Jeremy

          Jeremy was demonstrating a method that would spin-through an assembly and load all classes within that assembly that implemented a particular interface.  The syntax to do the checks on each type were just a bit more obtuse than Jeremy would have liked them to be.  As we left that talk, I only half-jokingly told Jeremy that I was going to write him an extension method to make that activity simpler.  Being a man of my word, I present the code below to do just that.

          Tags: , , , , , , , , | Categories: Event Posted by bsstahl on 10/21/2015 10:19 AM | Comments (0)

          I hope you’ve had an opportunity to see my presentation, “Dynamic Optimization – One Technique all Programmers Should Know” at a Code Camp or User Group near you.  If so, and you want to have a copy of the slide deck for your very own, you can see it embedded below, or use the direct link to the Powerpoint here

          The subject of this presentation is using a technique called Dynamic Programming to solve problems that have more than one possible solution.  This technique works very well when used to solve problems that are recursive in nature.  One of the best things about this technique is that it guarantees that the solution it produces is the best possible solution.

          We look at three examples during the presentation, the first is done only “on paper” and is an example of using this technique to solve a knapsack problem.  The second example is done in pseudo-code and solves a linear best-path problem in the game of Chutes & Ladders.  Finally, we drop into Visual Studio to solve a 2-dimensional best-path problem.  Sample code for both of the last 2 examples can be found in GitHub.

          Keep an eye on my Speaking Engagements Page for opportunities to see this presentation live. If you are a user group or conference organizer, you can contact me to schedule an in-person presentation.  This presentation is a lot of fun to deliver and has been received extremely well at Code Camps and User Groups across the country.

          Tags: , , , , , , , , , , | Categories: Development Posted by bsstahl on 10/12/2015 6:15 AM | Comments (0)

          If you are building an API for other Developers to use, you will find out two things very quickly:

          1. Developers don't read documentation (you probably already know this).
          2. If your API depends on its documentation to get developers to understand and discover its features, it is likely that it will not be used.

          Fortunately, there are some simple mechanisms for wrapping complex APIs and making their functionality both easy to use, and highly discoverable. An API that uses tools like IntelliSense in Visual Studio to make its features discoverable by the downstream developer is far more likely to be adopted then one that doesn't. In recent years, additions to the C# language have made creating a Domain Specific Language that uses a fluent syntax for nearly any API into a simple process.

          Create the Context

          The 1st step in simplifying any API is to provide a single starting point for the downstream developer to interact with. In most cases, the best practice is to use the façade pattern to define a context that holds our entity collections. Each collection of entities becomes a property on the context object. These properties all return an IQueryable<Entity>. For example, in the EnumerableStack demo solution on GitHub (https://github.com/bsstahl/SimpleAPI), I created an object Bss.EnumerableStack.Data.EnumerableStack to provide this functionality. It has two properties, Posts and Questions, each of which returns an IQueryable<Post>. It is these properties that will be used to access the data from our API.

          The context object, on top of becoming the single point of entry for downstream developers, also hides any complexities in the construction logic of the underlying data source. That is, if there is any configuration or other setup required to access the upstream data provider (such as web service access or database connections), much of the complexity of that construction can be hidden from the API user. A good example of this can be seen in the FluentStack demo solution from the same GitHub repository. There, the Bss.FluentStack.Data.OData.FluentStack context object wraps the functionality of constructing the connection to the StackOverflow OData web service.

          Extend Our Language

          Now that we have data to access, it's time for us to extend our domain specific language to provide tools to make accessing this data simpler for the API caller. We can use Extension methods on IQueryable<Entity> to create custom filters for our data. By creating extension methods that accept IQueryable<Entity> as a parameter and return the same, we can create methods that can be chained together to form a fluent syntax that will perform complex filtering. For example, in the EnumerableStack solution , the Questions, WithAcceptedAnswer and TaggedWith methods found in the Bss.EnumerableStack.Data.Extensions module, can all be used to execute queries on the data exposed by the properties of our context object, as shown below:

          var results = new EnumerableStack().Posts.WithAcceptedAnswer().TaggedWith("odata");

          In this case, both the WithAcceptedAnswer and TaggedWith filters are applied to the data. The best part about these methods are that they are visible in Intellisense (once the namespace has been brought into scope with a Using statement) making the functionality easy to discover and use.

          Another big advantage of creating these extension methods is that they can hide the complexity of the lower level API. Here, the WithAcceptedAnswer method is wrapping a where clause that filters for those posts that have an AcceptedAnswerId property that is non-null. It may not be obvious to a downstream API consumer that the definition of a post with an "accepted answer" is one where the AcceptedAnswerId has a value. Our API hides that implementation detail and allows the consumer to simply request what is needed. Similarly, the TaggedWith method hides the fact that the StackOverflow API stores tags in lower-case, within angle-brackets, and with all tags on a post joined into a single string. To search for tags, the consumer would need to know this, and take all appropriate actions when searching for a tag if we didn't hide that complexity in the TaggedWith method.

          Simplify Query Predicates

          A predicate is a function that accepts an entity as a parameter, and returns a boolean value. These functions are often used in the Where clause of a query to indicate which objects should be included in the result set. For example, in the query below

          var results = new EnumerableStack().Posts.Where(p => p.Parent == null);

          the function expression p => p.Parent == null is a predicate that returns true if the Parent property of the entity is null. For each entity passed to the function, the value of that property is tested, and if null, the entity is included in the results of the query. Here we are using a Lambda Expression to provide a delegate to our function. One of the coolest things about Linq is that we can now represent this expression in a variable of type Expression<Func<Entity, bool>>, that is, a Lambda expression of a function that takes an Entity and returns a boolean. This is pretty awesome because if we can store it in a variable, we can pass it around and enable extension methods like this one, as found in the Asked class of the Bss.EnumerableStack.Data library:

          public static Expression<Func<Post, bool>> InLast(TimeSpan span)
             {
             return p => p.CreationDate > DateTime.UtcNow.Subtract(span);
             }

          This method accepts a TimeSpan object and returns the Lambda Expression type useable as a predicate. The input TimeSpan is subtracted from the current DateTime UTC value, and compared to the CreationDate property of a Post entity. If the creation date of the Post is later than 30-days prior to the current date, the function returns true. Since this InLast method is static on a class called Asked, we can use it like this:

          var results = new EnumerableStack().Questions.Where(Asked.InLast(TimeSpan.FromDays(30));

          Which will return questions that were asked in the last 30 days. This becomes even simpler to understand if we add a method extending Int called Days that returns a Timespan, like this:

          public static TimeSpan Days(this int value)
             {
             return TimeSpan.FromDays(value);
             }

          allowing our expression to become:

          var results = new EnumerableStack().Questions.Where(Asked.InLast(30.Days());

          Walking through the Process

          In my conference sessions, Simplify Your API: Creating Maintainable and Discoverable Code, I walk through this process on the FluentStack demo code. We take a query created against the StackOverflow OData API that starts off looking like this:

          var questions = new StackOverflowService.Entities(new Uri(_serviceRoot))
             .Posts.Where(p => p.Parent == null && p.AcceptedAnswerId != null
             && p.CreationDate > DateTime.UtcNow.Subtract(TimeSpan.FromDays(30))
             && p.Tags.Contains("<odata>"));

          and convert it, one step at a time, to this:

          var questions = new FluentStack().Questions.WithAcceptedAnswer().
             Where(Asked.InLast(30.Days)).TaggedWith("odata");

          a query that is much simpler, easier to understand, easier to create and easier to maintain. The sample code on GitHub, referenced above, and available at http://github.com/bsstahl/SimpleAPI, contains the FluentStack.sln example which shows how to simplify an API created with an OData source. It also contains the EnumerableStack.sln project which walks through the same process on a purely enumerable data source, that is, an implementation that will work with any collection.

          Sound Off

          Have you used these tools to simplify an API for downstream programmers? Do you have other techniques that you use to do the same, similar, or additional things to make your APIs better? If so, Tweet it to me @bsstahl and let's keep the conversation going.

          I previously posted the slides for my Building Enterprise Apps using Entity Framework 4 talk here.  I can now post the source code for the completed demo application.  That code, created for use in Visual Studio 2010 Ultimate, is available in zip format below.  This is the same code that was demonstrated at Desert Code Camp 2011.1 and SoCalCodeCamp 2011 as well as the New Mexico .NET User’s Group (NMUG).

          EF4EnterpriseDemoCode.zip