Tags: , , , , , , , , , , , , | Categories: Event Posted by bsstahl on 10/16/2017 10:37 AM | Comments (0)

Another great Desert Code Camp is in the books. A huge shout-out to all of the organizers, speakers & attendees for making the event so awesome.

I was privileged to be able to deliver two talks during this event:

    • A Developer’s Survey of AI Techniques: Artificial Intelligence is far more than just machine learning. There are a variety of tools and techniques that systems use to make rational decisions on our behalf. In this survey designed specifically for software developers, we explore a variety of these methods using demo code written in c#. You will leave with an understanding of the breadth of AI methodologies as well as when and how they might be used. You will also have a library of sample code available for reference.

      • AI that can Reason "Why": One of the big problems with Artificial Intelligences is that while they are often able to give us the best possible solution to a problem, they are rarely able to reason about why that solution is the best. For those times where it is important to understand the why as well as the what, Hybrid AI systems can be used to get the best of both worlds. In this introduction to Hybrid AI systems, we'll design and build one such system that can solve a complex problem for us, and still provide information about why each decision was made so we can evaluate those decisions and learn from our AI's insights.

      Please feel free to contact me on Twitter with any questions or comments on these or any of my presentations.

      I previously wrote about a Hybrid AI system that combined logical and optimization methods of problem solving to identify the best solution to an employee shift assignment problem. This implementation was notable in that a hybrid approach was used so that the optimal solution could be found, but the system could still indicate to the users why a particular assignment was, or wasn’t, included in the results. 

      I recently published to GitHub a demo of a similar system. I use this demo in my presentation, Building AI Solutions that can Reason Why. The code demonstrates the hybridization of multiple AI techniques by creating a solution that iteratively applies a combinatorial optimization engine. Different results are obtained by varying the methods of applying the constraints in that model. In the final (4th) demo  method, an iterative process is used to identify what the shortcomings of the final product are, and why they are necessary.

      These demos use the Conference Scheduler AI project to build a valid schedule.

      There are 4 examples, each of which reside in a separate test method:

      ScheduleWithNoRestrictions()

      The 1st method in BasicExamplesDemo.cs shows an unconstrained model where only the hardest of constraints are excluded. That is, the only features of the schedule that are considered by the scheduler are those that are absolute must-haves.  Since there are fewer hard constraints, it is relatively easy to satisfy all the requirements of this model.

      ScheduleWithHardConstraints()

      The 2nd method in BasicExamplesDemo.cs shows a fully constrained model where  all constraints are considered must-haves. That is, the only schedules that will be considered for our conference are those that meet all of the scheduling criteria. As you might imagine, this can be difficult to do, in this case resulting in No Feasible Solution being found. Because we use a combinatorial optimization model, the system gives us no clues as to  which of the constraints cause the infeasibility, or what to do that might allow it to find a solution.

      ScheduleWithTimePreferencesAsAnOptimization()

      The 3rd method in BasicExamplesDemo.cs shows the solution when the true must-haves are considered hard constraints but preferences are not. The AI attempts to optimize the solution by satisfying as many of the soft constraints (preferences) as possible. This results in an imperfect, but possibly best case schedule, but one where we have little insight as to what preferences were not satisfied, and almost no insight as to why.

      AddConstraintsDemo()

      The final demo, and the only method in AddConstraintsDemo.cs, builds on the 3rd demo, where the true must-haves are considered hard constraints but preferences are not. Here however, instead of attempting to optimize the soft constraints, the AI iteratively adds the preferences as hard constraints, one at a time, re-executing the solution after each to make sure the problem has not become infeasible. If the solution has become infeasible, that fact is recorded along with what was being attempted. Then that constraint is removed and the process continues with the remaining constraints. This Hybrid process still results in an imperfect, but best-case schedule. This time however, we not only know what preferences could not be satisfied, we have a good idea as to why.

      The Hybrid Process

      The process of iteratively executing the optimization, adding constraints one at a time, is show in the diagram below.  It is important to remember that the order in which these constraints are added here is critical since constraining the solution in one way may limit the feasibility of the solution for future constraints.  Great care must be taken in selecting the order that constraints are added in order to obtain the best possible solution.

       

      Hybrid Conference Optimization Process

      The steps are as follows:

      1. Make sure we can solve the problem without any of the soft constraints.  If the problem doesn’t have any feasible solutions at the start of the process, we are certainly not going to find any by adding constraints.
      2. Add a constraint to the solution. Do so by selecting the next most important constraint in order.  In the case of our conference schedule, we are adding in speaker preferences for when they speak. These preferences are being added in the order that they were requested (first-come first-served).
      3. Verify that there is still at least 1 feasible solution to the problem after the constraint is added.  If no feasible solutions can be found:
        1. Remove the constraint.
        2. Record the details of the constraint.
        3. Record the current state of the model.
      4. Repeat steps 2 & 3 until all constraints have been tried.
      5. Publish the solution
        1. The resulting schedule
        2. The constraints that could not be added.  This tells us what preferences could not be accommodated.
        3. The state of the model at the time the failed constraints were tried.  This give us insight as to why the constraints could not be satisfied.

      Note: The sample data in these demos is very loosely based on SoCalCodeCamp San Diego from the summer of 2017. While some of the presenters names and presentations come roughly from the publicly available schedule, pretty much everything else has been fictionalized to make for a compelling demo, including the appearances by some Microsoft rock stars, and the "requests" of the various presenters.

      If you have any questions about this code, or about how Hybrid AIs can be used to provide more information about the solutions to problems than strictly optimization or probabilistic models, please contact me on Twitter.

      Tags: , , , , , , , , , | Categories: Development Posted by bsstahl on 9/29/2017 5:53 AM | Comments (0)

       

      My presentation from the #NDCSydney conference has been published on YouTube.

      We depend on Artificial Intelligences to solve many types of problems for us. Some of these problems have more than one possible solution. Handling those problems with more than one solution while building a modern AI system is something every developer will be asked to do over the course of his or her career. Figuring out the best way to utilize the capacity of a device or machine, finding the shortest path between two points, or determining the best way to schedule people or events are all problems where mathematical optimization techniques and tooling can be used to quickly and efficiently find solutions.

      This session is a software developers introduction to using mathematical optimization in Artificial Intelligence. In it, we will explore some of the foundational techniques for solving these types of problems, and use the open-source Google OR-Tools to put them to work in our AI systems. Since this is a session for developers, we'll keep it in terms that work best for us. That is, we'll go heavy on the code and lighter on the math.

      Tags: , , , , , , , , , | Categories: Event Posted by bsstahl on 6/23/2017 8:10 AM | Comments (0)

      The slide deck for my talk “A Developer’s Survey of AI Techniques” can be found below, while the demo code can be found on GitHub.

       

       

      The talk explores some of the different techniques used to create Artificial Intelligences using the example of a Chutes & Ladders game.  Various AIs are developed using different strategies for playing a variant of the game, using different techniques for deciding where on the game board to move.

      If you would like me to deliver this talk, or any of my talks, at your User Group or Conference, please contact me.

      Tags: , , , | Categories: Development Posted by bsstahl on 6/2/2017 7:27 AM | Comments (0)

      I recently had a developer colleague return from an AI conference and tell me something along the lines of "…all they really showed were algorithms, nothing that really learned." Unfortunately, there is this common misconception, even among people in the software community, that to have an AI, you need Machine Learning. Now don't get me wrong, Machine Learning is an amazing technique and it has been used to create many real breakthroughs in Software Engineering. But, to have AI you don't need Machine Learning, you simply need a system that makes decisions that otherwise would need to be made by humans. That is, you need a machine to act rationally. There are many ways to accomplish this goal. I have explored a few methods in this forum in the past, and will explore more in the future. Today however, I want to discuss the real value proposition of AI. That is, the ability to make decisions at scale.

      The value in AI comes not from how the decisions are made, but from the ability to scale those decisions.

      I see 4 types of scale as key in evaluating the value that Artificial Intelligences may bring to a problem. They are the solution space, the data requirements, the problem space and the volume. Let's explore each of these types of scale briefly.

      Solution Space

      The solution space consists of all of the possible answers to a question. It is the AIs job to evaluate the different options and determine the best decision to make under the circumstances. As the number of options increases, it becomes more and more important for the decisions to be made in an automated, scalable way. Artificial Intelligences can add real value when solving problems that have very large solution spaces. As an example, let's look at the scheduling of conference sessions. A very small conference with 3 sessions and 3 rooms during 1 timeslot is easy to schedule. Anyone can manually sort both the sessions and rooms by size (expected and actual) and assign the largest room to the session where the most people are expected to attend. 3 sessions and 3 rooms has only 6 possible answers, a very small solution space. If, on the other hand, our conference has 450 sessions spread out over 30 rooms and 15 timeslots, the number of possibilities grows astronomically. There are 450! (450 factorial) possible combinations of sessions, rooms and timeslots in that solution space, far too many for a person to evaluate even in a lifetime of trying. In fact, that solution space is so large that a brute-force algorithm that evaluates every possible combination for fitness, may never complete either. We need to depend on combinatorial optimization techniques and good heuristics to manage these types of decisions, which makes problems with a large solution space excellent candidates for Artificial Intelligence solutions.

      Data Requirements

      The data requirements consist of all of the different data elements needed to make the optimal decision. Decisions that require only a small number of data elements can often be evaluated manually. However, when the number of data elements to be evaluated becomes unwieldy, a problem becomes a good candidate for an Artificial Intelligence. Consider the problem of comparing two hitters from the history of baseball. Was Mark McGwire a better hitter than Mickey Mantle? We might decide to base our decision on one or two key statistics. If so, we might say that McGwire was a better hitter than Mantle because his OPS is slightly better (.982 vs .977). If, however we want to build a model that takes many different variables into account, hopefully maximizing the likelihood of making the best determination, we may try to include many of the hundreds of different statistics that are tracked for baseball players. In this scenario, an automated process has a better chance of making an informed, rational decision.

      Problem Space

      The problem space defines how general the decision being made can be. The more generalized an AI, the more likely it is to be applicable to any given situation, the more value it is likely to have. Building on our previous example, consider these three problems:

      • Is a particular baseball hitter better than another baseball hitter?
      • Is a particular baseball player better than another baseball player (hitter or pitcher)?
      • Is a particular baseball player a better athlete than a particular soccer player?

      It is relatively easy to compare apples to apples. I can compare one hitter to another fairly easily by simply comparing known statistics after adjusting for any inconsistencies (such as what era or league they played in). The closer the comparison and the more statistics they have in common, the more likely I am to be able to build a model that is highly predictive of the optimum answer and thus make the best decisions. Once I start comparing apples to oranges, or even cucumbers, the waters become much more muddied. How do I build a model that can make decisions when I don't have direct ways to compare the options?

      AIs today are still limited to fairly small problem spaces, and as such, they are limited in scope and value. Many breakthroughs are being made however that allow us to make more and more generalizable decisions. For example, many of the AI "personal assistants" such as Cortana and Siri use a combination of different AIs depending on the problem. This makes them something of an AI of AIs and expands their capabilities, and thus their value, considerably.

      Volume

      The volume of a problem describes the way we usually think of scale in software engineering problems. That is, it is the number of times the program is used to reach a decision over a given timeframe. Even a very simple problem with small solution and problem spaces, and very simple data needs, can benefit from automation if the decision has to be made enough times in a rapid succession. Let's use a round-robin load balancer on a farm of 3 servers as an example. Round Robin is a simple heuristic for load balancing that attempts to distribute the load among the servers by deciding to send the traffic to each machine in order. The only data needed for this decision is the knowledge of what machine was selected during the last execution. There are only 3 possible answers and the problem space is very small and well understood. A person could easily make each decision without difficulty as long as the volume remains low. As soon as the number of requests starts increasing however, a person would find themselves quickly overwhelmed. Even when the other factors are small in scale, high-volume decisions make very good candidates for AI solutions.

       

      These 4 factors describing the scale of a problem are important to consider when attempting to determine if an automated Artificial Intelligence solution is a good candidate to be a part of a solution. Once it has been decided that an AI is appropriate for a problem, we can then look at the options for implementing the solution. Machine Learning is one possible candidate for many problems, but certainly not all. Much more on that in future articles.

      Do you agree that the value in AI comes not from how the decisions are made, but from the ability to scale those decisions? Did I miss any scale factors that should be considered when determining if an AI solution might be appropriate? Sound off on Twitter.

      Tags: , , , , , , , , , | Categories: Development Posted by bsstahl on 5/25/2017 12:21 AM | Comments (0)

      I've written and spoken before about the importance of using the Strategy Pattern to create maintainable and testable systems. Strategies are even more important, almost to the level of necessity, when building AI systems.

      The Strategy Pattern is an abstraction tool used to maintain loose-coupling between an application and the algorithm(s) that it uses to do its job. Since the algorithms used in AI systems have many different ways they could be implemented, it is important to abstract the implementation from the system that uses it. I tend to work with systems that use combinatorial optimization methods to solve their problems, but there are many ways for AIs to make decisions. Machine Learning is one of the hottest methods right now but AI systems can also depend on tried-and-true object-oriented logic. The ability to swap algorithms without changing the underlying system allows us the flexibility to try multiple methods before settling on a specific implementation, or even to switch-out implementations as scenarios or situations change.

      When I give conference talks on building AI Systems using optimization methods, I always encourage the attendees to create a "naïve" solution first, before spending a lot of effort to build complicated logic. This allows the developer to understand the problem better than he or she did before doing any implementation. Creating this initial solution has another advantage though, it allows us to define the Strategy interface, giving us a better picture of what our application truly needs. Then, when we set-out to build a production-worthy engine, we do so with the knowledge of exactly what we need to produce.

      There is also another component of many AIs that can benefit from the use of the Strategy pattern, and that is the determination of user intent. Many implementations of AI will include a user interaction, perhaps through a text-based interface as in a chatbot or a voice interface such as a personal assistant. Each cloud provider has their own set of services designed to determine the intent of the user based on the text or voice input. Each of these implementations has its own strengths and weaknesses. It is beneficial to be able to swap those mechanisms out at will, along with the ability to implement a "naïve" user intent solution during development, and the ability to mock user intent for testing. The strategy pattern is the right tool for this job as well.

      As more and more of our applications depend heavily on algorithms, we will need to make a concerted effort to abstract those algorithms away from our applications to maintain loose-coupling and all of the benefits that loose-coupling provides. This is why I consider the Strategy Pattern to be a necessity when developing Artificial Intelligence solutions.

      Tags: , , , , , , , | Categories: Event Posted by bsstahl on 5/7/2017 3:40 AM | Comments (0)

      The slide deck for my presentation “Examples of Microservice Architectures” can be found here.

      There isn't one clear answer to the question "what does a micro-service architecture look like?" so it can be very enlightening to see some existing implementations. In this presentation, we will look at 2 different applications that would not traditionally be thought of as candidates for a service-oriented approach. We'll look at how they were implemented and what benefits the micro-services architecture brought to the table for each application.
      Tags: , , , , , , , , , , , , | Categories: Development Posted by bsstahl on 3/21/2017 7:41 AM | Comments (0)

      It is fairly easy these days to test code in isolation if its dependencies are abstracted by a reusable interface. But what do we do if the dependency cannot easily be referenced via such an interface?  Enter Shims, from the Microsoft Fakes Framework (formerly Moles).  Shims allow us to isolate our testing from any dependent methods, including methods in assemblies we do not control, even if those methods are not exposed through a reusable interface. To see how easy it is, follow along with me through this example.

      In this sample code on GitHub, we are building a repository for an application that currently gets its data from a file exported from a system that tracks scheduled meetings.  It is very likely that the system will, in the future, expose a more modern interface for that data so we have isolated the data storage using a simple Repository interface that has one method.  This method, called GetMeetings returns a collection of Meeting entities that start during the specified date range.  The method will return an empty collection if no data is found matching the specified criteria, and could throw either of 2 custom errors, a PermissionsException when the user does not have the proper permissions to access the information, and a DataUnavailableException for when the data source is unavailable for any other reason, such as a network outage or if the data file cannot be located.

      It is important to point out why a custom exception should be thrown when the data file is not found, rather than allowing the FileNotFoundException to bubble-up.  If we allow the implementation-specific exception to bubble, we have exposed an implementation detail to the caller. That is, the calling code is now aware of the fact that this is a file system implementation.  If code is written in a client that traps for FileNotFoundException, then the repository implementation is swapped-out for a SQL server implementation, the client code will have to change to handle the new types of errors that could be thrown by that implementation.  This violates the Dependency Inversion principle, the “D” from the SOLID principles.  By exposing only a custom exception, we are hiding those implementation details from the caller.

      Downstream clients can easily test code that uses this repository without having to actually access the repository implementation because we have exposed the IMeetingSourceRepository interface. However, it is a bit more difficult to actually test the repository implementation itself.  We have a few options here:

      • Create data files that hold known data samples and load those files during unit testing.
      • Create a wrapper around the System.IO namespace that exposes an interface, such as in the System.IO.Abstractions project.
      • Don’t test any code that requires reaching-out to the file system.

      Since I am of the opinion that 100% code coverage is both reasonable, and desirable (although not a measurable goal), I will summarily dispose of option 3 for the purpose of this analysis. I have used option 2 many times in my life, and while employing wrapper code is a valid and reasonable solution, it adds additional code to my production deployments that is very limited in terms of what it adds to the loose-coupling of my solution since I already am loosely-coupled to this implementation via the IMeetingSourceRepository interface.

      Even though it is far from a perfect solution (many would consider them more integration tests than unit tests), I initially selected option 1 for this implementation. That is, I created data files and deployed them along with my tests.  You can see the test files I created in the Data folder of the MeetingSystem.Data.FileSystem.Test project.  These files are deployed alongside my tests using the DeploymentItem directive that decorates the Repository_GetMeetings_Should class of the test project.  Using this method, I was able to create tests that:

      • Verify that the correct # of meetings are returned from a file
      • Verify that meetings are properly filtered by the StartDateTime of the meeting
      • Validate the data elements returned from the file
      • Validate that the proper custom exception is thrown if a FileNotFoundException is thrown by the underlying code

      So we have verified nearly everything we need to test in our implementation.  We’ve verified that the data is returned properly, and that one of our custom exceptions is being returned. But what about the PermissionsException?  We were able to simulate a FileNotFoundException in our tests by just using a bad filename, but how do we test for a permissions problem?  The ReadAllText method of the File object from System.IO will throw a System.Security.SecurityException if the file cannot be read due to a permissions problem.  We need to trap this exception and throw our own exception, but how can we validate that we have successfully done so and that the functionality remains intact through future refactoring?  How can we simulate a permissions exception on a file that we have enough permission on to deploy to a test folder? Enter Shims from the Microsoft Fakes Framework.

      Instead of having our tests actually reach-out to the file system and actually try to load a file, we can intercept calls to the System.IO.File.ReadAllText method and have those calls execute some delegate code instead. This code, which we write in our test methods, can be specific to each test and exist only within the context of the test. As a result, we are not deploying any additional code to production, while still thoroughly validating our code.  In fact, using this methodology, I could re-implement my previous tests, including my test data in the tests themselves, making these tests better unit tests.  I could then reserve tests that actually reach out to files for integration test libraries that are run less frequently, and perhaps even behind the scenes.

          Note: If you wish to follow-along with these instructions, you can grab the code from the DemoStart branch of the GitHub repo, rather than the Master branch where this is already done.

          To use Shims, we first have to create a Fakes Assembly.  This is done by right-clicking on the System reference in the test project from Visual Studio 2017, and selecting “Add Fakes Assembly” (full framework only – not yet available for .NET Core assemblies). Be sure to do this in the test project since we don’t want to actually deploy the Fakes assembly in our production code.  Using the add fakes assembly menu item does 2 things:

          1. Adds a reference to Microsoft.QualityTools.Testing.Fakes assembly
          2. Creates 2 .fakes XML files in the Fakes folder within the test project. These items are built into corresponding fakes dll files that are deployed with the test project and used to provide stub and shim objects that mimic the objects in the selected assemblies.  These fake objects reside in the same namespace as their “real” counterparts, except with “Fakes” on the end. Thus, our fake File object will reside in the System.IO.Fakes namespace.

          Fakes

          The next step in using shims is to create a ShimsContext within a Using statement. Any method calls that execute within this context can be intercepted and replaced by our delegates.  For example, a test that replaces the call to ReadAllText with a method that returns a single line of constant data can be seen below.

          Methods on shim objects are referenced through properties of the fake object.  These properties are of type FakesDelegate.Func and match the signature of the method being shimmed.  The return data type is also appended to the property name so that each item’s signature can be represented with a different property name.  In this case, the ReadAllText method of the File object is represented in the System.IO.Fakes.File object as a property called ReadAllTextString, of type FakesDelegate.Func<string, string>, since the method takes a string parameter (the path of the file), and returns a string (the text contents of the file).  If we assign a method delegate to this property, that method will be executed in place of the call to System.IO.File.ReadAllText whenever ReadAllText is called within the ShimContext.

          In the gist shown above, the variable p represents the input parameter and will hold the path specified in the test (in this case “April2017.abc”).  The return value for our delegate method comes from the constant string dataFile.  We can put anything we want here.  We can replace the delegate with a call to an anonymous method, or with a call to an existing method.  We can return a value gleaned from an external source, or, as is needed for our permissions test, throw an exception.

          For the purposes of our test to verify that we throw a PermissionsException when a SecurityException is thrown, we can replace the value of the ReadAllTextString property with our delegate which throws the exception we need to test for,  as seen here:

          System.IO.Fakes.ShimFile.ReadAllTextString =
                p => throw new System.Security.SecurityException("Test Exception");

          Then, we can verify in our test that our custom exception is thrown.  The full working example can be seen by grabbing the Master branch of the GitHub repo.

          What can you test with these Shim objects that you were unable to test before?  Tell me about it on Twitter @bsstahl.

          The demo code for my presentation on Testing in Visual Studio 2017 at the VS2017 Launch event can be found on GitHub.  There are 2 branches to this repository, the Main branch which holds the completed demo, and the DemoStart branch which holds the starting point of the demonstration in case you would like to implement the sample yourself.

          The demo shows how Microsoft Fakes (formerly Moles) can be used to create tests against code that does not implement a reusable interface. This can be done  without having to resort to integration style tests or writing extra wrapper code just to implement an interface.  During my launch presentation, I also use this code to demonstrate the use of Intellitest (formerly Pex) to generate exploratory tests.

          I really enjoy working with .NET Core.  I like the fact that my code is portable to many platforms and that the footprint is so much smaller than with traditional .NET applications.  Unfortunately, the tooling has not quite reached the level that we expect from a Microsoft finished product (which it isn’t – yet). As a result, there are some additional actions we need to take when setting up our solutions in Visual Studio 2015 to allow us to unit test our code properly.  The following are the steps that I currently take to setup and test a .NET Core library using XUnit and Moq.  I know that a number of these steps will be done for us, or at least made much easier, by the tooling in the coming months, either by Visual Studio 2017, or by enhancements to the Visual Studio 2015 environments.

          1. Create the library to be tested in Visual Studio 2015
            1. File > New Project > .Net Core > Class Library
            2. Notice that this project is created in a solution folder called ‘src’
          2. Create a solution folder named ‘test’ to hold our test projects
            1. Right-click on the Solution > Add > New Solution Folder
          3. Add a new console application to the test folder as our test project
            1. Right-click on the ‘test’ folder > Add > New Project > .Net Core > Console Application
          4. Add a reference to the library being tested in the test project
            1. Right-click on the test project > Add > Reference > Select the library to be tested
          5. Install packages needed for unit testing from NuGet to the test project
            1. Right-click on the test project > Manage NuGet Packages > Browse
            2. Install ‘xunit’ as our unit test runner
              1. The current version for .Net Core is ‘2.2.0-beta4-build3444’
            3. Install ‘dotnet-test-xunit’ to integrate xunit with the Visual Studio test tools
              1. The current version for .Net Core is ‘2.2.0-preview2-build1029’
            4. Install ‘Moq’ as our mocking library
              1. The current version for .Net Core is ‘4.6.38-alpha’
          6. Edit the project.json of the test library
            1. Change the “EmitEntryPoint” option to false
            2. Add “testrunner” : “xunit” node

          Some other optional steps include:

          • Install the ‘Microsoft.CodeCoverage’ package from NuGet to enable the code coverage tooling
          • Install the ‘Microsoft.Extension.DependencyInjection’ package from NuGet to enable DI
          • Install the ‘TestHelperExtensions’ package from NuGet to add extensions that assist with writing good unit tests
          • Add any additional runtimes that might be needed. Some options are:
            • win10-x86
            • win10-x64
            • win7-x86
            • win7-x64
          • Set ‘Run tests after build’ in Visual Studio so tests run automatically

          There will likely be better ways to do many of these things shortly, but if you know a better way now, please let me know via Twitter.