Tags: , , , , , , , , , , , | Categories: Development Posted by bsstahl on 2/14/2019 9:31 AM | Comments (0)

Have you ever experienced that feeling you get when you need to extend an existing system and there is an extension point that is exactly what you need to build on?

For example, suppose I get a request to extend a system so that an additional action is taken whenever a new user signs-up.  The system already has an event message that is published whenever a new user signs-up that contains all of the information I need for the new functionality.  All I have to do is subscribe a new microservice to this event message, and have that service take the new action whenever it receives a message. Boom! Done.

Now think about the converse. The many situations we’ve all experienced where there is no extension point. Or maybe there is an extension mechanism in place but it isn’t quite right; perhaps an event that doesn’t fire on exactly the situation you need, or doesn’t contain the data you require for your use case and you have to build an entirely new data support mechanism to get access to the bits you need.

The cost to “go live” is only a small percentage of the lifetime total cost of ownership. – Andy Kyte for Gartner Research, 30 March 2010

There are some conflicting principles at work here, but for me, these situations expose the critical importance of flexibility and extensibility in our application architectures.  After all, maintenance and extension are the two greatest costs in a typical application’s life-cycle. I don’t want to build things that I don’t yet need because the likelihood is that I will never need them (see YAGNI). However, I don’t want to preclude myself from building things in the future by making decisions that cripple flexibility. I certainly don’t want to have to do a full system redesign ever time I get a new requirement.

For me, this leads to a principle that I like to follow:

I value Flexibility over Optimization

As with the principles described in the Agile Manifesto that this is modeled after, this does not eliminate the item on the right in favor of the item on the left, it merely states that the item on the left is valued more highly.  This makes a ton of sense to me in this case because it is much easier to scale an application by adding instances, especially in these heady days of cloud computing, than it is to modify and extend it. I cannot add a feature by adding another instance of a service, but I can certainly overcome a minor or even moderate inefficiency by doing so. Of course, there is a cost to that as well, but typically that cost is far lower, especially in the short term, than the cost of maintenance and extension.

So, how does this manifest (see what I did there?) in practical terms?

For me, it means that I allow seams in my applications that I may not have a functional use for just yet. I may not build anything on those seams, but they exist and are available for use as needed. These include:

  • Separating the tiers of my applications for loose-coupling using the Strategy and Repository patterns
  • Publishing events in event-driven systems whenever it makes sense, regardless of the number of subscriptions to that event when it is created
  • Including all significant data in event messages rather than just keys

There are, of course, dangers here as well. It can be easy to fire events whenever we would generally issue a logging message.  Events should be limited to those in the problem domain (Domain Events), not application events. We can also reach a level of absurdity with the weight of each message. As with all things, a balance needs to be struck. In determining that balance, I value Flexibility over Optimization whenever it is reasonable and possible to do so.

Do you feel differently? If so, let me know on Twitter.

Tags: , , , , , , , , | Categories: Development Posted by bsstahl on 2/11/2019 7:55 PM | Comments (0)

What is the result of converting a value that is close to, but not at, the maximum value of an Int64 from a double to a long (Int64)?  That is, what would be the result of an expression like:

(long)((double)(Int64.MaxValue – 1))

  1. 9223372036854775806 (263-2, the correct value numerically)
  2. -9223372036854775808 or another obviously incorrect value
  3. OverflowException
  4. Any of the above

Based on the framing of the question it is probably clear that the correct answer is "D". It is possible, depending on the hardware details and current state of your system, for any of the 3 possible outcomes.  Why is this and what can we do to be sure that the results of our floating-point operations are what we expect them to be?

Before we go into the ways we can modify the behavior of our operations, let's take a look at the two data types in question, Int64 and Double.

An Int64 value, also known as a long, is a fairly straightforward storage mechanism that uses 63 bits for the value and 1 bit to represent the sign.  Negative numbers are stored in twos-complement form to make mathematical operations simpler.  The result is that the Int64 type can store, with perfect fidelity, any integral value between -9223372036854775808 and 9223372036854775807.

The Double data type on the other hand is far more complex. It requires storage for continuous values, not just integers. As a result, the Double data type uses 52 bits to store the mantissa (value), 11 bits to store the exponent (order of magnitude) and the remaining bit of the 64-bit structure to store the sign. Both the exponent and mantissa are shifted by a few bits based on some fairly safe assumptions.  This gives us a range of values for the exponent of -1023 to 1024 and a little more than 52 bits of fidelity in the mantissa.

It is this difference in fidelity; 63 bits for Int64 and roughly 52 bits for Doubles, that can cause us problems when converting between the two types.  As long as the integer value can be stored in less than 52 bits (value < 4503599627370495) values can be converted back and forth between Int64 and Double without any data loss. However, as soon as the values cannot be represented completely in 52 bits, data loss is likely to occur.

To store such a value in a Double data type, the exponent is adjusted higher and the best available value for the mantissa is found.  When converted back to Int64, this value will be rounded automatically by the framework into the closest integer value. This resulting value may, or may not, be exactly the same as the original value.  To see an example of this, execute the following code in your favorite C# environment:

Console.WriteLine((long)9223372036854773765.0);

If your system is like mine, you’ll get an answer that is not the same as the original value. On my system, I get the result 9223372036854773760. It is said that this integer does not “round-trip” since it cannot be converted into a Double and then back to an integer.

To make matters worse, the rounding that is required for this conversion can be unsafe under certain conditions. On my machine, if the values get within 512 of Int64.MaxValue, even though they don’t exceed it, attempting the conversion may result in an invalid result, or an OverflowException. Even performing the operation without overflow checking using the unchecked keyword or compiler switch doesn't improve things since, if done unchecked, any overflow in the operation will result in an incorrect value rather than an exception. I prefer the exception in this kind of situation so I generally keep overflow checking on.

The key takeaway for me is that just checking to make certain that a Double value is less than Int64.MaxValue is not enough to guarantee it will convert without error, and certainly does not guarantee the accuracy of any such conversion. Only integer values below 52 bits can be accurately converted into Int64 values. 

It is always best to avoid type conversions if possible, but if you are in a situation where it is necessary to convert from large Double values into Integers, I recommend trying some experiments in your production environment to see what range of values will convert accurately. I also highly recommend including very large integers, approaching or at Int64.MaxValue as test data against any method that accepts Int64 values.  Values that are very large in the negative direction (nearing Int64.MinValue) are also good candidates to be used as test data in these methods.

I’ve attached a number of resources below that I used in my research to produce this article, and to fix the bug I caused doing this kind of conversion.  If you have run into this situation and come up with an interesting way of handling it, or if the results of your conversions are different than mine, please let me know about it on Twitter

Resources


      Tags: , , , , , , , , | Categories: General Posted by bsstahl on 12/13/2018 2:39 PM | Comments (0)

      I was recently interviewed by Dave Rael (@raelyard) for his Developer on Fire Podcast.  I had a great time talking with Dave about a lot of different things, both professional and personal, and got to name-drop just a few of the many people who have been a part of my journey over the years.

      I also took the opportunity to talk about a few things that have been on my mind:

      I hope you enjoy this interview and find something of value in it. If so, please let me know about it on Twitter.

      Developer On Fire

      Tags: , , , , , , , , , , , , | Categories: Event Posted by bsstahl on 11/10/2018 12:39 PM | Comments (0)

      The slide decks for my two talks at SoCalCodeCamp USC from November 10, 2018 are below.

      Thanks to all of the organizers and attendees of this always amazing event.

      Tags: , , , , , , , | Categories: Event Posted by bsstahl on 10/30/2018 8:10 PM | Comments (0)

      Code Monkey 3 Duckin it (1)

      March 8th – 10th 2019

      Mark your calendars to block-out the weekend of March 8th 2019 for the next AZGiveCamp Hackathon-of-Help. More details will be coming very soon so keep an eye on AZGiveCamp.org and Meetup for all the particulars as soon as they are available.  I’m looking forward to seeing you all at our 9th event, helping those who help our community.

      Tags: , , , , , , , , , , , , , , , , | Categories: Event Posted by bsstahl on 9/26/2018 11:12 AM | Comments (0)

      I will be speaking tonight, 9/26/2018 at the Northwest Valley .NET User Group and tomorrow, 9/27/2018 at the Southeast Valley .NET User Group. I will be speaking on the subject of WebAssembly. The talk will go into what WebAssembly programs look and act like, and how they run, then explore how we as .NET developers can write WebAssembly programs with Microsoft’s experimental platform, Blazor.

      Want to run your .NET Standard code directly in the browser on the client-side without the need for transpilers or browser plug-ins? Well, now you can with WebAssembly and Blazor.

      WebAssembly (WASM) is the W3C specification that will be used to provide the next generation of development tools for the web and beyond. Blazor is Microsoft's experiment that allows ASP.Net developers to create web pages that do much of the scripting work in C# using WASM.

      Come join us as we explore the basics of WebAssembly and how WASM can be used to run existing C# code client side in the browser. You will walk away with an understanding of what WebAssembly and Blazor can do for you and how to immediately get started running your own .NET code in the browser.

      The slide deck for these presentations can be found here IntroToWasmAndBlazor-201809.pdf.

      Tags: , , , , , , | Categories: Development Posted by bsstahl on 3/15/2018 8:22 PM | Comments (0)
      plus ça change, plus c'est la même choseThe more that things change, the more they stay the same. – Rush (and others Winking smile )

      In 2013 I wrote that programmers needed to take responsibility for the output of their computer programs.  In that article, I advised developers that the output of their system, no matter how “random” or “computer generated”, was still their responsibility. I suggested that we cannot cop out by claiming  that the output of our programs is not our fault simply because we didn’t directly instruct the computer to issue that specific result.

      Today, we have a similar problem, only the stakes are much, much, higher.

      In the world of 2018, our algorithms are being used in police work and inside other government agencies to know where and when to deploy resources, and to decide who is and isn’t worthy of an opportunity. Our programs are being used in the private sector to make decisions from trading stocks to hiring, sometimes at a scale and speed that puts us all at risk of economic events. These tools are being deployed by information brokers such as Facebook and Google to make predictions about how best to steal the most precious resource we have, our time.  Perhaps scariest of all, these algorithms may be being used to make decisions that have permanent and irreversible results, such as with drone strikes.  We  simply have no way of knowing the full breadth of decisions that AIs are making on our behalf today.  If those algorithms are biased in any way, the decisions made by these programs will be biased, potentially in very serious ways and with serious results.

      If we take all available steps to recognize and eliminate the biases in our systems, we can minimize the likelihood of our tools producing output that we did not expect or that violates our principles.

      All of the machines used to execute these algorithms are bias-free of course.  A computer has no prejudices and no desires of its own.  However, as we all know, decision-making  tools learn what we teach them.  We cannot completely teach these algorithms free of our own biases.  It simply cannot be done since all of our data is colored by our existing biases.  Perhaps the best known example of bias in our data is in crime data used for policing. If we send police to where there is most often crime, we will be sending them to the same places we’ve sent them in the past, since generally, crime involves having a police office in the location to make an arrest. Thus, any biases we may have had in the past about where to send police officers, will be represented in our data sets about crime.

      While we may never be able to eliminate biases completely, there are things that we can do to minimize the impact of the biases we are training into our algorithms.  If we take all available steps to recognize and eliminate the biases in our systems, we can minimize the likelihood of our tools producing output that we did not expect or that violates our principles.

      Know that the algorithm is biased

      We need to accept the fact that there is no way to create a completely bias-free algorithm.  Any dataset we provide to our tools will inherently have some bias in it.  This is the nature of our world.  We create our datasets based on history and our history, intentionally or not, is full of bias.  All of our perceptions and understandings are colored by our cognitive biases, and the same is true for the data we create as a result of our actions.  By knowing and accepting this fact, that our data is biased, and therefore our algorithms are biased, we take the first step toward neutralizing the impacts of those biases.

      Predict the possible biases

      We should do everything we can to predict what biases may have crept into our data and how they may impact the decisions the model is making, even if that bias is purely theoretical.  By considering what biases could potentially exist, we can watch for the results of those biases, both in an automated and manual fashion.

      Train “fairness” into the model

      If a bias is known to be present in the data, or even likely to be present, it can be accounted for by defining what an unbiased outcome might look like and making that a training feature of the algorithm.  If we can reasonably assume that an unbiased algorithm would distribute opportunities among male and female candidates at the same rate as they apply for the opportunity, then we can constrain the model with the expectation that the rate of  accepted male candidates should be within a statistical tolerance of  the rate of male applicants.  That is, if half of the applicants are men then men should receive roughly half of the opportunities.  Of course, it will not be nearly this simple to define fairness for most algorithms, however every effort should be made.

      Be Open About What You’ve Built

      The more people understand how you’ve examined your data, and the assumptions you’ve made, the more confident they can be that anomalies in the output are not a result of systemic bias. This is the most critical when these decisions have significant consequences to peoples’ lives.  A good example is in prison sentencing. It is unconscionable to me that we allow black-box algorithms to make sentencing decisions on our behalf.  These models should be completely transparent and subject to our analysis and correction.  That they aren’t, but are still being used by our governments, represent a huge breakdown of the system, since these decisions MUST be made with the trust and at the will of the populace.

      Build AIs that Provide Insight Into Results (when possible)

      Many types of AI models are completely opaque when it comes to how decisions are reached.  This doesn’t mean however that all of our AIs must be complete black-boxes.  It is true that  most of the common machine learning methods such as Deep-Neural-Networks (DNNs) are extremely difficult to analyze.  However, there are other types of models that are much more transparent when it comes to decision making.  Some model types will not be useable on all problems, but when the options exist, transparency should be a strong consideration.

      There are also techniques that can be used to make even opaque models more transparent.  For example, a hybrid technique (here & here)  can be used to run opaque models iteratively.  This can allow the developer to log key details at specific points in the process, making the decisions much more transparent.  There are also techniques to manipulate the data after a decision is made, to gain insight into the reasons for the decision.

      Don’t Give the AI the Codes to the Nukes

      Computers should never be allowed to make automated decisions that cannot be reversed by a human if necessary. Decisions like when to attack a target, execute a criminal, vent radioactive waste, or ditch an aircraft are all decisions that require human verification since they cannot be undone if the model has an error or is faced with  a completely unforeseen set of conditions. There are no circumstances where machines should be making such decisions for us without the opportunity for human intervention, and it is up to us, the programmers, to make sure that we don’t give them that capability.

      Don’t Build it if it Can’t be Done Ethically

      If we are unable to come up with an algorithm that is free from bias, perhaps the situation is not appropriate for an automated decision making process.  Not every situation will warrant an AI solution, and it is very likely that there are decisions that should always be made by a human in totality.  For those situations, a decision support system may be a better solution.

      The Burden is Ours

      As the creators of automated decision making systems, we have the responsibility to make sure that the decisions they make do not violate our standards or ethics.  We cannot depend on our AIs to make fair and reasonable decisions unless we program them to do so, and programming them to avoid inherent biases requires an awareness and openness that has not always been present.  By taking the steps outlined here to be aware of the dangers and to mitigate it wherever possible, we have a chance of making decisions that we can all be proud of, and have confidence in.

      Tags: , , , | Categories: General Posted by bsstahl on 3/10/2018 8:07 PM | Comments (0)

      I recently gave my very first Toastmasters speech. I’m rather proud of it. It certainly didn’t go perfectly but was a good introduction to Toastmasters for me, and a good introduction of me to my Toastmasters club.

      For those who aren’t familiar with the process, everyone’s 1st Toastmaster speech is called an Icebreaker and is a way to introduce a new Toastmaster to the other members of the club. In my Icebreaker, I chose to introduce myself to my club by talking about just a few of the people who I feel made important historical contributions that paved my path to today.

      The transcript and video of this presentation can be found below.

      I like to describe myself as the kind of person who has a list of his favorite physicists and favorite mathematicians. The thought being that just knowing I have such a list tells you everything you really need to know about me. Today I'd like to tell you a little bit more about me, to go a little bit deeper, and tell you about me by telling you about just a few of the people on my list and why I find them so fascinating and so important.

      We start in ancient Greece in the 4th century BCE. Democritus of Abdura develops a theory of the composition of matter in the universe that is based on what he calls "atoms". These atoms are physically indivisible, always in motion, and have a lot of empty space in between. He is the first person to develop a theory like this, of the creation of the universe and the existence of the universe in a way that is explainable, that is predictable, that we can understand. As such, may people consider him to be the first scientist. It is this reasoning, that the universe is knowable, that has made all technological advancement that we've had since, possible.

      One such advancement came in 1842 so let's jump forward from the first scientist to the first computer programmer. Charles Babbage has created his Analytical Engine, and Ada, Countess of Lovelace, translates an article on using that machine to calculate the Bernoulli numbers which was a well known mathematical sequence. She created notes on this article that describes the inputs and instructions and the states of all the registers of the machine at each point in the process. This, deservingly so, is considered to be the first ever computer program. But more than even creating the first program, Ada Lovelace recognized the capabilities of these machines. She recognized that they could be more than just machines that analyze numbers, they could analyze anything that could be represented by numbers. She predicted that they could be used to compose music, create graphics, and even be usable in scientific experiments. This recognition of the computer as a general purpose tool, rather than just as a fancy calculator, is what made all of society's advancements that were based on computers and computer processing, possible.

      There are many other people on my list that I'd like to talk about: Nicola Tesla and Alan Turing; Grace Hopper and Albert Einstein.

      But there are really two modern physicists that played a greater role than any in my path to today.  The first of those is Carl Sagan.  Dr. Sagan had the ability to communicate in a very accessible way his almost childlike awe and wonder of the cosmos.  He combined the resources and knowledge of a respected scientist with the eloquence of a teacher and a poet, and made science and scientific education available to an entire generation as it never had been before. 

      Perhaps the most significant reason though that Carl Sagan has become important to me, especially in the last few years, is that he reminds me, quite powerfully, of number one on my list, my favorite physicist of all, my father Hal Stahl, who passed away on this very day, two years ago. Dad's specialty was optics, he loved to play with light and its properties. He also loved math and its power to explain the concepts in physics.  Like my father I love how math, especially calculus, make the calculations of practical things feasible. So much so, that had I recognized the power of physics combined with calculus, before I learned to make computers do my bidding, my career might have taken a slightly different path.

      I hope I have given you a few insights into my worldview through the lens of those I idolize.  I like to think that my list shows the value I place on education, especially STEM. It also shows that I recognize the value of collaboration and understand how much of what we do depends on those who came before us.  Isaac Newton famously stated, "If I have seen farther [than others] it is by standing on the shoulders of giants."  My list of the giants on whose shoulders I stand can be found on Twitter @bsstahl. To me that list represents just a few of the many without whom our work and our world would not be possible.

      Tags: , , , , , , , , , | Categories: Event Posted by bsstahl on 12/11/2017 10:25 PM | Comments (0)

      The slide deck for my presentation “Building AI Solutions with Google OR-Tools”, as delivered at SoCalCodeCamp Los Angeles 2017, is available below.

      As a reminder, a video of the same session delivered at NDC Sydney in August of 2017 is available on YouTube.

      Tags: , , , , , , , , , , , , | Categories: Event Posted by bsstahl on 10/15/2017 6:37 PM | Comments (0)

      Another great Desert Code Camp is in the books. A huge shout-out to all of the organizers, speakers & attendees for making the event so awesome.

      I was privileged to be able to deliver two talks during this event:

        • A Developer’s Survey of AI Techniques: Artificial Intelligence is far more than just machine learning. There are a variety of tools and techniques that systems use to make rational decisions on our behalf. In this survey designed specifically for software developers, we explore a variety of these methods using demo code written in c#. You will leave with an understanding of the breadth of AI methodologies as well as when and how they might be used. You will also have a library of sample code available for reference.

          • AI that can Reason "Why": One of the big problems with Artificial Intelligences is that while they are often able to give us the best possible solution to a problem, they are rarely able to reason about why that solution is the best. For those times where it is important to understand the why as well as the what, Hybrid AI systems can be used to get the best of both worlds. In this introduction to Hybrid AI systems, we'll design and build one such system that can solve a complex problem for us, and still provide information about why each decision was made so we can evaluate those decisions and learn from our AI's insights.

          Please feel free to contact me on Twitter with any questions or comments on these or any of my presentations.