Tags: api, coding practices, code sample, development, generics, presentation, services, skill, SOA, speaking, visual studio |
Posted by bsstahl
10/12/2015 10:15 PM |
If you are building an API for other Developers to use, you will find out two things very quickly:
- Developers don't read documentation (you probably already know this).
- If your API depends on its documentation to get developers to understand and discover its features, it is likely that it will not be used.
Fortunately, there are some simple mechanisms for wrapping complex APIs and making their functionality both easy to use, and highly discoverable. An API that uses tools like IntelliSense in Visual Studio to make its features discoverable by the downstream developer is far more likely to be adopted then one that doesn't. In recent years, additions to the C# language have made creating a Domain Specific Language that uses a fluent syntax for nearly any API into a simple process.
Create the Context
The 1st step in simplifying any API is to provide a single starting point for the downstream developer to interact with. In most cases, the best practice is to use the façade pattern to define a context that holds our entity collections. Each collection of entities becomes a property on the context object. These properties all return an IQueryable<Entity>. For example, in the EnumerableStack demo solution on GitHub (https://github.com/bsstahl/SimpleAPI), I created an object Bss.EnumerableStack.Data.EnumerableStack to provide this functionality. It has two properties, Posts and Questions, each of which returns an IQueryable<Post>. It is these properties that will be used to access the data from our API.
The context object, on top of becoming the single point of entry for downstream developers, also hides any complexities in the construction logic of the underlying data source. That is, if there is any configuration or other setup required to access the upstream data provider (such as web service access or database connections), much of the complexity of that construction can be hidden from the API user. A good example of this can be seen in the FluentStack demo solution from the same GitHub repository. There, the Bss.FluentStack.Data.OData.FluentStack context object wraps the functionality of constructing the connection to the StackOverflow OData web service.
Extend Our Language
Now that we have data to access, it's time for us to extend our domain specific language to provide tools to make accessing this data simpler for the API caller. We can use Extension methods on IQueryable<Entity> to create custom filters for our data. By creating extension methods that accept IQueryable<Entity> as a parameter and return the same, we can create methods that can be chained together to form a fluent syntax that will perform complex filtering. For example, in the EnumerableStack solution , the Questions, WithAcceptedAnswer and TaggedWith methods found in the Bss.EnumerableStack.Data.Extensions module, can all be used to execute queries on the data exposed by the properties of our context object, as shown below:
var results = new EnumerableStack().Posts.WithAcceptedAnswer().TaggedWith("odata");
In this case, both the WithAcceptedAnswer and TaggedWith filters are applied to the data. The best part about these methods are that they are visible in Intellisense (once the namespace has been brought into scope with a Using statement) making the functionality easy to discover and use.
Another big advantage of creating these extension methods is that they can hide the complexity of the lower level API. Here, the WithAcceptedAnswer method is wrapping a where clause that filters for those posts that have an AcceptedAnswerId property that is non-null. It may not be obvious to a downstream API consumer that the definition of a post with an "accepted answer" is one where the AcceptedAnswerId has a value. Our API hides that implementation detail and allows the consumer to simply request what is needed. Similarly, the TaggedWith method hides the fact that the StackOverflow API stores tags in lower-case, within angle-brackets, and with all tags on a post joined into a single string. To search for tags, the consumer would need to know this, and take all appropriate actions when searching for a tag if we didn't hide that complexity in the TaggedWith method.
Simplify Query Predicates
A predicate is a function that accepts an entity as a parameter, and returns a boolean value. These functions are often used in the Where clause of a query to indicate which objects should be included in the result set. For example, in the query below
var results = new EnumerableStack().Posts.Where(p => p.Parent == null);
the function expression p => p.Parent == null is a predicate that returns true if the Parent property of the entity is null. For each entity passed to the function, the value of that property is tested, and if null, the entity is included in the results of the query. Here we are using a Lambda Expression to provide a delegate to our function. One of the coolest things about Linq is that we can now represent this expression in a variable of type Expression<Func<Entity, bool>>, that is, a Lambda expression of a function that takes an Entity and returns a boolean. This is pretty awesome because if we can store it in a variable, we can pass it around and enable extension methods like this one, as found in the Asked class of the Bss.EnumerableStack.Data library:
public static Expression<Func<Post, bool>> InLast(TimeSpan span)
return p => p.CreationDate > DateTime.UtcNow.Subtract(span);
This method accepts a TimeSpan object and returns the Lambda Expression type useable as a predicate. The input TimeSpan is subtracted from the current DateTime UTC value, and compared to the CreationDate property of a Post entity. If the creation date of the Post is later than 30-days prior to the current date, the function returns true. Since this InLast method is static on a class called Asked, we can use it like this:
var results = new EnumerableStack().Questions.Where(Asked.InLast(TimeSpan.FromDays(30));
Which will return questions that were asked in the last 30 days. This becomes even simpler to understand if we add a method extending Int called Days that returns a Timespan, like this:
public static TimeSpan Days(this int value)
allowing our expression to become:
var results = new EnumerableStack().Questions.Where(Asked.InLast(30.Days());
Walking through the Process
In my conference sessions, Simplify Your API: Creating Maintainable and Discoverable Code, I walk through this process on the FluentStack demo code. We take a query created against the StackOverflow OData API that starts off looking like this:
var questions = new StackOverflowService.Entities(new Uri(_serviceRoot))
.Posts.Where(p => p.Parent == null && p.AcceptedAnswerId != null
&& p.CreationDate > DateTime.UtcNow.Subtract(TimeSpan.FromDays(30))
and convert it, one step at a time, to this:
var questions = new FluentStack().Questions.WithAcceptedAnswer().
a query that is much simpler, easier to understand, easier to create and easier to maintain. The sample code on GitHub, referenced above, and available at http://github.com/bsstahl/SimpleAPI, contains the FluentStack.sln example which shows how to simplify an API created with an OData source. It also contains the EnumerableStack.sln project which walks through the same process on a purely enumerable data source, that is, an implementation that will work with any collection.
Have you used these tools to simplify an API for downstream programmers? Do you have other techniques that you use to do the same, similar, or additional things to make your APIs better? If so, Tweet it to me @bsstahl and let's keep the conversation going.
One of the things I do to take better control of my online presence is to use a different email address for every online service that I use. I do this for 3 main reasons:
1. To Reduce Spam
If a single alias starts receiving spam, I have a number of options:
- Since the alias is only used with 1 service, I can create a new alias for that service, update my profile on their website, and delete the old alias
- If I no-longer feel like I need to receive email from that service, I can simply delete the alias.
I can also determine if a company is selling my email address to spammers. If I have to recreate the alias for a single service more that once or twice due to spam, I can probably assume they are selling my address and take the appropriate steps. Finally, since I am using non-standard email addresses (not my name@, or info@, etc) they are harder for spammers to guess and therefore less susceptible to spam.
2. To Help Prevent Companies from Tracking Me Across Sites
One of the ways companies can line-up data about me across multiple services or websites is by my email address. Since many people use the same email address across all services, it can becomes an easy way to be confident that a user of one site, is the same person as the user of another site. It is common today for a single company to have many different brands and properties, and to combine data from all of them (or sell that data to others) in order to learn more about us. As a result, it can be a benefit to our privacy if we use a different email address for each.
That being said, it should be noted that there are a number of other ways companies can track us across sites. To truly do what you can to protect your privacy, there are several other steps you should take to prevent your data from being tracked across sites. Using different email aliases is just one step. Perhaps I will make this the subject of a future post.
3. To Help Protect Me in Case of Data Breach
Perhaps the most important reason for using a different email alias for every service is that eventually, my data at one or more of these services, will be compromised. Like companies who legally have access to my data, hackers can use their illegally obtained data to also try to match-up my accounts across multiple breaches, or across multiple sites. For example, a single data set can provide thousands of email/password combinations that can be tried at common sites like Twitter, or at banking, government and other key service sites. It makes sense that we do everything we reasonably can to protect our own information since we can't assume that the companies holding it will be able to protect it forever.
Pick a Method and Use It
I recommend using Outlook.com to create email aliases since that service allows you to create truly distinct aliases and tie them to the same account. Gmail can also create many aliases per account, but they all start with the same alias and just end with a plus sign and then the unique portion of the alias (i.e. myaccount+Guid1@gmail.com and firstname.lastname@example.org both work as aliases for email@example.com). This is better than nothing, but this pattern is easily identifiable and can be filtered-out using software.
A good pattern is to use GUIDs as the email addresses. That is, an address like B99C3900-157A-45F7-AD20-67EF83ED6776@outlook.com or B99C3900157A45F7AD2067EF83ED6776@outlook.com will almost always be available and is impossible to guess. If you create a number of such aliases and keep them with you, perhaps in a OneNote notebook, you will have functional email addresses to give whenever you are asked for a new one. Then you just need to associate that alias with the service in your notebook so you know not to use it again, and so you know where each alias was used.
Do you have a recommendation of an email service or alias pattern that has worked well for you? Sound off on Twitter using the hashtag #OneAliasPerAccount.
I am really looking forward to October because I have 3 awesome events that I’ll be speaking, and learning, at:
- The first event for the month is Code Camp NYC in Manhattan on October 10th. I have attending this event once before and loved it. I’m really looking forward to being there again.
- Next up is Atlanta Code Camp on October 24th. This will be my 1st time at this event, and my 1st time in Atlanta in many years. Hopefully, people will have some helpful suggestions for what to see and do when I am not at the Code Camp.
- Finally, I’ll be speaking at .NET Group – Southern Nevada’s .NET User Group in Las Vegas on October 29th. I’ve spoken in Las Vegas at the Code Camp there before, but have never had the privilege of attending their user group.
I have several other event possibilities in the works for November and beyond. I’ll announce them here periodically, but you can always see my schedule, as well as past events and the talks I am currently giving, using the “Speaking Engagements” link above.
While I was working on my last post, I experimented with some visualizations that I thought might help make my point a bit more clearly. I didn’t end up using them, but the whiteboard exercise that I went through in developing them helped me organize my thoughts, and, I believe, resulted in a better article.
Once I had drawn-out things the way I wanted them, I did what many people do with a whiteboard, I took a photo of it for my notes. The image above shows what resulted. As you can see, it isn’t a bad rendering, although certainly not perfect. The words and structure are both clearly visible and easily readable, but there is nothing all that impressive about it on its own. After all, there are a number of apps out there which can convert a photo of a whiteboard to a similar image. The part where it becomes interesting is when you see the original source photo, shown below.
You see, I was working on the post from my hotel room, and my “whiteboard” was the hotel window. Despite all of the background clutter, I didn’t have to do anything special to get the whiteboard image. I just did what I always do, open Office Lens, select whiteboard, and take a picture. The app did the rest. Not only that, but it also, once I saved it, automatically uploaded it to my OneNote so that, by the time I got back to my laptop, I already had a synced copy of it in OneNote ready to be dragged into the appropriate notebook. Plus, since my phone is set to sync my photos to OneDrive, I already had a copy of both the original image, and the whiteboard image, in my OneDrive Camera Roll. All of this is configurable of course. If you want, Office Lens will just save the images to your phone. But for me, the OneNote integration is a huge time-saver.
Oh, and by the way, it can also function as a document and business card scanner. Magic!
Office Lens is a free app from Microsoft that is available on all major phone platforms.
Code Coverage has been the topic of a number of conversations lately, most recently after the last Southeast Valley .NET User Group meeting where Jeremy Clark presented his great talk, Unit Testing Makes Me Faster. During this presentation, Jeremy eponymized, on my behalf, something I've been saying for a while, that the part of an application that you don't need to test is the part that your users don't care about. That is, if your users care about something in your application, you should be writing tests that ensure that the users' needs are fulfilled by your code. This has never really been a controversial statement, just one that sometimes gets lost in the myriad of information about unit testing and test driven development.
Where the conversation got really interesting was when we started discussing what should happen if you decide that a piece of code really isn't important to your users. It is my assertion that code which is deemed unimportant enough to the user that it might not be tested, should be removed from the project, even if is part of a standard implementation. I will attempt to justify this assertion by using the example of a property implementation that supports the INotifyPropertyChanged interface.
In this example, we have a property getter and setter. The getter simply returns the value stored in the internal member. However the setter holds some actual logic. In this case, the new value being set is compared to the current value of the property. If the property value is changing, the update is made and a method called that fires a notification event indicating that the value of the property has changed. This is a fairly common implementation, especially for View-Model layer code.
Decision: Do my users care about this feature?
The conditional in this code is designed to skip the assignment and the change notification if the property value is not really changing. If we were to eliminate the conditional, it would impact the users of this code in the following ways?
- A few CPU cycles may be wasted on an assignment that isn't doing anything
- An event indicating the property was changed would fire incorrectly
In the vast majority of cases, the performance hit from item 1 is trivial and can be ignored. Item 2 however is a bit more complicated. Unless I know for certain that firing the event when the property is not really changing isn't a problem, I have to assume it is a problem, since there are any number of things that could happen as a result of having an event fire. Often, when this event fires it will cause a refresh of the bound data to the UI elements. This may have a significant impact on performance, or it may not. There may also be additional actions taken by the programmers of this event client that may not be foreseeable when designing this layer. If the circumstances are such that I know there will be no problems if the event fires more often than it should, then I can probably conclude that my users don't care about this code. In all other circumstances, I should probably conclude that they do.
Decision: Should I remove this code?
If I have concluded that my users care about the code, then my path is clear, I should leave the code in place and write tests to make sure that the event fires when it should, and only when it should. However, if I have concluded that my users don't care about this particular code, then I have another decision to make. I need to decide if I should leave the code untested but in place, remove the code from my project, or leave it in and write tests for it anyway.
If the feature is not important to the users and there is no likelihood that the feature will become important to the users in the future, then the code should not be there. Period. We cannot waste time and effort supporting code that our users will not need. Scope-creep is a real danger to any project and should be avoided at all costs, even on the small stuff. Lots of small stuff adds up to big stuff, especially over the lifespan of any non-trivial application.
So, if the features are important to the users, we test them, if they are unimportant to the users, we remove them. No controversy here. The questions come in when there is a likelihood that the feature could become important in the future, or if the feature is important to someone other than the users, such as the developers.
Suppose we decide that the users are likely to request this feature in the future. Wouldn't it be easier just to implement the feature now, when we are already in the code and familiar with it? My answer to this is to fall back on YAGNI. You Ain't Gonna Need It, has proven itself a valuable principal for preventing scope-creep. Even if you think it is pretty likely that you'll need something later, the reality is that you probably won’t. Based on this principal, we should not be putting features into our projects that are not needed right now.
But what about the situation where code is important to someone other than the users, for example, the developers? In this case, we have to decide if the code really is important, or is it just another case where the YAGNI principal should be applied. Technical requirements can be legitimate, but any requirement that is not directly in support of the user's needs is a smell that should be investigated. In the case of our property setter, saying that standardization is important and using that logic to make standardization a requirement sounds a lot like saying "I think this feature may be important someday" and it probably falls to YAGNI to keep it out of our code. That being said, if there is a technical requirement that is truly needed, it should be tested like any other important requirement. For a little more information on this, see my earlier analysis Conflict of Interest: Yagni vs. Standardization.
How About we Leave It and Just Don't Test It?
It is important to remember that we shouldn't simply leave code untested in our production code, even if the users don't really care about it right now. If we do so, and the feature becomes important in the future, we will almost certainly end up with code that is important to our users, but is untested and therefore at-risk. We are unlikely to go back into an application and just add tests for a feature that already exists simply because that feature is now important when it wasn't earlier. We'd like to think we would, but the fact is that we won't. No, leaving the code in the application, but untested, is not an option.
The Case for 100% Code Coverage
So, we want to remove any code that is not currently required by our users, and test all code that is truly needed. If you have come along with me on this you may now realize that 100% code coverage is actually a reasonable goal, since that would be the result of removing all unneeded code and testing all needed code. This is not to say that it is reasonable to use Code Coverage as a metric with which to judge a development team, but instead it should be considered as a tool that can help identify scope-creep and missing tests. Since we are testing all code that our users care about, and not adding any code that the users don't care about, we should expect to approach 100% code coverage in order to have a good chance of producing well-tested, maintainable code that gives us the flexibility and confidence to refactor ruthlessly.
Code Coverage sometimes gets a bad reputation because it can be easy to game. That is, it is not a good metric of success for a development team. However, it is a magnificent tool to help you identify places where tests are missing. It won't tell you where your tests are not doing what they need to do, but it will tell you when you have a piece of code that is not exercised by any tests. If you are a TDD (Test-Driven-Development) practitioner, as I am, Code Coverage will tell you when you’ve gotten ahead of yourself and written code before writing a test for it. This is especially valuable for those who are just learning TDD, but never loses its value no matter how experienced you are at TDD.
Continue the Conversation
How do you feel about this logic? Did I miss something critical in this analysis? Have you found something different in your experience? Let's keep this conversation going on Twitter. Tweet me @bsstahl with your comments, or post on your blog and tweet me the link.
I've released a new Open-Source library of extension methods that can be used to create more effective unit and integration tests. This library is called TestHelperExtensions. The source code is available on GitHub (pull requests welcome), a .NET 4 package is available via NuGet, and the documentation is available here. The goal is to allow anyone to have access to the same set of test helpers I have been using, and building up, for many years.
I have been giving Test Driven Development (TDD) sessions at code camps and conferences for a number of years. During those sessions, I spend a lot of time in code, building up a test suite for a production application, and demonstrating the process I use for TDD. Part of this process is using a set of extension methods to perform common tasks, such as generating test data, and doing comparisons of DateTime values. Many people have asked for access to this library during these sessions and my answer has always been the same, "you can grab it from the sample code". Now, I've decided to make it easier for anyone to include it in their projects via NuGet, and to allow the community the opportunity to extend and modify the library on GitHub.
I still have a small backlog of features I'd like to add to this tool. After that, It's up to you what happens with it. If you have a feature suggestion, please let me know. Twitter is the best place to start a conversation about this, or any development topic with me. You can also create an issue on GitHub, or simply submit a pull request. I'd love to hear how you are using this library, and anything that can be done to make it more effective for you.
There was some confusion last week at the SoCalCodeCamp about what the phrase “One Reason to Change” actually means. As you probably know, the Single Responsibility Principle states that every class should have one and only one responsibility within the system. A common check for adherence to this principal is that the object has only one reason to change. However, it is important to realize that this is referring to the code (the class), not the state of the object (the instance). The state of the object may have many reasons to change, however, we as developers should have only 1 reason to change the code for our objects. For example, if the object is in the business-rules layer, we should only have to change the code if the business rules change. Likewise, if the object is in the data tier, it should only need code changes if the structure of the data changes.
Posted by bsstahl
6/30/2015 4:45 AM |
While this was an interesting and somewhat novel approach, it turned out to have a few fairly significant drawbacks:
Using this hybrid approach meant there were two runtimes that had to be initialized and operating during execution, a costly drain on system resources, especially for mobile devices.
Applications built using this methodology would run well on Windows 8 and 8.1 machines, as well as Windows Phone devices, but not on the web, or on Android or iDevices.
So, it seems that it is time for me to move to a more standard front-end development stack. I need one that is cross-platform, ideally providing a good deployment story for web, PC, tablet & phone, and supporting all major platforms including Android, iDevices & Windows phones and tablets. It also needs to be standards-based, and work using popular frameworks so that my apps can be kept up-to-date with the latest technology.
I’ll keep you informed of my progress and let you know if this does indeed turn out to be the best way for me to build apps. Stay tuned.
File this post under saving you some time that I spent worrying.
Recently, I started having problems with my OneNote notebooks not syncing on my primary laptop. Or at least, that’s what it seemed like was happening. I depend a lot on OneNote since I use it for all of my notes, on all of my devices, so this was a very big deal for me. The notebooks all had the little red “not-synced” icon on them, so I would request a manual sync. OneNote went through the process and looked like it was syncing, but at the end, all of the notebooks still had the little red icon on them saying they weren’t current.
The problem turned out to be that I had accidentally changed the radio-button at the top of the sync dialog from “Sync automatically whenever there are changes” to “Work offline - sync only when I click ‘Sync All’”. As a result, the notebooks always looked as if they were not up-to-date (because they might not have been) and were listed as “Not connected”. Of course, if I had looked at the last sync time, I would have seen that all notebooks had been synced as of the last manual sync. Everything was working just fine, I had just changed the setting, making it so my notebooks only synced when I forced it. By changing the radio button setting back to “Sync automatically…”, everything worked as I expected.
To allow ourselves to create the best possible services for our clients, it is important to make those services as flexible and maintainable as possible. Building services in an agile way helps us to create better services, however it makes it more likely that our service interface will, at some point, have to change. Changing a service interface after publication is, and should be, a well gated, well thought-out process. By changing the interface, you are changing the contract your service has with all of your clients, and you are probably requiring every one of the service consumers to change. This should not be done lightly. However, there are a few things that can be done to minimize the impacts of these changes. Several of these things require agreements with the clients up front. As a result, these items should be included in the Service Level Agreement (SLA) between the service providers and the consumers.
Caveat: I am a solution architect, not an expert in creating service level agreements. Typically, my only involvement with SLAs is to object when I can’t get what I need in one from a service provider. My intent here is to call-out a few things that all service providers should include in their SLAs to maintain the flexibility of their APIs. There are many other things that should be included in any good SLA that I will not be discussing here.
The two items that I believe should be included in all service SLAs are the requirements that the clients support both Lax Versioning and Forward Compatible Contracts. Each of these items is discussed in some detail below.
Lax Versioning allows us to add new, optional members to the data contract of the service without that change being considered a breaking change. Some modern service frameworks provide this behavior by default and many of the changes we might make to a service fall into this category. By reducing the number of changes that are considered breaking, we can lessen the burden on our implementation teams, reducing coordination requirements with service consumers, and shortening time to market of these changes.
One of the major impacts that Lax Versioning has is that it requires us to either avoid schema validation altogether, or to use specially designed, versionable schemas to do our validation. I recommend avoiding schema validation wherever reasonable and possible.
Forward Compatible Contracts
Forward Compatible Contracts, also known as the Round-Tripping of Unknown Data, requires that the service round-trip any additional data it gets, but doesn’t understand, back to the client and that clients round-trip any additional data they get, but don’t understand, back to the server. This behavior reduces the coupling between client and server for changes that are covered by Lax Versioning, but need to retain the additional data throughout the call life-cycle.
For example, suppose we were version a contract such that we added an additional address type to an employee entity (V1 only has home address, V2 has home and work addresses). If we change the service to return the V2 employee prior to changing the client, the client will accept the additional (optional) address type because we have already required Lax Versioning, but it will not know what to do with the information. If a V1 client without round-tripping support sends that employee back to the server, the additional address type will not be included. If however, the V1 client supports this round-tripping behavior, it will still be unable to use the data in the additional address field, but will return it to the server if the entity is sent back in a subsequent call. These behaviors with a V1 client and a V2 service are shown in the diagram below.
If the same practice is used on the server side, then we can decouple the client and server from many implementation changes. Clients would be free to implement the new versions of contracts as soon as they are ready, without having to wait for the service to roll-out. Likewise, many changes at the service side could be made knowing that data sent down to the clients will not be lost when it is returned to the server.
Making changes to the contract of existing services is a process that has risk, and requires quite a bit of coordination with clients. Some of the risks and difficulties involved in the process can be mitigated by including just 2 requirements in the Service Level Agreements of our services. By requiring clients to implement Lax Versioning and making our contracts Forward Compatible, we can reduce the impact of some changes, and decouple others such that we significantly reduce the risk involved in making these changes, and improve our time-to-market for these deployments.