|
Posts
|
You have the chance to get My book for half price tomorrow. Use code dotd0416au at http://www.manning.com/logan/
The process of collaboration and development on a project is important. The means by which you get your code into a deliverable state matters and matters a lot. While the method of delivery and what constitutes a deliverable state changes from company to company, team to team and even project to project but there is always a point at which you want 'make something available' ie, publishing
The Canonical Repository is the repository we deliver code from, it serves as the central reference point, the tip of development for the project. Being the point of global collaboration it is sacrosanct. No one pushes Work In Progress (WIP) code to this repository and no single person owns it. It exists to hold the history of the project that is that is currently under development and be the point of coordination for accepted code in the team.
There is a bit of a mind hack going on here that I encourage you keep intact. That is the 'sacredness' of canonical. It should never become something that a Developer pushes too as part of his daily workflow. It should always carry the sense that things that go into it are important. In my experience this vastly reduces the screw-ups, bugs, bad pushes etc that can be so painful to a team. It also helps encourage both the Developer and the code reviewer to take their job seriously without forcing a lot of process onto them. That lack of strict painful process is what makes this approach so powerful.
Developers work in Forks of the Canonical Repository. They create
branches, work on experimental changes, code to meet User Story
requirements of the system and collaborate with each other through
these Forks. They use these forks as the point of coordination with
other Developers that they are working on a User Story with. This type
of Code sharing (during the development process) should happen via
direct pulling from the forks of their peers. That is, Developers
wanting to use code from a peer should use git fetch or git pull
to gather the code they are interested in using from their peers
forks. They shouldn't use Pull Requests. Pull Requests are a 'heavy
weight' construct that should really only be used to get code into
canonical.
There is a second big mind hack going on here. That is the fact that nothing in the Developer repo matters until the Developer says it matters by creating a pull request. We want the Developer to be productive, we want him to use the tools that work for him and the process that is most comfortable for him. So we place no restrictions on how he edits, manages and commits code in his own repository. The only thing that matters is that the code submits for inclusion in canonical via a Pull Request meets the standards the team has set, through peer reviews, automated test suites and the like.
We want him to explore freely. We want him to be the most productive that he can be using the tools that he is comfortable with. We don't want him to worry about what impact that exploration will have on peers or if someone is going to be looking over his shoulder trying to validate the quality of code that may never actually get into canonical.
We don't impose tools or process on Developers in their own repo. They can code using any process they would like, using any editor, compiler, platform etc. It doesn't matter in the least as long as the output meets the team's standards. Code that exists in the Developer's repo is simply potential. That is, it doesn't matter until it is resolved i by creating a Pull Request. It has no impact on the world (ie the project/team/organization) until the Developer feels that it is ready, until he explicitly converts it from Potential to Realized via a Pull Request. Think of the Developer's repository like Schrödinger's box and the code like Schrödinger's cat. No one should know or care what state the code is in until the Developer open's the box by creating a pull request.
In realizing potential code, There are always two parties to the process. The Developer or Developers producing the code and the Developer that's going to review/validate that code. Lets get started with the Developer's side.
The Developer puts the completed code onto a dedicated branch in his repository and refactors that to meet the standards of the project. He should ensure that the following invariants hold.
Once the Developer creates a Pull Request, a team mate is selected to do the review. There are a ton of ways to select a reviewer. In my experience the best way to do it is just let the code reviewer self select. As long is its not always the same person stepping up things should be fine. If you have put together a good mature team this approach is, by far, the best.You could also just to randomly assign a team mate, have the Developer producing the code to pick the team mate that is going to review. In the end how the Reviewer is selected matters a lot less then the fact that you have a reviewer.
The final responsibility of the Developer is to make sure make sure the code gets reviewed. The longer code sits out without review the more likely it is that bit rot will occur, that merging into the Canonical branch will become painful etc. So the Developer needs to do what is necessary to make sure that code does not sit out in a Pull Request for more then a few days without any action. This may require him to respond quickly to comments from his reviewer or it may require him to go stand over his reviewers shoulder while the reviewer is doing the Pull Request. It depends on where the problem lies.
The Reviewer's job is to review the Pull Request to validate that it meets the project standard. This comprises a few things.
If any of these steps fail the Reviewer should push back on the original Developer to make fix the code. This may take several iterations. That is completely fine. This is normal development, we are Engineers and we want to create a well engineered, maintainable product. The above steps are the minimal steps that need to be done ensure this standard. Sub-par code should never make it into Canonical simply because the Developer is annoyed with the process or the or timelines are tight. You invariably pay more in the long run by giving in to these pressures then you gain in the short run by letting bad code pass.
When the change is reviewed and accepted the reviewer signs merges the Pull Request.
Once the code is in Canonical and out of the Developer's repository, history revision, commit amends and the like should all stop. At that point your team depending on the code and the order of the code and changes to history become painful. If there is a bug or a fix that needs to be made it should go into a new commit and go through the process previously outlined above.
This process is actually very smooth, but it does assumes a few things:
If these things are true then you are golden. It may take a little bit of time to get a feel for the process and work around the little hiccups that will inevitably occur. However, the process should be flowing smoothly after a few weeks.
This is not intended to be a rigid process. The only place that real process actually comes into play in the transition from the Developer's repository to the Canonical repository. This is true by design. Its purpose is to encourage the wild and wooly exchange and growth of ideas, creativity and productivity in the project wherever that is possible while still providing a enough rigidity and discipline where it is required. Getting that balance right means that you get the most creativity and productivity possible from your Developers and the most maintainable, well engineered code possible while at the same time keep the team happy. This is what this process seeks to encourage.
The big thing to watch out for is the urge to bypass the process. Sometimes when you are in a tight situation it can be tempting to want to push code directly to Canonical, bypassing the review. For example, you might have a bug in production, lets say your system is down and its costing the company a million dollars a minute. Pressure is high and you may have a huge urge to tell the devs (or yourself if you are the dev) that the process would only slow you down and the code needs to get out right now. This is almost invariably a bad decision. This process should only take a few minutes assuming the code is good. The likelihood that the code is bad in that situation is high and its much cheaper timewise to catch those problems in the review cycle then to deploy the code and realize in production that you fixed one bug but introduced another.
I have given a lot of talks over the last ten years or so some good and some bad. There are three talks that specifically stand out in my mind and bad talks from the last few years. The first was a introductory talk that we call 'Essentials' that I gave at StrangeLoop in 2011, the second was the same Essentials talk that I gave at the Indianapolis Java Users group last year. The third and final talk and the one that got me thinking about this topic was a talk on release handling in A Coruña Spain last week. Each of these talks where bad, far below what I consider an acceptable level. The reason these three stand out is because they where all bad for the same reason and that reason was over-confidence.
What do I mean by that? For better or worse, I am a expert in Erlang. I have been writing Erlang code and studying the Erlang Virtual Machine and runtime system for the better part of ten years. I am confident in my knowledge of Erlang. I also have a huge amount of experience as a speaker. I have been getting up in front of people and giving talks about various topics since I was nine or ten years old. I have, specifically, been getting up in front of technical audiences for the better part of ten years and giving talks on various topics. So when you are very comfortable in front of people and you know your material well it is easy to get lazy. Thats what happened in each of these cases. I got lazy, I assumed I knew the material well enough to give the talk and didn't give enough effort to preparation.
In the end giving a good talk is less about knowing your material then about making sure your material is coherent. You do need to know the material, of course, but you also need to go through and make sure the information you are providing is in a form that your audience can consume. That is a very different thing then giving a talk off the cuff. So even when you know the material you need to order that material and massage it. You need to present it. Then you need to get familiar with that order and that presentation so you can integrate it into your approach.
I actually gave the Essentials talk again in A Coruña and did pretty well. All it took was going over the material once or twice, making sure I understood the order and form and the reason for that order and form. That small investment in time resulted in a huge improvement in the quality of the talk. I should have done the same thing for the Release Handling talk, but I fell into the old trap again. I wont say that I wont make this mistake again because its trap that is so easy to fall into. However, its something I am armed against now and will hopefully avoid.
The Dialyzer is a static analysis tool that identifies software discrepancies using a Hindly Milner style system called Success Typing. Basically, it brings some of the benefits of Haskell or Ocaml style type checking to Erlang. If you are not using it as part of the automated build for your system you should be.
Dialyzer does it's thing going through all of your direct and transitive dependencies and pulling out lots of information about types, function calls etc. It then checks that that 'type universe' is consistent with the calls you are making in your target. However, All this pulling out of type information takes quite a long time so Dialyzer has been given the ability to cache this information in files called PLT files.
While PLT files are wonderful for saving you time and you should use them. You must be careful about how you use them. In fact, most people are using them in ways that will cause them problems in the long run. Its common to have a 'Global' PLT file, either shipped with the Erlang distribution or in the users home directory. Using Global PLT files is a lot like using global state in your code. Its a practice that leads to subtle and hard to debug errors.
Why is that? Well it has a lot to do with the fact that in the post
Rebar world version numbers don't actually mean what they should they
should. Huh? what does that have to do with Dialyzer? Give me a
second and I will explain. A lot of folks out there have the
dependencies in their rebar.config pointing to a branch or other
mutable ref instead of a tag. This means that every time they do a
rebar get-deps and pull down their deps (maybe after a clean) they
have slightly different code under the same version number. That is,
the version number is specified in the *.app or the *.app.src and that
doesn't change but with every new commit the code associated with
that version number changes. That pretty much eliminates the value of
version numbers as useful identifiers. Unfortunately, Dialyzer and
several other OTP tools rely on those version numbers. So lets take
the case where you have a Global PLT file where you keep the cached
information about OTP Applications. Internally, this file is organized
by application and version. However, every single one of projects you
are currently working on you work on (if you are depending on branches
or head instead tags) are working on subtly different implicit
versions for each dependency, all identified by an invalid explicit
version that Dialyzer is using.
You may go a long time without this causing you problems, but at some point Dialyzer is going to think your code is wrong based on what it knows about an App but that App is different from the one you have as a dependency and so the warning is invalid. You wont know that though and you will spend a ton of time trying to figure out what is wrong and eventually give up and ignore the warning in your mind or maybe even stop using Dialyzer.
How do we fix this? We could do this by relying on explicit tags
everywhere and making sure that we don't pollute our version space but
thats hard to do. Its much simpler and safer to solve this problem by
treating the PLT file just like any other compile output. It should be
generated on a per project basis and kept in the 'build' area of the
project. Yes, that means it should be regenerated for each project,
and that takes a little while but doing it this way solves all the
problems described above in a clean and simple way. You probably
already have a Makefile driving Rebar so lets leverage that Makefile
to do the Dialyzer magic in simplest and least intrusive way
possible. Basically, we will set up a couple of tasks to do what we
need. The first task is a simple rule that creates the PLT file, the
second task runs Dialyzer on the Rebar source.
$(DEPSOLVER_PLT):
- dialyzer --output_plt $(DEPSOLVER_PLT) --build_plt \
--apps erts kernel stdlib crypto public_key -r deps
dialyzer: $(DEPSOLVER_PLT)
dialyzer --plt $(DEPSOLVER_PLT) -Wrace_conditions --src src
Another thing to note. Dialyzer will often fail on your dependencies. There are too many fools out there that are not using dialyzer on their project. These rules take that fact of life into account, but you will see error messages from the PLT file creation. You can ignore those, its not your responsibility to fix errors in your upstream (though you can and submit them back, its always good to be a contributor to your dependencies).
Thats it. You want to change the --apps to a list of actual
dependencies for your project rather then the arbitrary list here. A
Dialyzer specific example makefile is
here. In my next post, I will put
up a simple general Makefile that you can use to run a Rebar project.
Every year Martin Logan, Jordan Wilberding and myself put on conference called ErlangCamp. We have done it for three years so far; Chicago in 2010, Boston in 2011 and now A Coruña, Spain People come from all over the world to spend two days talking about engineering highly resilient systems with Erlang and OTP. They spend time with some of the most prominent engineers in the Erlang world and they learn a lot in a short amount of time. However, it isn't a conference in the normal sense. At ErlangCamp we have one goal and one goal only; that is to get people over the "hump" of learning OTP and how to build big reliable systems in just two days. Its much more of an intense two day workshop or training class then it is a normal conference. Its tough to keep up, but you learn a ton. Best of all its a very good value at just around $100.
Why is it so inexpensive? We price it that way so that people are able to come. For better or worse it is still the case that most people in the Erlang world are doing Erlang on the side, in their personal projects, or as a small test project in their company. They dont have the financial backing or resources of their company and so end up paying for the Camp out of their own pocket. We are interested in seeing those folks learn even more. We want to see them become the seed of Erlang at whatever organization they are in and we absolutely love to see the community grow. So we charge just enough to cover our expenses and we, more or less, donate our time and efforts to the cause of getting them up to speed on some of the most critical parts Erlang. It's good for them, the attendee; it's good for the Erlang community and, in the long run,its good for everyone involved. It's a good times and you should be a part of it.
This year we are doing the conference in the Galician city of A Coruña on October 5th and 6th. A Coruña beautiful town on the northern coast of Spain. It is a nice vacation spot in it's own right and we are getting unbelievable support from the Universidade de Coruña, and the especially the computer science department. This is going to be one of the liveliest ErlangCamps since we did the first one in Chicago in 2010.
This year we have also added a couple of interesting features. One of the biggest ones is the chance for attendees to have dinner with the Speakers. Basically, for the cost of your ticket + the cost of a nice dinner you get to spend several hours on Friday night Martin, Jordan and myself anything you would like. That is an opportunity that's worth the cost.
We are really proud of the work we have done with ErlangCamp and how these Camps have helped many people in the Erlang Community. This is your chance to become part of that and its a great excuse to spend a couple days in Spain learning about one of the best and most interesting software platforms out there.
Very recently and after much experimentation I fundamentally changed the tools I use to do development. This is the first time I have made a change like this since I started coding in the mid 90s.
So far this non-traditional setup is working rather well and has made me much more mobile. Here is how I happened upon it.
A year or so ago I was working on a personal project. This project was a simple native application that was designed to run on multiple different OSes. Since I am a true believer when it comes to testing I bought a large (at the time) 6 core AMD box with 16 gigs of memory, up-gradable to 64, as a testing box. The goal was to have a box on which I could run multiple VMs for building and testing. It worked well for that, but shortly after I bought the box I found out that someone had beat me to the punch on the project. They had built a good implementation of the very thing I was planning on building! That's life, though, and its nice when someone else does the work for you anyway.
Concurrent to this, I started to use IRC a lot more and I wanted a persistent IRC client. So I did the usual thing and set up a server in the cloud running irssi and related bits under tmux. This led to me start to use that wee little box (the cheap one you can actually afford to leave running 24/7) for things beyond irc, like small builds, testing etc. After a few months of this I had a eureka moment. "I have this huge box sitting at home doing nothing and it is way bigger then anything I could ever afford to run in the cloud. Why am I not using that?" I said to myself with well deserved slap to the forehead. So I proceeded to set up a tomato router and Dynamic DNS through Namecheap, loaded up arch on the server and started using that instead of the cloud based server I had been using.
Well now I had this powerful box available that I was always connected to via ssh and so I started using it. I ran builds, did development, everything you do on any other development box. I use emacs for nearly everything so going to a terminal based emacs was no real change at all. After a bit of this I realized that my laptop had become redundant. I was only using it as a terminal and nothing more. I wondered what it would take to ditch the laptop and go to a ultra-portable setup something that was truly just a thin client. I thought that what would probably be most interesting would be a decent sized tablet probably Android but a full linux if I could find it.
As when any time you are trying something new and out of the ordinary there where lots of problems that would have to be solved to make this work. At the very least I figured the following problems would need to be solved.
This approach requires that I always be connected. In my world, thats already a pretty conistant requirment. Fortunatly, in the modern world connectivity is ubiquitous so this is really no problem. Even in the cases where there is no wifi connectivity, if you have a decent cell company (Ting!) tethering solves the last bit of connectivity requirement we might have.
While connectivity is ubiquitous it is not always of high quality. That is, bandwidth may be lacking or congestion high. In these situations SSH is unusable. It needs a stable strong connection to be anything like usable. I actually thought this was a deal breaker until I discovered Mosh. Mosh is SSH for the modern of persistent but flaky\unstable\low bandwidth connections. With mosh using a low bandwidth connection as a pipe to your primary working box works really well. It solved this problem in that nothing else would have.
Output is pretty straightforward. Android tablets have a decent if small screen and they support an external monitor. So, as long as I have a tablet with a decent sized screen I should be ok. When I am on the go I can just use the tablet and when I am in the office I can hook up to a monitor for a bit more real estate.
Input is much harder then output. The simple solution is to use a hardware keyboard, but these have the negative feature of being heavy (comparatively) and bulky. I am already a bit of a geek for input devices and have been exploring them for a long time, alternative layouts, alternative formats, Chording keyboards, etc. I pulled the trigger on Dvorak long ago, this just gave me a good reason to pull the trigger on one of these other methods of input. For ease of trucking around as well as longevity I choose the Twiddler2 by HandyKey.
This has solved the input problem quite well but it is buy no means a trivial solution. Learning a completely new keyboard, remapping long standing muscle memory and rebuilding speed is a slow frustrating process. The win at the end is huge but the journey is hard.
Now that all the problems have been solved and the transition made I have a powerful, ultra-portable setup that has the upside of being very, very inexpensive for the power available. I don't, yet, have enough time with this system to know if in will work out in the long term, but I will keep you appraised of my progress.
In the last few days I have gotten the question 'What are the differences between LFE and Joxa?' quiet a few times. So instead of answering them individually I thought I would write up the differences here.
The primary and most important difference is in the Goals of the two languages. I believe that the primary goal Robert had in mind when implementing LFE was to provide a mutable and syntax extensible version of Erlang. This would allow people to change the language where they needed to. Also I suspect, very strongly, that Robert likes implementing languages and he had a lot of fun implementing LFE. I certainly did with Joxa. However, I had some other very specific goals when I sat down to create Joxa.
Each of these things could have been solved in Erlang. For example, I could have implemented each language using leex and yecc. However, my best experience with DSLs has always come from Lisp and building those DSLs via functions and macros in Lisp itself. However, I have been using Erlang for a long time and I was very unwilling give up the features of the Erlang VM to get those advantages from Lisp. The only solution seemed to be using a Lisp on the Erlang VM.
The obvious first choice was LFE. So I spent several weeks digging into the language and its internals. At the end I decided it did not suit my purposes and the only fallback was to create a language of my own (there was a bit of sanity questioning involved as well).
With that in mind lets enumerate some of the major differences.
Simply put, LFE is a Lisp 2 while Joxa is a Lisp 1. According to Richard P. Gabriel, the lisp 1 vs Lisp 2 is defined as follows:
> Lisp-1 has a single namespace that serves a dual role as the > function namespace and value namespace; that is, its function > namespace and value namespace are not distinct. In Lisp-1, the > functional position of a form and the argument positions of forms > are evaluated according to the same rules. > > Lisp-2 has distinct function and value namespaces. In Lisp-2, the > rules for evaluation in the functional position of a form are > distinct from those for evaluation in the argument positions of the > form. Common Lisp is a Lisp-2 dialect.
To give a practical example of the above description lets say that you
have a function called hello-world that returns an atom
hello-world. To define and call that function in Joxa you would do:
(defn hello-world ()
:hello-world)
(hello-world)
To do the same thing in LFE you would do the following:
(defun hello-world ()
'hello-world)
(: hello-world)
note: In LFE the : serves the same purpose as funcall in Common Lisp.
The Lisp world has been arguing about which is better since the dawn of the universe. It very much depends on personal preference. For me, I find Lisp 2 to be very unnatural and counter intuitive nearly to the point where I wont code in it.
LFE is, as its name implies, is a Lisp version of the Erlang Language and whose intention is to provide a Lisp very close to Erlang and it's Semantics.
Joxa has no such intention. It is a unique language that happens to be targeted at the Erlang VM. It makes no effort to provide Erlang, Common Lisp semantics. The entire goal is to provide a tight, small well understood functional language with a clean approachable syntax that allows for the use of Lisp style macros.
As you will notice in using Joxa, there is very little in the way of Erlang syntax and even less of Common Lisp. Some syntax for declarative data structures has been pulled over but very little more. Most of the syntax comes from Clojure and Scheme.
In LFE macros are evaluated in LFE itself and not with the Erlang VM. This means that macro evaluation has different semantics then function evaluation. In LFE macros are not simply functions that run at compile time like they are lisp. The are special things that have their own evaluation semantics different from those of normal functions. So not only must you keep in mind the normal compile time vs runtime semantics you must also keep in mind the function vs macro semantics. I believe this is a hindrance to the easy use of macros.
For that reason Joxa takes an incremental approach to module compilation that allows macros to be evaluated on the VM in the exact same way as functions and there is no need at all to worry about differences in the evaluation environment between normal functions and macros. Since one of the important goals of Joxa is explicitly supporting DSLs this unified evaluation environment for functions and macros is quite important.
I am the co-founder of a startup called Afiniate and we are basing important aspects of Afiniate's business on Joxa. It is very important to us that Joxa be both well tested and stable. To that end the language has been bootstrapped on itself. That is, Joxa is actually implemented in Joxa. This allowed us to test the system and work out many problems very early on. This also serves as an important litmus test for new features and changes. This litmus test caches many types of problems before they ever leave the developers desktop. I believe that this bootstrapping is a fundamental requirement for any language that will be used in production systems.
Joxa is also extremely well tested. As I said we are basing an important part of our business on this platform and we must ensure, as much as possible, that we can iterate on the platform quickly while retaining its stability.
It took me quite a few years to arrive at an optimal mental model for thinking about the Erlang, ERTS and releases in the context of an operating system. It is a different enough method of thought that its worth talking about.
I think most folks have not yet come to a fundamental realization when it comes to Erlang, or more specifically, ERTS. That realization is that ERTS is a Virtual Machine in the classic sense. That is, it views each release as a complete self contained 'machine'. That concept is central to the way it expects releases to be organized and managed.
Once you get your mind around this idea that each release is a self contained machine, things begin to make more sense. For example, why are there only version numbers and not names in release directories and tarballs created by sys_tools? This is because an Erlang Release is a self contained universe, a complete machine Virtual Machine, sort of like an install of Linux. So asking why you cant have multiple releases in the same node is very similar to asking why linux only has one rc.d directory. It only has one because in the context of a machine only one 'operating system/bootstrapping system' makes sense.
So how do you integrate this very different module into the Unix-y way of doing things? The simple answer is that you don't.
You can take two approaches when it comes to installing a release on a Unix box. You can go through the efforts of splitting out the release into its constituent applications, installing those applications separately so that nothing is shared. Then you can come up with a scheme for your release metadata so that they do not collide. After that you can shoehorn all this into whatever package management system exists on your preferred platform. The question I have with this approach is simply why? To retain some idea of the purity of the Unix model? so you can save a few tens or hundreds of megabytes of disk space? With this approach you will be constantly fighting ERTS, coming up with ways to get around the Virtual Machine and what it is natively expecting. In the end, You will be coming up with all sorts of ways to create unanticipated pain for yourself with no real gain.
The alternative is that you treat an Erlang/OTP release as the VM expects. You could use the tools distributed with Erlang to create and handle releases. You could treat a release tarball as a single distributable thing. You could even take this to the logical extreme and, if you are targeting homogeneous hardware even include the ERTS binary in that tarball so that everything is completely self contained. The whole reason ERTS is structured in this manner is due to the problem it was intended to solve: providing a way to make a self contained system that could be installed trivially on target machines.
So taking advantage of this model, You end up with a system (in either /opt or /var) that looks something like this.
/opt/erts//-*
That is, you have a separate directory structure to handle Erlang Releases, some where on your system and each release is in its own specific location.
Lets look at an example. Lets say that I have a release foo. On your
target systems you expect to have versions 0.1.0, 0.2.0 and
0.2.1. You also have the release bar with versions 0.1.1 and
0.1.2. Then your tree might look like:
/opt/erts
|-- bar
| |-- 0.1.1
| | |-- ....
| `-- 0.1.2
| | |-- ....
`-- foo
|-- 0.1.0
| |-- ....
|-- 0.2.0
| |-- ....
`-- 0.2.1
|-- ....
Where the ellipses identify all the files and directories relating to the release.
Your chef/puppet/management scripts end up being very trivial, simply untaring releases into that version scheme and starting up the self contained releases in those directories, then running the relevant startup commands for ERTS.
This becomes an even more important factor when you start looking at hot code loading and live upgrades with Relups. While I don't recommend that this live upgrading facility for anything but the most trivial projects or those projects where extremely high up-time is worth the monumental costs of getting it right, it still is there and relies on this well understood layout to function.
Unless you have a very good reason not to, I suggest you embrace the Erlang approach. Its simple, straight forward, easy to understand and even easier to manage. It has few downsides and major wins for your infrastructure management and deployment.
The process of collaboration and development on a project is important. The means by which you get that code into a deliverable state matters and matters a lot. Now the method of delivery and what constitutes a deliverable state changes from company to company, team to team and even project to project but there is always a point at which you want 'make something available'.
The process you use to get to that point should involve a good balance of time to market and quality. While you don't want to make the code so perfect that you never deliver it or you deliver it so late that it ceases to matter, you also do not want to deliver code that is so poor and has such high maintenance costs that it doesn't actually solve the problem its designed to solve. You must strike a balance between the two and using good git principles along with a little bit of sound basic engineering practices you can get there. With that in mind, I am going to discuss my preferred team development model with Git, along with a few additional changes if you are using Github.
The Canonical Repository is the repository we deliver code from, it serves as the central reference point, the tip of development for the project. No one pushes Work In Progress (WIP) code to this repository and no single person owns it. It exists to hold the history of the 'thing' that is currently in production or that will be in production and it is the point that Developers rebase their work in process code on.
There is a bit of a mind hacks going on here that I encourage you keep intact. That is the 'sacredness' of canonical. It should never become something that a developer pushes too has part of his daily workflow. It should always carry the sense that things that go into it are important. In my experience this vastly reduces the screw-ups, bugs, bad pushes etc that can be so painful to a team. It also helps encourage both the developer and the code reviewer to take their job seriously without forcing a lot of process onto them. That lack of strict painful process is what makes this approach so powerful.
Developers work in clones of the Canonical Repository. They create branches, work on experimental changes, code to meet the requirements of the system and collaborate with each other through these clones. If they are all on the same network then they setup the Git Server on their development boxes and push and pull clones directly to and from their team mates. If they are in different parts of the world or simply on different networks they may have to use some intermediary like Github to share their code. In that case, they replicate to their github clone and collaboration goes on through github. Of the two approaches I prefer addressing my peers peers repositories directly. It removes an unnecessary step from the process and makes it that much easier and less error prone.
There is a second big mind hack going on here. That is the fact that nothing in the developers repo matters until the developer says it matters. We want the developer to be productive, we want him to use the tools that work for him and the process that is most comfortable for him use to produce code. It doesn't matter if the rest of the team uses emacs and he uses vim, it doesn't matter if he wants to use OSX and the rest of the team uses Centos. What matters is the code he produces meets the standards the team has set, through previous reviews, automated test suites and the like.
We want him to explore freely. We want him to be the most productive that he can be using the tools that he is comfortable with. We don't want him to worry about what impact that exploration will have or if someone is going to be looking over his shoulder trying to validate the quality of code that literally wont matter until and if it makes it into canonical.
We don't impose tools, or process on developers in their own repo. They can code using any process they would like, using any editor, compiler, platform etc. It doesn't matter in the least as long as the output meets the teams standards. Code that exists in the developers repo is kind of like Potential Energy. Potential Energy is impotent, has no interaction with the world around it until it becomes Kinetic Energy. Once that Potential Energy becomes kinetic it has the potential to change the world. Code in a developer's repository is very similar. It has no impact on the world (ie the project/team/organization) until the Developer(s) feel that it is ready, until the developer explicitly converts it to Kinetic Energy. Lets talk a bit about the process used to convert that Potential Energy to Kinetic Energy.
There are always two parties to the process. The Developer or Developers producing the code and the engineer that's going to review/validate that code. Lets get started with the producer side.
The Developer puts the completed code onto a dedicated branch in his repository and refactors it to meet the standards of the project. He should ensure that the following invariants hold.
It could be that you have a convention that code ready for review goes onto a branch specifically named, perhaps something like 'reviewable' or 'rv'. It could also be that the Developer lets his team mates know what branch the code is on when he makes the announcement. In either case, once code is ready he makes an announcement letting the team know that fact. Usually this is in the form of patches sent to a mailing list, or a github pull request. The mechanism doesn't actually matter so much as the fact that an intentional announcement with a short description is made to the team working on the project.
Once the Developer announces it, a team mate should be selected to do the review. There are a ton of ways to select a reviewer to handle shepherding the code into canonical. In my experience the best way to do it is just let the code reviewer self select. As long is its not always the same person stepping up things should be fine. If you have put together a good mature team this approach is, by far, the best. Other ways to do it, is just to randomly assign a team mate, or you could also have the Developer producing the code to pick the team mate that is going to review. In the end how the Reviewer is selected matters a lot less then the fact that one is selected.
The final responsibility of the Developer is to make sure make sure the code gets reviewed. The longer code sits out without review the more likely it is that bit rot will occur, that merging into the Canonical branch will become painful etc. So the Developer needs to do what is necessary to make the review happen in a reasonable time frame.
The Reviewer's job is to review the change to validate that it meets the project standard. This comprises a few things.
If any of these steps fail the Reviewer pushes back on the original Developer to make changes and finish the work. This may take several iterations depending on the quality of the Developer writing the code. That is completely fine. This is normal development, we are engineers not code monkeys and we want to get the code as right as reasonably can. The above steps are the minimal steps that need to be done ensure this standard. Sub-par code should never make it into Canonical simply because the developer is annoyed with the process or the or timelines are tight. You invariably pay more in the long run by giving in to these pressures then you gain in the short run.
Once the change is reviewed and accepted the reviewer signs off on the code. Yes, we expect the Reviewer to put his name on it and take responsibility for the fact that that code is in Canonical. If the code breaks the build it should be embarrassing to the Reviewer.
Once the code is in Canonical and out of the Developers repository, history revision, commit amends and the like should all stop. At that point your team has people depending on that code and changes to history become painful. If there is a bug or a fix that needs to be made it should go into a new commit and go through the process previously outlined.
This process is actually very smooth, but it does assumes a few things:
If these things are true then you are golden. It may take a little bit of time to get a feel for the process and work around the little hiccups that will inevitably occur. However, with competent engineers the process should be flowing smoothly after a few weeks.
This is not intended to be a rigid process. The only place that real process actually comes into it is in the transition from the Developer's repository to the Canonical repository. This is true by design. Its whole purpose is to encourage the wild and wooly exchange and growth of ideas, creativity and productivity in the project wherever that is possible while providing a bit of rigidity and discipline only when it is required. Getting that balance right in such a way that you get both the most creativity and productivity possible and the most maintainable, well founded code while at the same time keep the team happy is the goal. This is what this process seeks to encourage.
One of the ways I have seen the process become founder and become rigid is in respect to a team that does not know git well. This process takes quite a bit of comfort with Git to work. Your developers need to be comfortable with local and remote repositories, rebasing, merging, pushing and pulling from peers, signing off on commits etc. If you have new or incompetent developers you may be tempted to wrap all of these steps in scripts that, apparently, take the need for knowledge away from your developers. However, these scripts don't actually remove the need for knowledge of git, all they really do is delay that need slightly. What they actually do is encourage the team to avoid learning git, while at the same time spending an inordinate amount of time fixing problems when the scripts inevitably break. At the same time they lock down and rigidify a process that should, by its nature, not be rigid.
Go ahead, bite the bullet and give the team the time and resources they need to learn git well. It will take a few weeks and you will take a productivity hit. However, in the long run you will make that back many times over. If you have team members that either do not, can not, or will not pick up git for whatever reason. Well maybe its time to consider those folks as bad hires and let them go. The team will be better off for it in the long run.
The other big thing to watch out for is the urge to bypass the process. Sometimes when you are in a tight situation it can be tempting to want to push code directly to Canonical, bypassing the review. For example, you might have a bug in production lets say your system is down and its costing the company a million dollars a minute. Pressure is high and you may have a huge urge to tell the devs (or yourself if you are the dev) that the process would only slow you down and the code needs to get out right now. This is almost invariably a bad decision. This process should only take a few minutes assuming the code is good. The likelihood that the code is bad in that situation is high and its much cheaper timewise to catch those problems in the review cycle then to deploy the code and realize in production that you fixed one bug but introduced another.
I have a very strong standard for commits when it comes to git. In general, commits should contain one unit of change and one unit of change only. When looking at a Git log you should see a very clear, linear history of change. A history where each commit contains a single change has a good short commit line explaining what the commit contains and a complete commit detail containing a description of the whys of the commit.
I get a fair amount of flack on this from many of my peers. They tend to see Git as simple a place to store code or as an audit trail, like most non-distributed version control system. This opinion just does not serve in most cases. Commit hygiene is an import part of development using Git. In the same way that readability and factoring is important to code. It takes work to do it right and the benefits may be intangible, but in the long run its well worth it.
It takes time and effort to clean up your patches and get them into a publishable, well factored state. Effort that many people don't want to spend. If thats the case why do it?
In the alternate case, where change is spread willy nilly accross multiple commits git bisect will not work correctly. That is, you will not be able to test each patch on its own, nor will you be able to identify easily all the patches related to the problem. You will end up digging through the commits in this specific deployment looking for each thing that might have caused the problem and reverting that. Of course, because change is spread willy nilly around you will end up reverting things you do not wish to revert and that will cause other problems. In reality what you will probably do, is spend some number of hours trying to get a fix in place that will allow things to run, push those fixes and hopefully come back at some point in the future to revist your fixes.
Lets say you roll out a new set of patches one of those patches contain an error. In the case of well defined well factored patches where related change is part of the same patch it is fairly trivial to run git bisect against your repo find the offending patch, then do git revert on said patch and redeploy your system.
A single patch focused on a single set of change is much easier to review and comment on then a single change spread over a number of patches or a single patch that contains a large number of unrelated changes.
Knowing what reason generated a patch (a ticket, a story, a customer request) and being able to tie the change that accomplishes that reason to the reason itself goes a long way towards making your patch comprehensible to the reviewer.
The other reason to factor your changes before they make it into production is simple higene. That a person looking at change sets in linear temporal order has a good idea of what is changing in each patch. That is, that each patch presents a good self contained step in the march of code over time.
This unfortunately, is much harder to justify then the first point. Much like refactoring its one of those things that you can point to and say "This is a Good Idea", while beeing unable to give hard and fast reasons for that. There is no way to say that a clean commit history is going to save you 20% of your time on maintenance, or that its going to help you get to your next milestone 5% more quickly.
In my opinion, it will help you in maintaining your code base, in understanding why a particular change occured and what the steps leading up to that change will be. Also the discipline of patch higheene will help you focus on the change that needs to occur for a particular. I consider all these useful results for very little cost.
What exactly is one unit of change? The answer is of course, it depends. The rule of thumb is that if its directly related to the change you are actively working on, that is it can be tied directly to what you are holding in your head, then its probably part of the same change.
However, if you see a bug or refactoring opportunity while doing other things, that's not part of the change. If you have two things in your head and are kind of working on them at the same time because they are somehow related in your mind those also are not part of the same change.
Much like with refactoring you will, over time, develop a sense for what a good patch should look like and what you should be publishing.
There is one right answer to when to practice commit hygiene. That is always 'before the code is published'. Its very painful for your consumers if the git history changes after they have pulled from your repo. To avoid this, you want to make sure that you don't refactor after you have pushed to the canonical repo.
One thing to be aware of is that the word publish can have several meanings. In the case where you have a canonical repo where people expect to find the latest released code, in time you publish there it locks you from further refactoring. But lets take a slightly more nebulous case. Lets say that you and a peer are working together on a piece of task generating commits for the purpose of sharing code. Many of these commits are going to be temp commits, or commits that only exist so that the in-process code can be shared. In this case, you also don't want to refactor while you are in the process of building. You have a consumer after all, your peer. However, once you and your peer have your work done but before you publish your work to your canonical repo or your peers it would be a very good idea to for you and/or your peer to sit down and work on refactoring the commit stack, practicing a bit of commit hygiene before you publish your work. You should consider that the last step in your build process. As a note, once this in done you should do any future work on top of that newly refactored branch.
The key is the Interactive Rebase in git. This is the tool you will use to clean up and manipulate your git history. I wont go into too much detail here, but I will refer you to many other good discussions on the subject.
In the end the thing that must be kept in mind is that git is a developer's tool more than anything else. Not an audit trail, not a place to stick all the crap that we produce. It is a tool, but you need to follow some guidelines for that tool to be as useful as it can be.
I have a serious personal investment in Erlang as a language and a platform. I have written a book on the subject and spent a fair amount of my professional and open source life working in it. I have done all of that because I believe its a system that simply just works for much of the type of development that I do. However, I also have a secret long term love affair with lisp. Unfortunately, I have never really been able to use it in anything but toy projects. Recently I happened upon LFE again. I remember seeing it when it was announced and thinking it wasn't quite baked, but that seems to have changed quite a bit. It may very well solve both my Erlang as an awesome platform and Lisp as on awesome language itches for me.
I decided that my very first project was going to be a Swank (Slime backend) server implementation for LFE. Think about it, you could code in Slime using Lisp on the Erlang platform. Thats pretty exciting to me. Unfortuanatley, I immediately ran into a problem. That is that Macros are accessable after compliation. That is they go away when the system is compiled and are no longer accessable. Well, that doesnt work in a system like slime whene there isn't a lot of distinction between compilation time and run time (nor is it very acceptable in Lisp in general as the authors acknowledge).
Macros are not first class citizens of the LFE world. That is they exist only at run time and they are not evaluated by the Erlang virtual machine, the are evaluated by the LFE build system itself. This means that macro evaluation has different semantics then the evaluation of code compiled and evaluated by the virtual machine in the normal manner. This forces the developer to make a distinction in his mind between things that will be evaluated at compile time vs things that will evaluated at run time. This is probably best illustrated by the 'eval-when-compile' construct.
A problem related to this is the fact is that, because macros only exist at compile time, if you want to use macros in another file you must import those files with the Macros, those macros are then evaluated by the expander in the namespace provided (along with its other functions). The fact that macros are always evaluated in the namespace of the macro caller provides its own set of problems.
All of these problems can be solved by compiling macros and making them first class entities in LFE.
In the end, Macros are really just functions. They are functions that are evaluated at compile time, who's input are lists that represent the AST of the program, but they are just functions just the same. So it makes sense that we could just compile them as functions and mark them somehow so that the system knows that they are macros.
So that is exactly what I am going to do. First and formost I am going to convert a specified macro to a function. Recursively expanding the macro (of course) until a specific form is completely expanded. Then convert that form into a 'define-function' expression. I have to mark this function as a macro, so the easiest way to do that is to keep a list of macros around for that particular namespace. When a macro function is created we add it to the list of names returned by that special function. So when the code generation occures the macro gets compiled into a function. We mark the function as being a macro by generating a special function in lfe compiled modules. I think we can call that function 'macros' (its simple enough after all), and when called that function will return a list of all macro functions in the system.
As we generate each 'function' be it an actual function or a macro as function we will go through a complete generation process. That is we will go from a sexpr form describing a 'function' to a compiled erlang function in the correct namespace. In this manner the full module will be available at compile time for every function compiled before the current function being compiled and/or expanded. That should give us the flexibility to just about anything we want, at the cost of additional compilation time. A nice fringe benefit is that we should be able to git rid of the eval-when-compile special form.
There are two big implications that compiling macros in this way entails. The first is that, if you have macros that depend on functions then those functions must be defined before the macro that calls them. It will become a very simple top down evaluation process, or at least, it will appear that way from the point of view of the developer.
The second is that macros can not be defined before the module definition, since in both Erlang and LFE the module definition be the first thing in the file. In current lfe code it seems fairly common for macros to come at the head of the file. Because macros will now be generated and compiled into functions in that modules namespace the must be defined after the def module call. Macros can be called before define module (as long as they evaluate in such a way that the define module call is the first expanded form in the file) but macros ahead of modules can only be called from other modules.
This is the first of several planned changes to LFE. My next project will be to complete the a swank backend for the language. Once that's done I am actually going to try to switch over to LFE as my primary Erlang language. I suspect this alone will generate some changes as well. One thing I would really love to see is some more Clojure goodness in the language. Not a complete Clojure implementation, but Clojure does have a ton of great syntax ideas that we can borrow. However, those things are for the future. First and foremost, I need to finish macro compilation.
Note: This was taken whole cloth from the Google Closure Tutorial and translated to (hopefully) idiomatic ClojureScript. I take credit as the translator. However, I can take no authorship credit. You can find the complete source for this tutorial on github.
This tutorial gives you hands-on experience using ClojureScript and the Google Closure Library by walking you through the construction of a simple application. To do this tutorial you should have some experience with JavaScript and Clojure. You should have already gone through the ClojureScript the setup process, as described on the ClojureScript Site or as I described in a previous post. This tutorial explains the different parts of these source files step by step.
This tutorial illustrates the process of building a simple application for displaying notes. The example:
creates a namespace for the application,
uses the Closure Library's goog.dom.createDom() through the ClojureScript dom-helpers functions to create the Document Object Model (DOM) structure for the list,
and uses a Closure Library class through ClojureScript in the note list to allow the user to open and close items in the list.
When you use JavaScript libraries from different sources, there's always the chance that some JavaScript code will redefine a global variable or function name that you, yourself, have defined in your code, creating a nasty bug. Clojure doesn't have this problem and, by extension, neither does ClojureScript, though it compiles to javascript.
In clojure script we define namespaces in exacty the same way we do in Clojure. For our notepad application we can define the tutorial.notepad.note namespace as follows:
(ns tutorial.notepad.note
(:require [tutorial.dom-helpers :as dom]))
Once the tutorial.notepad.note namespace exists, the example creates the initialization function in the Note namespace.
(defn init
[title content node-container]
{:title title :content content :parent node-container})
The note init function is now in the tutorial.notepad.note namespace created with the ns macro.
First and formost, go get the dom-helpers.cljs from the twitterbuzz sample app and stick it in your project. It has the twitterbuzz namespace and you probably wont want to keep that. I changed it to the tutorial.notepad.dom-helpers namespace and refer to it as such below.
To display a Note in the HTML document, the example gives the note namespace the following method:
(defn make-note-dom
[self]
(let [header-element (dom/build
[:div {:style "background-color:#EEE"}
(:title self)])
content-element (dom/build
[:div (:content self)])
new-note (dom/build
[:div header-element content-element])]
(dom/append (:parent self)
new-note)
;; Return an updated self object with the above declarations
(-> self
(assoc :header-element header-element)
(assoc :content-element content-element))))
This function uses the dom-helpers function build. The following 'require' statement includes the code for this function:
(ns tutorial.notepad.note
(:require [tutorial.dom-helpers :as dom]))
The require directive for the namespace macro in ClojureScript works exactly the same way as the require in Clojure. There is nothing further to worry about.
The function build in dom-helpers creates a new DOM element using a syntax very similar to that used in the Hiccup library. For example, the following statement from dom-helpers creates a new div element.
;; Remember we imported dom-helpers as dom
(dom/build
[:div {:style "background-color:#EEE"} (:title self)])
The map in the vector that starts with the :div keyword specifies attributes to add to the element, and the vector is terminated by the child to add to the element (in this case a string). Both the map and the child specifiers are optional.
The make-note-dom function just makes a single Note argument. To make a list of notes, the example includes a make-notes function that takes a vector of note data, as a vector of maps, and instantiates a Note for each item, calling the make-note-dom function for each one.
(defn make-notes
[data node-container]
(doseq [cont data]
(let [self
(init (:title cont) (:content cont) node-container)]
(make-note-dom self))))
With just a few lines of code the example makes each note a Zippy.A Zippy is an element that can be collapsed or expanded to hide or reveal content. First the example adds a new require element to the tutorial.notepad namespace. The namespace should now look like.
(ns tutorial.notepad.note
(:require [tutorial.notepad.dom-helpers :as dom]
[goog.ui.Zippy :as zippy]))
Then it adds a line to the make-note-dom function:
(defn make-note-dom
[self]
(let [header-element (dom/build
[:div {:style "background-color:#EEE"}
(:title self)])
content-element (dom/build
[:div (:content self)])
new-note (dom/build
[:div header-element content-element])
;; NEW LINE
zippy (goog.ui.Zippy. header-element content-element)]
(dom/append (:parent self)
new-note)
;; Return an updated self object with the above declarations
(-> self
(assoc :header-element header-element)
(assoc :content-element content-element)
(assoc :zippy zippy))))
The constructor call (new goog.ui.Zippy. header-element content-element) attaches a behavior to the note element that will toggle the visibility of content-element when the user clicks on header-element. For more information about the Zippy class, see the Zippy API documentation
Here is the complete ClojureScript code for this example application:
(ns tutorial.notepad.note
(:require [tutorial.dom-helpers :as dom]
[goog.ui.Zippy :as zippy]))
(defn init
[title content node-container]
{:title title :content content :parent node-container})
(defn make-note-dom
[self]
(let [header-element (dom/build
[:div {:style "background-color:#EEE"}
(:title self)])
content-element (dom/build
[:div (:content self)])
new-note (dom/build
[:div header-element content-element])
zippy (goog.ui.Zippy. header-element content-element)]
(dom/append (:parent self)
new-note)
;; Return an updated self object with the above declarations
(-> self
(assoc :header-element header-element)
(assoc :content-element content-element)
(assoc :zippy zippy))))
(defn make-notes
[data node-container]
(doseq [cont data]
(let [self
(init (:title cont) (:content cont) node-container)]
(make-note-dom self))))
We can't embed ClojureScript in html like we can with javascript. This is a very good thing, but I digress. This does mean that we need to do things just a bit differently. In our case we are going to create another file in the notepad namespace called core.cljs that will define a main function for us. It basically takes the data required and calls the various note functions.
(ns tutorial.notepad
(:require [tutorial.notepad.note :as note]
[tutorial.dom-helpers :as dom]))
(defn main
[]
(let [note-data [{:title "Note 1" :content "Content of Note 1"}
{:title "Note 2" :content "Content of Note 2"}]
note-container (dom/get-element :notes)]
(note/make-notes note-data note-container)))
;; Take note of this, this is how we call main at startup!
(main)
With Google Closure and ClojureScript how we call the generated javascript depends on whether or not we compiled with the advanced optimizations. If you compiled without advanced optimizations (see getting started) then your html file should look like this:
If you compiled with advanced optimizations your html should look like this:
<!doctype html>
<!--[if lt IE 7 ]> <html lang="en" class="no-js ie6"> <![endif]-->
<!--[if IE 7 ]> <html lang="en" class="no-js ie7"> <![endif]-->
<!--[if IE 8 ]> <html lang="en" class="no-js ie8"> <![endif]-->
<!--[if IE 9 ]> <html lang="en" class="no-js ie9"> <![endif]-->
<!--[if (gt IE 9)|!(IE)]><!--> <html lang="en" class="no-js"> <!--<![endif]-->
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Google Closure Tutorial for Clojurescript</title>
<meta name="description" content="Demo showing off the ClojureScript hotness">
<!--[if lt IE 9]>
<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body>
<header>
<div id="notes"></div>
<script type="text/javascript" src="js/goog/base.js"></script>
<script type="text/javascript" src="js/deps/tutorial.js"></script>
<script type="text/javascript">
goog.require('tutorial.notepad');
</script>
</body>
</html>
Note two things, the first is that we no longer need to include the goog/base.js script and the second is that we no longer need to have the explicit require in our html file.
Also note that we are including the script after the notes div has been defined. That's rather important.
Go ahead and checkout the project from github.
$ get clone https://github.com/ericbmerritt/google-closure-tutorial.git
$ cd google-closure-tutorial
From the project root (where you should be right now). Run the the following:
$ ./bin/compile.sh
This does the unoptimized build. You may inspect the compile.sh script to see the actual command that is run. There is a very simple web server included in the distro. From the root again, do the following:
$ ./bin/webserver.sh
Now you can point your browser to http://127.0.0.1:8000/index-unoptized.html to see the result.
Again from the project root, run the command:
$ ./bin/compile-optimized.sh
$ ./bin/webserver.sh
This will do the exact same thing as the unoptimized, just with optimization turned on. Again, you may inspect compile-optimized.sh to see the command details.
I came a across a few realizations as I was working through this and thought I would share them with you. You should compile your ClojureScript with both advanced options on and off. Advanced options being on is going to help you catch a lot of errors at compile time that you would otherwise have to find at run time. However, its absolutely impossible to debug. So during development you will mostly be compiling without advanced options. Make sure, though, that you compile with advanced options on a fairly frequent basis. You will be happy you did in the long run. At the very least anyway you are going to want to compile with advanced options before deploying.
Setting up Clojurescript is fully described over at the Clojurescript Quick Start. What I am doing here is just reorganizing and rewriting to make things a bit more clear to someone that thinks like I do. While I was setting up clojurscript, I missed a couple of things as I was going along. So I thought it might be useful to others to go through the process for others. I work on linux, and this method will work there. I suspect very strongly that it will work on OSX as well.
First and foremost download Clojurescript as below.
$ git clone https://github.com/clojure/clojurescript
Yes, you need to go ahead and do the git clone. Clojurescript is moving forward rapidly and will be for the foreseeable future. You are going to be updating the repo on a regular basis. So go ahead and clone it. I suggest you put in whatever working directory you use for projects. I tend to keep my projects in $HOME/workspace and that is where clojurescript lives on my box. This works out rather well because I expect to be contributing back. Hopefully, you will too.
You can bootstrap everything by running the bootstrap script in the clojurescript directory.
$ ./script/bootstrap
Fetching Clojure...
Fetching Google Closure Library...
Fetching Google Closure Compiler...
Building goog.jar...
Copying closure/compiler/compiler.jar to lib/...
it will pull down all the dependencies for clojurescript. Note that clojurescript relies on clojure 1.3.0beta1 (at the time of this writing). Its very probable that you are thinking about using clojurescript as the front end to a project that uses clojure as a back end. If that is the case its probably best to base your back end project on clojure 1.3.0beta1 or what ever happens to be the current version of clojure that clojure script uses.
If everything went well, the next thing we need to do is to setup the CLOJURESCRIPT_HOME env variable. This should point to where you installed clojurescript. If you remember I put my clojurescript in $HOME/workspace/clojurescript and thats where my CLOJURESCRIPT_HOME points.
$ export CLOJURESCRIPT_HOME=$HOME/
Of course, replace with actual location of your clojurescript repo.
Finally we want to set up the paths. You can either set your system PATH env variable to include both $CLOJURESCRIPT_HOME/script and $CLOJURESCRIPT_HOME/bin or you can symlink the cljsc and repl scripts to a location already in your bin directory. So you might do
$ export PATH=$PATH:$CLOJURESCRIPT_HOME/bin:$CLOJURESCRIPT_HOME/script
for putting those two in your path. Or you might do
$ ln -s $CLOJURESCRIPT_HOME/bin/cljsc $HOME/bin/
$ ln -s $CLOJURESCRIPT_HOME/script/repljs $HOME/bin/
As you might imagine I have $HOME/bin already in my path. Either one works just fine, so its up to you. What you do not want to do is copy those scripts into locations in your PATH. Remember those scripts are going to change.
So lets try it out. Run the repljs, you should see something that looks like this.
$ repljs
#'user/jse
"Type: " :cljs/quit " to quit"
ClojureScript:cljs.user>
If you do, then WOOT! you are golden. If you don't then you probably forgot a step or did something wrong (or something fundamental has changed since I wrote this). Go back and take a look and see what went wrong.
Having a repl is pretty awesome, and you are going to use it; but what you really want to do is integrate it into your project. At the moment how you do that is going to vary a lot depending on your project. However, in general here are two ways to do that. The first is to use the repl, the second is to run cljsc. I tend to use the cljsc script because I can stick it in a script.
I am building a project in Google App Engine using appengine-magic. I would love to actually have a compiler built into leiningen. However, there are no lein plugins for clojurescript yet. So I have a little shell script the runs in the root of the project. That shell script basically does the following.
$ cljsc ./src '{:optimizations :advanced :output-dir "war/js" :output-to "war/js/myprojectname.js"}'
That will do the whole program compilation over all your cljs files. As soon as we get some lein goodness I will be migrating to that. I like an integrated build system quite a lot.
As some of you may have guessed, I am a fan of Erlang. I think that it's a very interesting language with a tremendous amount of promise for the type of server side applications that I usually end up working on. I have talked a lot about various things here on Erlangish, so I thought it would finally be appropriate to spend a bit of time talking about the topic of the blog. For the most part I will be delving into the, somewhat obscure, history of Erlang. I will also spend a bit of time providing some instructions on how to get started with the language.
Erlang is a distributed, concurrent, soft real time functional programming language and runtime environment developed by Ericsson, the Telecoms Infrastructure supplier. It has built-in support for concurrency, distribution and fault tolerance. Since its open source release in 1999, Erlang has been adopted by many leading telecom and IT related companies. It is now successfully being used in other sectors including banking, finance, ecommerce and computer telephony.
OTP is a large collection of libraries for Erlang to do everything from compiling ASN.1 to providing an application embeddable http server. It also consists of implementations of several patterns that have proven useful in massively concurrent development over the years. Most production Erlang applications are actually Erlang/OTP applications. OTP is also open source and distributed with Erlang.
Although Erlang is a general purpose language, it has tried to fill a certain niche. This niche mostly consists of distributed, reliable, soft real-time concurrent systems. These types of applications are telecommunication systems, Servers and Database applications which require soft real-time behavior. Erlang excels at these types of systems because these are the types of systems that it was originally designed around. It contains a number of features that make it particularly useful in this arena. For example; it provides a simple and powerful model for error containment and fault tolerance; concurrency and message passing are a fundamental to the language, applications written in Erlang are often composed of hundreds or thousands of lightweight processes; Context switching between Erlang processes is typically several orders of magnitude cheaper than switching between threads in other languages; it's distribution mechanisms are transparent, programs need not be aware that they are distributed and The runtime system allows code in a running system to be updated without interrupting the program.
Given that there are things that Erlang is good at there are bound to be a few things that it's not so good at. The most common class of 'less suitable' problems is characterized by iterative performance being a prime requirement with constant-factors having a large effect on performance. Typical examples are image processing, signal processing and sorting large volumes of data.
I am firmly convinced that Erlang's history is a key ingredient to its success. I am not aware of any other language whose early development was so straightforward and pragmatic. For most of its life Erlang was developed inside Ericsson, originally for internal use only. Later on it was made available to the external world and eventually open sourced. The timeline of its development goes something like this.
1985_Ericsson identified some issues that existed with telecom languages at the time. To address these difficulties they started experiments with the programming of telecom applications using more then twenty different languages. These early experimenters came up with a few features that a useful system needed to supply. They realized that the target language must be a very high level symbolic language in order to achieve productivity gains. This new requirement vastly subseted the language space and resulted in a very short list of languages. The languages included Lisp, Prolog, Parlog, etc.
1986 Further experiments within this subseted list where performed. New results were generated from this round of experiments as well. They found that the theoretically ideal language also needed to contain primitives for concurrency and error recovery, and the execution model needed to exclude back-tracking. This ruled out two more of the contending languages, Lisp and Prolog. This ideal language also needed to have a granularity of concurrency such that there would be a one to one relationship between one asynchronous telephony process and one process in the language. This ruled out Parlog. At the end of this round of experiments they where left with out any languages in the list they had started with. Being the pragmatic folks that they were, they decided to develop their own language with the desirable features of Lisp, Prolog and Parlog, but with superior concurrency and error recovery built into the language.
1987 The first experiments began with this nascent language which became Erlang.
1988 The ACS/Dunder project was started at Ericsson. This was a prototype implementation of PABX by developers external to the core Erlang developers.
1989 The ACS/Dunder project became a full fledged project with the reconstruction of about a tenth of the complete, production PABX called the MD-110 system. The preliminary results where very promising. In this early phase developer productivity was already ten times greater then during the development of the original system in the PLEX language. This reimplementation also pushed forward experiments directed at increasing the speed of Erlang.
1991 At this point the experiments directed at speeding up Erlang bore fruit. A fast implementation of Erlang was released to growing user community. Erlang was also presented at an international telecom conference that year.
1992 During this year the user base started growing significantly. Erlang was ported to VxWorks, PC, Macintosh and other platforms. Three new, complete applications were written in Erlang where presented at another conference. The first two production projects using Erlang are started.
1993 Distribution primitives where added to the language, which made it possible to run a homogeneous Erlang system on heterogeneous hardware. A corporate decision was made within Ericsson to begin selling and supporting Erlang externally. A new organizational structure was built up to maintain and support Erlang implementations and Erlang Tools. This resulted in the creation of Erlang Systems AB.
1996 OTP was formed into a separate product group within Erlang Systems AB. This represented the maturing of the OTP platform within Erlang. After nearly ten years of development the (non-Erlang) AXE/N project was closed down and pronounced a failure. This left a large whole in Eriksson's product line and development of a new replacement product was started to fill it, it was decided that this replacement product would be written in Erlang.
1998 After two years the AXE/N replacement project, now called the AXD301 was delivered. Around this same time the CEO of Ericsson Radio became the CEO of Ericsson as a whole. This person had banned Erlang at Ericsson Radio and though he never banned Erlang at Ericsson proper it became career suicide to propose Erlang new Erlang projects. This problem effectively killed opportunities for Erlang Systems AB to sell the language and support. The primary question potential customers asked was 'Who wants to use a language developed by Ericsson when Ericsson won't use the language itself?'. This just goes to show that corporate bureaucracy will be corporate bureaucracy autocracy no matter where you are. In any case, this turned into a bit of a blessing in disguise for the rest of us.
1998-99 In order to drive further development a decision was made within Erlang Systems AB to release Erlang as an Open Source project. This didn't imply an abandonment of Erlang by Ericsson or Erlang Systems AB. Erlang continued and continues to be supported by these two organizations. This decision was made wholly with the idea of spreading Erlang and removing the, somewhat negative, idea that it was a proprietary language. It has remained open source and supported to this day.
As you can see the Erlang didn't start out as Erlang at all. It started out as just a series of requirements backed up by experiments. A large number of experiments where done to find the language that matched those requirements. When no existing language was found Ericsson decided to create their own. Considering Ericsson's resources and the costs associated with development of their products
I think this was a very pragmatic decision. However, that conclusion is open to interpretation. In any case, after the initial development there was a constant back and forth dialog between the users and developers of the language as the language moved through its formative process. I think this fact alone is one of the reasons that Erlang is as good as it is today. Later on in its development as Ericsson grew less resourceful Erlang started to have political problems within the company. Even though Ericsson had several successful and profitable products in Erlang and other languages the de-facto ban occurred. Fortunately by this time Erlang could and did stand on its own. The ban actually turned out to be fortunate for the rest of us because it led, pretty directly, to Erlang's eventual Open Sourcing.
Joe Armstrong, one of the original Erlang Developers and a productive member of the community put together a number of tutorials that are very useful. Its worth going through these and playing with the code.
There are a couple of good editors to use with Erlang. The gold standard is the Erlang Emacs mode distributed as part of the Erlang distro. A very updated version is now available from the Erlware folks. You can get it here. If you go this route I suggest you also get Distel written by Luke Gorrie. It's available from the the good folks at google code. There are instructions included with both of these to get you up and going. For those of you more inclined to the IDE world you may want to take a look at Erlide. This is a set of Eclipse plugins that add support for Erlang to Eclipse. Its still pretty beta, but it's very usable.
Learning Erlang is a fairly quick process. For an experienced developer it shouldn't take more then a few days before they can write nontrivial programs, about a week or two to feel really comfortable and a month or so before feeling ready to take on something big by themselves. It helps a lot to have someone who knows how to use Erlang around for some hand-holding.
Start off by going through the quick start part of the FAQ Then go through the Erlang Course. You can skip the history part if you would like. I have gone over it in more detail here. Once you have done the course play around with some of the examples. Then go read the long version of the getting started docs. This should put you on the road to being able to write some Erlang code. If you are one to worry about coding conventions then you may want to take a look at the programming rules. This has quite a number of useful and well thought out programming rules. One of the things that makes Erlang really interesting is the OTP System. If you really want to get to know something about Erlang then it make sense to spend a bit of time learning OTP and its design principles. is a very good place to start.
|
Tweets
|
|
Repositories
|
|
Posts
|
Here is your chance to get our book Erlang and OTP in Action for half price on April 16th. Use code dotd0416au at www.manning.com/logan/
TL;DR
As I’ve mentioned before, Opa is a new web framework that introduces not only the framework itself but a whole new language. A lot has changed in Opa since I last posted about it. Now Opa has a Javascript-esque look and runs on Node.js. But it still has the amazing typing system that makes Opa a joy to code in.
The currently available Heroku buildpack for Opa only supported the old, pre-Node, support. So I’ve created an all new buildpack and here I will show both a bit of how I created that buildpack and how to use it to run your Opa apps on Heroku.
The first step was creating a tarball of Opa that would work on Heroku. For this I used the build tool vulcan. Vulcan is able to build software on Heroku in order to assure what is built will work on Heroku through your buildpacks.
vulcan build -v -s ./opalang/ -c "mkdir /app/mlstate-opa && yes '' | ./opa-1.0.7.x64.run" -p /app/mlstate-opa
This command is telling vulcan to build what is in the directory opalang with a command that creates the directory /app/mlstate-opa and then runs the Opa provided install script to unpack the system. This is much simpler than building Opa from source, but it is still necessary to still use vulcan to create the tarball from the output of the install script to ensure paths are correct in the Opa generated scripts.
After this run, by vulcan’s default, we will have /tmp/opalang.tgz. I upload this to S3, so that our buildpack is able to retrieve it.
Since Opa now relies on Node.js, the new buildpack must install both Node.js and the opalang.tgz that was generated. To do this I simply copied from the Node.js buildpack.
If you look at the Opa buildpack you’ll see, as with any buildpack, it consists of three main scripts under ./bin/: compile, detect and release. There are three important parts for understanding how your Opa app must be changed to be supported by the buildpack.
First, the detect script relies on there being a opa.conf to detect this being an Opa application. This for now is less important since we will be specifying the buildpack to use to the heroku script. Second, in the compile script we rely on there being a Makefile in your application for building. There is no support for simply running opa to compile the code in your tree at this time. Third, since Opa relies on Node.js and Node modules from npm you must provide a package.json file that the compile script uses to install the necessary modules.
To demostrate this I converted Opa’s hello_chat example to work on Heroku, see it on Github here.
There are two necessary changes. One, add the Procfile. A Procfile define the processes required for your application and how to run them. For hello_chat we have:
web: ./hello_chat.exe --http-port $PORT
This tell Heroku that our web process is run from the binary hello_chat.exe. We must pass the $PORT variable to the Opa binary so that it binds to the proper port that Heroku expects it to be listening on to route our traffic.
Lastly, a package.json file is added so that our buildpack’s compile script installs the necessary Node.js modules:
{
"name": "hello_chat",
"version": "0.0.1",
"dependencies": {
"mongodb" : "*",
"formidable" : "*",
"nodemailer" : "*",
"simplesmtp" : "*",
"imap" : "*"
},
"engines": {
"node": "0.8.7",
"npm": "1.1.x"
}
}
With these additions to hello_chat we are ready to create an Opa app on Heroku and push the code!
$ heroku create --stack cedar --buildpack https://github.com/tsloughter/heroku-buildpack-opa.git $ git push heroku master
The output from the push will show Node.js and npm being install, followed by Opa being unpacked and finally make being run against hello_chat. The web process in Procfile will then be run and the output will provide a link to go to our new application. I have the example running at http://mighty-garden-9304.herokuapp.com
Next time I’ll delve into database and other addon support in Heroku with Opa applications.
There is a great new Emacs plugin from Eric Merritt that like FlyMake builds your code and highlights within Emacs any errors or warnings, but unlike FlyMake builds across the whole project. You can clone the mode from here projmake-mode
After cloning the repo to your desired location add this bit to your dot emacs file, replacing <PATH> with the path to where you cloned the repo.
This Emacs code also uses add-hook to set projmake-mode to start for erlang-mode is loaded. Projmake by default knows how to handle rebar and Make based builds so there is no setup after this, assuming your project is built this way.
Here is my Makefile for building Erlang projects with rebar, replace PROJECT with the name of your project:
Now you can load Emacs and a file from your project and if it is an Erlang file due to the add-hook function in our dot emacs file it will automatically load projmake-mode. You can add hooks for other modes or simply run M-x projmake-mode.
For more documentation and how to extend to other types of projects check out the documentation.
Working with Erlang for writing RESTful interfaces JSON is the communication “language” of choice. For simplifying the process of JSON to a model the backend could work with efficiently I’ve created maru_models. This app decodes the JSON with jiffy and uses functions generated by a modified version of Ulf’s exprecs to create an Erlang record. The generated functions are created with type information from the record definition and when a property is set for the record through these functions it is first passed to the convert function of maru_model_types to do any necessary processing.
I separated this application into a separate repo to simplify people trying the examples. But the real development will be done in the Maru main repo.
TLDR;
Copy and paste the following into your elisp erlang-mode configuration to get flymake working with Rebar projects.
(defun ebm-find-rebar-top-recr (dirname)
(let* ((project-dir (locate-dominating-file dirname "rebar.config")))
(if project-dir
(let* ((parent-dir (file-name-directory (directory-file-name project-dir)))
(top-project-dir (if (and parent-dir (not (string= parent-dir "/")))
(ebm-find-rebar-top-recr parent-dir)
nil)))
(if top-project-dir
top-project-dir
project-dir))
project-dir)))
(defun ebm-find-rebar-top ()
(interactive)
(let* ((dirname (file-name-directory (buffer-file-name)))
(project-dir (ebm-find-rebar-top-recr dirname)))
(if project-dir
project-dir
(erlang-flymake-get-app-dir))))
(defun ebm-directory-dirs (dir name)
"Find all directories in DIR."
(unless (file-directory-p dir)
(error "Not a directory `%s'" dir))
(let ((dir (directory-file-name dir))
(dirs '())
(files (directory-files dir nil nil t)))
(dolist (file files)
(unless (member file '("." ".."))
(let ((absolute-path (expand-file-name (concat dir "/" file))))
(when (file-directory-p absolute-path)
(if (string= file name)
(setq dirs (append (cons absolute-path
(ebm-directory-dirs absolute-path name))
dirs))
(setq dirs (append
(ebm-directory-dirs absolute-path name)
dirs)))))))
dirs))
(defun ebm-get-deps-code-path-dirs ()
(ebm-directory-dirs (ebm-find-rebar-top) "ebin"))
(defun ebm-get-deps-include-dirs ()
(ebm-directory-dirs (ebm-find-rebar-top) "include"))
(fset 'erlang-flymake-get-code-path-dirs 'ebm-get-deps-code-path-dirs)
(fset 'erlang-flymake-get-include-dirs-function 'ebm-get-deps-include-dirs)
Intro
Its probably no great surprise to anyone that I dislike Rebar a lot. That said there are times when I have no choice but to use it. This is always either because a company I am contracting for uses it, or an open source project I am contributing to uses it. When I am forced to use it there are a few things I don’t want to give up. Most important among these is Flymake for Erlang. The default setup for Flymake doesn’t work for Rebar projects because Flymake does not know where the code and include paths for dependencies are. Fortunately, we can fix this with a few lines of elisp.
Flymake For Erlang
First make sure you have Flymake for Erlang installed. It is easiest just to follow the directions available on the Erlang Website.
The Elisp Additions for Erlang Flymake
There are two defvars that point to functions that are used to search for the correct code paths and include paths respectively. We are going to replace those functions with our own functions. Both these functions search upwards from the directory that contains the file pointed to by the current buffer, looking for the top most ‘rebar.config’ in the directory path. It then uses that for a base and searches down the directory structure looking for either ‘ebin’ files or ‘include’ files.
There are two things to note here. The first is that you must have already run `get-deps` for rebar for this to work and the second is that if your project is truly huge or you have way more dependencies then you probably need this search could take a second or two. That is a second or two too long in an interactive compiler like Flymake. That said, the likelihood that you will run into this second problem is quite low.
Getting Started
The very thing you want to do is ensure that you have required the erlang-flymake module. Most of what we do below depends on this.
(require 'erlang-flymake)
Finding the Top rebar.config
The second thing we want to do is look for the top rebar.config in the project. If a rebar project contains more then one OTP application its quite likely that it will contain more then one rebar.config. The very topmost `rebar`config` is the right one to serve as root of our search. So we introduce a set of recursive functions to look for that top level dir.
(defun ebm-find-rebar-top-recr (dirname)
(let* ((project-dir (locate-dominating-file dirname "rebar.config")))
(if project-dir
(let* ((parent-dir (file-name-directory (directory-file-name project-dir)))
(top-project-dir (if (and parent-dir (not (string= parent-dir "/")))
(ebm-find-rebar-top-recr parent-dir)
nil)))
(if top-project-dir
top-project-dir
project-dir))
project-dir)))
ebm-find-rebar-top-recr will return either the top most directory or nil. Our next function takes that result and does something useful with.
(defun ebm-find-rebar-top ()
(interactive)
(let* ((dirname (file-name-directory (buffer-file-name)))
(project-dir (ebm-find-rebar-top-recr dirname)))
(if project-dir
project-dir
(erlang-flymake-get-app-dir))))
In this function, we get the directory containing the file pointed at by the current buffer. We then call our recr function. If it returns a directory we return that, if it returns nil however, we call the original erlang-flymake-get-app-dir function.
At this point we should have our project root. Now its a simple matter of recursively searching down the directory tree looking for files of a certain name. So we create a function that does just that, given a directory and a name will return a list of absolute paths for each subdirectory that matches the specified name.
(defun ebm-directory-dirs (dir name)
"Find all directories in DIR."
(unless (file-directory-p dir)
(error "Not a directory `%s'" dir))
(let ((dir (directory-file-name dir))
(dirs '())
(files (directory-files dir nil nil t)))
(dolist (file files)
(unless (member file '("." ".."))
(let ((absolute-path (expand-file-name (concat dir "/" file))))
(when (file-directory-p absolute-path)
(if (string= file name)
(setq dirs (append (cons absolute-path
(ebm-directory-dirs absolute-path name))
dirs))
(setq dirs (append
(ebm-directory-dirs absolute-path name)
dirs)))))))
dirs))
Now we write a couple of functions to replace the corresponding functions in `erlang-flymake`. The first looks for all `ebin` directories while the second looks for all `include` directories.
(defun ebm-get-deps-code-path-dirs ()
(ebm-directory-dirs (ebm-find-rebar-top) "ebin"))
(defun ebm-get-deps-include-dirs ()
(ebm-directory-dirs (ebm-find-rebar-top) "include"))
Finally we replace the `erlang-flymake` versions of those functions with our implementations.
(fset 'erlang-flymake-get-code-path-dirs 'ebm-get-deps-code-path-dirs) (fset 'erlang-flymake-get-include-dirs-function 'ebm-get-deps-include-dirs)
Conclusion
This approach is a bit of a hack, we basically use some heuristics to find a root and then just grab everything under that that looks remotely like a code or include directory. While its a bit hacky it has the valuable upside that its flexible and robust.
Common Test is a well thought out integration testing framework for Erlang. If you
are not using it you probably should be. However, it has one fault. It
does not return non-negative exit status’ to the caller when the tests
fail. This is a major oversight, and it makes it difficult to use as
part of a continuous integration scheme or in a make based build
system.
The long term fix is for the OTP folks to resolve the issue in thect_run command. To that end I have filed a bug report with the
Erlang folks. In the short term, though, we need this behaving
correctly. After much twiddling around with different solutions and
conversions on the erlang-questions list. This solution finally popped
out of a conversation with Lukas Larsson. Basically, we use the old
unix standby of awk.
ct_run -dir tests ... | awk "/FAILED/{exit 1;}/failed/{exit 1;}/SKIPPED/{exit 1;}"
Where ... is replaced with your additional options. Its not the best
solution on the planet, but it is the simplest one that I found that
works consistently.
Fred, of Learn You Some Erlang for Great Good, today posted on his blog about the problems around how rebar handles releases, Rebar Releases and Being Wrong. The problems he mentions and a few others are why, despite giving it a legitimate shot, I have found rebar unusable for my workflow to be efficient and stable while adhering to OTP standards at the same time.
I suggest first reading his post, if you already use rebar, and then continuing on with the rest of this.
I’ll start with an example on the generation of a project containing two applications and a dependency from one of those applications of cowboy. Next, I’ll create a release (and in the process a deployable target system) to show the difference in how sinan handles this process.
TL;DR Sinan does OTP the right way, rebar does not.
First, you can download the latest version sinan from this link, it is simply an executable escript, so ‘chmod +x sinan‘ and put it in your PATH and you are good to go.
Sinan provides a ‘gen’ command to create your project. I include the output of the steps I took to build this project. Sinan assumes this is a multiple application project, but if you give “y” instead it will create a directory structure similar to rebars default structure with a src/ directory instead of a lib/ directory.
$ sinan gen
Please specify your name
your name> Tristan Sloughter
Please specify your email address
your email> tristan@mashape.com
Please specify the copyright holder
copyright holder ("Tristan Sloughter")>
Please specify name of your project
project name> rel_example
Please specify version of your project
project version> 0.0.1
Please specify the ERTS version ("5.9")>
Is this a single application project ("n")>
Please specify the names of the OTP apps that will be developed under this project. One application to a line. Finish with a blank line.
app> app_1
app ("")> app_2
app ("")>
Would you like a build config? ("y")> y
Project was created, you should be good to go!
We now have a project named rel_example and can see the generated contents.
$ cd rel_example/ $ ls config doc lib sinan.config
Before going further I add the line {include_erts, true}. to sinan.config so that a generated tarball of the release contains erts and can be booted on a machine without Erlang installed.
$ cat sinan.config {project_name, rel_example}. {project_vsn, "0.0.1"}. {build_dir, "_build"}. {ignore_dirs, ["_", "."]}. {ignore_apps, []}. {include_erts, true}.
A tree structure view of the generated project is below:
. ├── config │ └── sys.config ├── doc ├── lib │ ├── app_1 │ │ ├── doc │ │ ├── ebin │ │ │ └── overview.edoc │ │ ├── include │ │ └── src │ │ ├── app_1_app.erl │ │ ├── app_1.app.src │ │ └── app_1_sup.erl │ └── app_2 │ ├── doc │ ├── ebin │ │ └── overview.edoc │ ├── include │ └── src │ ├── app_2_app.erl │ ├── app_2.app.src │ └── app_2_sup.erl └── sinan.config
You’ll see we have a lib directory with two applications containing their source files under a src directory. Now in order to boot the release we’ll create, we need to remove a couple tings from each supervisor. Instead of creating something for them to supervise just remove the variable AChild and replace [AChild] with [].
Next, so we have a third party dependency in the example, add cowboy to the applications in nano lib/app_1/src/app_1.app.src:
{applications, [kernel, stdlib, cowboy]},
Sinan provides a depends command to show the depenedencies of the project and where they are located:
$ sinan depends -v starting: depends Using the following lib directories to show resolved dependencies and where it found them: /home/tristan/.kerl/installs/r15b/lib /home/tristan/Devel/rel_example/_build/rel_example/lib compile time dependencies: runtime dependencies: kernel 2.15 : /home/tristan/.kerl/installs/r15b/lib/kernel-2.15 stdlib 1.18 : /home/tristan/.kerl/installs/r15b/lib/stdlib-1.18 cowboy 0.5.0 : /home/tristan/.kerl/installs/r15b/lib/cowboy-0.5.0 project applications: app_1 0.1.0 : /home/tristan/Devel/rel_example/_build/rel_example/lib/app_1-0.1.0 app_2 0.1.0 : /home/tristan/Devel/rel_example/_build/rel_example/lib/app_2-0.1.0
Now lets build a release and target system.
$ sinan dist
After running the dist command we have a _build directory that we find the following structure. I removed the files/dirs under each app to shorten the listing.
_build/
├── rel_example
│ ├── bin
│ │ ├── rel_example
│ │ └── rel_example-0.0.1
│ ├── erts-5.9
│ │ ├──
│ ├── lib
│ │ ├── app_1-0.1.0
│ │ │ ├──
│ │ ├── app_2-0.1.0
│ │ │ ├──
│ │ ├── cowboy-0.5.0
│ │ │ ├──
│ │ ├── kernel-2.15
│ │ │ ├──
│ │ └── stdlib-1.18
│ │ ├──
│ └── releases
│ └── 0.0.1
│ ├── rel_example.boot
│ ├── rel_example.rel
│ ├── rel_example.script
│ └── sys.config
└── tar
└── rel_example-0.0.1.tar.gz
Sinan has created a lib directory containing all necessary applications for our release as well as the needed files for booting the release. Additionally the dist command creates a tar.gz for easy deployment. But if we simply want to run our release where we are we can:
$ _build/rel_example/bin/rel_example Erlang R15B (erts-5.9) [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.9 (abort with ^G) 1>
This is only the tip of the iceberg of what sinan is capable of. I can’t go into all of it here but I’ll mention that you are able to define multiple releases for a project to generate and which of your project apps to include in each. Additionally you are able to provide a custom rel file if you require tweaks.
The important part to take away from this post is the structure of what you are working with when using sinan and how it is based on OTP standards, both for the source you work on and the results of the build process under _build/.
There are a lot of Erlang web frameworks out there today. Not all are modeled after the MVC model (see Nitrogen), but I think all of them are addressing the problem the wrong way. I recently gave a presentation, slides here and the code for this example here, describing my perferred method for using Erlang for web development and why I think it is the best way to go. In this post, I’ll go into more details on how to build the Erlang backend for the TodoMVC clone I did with Batman.js. I will not spend time on Batman.js but instead only give a quick list of reasons I prefer it to other Javascript frameworks.
Batman.js advantages:
Cowboy is a newer Erlang web server that provides a REST handler based on Webmachine. Both of these are perfect for developing a RESTful API, because they follow the HTTP standard exactly and when you are building an API based on HTTP, being able to properly reason about how the logic of the application maps to the protocol eases development and eases getting REST “right”.
Any non-dynamic content should be served by Nginx since there is no logic needed and it is something Nginx is great at, so why have Erlang do it? The snippet below configures Nginx to listen on port 80 and serve files from bcmvc’s priv directory. Each request is checked to see if it is a POST or any other method with a JSON request type. If either of those are true, the request is proxied on to a server listening on port 8080, in our case the Cowboy server.
server {
listen 80;
server_name localhost;
location / {
root <PATH TO CLONE>/bcmvc/lib/bcmvc_web/priv/;
if ($request_method ~* POST) {
proxy_pass http://localhost:8080; }
if ($http_accept ~* application/json) {
proxy_pass http://localhost:8080; }
}
}
Batman.js knows what endpoints to use and what data to send based on the name of the model we created and the encoded variables, code here. This results in the following API:
| Method | Endpoint | Data | Return |
|---|---|---|---|
| POST | todos | {todo : {body:”bane wants to meet, not worried”,isDone:false}} | |
| PUT | /todos/33e93b30-2371-4071-afc5-2d48226d5dba | {todo : {body:”bane wants to meet, not worried”,isDone:false}} | |
| GET | todos | [{todo : {id:"33e93b30-2371-4071-afc5-2d48226d5dba", body:"bane wants to meet, not worried", isDone:false}}] | |
| DELETE | /todos/33e93b30-2371-4071-afc5-2d48226d5dba |
Dispatch rules are matched by Cowboy to know what handler to send the request to. Here we have two rules. One that matches just the URL /todos and one that matches the URL with an additional element which will be associated with the atom todo. Both requests will be sent to the module bcmvc_todo_handler.
Dispatch = [{'_', [{[<<"todos">>], bcmvc_todo_handler, []},
{[<<"todos">>, todo], bcmvc_todo_handler, []}]}],
Cowboy provides a useful function child_spec for creating a child specfication to use in our supervisor. The child spec here tells Cowboy we want a TCP listener on port 8080 that handles the HTTP protocol. We additionally provide our dispatch list for it to match against and pass on requests.
ChildSpec = cowboy:child_spec(bcmvc_cowboy, 100, cowboy_tcp_transport, [{port, 8080}], cowboy_http_protocol, [{dispatch, Dispatch}]),
Now that we have a server on port 8080 that knows to send certain requests to our todo handler, we can build the module. The first required function to export is init/3. This function let’s Cowboy know we have a REST protocol, this is how it knows what functions to call (some have defaults and some existing in our module) to handle the request.
init(_Transport, _Req, _Opts) -> {upgrade, protocol, cowboy_http_rest}.
Knowing that this is a REST handler Cowboy will pass the request on to allowed_methods/2 to find out if our handler is able to handle this method. Next, the content types accepted and provided by the handler are checked against the incoming request. The expected HTTP response status codes are returned if any of these fail. 405 for allowed_methods, XXX for content_types_accepted and XXX for content_types_provided.
allowed_methods(Req, State) -> {['HEAD', 'GET', 'PUT', 'POST', 'DELETE'], Req, State}. content_types_accepted(Req, State) -> {[{{<<"application">>, <<"json">>, []}, put_json}], Req, State}. content_types_provided(Req, State) -> {[{{<<"application">>, <<"json">>, []}, get_json}], Req, State}.
Now the request is sent to the function that handles the HTTP method type of the request.
For a POST, a request to create a new todo item, the function process_post/2 is sent the request. Here we retrieve the body, a JSON object, from the request, convert it to a record and save the model. We’ll see how this record conversion is done when we look at the model module. To inform the frontend of the id of our new resource we set the location header to be the path with the id.
process_post(Req, State) -> {ok, Body, Req1} = cowboy_http_req:body(Req), Todo = bcmvc_model_todo:to_record(Body), bcmvc_model_todo:save(Todo), NewId = bcmvc_model_todo:get(id, Todo), {ok, Req2} = cowboy_http_req:set_resp_header( <<"Location">>, <<"/todos/", NewId/binary>>, Req1), {true, Req2, State}.
For this handler we expect PUT for an update to an object, that is what Batman.js does, but a PATCH would make more sense. For a PUT the URL contains the id for the todo item to be updated. That is retrieved with the binding/2 function. The todo record is created the same as in process_post/2 but then the this id is set for the model and the update/1 function is used to save it to the database.
put_json(Req, State) -> {ok, Body, Req1} = cowboy_http_req:body(Req), {TodoId, Req2} = cowboy_http_req:binding(todo, Req1), Todo = bcmvc_model_todo:to_record(Body), Todo2 = bcmvc_model_todo:set([{id, TodoId}], Todo), bcmvc_model_todo:update(Todo2), {true, Req2, State}.
For a GET request, which for this application we do not deal with a request for a single todo item, all todo items are retrieved from the model module. Each of these is passed to the model’s to_json/1 function and the result of converting each to JSON is combined into a binary string and placed between brackets so the Batman.js frontend receives a proper JSON list of JSON objects.
get_json(Req, State) -> JsonModels = lists:foldr(fun(X, <<"">>) -> X; (X, Acc) -> <<Acc/binary, ",", X/binary>> end, <<"">>, [bcmvc_model_todo:to_json(Model) || Model <- bcmvc_model_todo:all()]), {<<"[", JsonModels/binary, "]">>, Req, State}.
And lastly, DELETE. Like in PUT the todo item’s id is retrieved from the bindings created based on the dispatch rules and this is passed to the model’s delete function.
delete_resource(Req, State) -> {TodoId, Req1} = cowboy_http_req:binding(todo, Req), bcmvc_model_todo:delete(TodoId), {true, Req1, State}.
Model’s are repsented as records and must provide serialization functions to go between JSON and a record. Each model uses a parse transform that creates functions for creating and updating the record. The transform is a modified version of exprecs from Ulf Wiger that also uses the type definitions in the record to ensure when setting a field that it is the correct type. For example in the todo model isDone is a boolean, so when the model is created the boolean convert function will be matched to convert the string representation to an atom:
convert(boolean, <<"false">>) -> false; convert(boolean, <<"true">>) -> true;
So the key pieces of the bcmvc_model_todo are:
-compile({parse_transform, bcmvc_model_transform}). -record(bcmvc_model_todo, {id = ossp_uuid:make(v1, text) :: string(), body :: binary(), isDone :: boolean()}). to_json(Record) -> ?record_to_json(?MODULE, Record). to_record(JSON) -> ?json_to_record(?MODULE, JSON).
The ?record_to_json and ?json_to_record macros are defined in jsonerl.hrl. These marcos are generic and work for any record that is typed and uses the model transform.
Clearly, much of what the resource handler and model do is generic and can be abstracted out so that implementing new models and resources can be even simpler. This is the goal of my project Maru. Currently it is based on Webmachine but is now being convered to Cowboy.
In the end, using Cowboy for building a RESTful interface for your application allows you to build interfaces for the frontend entirely separted from backend development, and if you want multiple interfaces (like native mobile and web), they both talk directly to the same backend. Also, from the beginning you have the option to open up your application with an API for other developers to take your application new places, and, shameless plug here, add your API to Mashape to spread your new app!
I’ll have a complete walk of through using Cowboy and Batman.js to build the TodoMVC clone in a few days. For now I have the slides from my talk at the Chicago Erlang User Group:
Chicago Erlang User Group April, 4th 2012
I couldn’t get iframe embedding to work with WordPress… So if anyone knows what that is up with please comment.
How to organize Erlang/OTP releases over on my personal blog. Worth reading if you are in the process of figuring out how to manage Erlang in your organization.
|
Map
|
|
Videos
|
Eric is a veteran entrepreneur, author and public speaker. He is expert in the architecture, development and deployment of large-scale distributed systems on heterogeneous hardware, and the languages and platforms required to support them. His experience spans from vertical scaled manufacturing and billing systems on IBM Mainframe and midrange hardware for companies like Sysco, Inc, to distributed build systems and massive fleet deployment tools on large fleets of beige boxes for companies like Amazon.com, and high-frequency trading and financial exchange systems on custom hardware for leading private brokerages. Eric is also co-author of the popular book “Erlang and OTP in Action”.