Monday 18 December 2017

Using public Docker Hub for private Golang project images


Ever wondered if you really have to either pay for a private Docker Hub repository or run your own private Docker Registry in order to keep your Docker images private? Might there a way to use a public repository as a private one? Yes, there is - read on!

To answer this question, let's take a step back and analyse the main reasons for keeping your Docker images private. In my experience, the reasons usually boil down to one of the following:
  1. Restricting access to who can pull the image containing the binaries and the configuration of a closed source project.
  2. Hiding confidential/sensitive information such as passwords, tokens and private keys contained in the images.


Problem

In order to pull an image from a private Docker Hub Repository or a private Docker Registry, a Docker client needs to provide valid credentials when pulling the image, either in the form of username/password or as a valid authentication token. Assuming that the credentials are kept private, access to the image is restricted to the people/systems that are in possession of those credentials.

The only problem with this solution is that it costs money. Docker Hub provides free hosting for an infinite number of images, as long as they're public; but for private images it charges $12 to $16.80 per image and year, depending on the plan. This might be acceptable if you're running a business that generates revenue, but if you would like to store the content of your private research project on Docker Hub, then you might feel that price is relatively steep.

What are your options? You could run your own Docker Registry. But guess what? That registry has to run somewhere. You'd have to take care of backups, monitoring etc. You will end up paying even more, both in money and time.

Infrastructure free solution

But wait, before your admit defeat and give your credit card details to Docker Hub, let's see if we can achieve our goals using only public images hosted on Docker Hub.

The obvious solution is to encrypt the contents of the files in the Docker image using some kind of symmetric encryption. Everyone can read the encrypted image, but no one can access the data without knowing the secret key.

In theory this works, but in practice there are a few obstacles to overcome. First of all, you have to modify your project to decrypt the data when reading. Secondly, at least one binary of the image must be ELF executable, in order to be used as the init command for the Docker container.

Enter Golang

If you happen to be a Golang developer, both of these problems can be reduced to one. If the files that are distributed together with the compiled binary are small enough to fit into the heap, one can use bindata and assetfs to generate a Golang source that can be compiled into one binary and accessed using the filesystem interface. After doing that, the public image contains only one binary that is self-contained. Now we only have the problem of how to make sure that this one binary cannot be run and/or inspected by anyone not owning the secret key.

Enter Midgetpack

Fortunately for our endeavour, there is already an open source project that can encrypt an ELF binary with a symmetric key: Midgetpack. In its original version, it will prompt the user for the password on TTY, which is not very useful when running the binary in Docker. In order to pass the password as an environment variable, I have modified Midgetpack slightly. I have also created a Docker image containing the modified Midgetpack binary that can be used as part of a multi stage Docker build.

Putting everything together

Now that we have all the moving parts, we can put them all together!
For demonstration purposes, we'll use the following Golang source:

package main

import "fmt"

func main() {
 fmt.Println("super-secret-text")
}
To compile and encrypt the binary file we can use the following Dockerfile:
FROM golang:1.9.2 as build
RUN mkdir -p /go/src/github.com/draganm/midgetpack-demo
WORKDIR /go/src/github.com/draganm/midgetpack-demo
COPY main.go .
RUN CGO_ENABLED=0 GOOS=linux go install .

FROM dmilhdef/midgetpack:v2.0 as encrypt
ARG KEY
COPY --from=build /go/bin/midgetpack-demo /
RUN /midgetpack  -p -P $KEY /midgetpack-demo -o /midgetpack-demo-encrypted

FROM alpine:3.6
COPY --from=encrypt /midgetpack-demo-encrypted /midgetpack-demo
CMD ["/midgetpack-demo"]
 In order to build the image we need to pass a private key as a build argument to the docker build:
docker build --build-arg KEY=abc -t encrypted .
Now we can use the private key passed as a BP environment variable to start the container:
docker run -e BP=abc encrypted
If we start the container using the wrong key, it will fail:
docker run -e BP=wrong encrypted
And we can check if we can see the plain text in the binary:
docker run encrypted strings /midgetpack-demo | grep super-secret-text

Conclusion

It is possible to use publicly accessible Docker images to distribute closed source projects, as long as you're ready to accept some limitations and jump through a few hoops. This solution is not limited to Golang. As long as the image consists of ELF binaries, midgetpack can be used to encrypt them. A very similar approach could be used with Java/Clojure. In this case, the class files have to be encrypted and a custom class loader has to be implemented.

Saturday 18 August 2012

Itching a scratch: Thread safe LRU Cache in Ruby

Background:
I'm a bit of a newbie to the Ruby world. Actually I've started learning Ruby only few months ago as a kind of hobby. In the meantime I'm very impressed by elegance, simplicity and productivity of Ruby.
Given that I'm a hard-core Java developer by day, I was very interested to find out how Ruby handles concurrency. After being a bit dissapointed by existence of GIL (http://ablogaboutcode.com/2012/02/06/the-ruby-global-interpreter-lock/) in MRI, I started exploring alternative Ruby implementations - as it seems both JRuby and RBX seem to be much more concurrency friendly.

Fast forward to today:
I've been implementing a kind of a storage system in Ruby. In order to improve performance of the storage system, I've been caching data in a LRU cache. LRU cache is not really a new idea, so I thought there must be plenty of implementations floating around. Indeed, there are quite a few open source LRUCache implementations - just to name a few:

https://github.com/kindkid/lrucache/tree/master/lib
http://cvs.savannah.gnu.org/viewvc/ruby-cache/?root=pupa
https://github.com/ahoward/lru_cache
https://github.com/jmettraux/rufus-lru
https://github.com/dcarney/lru_cache

Most of the features (such as cache timeouts) offered by many of them were not important for my task. Astonishingly, almost none of those seem to care about concurrency of the access to cache. After bit of googling, I've even managed to find an implementation that does cover concurrency:

https://github.com/shadabahmed/lru-cache

At the first glance it did look fine, but after going through the source code I've spotted a few places with possible race conditions, especially when run in JRuby: none of the read operations on a Hash seem to be synchronized with a Mutex - this can go wrong badly if JRuby's Hash does not handle concurrent access internally. To prove my theory I've written following RSpec, that demonstrates the problem:
require 'lru'

module LRUCache
  describe Cache do

    describe :get do

      it "should be thread-safe" do

        cache=Cache.new 2
        threads=(0..200).map do
          Thread.new do
            (0..1000).each do |value|              
              cache.put(rand(10),rand(2000))
              cache.get(rand(10))
            end
          end
        end

        threads.each{|thread| thread.join}

      end

    end

  end
end

After running the Spec using JRuby 1.6.7.2, following happened:

$ rspec
F

Failures:

  1) LRUCache::Cache get should be thread-safe
     Failure/Error: cache.put(rand(10),rand(2000))
     NoMethodError:
       undefined method `key' for nil:NilClass
     # /home/milic/git/lru-cache/lru.rb:19:in `put'
     # ./spec/lrucache_spec.rb:17:in `LRUCache'
     # ./spec/lrucache_spec.rb:16:in `LRUCache'

Finished in 6.4 seconds
1 example, 1 failure

Failed examples:

rspec ./spec/lrucache_spec.rb:11 # LRUCache::Cache get should be thread-safe

This provided me with enough motivation to implement my own LRU Cache that will behave properly under heavy load and behave well when multiple threads are trying to produce value for the same key. The result of that can be found here: https://github.com/draganm/threadsafe-lru

The basic principle is simple: internal state of the cache is protected by one Mutex. Values are held on thread-safe Future objects (Node class in the source code) that synchronize creation of values (evaluating code blocks provided to the get method). This design should allow minimal contention of threads accessing different values (minimal duration of locking of internal state Mutex) and synchronization of threads accessing the same value so that the value is produced only once.



Tuesday 12 July 2011

Xtext 2.0 Mixin How-To

Last Week I've been experimenting with Xtext 2.0 and wanted to have a Mixin of two DSLs (one DSL referencing model elements of another DSL). I've done something similar about 8 months ago using XText 1.0 and of course by now I've completely forgotten the gory details and frustration levels involved in figuring out how to do it.

I was hoping that by now there would be some "how-to" available online; but Google search hasn't come up with anything useful. Hence, here is a short description of how a Mixin can be done:

So, what is a Mixin? Assume that you have two Xtext grammars: A and B. As long as A and B are independent of each other, you don't need Mixins. Now, let's assume that grammar A references elements that are described in grammar B or simply extends grammar B. In that case you need to tell grammar A about grammar B.

If you consult the Xtext documentation you won't get very far. To correct that here is what you need to do:

  1. Create Xtext projects for both grammar A and grammar B (you probably already did this)
  2. Add a plug-in dependency of A's Xtext project to B's Xtext project
  3. Edit the A.xtext file to include the following statement:
    import "platform:/resource/B/src-gen/<B's package>/B.ecore" as B
    (This must be located before the generate statement!)
  4. Edit GenerateA.mwe2 file to add the reference to B's genmodel by inserting:
    referencedGenModels = "classpath:/<B's package>/B.genmodel" into ecore.EcoreGeneratorFragment fragment.
  5. Re-Generate A's Xtext project by running GenerateA.mwe2
And there it is ... now you can reference B's model elements within grammar A.