A Fresh Developer’s take on Cloud Foundry Why the third platform is the new way to iterate

Why the third platform is the new way to iterate

Coming to the Dojo a little over two months ago for my first day of work at my first (out of college) job, I was mystified by the amount of understanding and knowledge I still had to gain by using the Cloud Platform as a development and deployment environment. After the first few weeks, we started developing a brand new application for internal use. There was no existing infrastructure, but a clear idea to use a SQL database, and a web portal for the UI. I was thrilled! I worked on my fair share of web applications, so this would be a cake walk.

Great! What were the IP’s? What was the endpoint? What was the downtime for our environment on updates? What was our server technology? They said it was up to me. I began thinking of asking IT for a VM spin up our database, scripts designed to back up the data with cron jobs, registering a DNS for our website, and for a potential duplicate set for our test environment; technical aches and pains that are first steps when building a web application. I then began asking what seemed the obvious question of who to go to for the resources, and I was told we would do it all ourselves. Wait, I thought, we were supposed to deploy and maintain the VM’s set up users for the database, download the applications, and backup technology? That was easily another two or three days short term, and days of complications long term. Or so I thought.

Except…I hadn’t known that we were using Cloud Foundry, and the benefits that came from this. All of the pain points I used to face in setting up a functioning web app were moot. Instead of spending time configuring and maintaining environments, the Dojo, as a team, has put together a full-functioning web application with production environment, architecture, an extensive automated testing harness, and automated deployment. All as painlessly as we might use Github.

Removing Development Roadblocks

When I thought of the Web application, a typical structure comes to mind of the architecture for the app.

Screen Shot 2016-08-31 at 12.20.49 PM

A typical developer headache for a web application

There’s plenty of steps for our team, as developers, to maintain and care about here. But with Cloud Foundry, we can remove many of these concerns.

Lets talk about Databases first. Using service brokers as provided through CF, we can simply bind a SQL service to our application. For extra sweetness, we can use a buildpack that supports that our server is connecting to a service-provided database. The buildpack will, at runtime, connect our application’s backend into a bound SQL database, no environment variables necessary.

Servers and UI? No problem. Pushing applications to Cloud Foundry takes as many commands as pushing to git. We can also update and configure the application directly through a UI included in our environment. This means changing something we forgot to put in our three-four deployment commands doesn’t warrant another deployment. As far as firewalls go, using the proper security groups to avoid exposing secretive endpoints is a few minutes behind the scenes of a UI, or a couple commands in a terminal.

Alright, so our development roadblocks are covered. What does all this mean for our Customers?

Rapid Integration and Feedback Loops

With our Test Driven Development and Continuous Integration mantras, we can rapidly iterate on our products. With Cloud Foundry, those rapid iterations allow us to take customer feedback. Using our real product on something as quick to set up and easily accessible as CF platform is, we can respond to our customer’s concerns much more efficiently.

Allowing our customers a way to directly interact with our product in this way is not only helpful for them, but for us, for a variety of reasons. Firstly, we’re able to get valuable feedback from them in a timely fashion, not when a feature has been done and untouched for two months (less old code touching, the better). Additionally, we can test what our real runtime environment will be like for our production level product, which means faster satisfaction for customers.

Screen Shot 2016-08-31 at 12.48.14 PM

Our development cycle. Made simple with less setup behind the scenes

Finally, we don’t have to worry about maintaining servers and VMS when they may go down. Cloud Foundry can sense when an app crashes, and restart it for us. Same database, same DNS, and limited downtime. Doesn’t get much easier than that.

Final thoughts

Using Cloud Foundry also means we can simplify our development process by having a consistent API and well-documented environment surrounding our application. When we rotate around our projects, people coming into our web application are using similar knowledge and practices they might use when deploying our Minecraft Docker Containers.

It’s for the reasons in this article, and I’m sure for many more to come, that the cloud platform and Cloud Foundry has made development easier for us. It allows us to create better products, faster. We can get immediate feedback from customers and get stories accepted faster. For most developers, that’s a no brainer.

Multi-Cloud MineCraft! Minecraft + Docker + CloudFoundry + vSphere + RackHD + ScaleIO = Awesomeness

Minecraft + Docker + CloudFoundry + vSphere + RackHD + ScaleIO = Awesomeness

Peter Blum

If you didn’t know, VMworld begins next week! Here at the Dojo we thought we would show you a cool project we’ve been working on. MINECRAFT in the cloud!!

With all the clouds out there, both public and private, we wanted to bridge the gap between them. We took a public vSphere cloud and a private bare metal cloud, and layered CloudFoundry ontop. Then with a simple CF push we spin up a Minecraft server from a Docker image. Hmm don’t know if I could put more buzz words into a sentence if I tried…anyhow check it out!

Big shout out to VMWare, RackHD, RexRay, Libstorage, and of course CloudFoundry for helping us make it possible. Also here is a link to our Github (https://github.com/EMC-Dojo/CloudFoundry-MineCraft) where you can find the Docker Image that you can run inside cloudFoundry for your own private Minecraft server.

Cloud Foundry. Open Source. The Way. EMC [⛩] Dojo.

Using Docker Container in Cloud Foundry

Using Docker Container in Cloud Foundry

As we all know, we can push source code to CF directly, and CF will compile it and create a container to run our application. Life is so great with CF.

But sometimes, for some reason, such as our App needs a special setup or we want to run an app on different platforms or infrastructures, we may already have a preconfigured container for our App. This won’t block our way to CF at all. This post will show you how to push docker images to CF.

Enable docker feature for CF

We can turn on docker support with the following cf command

  cf enable-feature-flag diego_docker

We can also turn it off by

  cf disable-feature-flag diego_docker
Push docker image to CF
  cf push cf-docker -o golang/alpine

Unlike the normal way, CF won’t try to build our code and run it inside the image we specified. CF would assume that you already put  everything you need into your docker image. We have to rebuild the docker image every time we push a change to our repository.

We also need to tell CF how to start our app inside the image by specifying the start command. We can either put it as an argument for cf push or put it into manifest.yml as below.

---
applications:
- name: cf-docker
  command: git clone https://github.com/kaleo211/cf-docker && cd cf-docker && mkdir -p app/tmp && go run main.go

In this example, we are using an official docker image from docker hub. In the start command, we clone our demo repo from Github, do something and run our code.

Update Diego with private docker registry

If you are in the EMC network, you may not able to use Docker Hub due to certificate issues. In this case, you need to setup a private docker registry. The version of registry needs to be V2 for now. Also, you have to redeploy your CF or Diego with the changes being shown below.

properties:
  garden:
    insecure_docker_registry_list:
    - 12.34.56.78:9000
  capi:
    stager:
      insecure_docker_registry_list:
      - 12.34.56.78:9000

Replace 12.34.56.78:9000 with your own docker registry ip and port.

Then, you need to create a security group to reach your private docker registry. You can put the definition of this security group into docker.json as shown below

[
    {
        "destination": "12.34.56.78:9000",
        "protocol": "all"
    }
]

And run

  cf create-security-group docker docker.json
  cf bind-staging-security-group docker

Now you can re-push to CF by

  cf push -o 12.34.56.78:9000/your-image

Retrospectives Rock! Our way of ensuring continuous improvement: personally, as a team, and in the industry

Our way of ensuring continuous improvement: personally, as a team, and in the industry

There is great philosophical and factual validity behind the concept of empathetic reflection affecting positively people’s lives and interactions with others. Throughout history, people have focused their actions and reactions around this idea both subconsciously and consciously. A famous and often used example of this is the Examen as originally defined by St. Ignatius; the concept of through thoughts, words and deeds, reflecting on where you have been (further what brought you elation, sadness, stress, confusion, etc.), where you are, and aligning these things with where you want to go; continuous improvement of the self.

Here at the EMC Dojo, as our name literally implies, we practice “the way.” A large portion of this form of enlightenment is no doubt found in reflection as practiced through Retrospective, or an exercise our team knows as Retro. The exercise is practiced as explained in the aforementioned example as a form of continuous improvement of our methodology. We use the time to align everyone’s perspective and to ensure we are on the same page in terms of action items and overall goals.

Retro is done every Friday without failure. It is about an hour long and is attended by everyone who plays a role in the Dojo’s operations. Even when members of our Dojo family are working off-site, we ensure that a Google Hangout session is available so that they are virtually in attendance, as this is arguably the most important “meeting” all week. It is the time and space for every employee to speak his/her mind and do so with assumed positive intent, so that feedback can constructively be given and received.

The concept is simple: as a whole team, we are living out the mantra we preach. We are learning by doing; generating weekly action and reaction to our development process through empathetic honesty.    

The person running the exercise for the week (this is not assigned per week, but rather a spur of the moment volunteer) will draw on a white board three columns signified by a smiley face (things that went swimmingly), a neutral face (the so-so items), and a sad face (as your teacher used to proclaim “areas for improvement”). Then all members will add topics/items for discussion from the week to each of these columns as he/she sees fit. These items do not need to be evenly distributed by any stretch of the imagination. That is, no person or team collectively must add the same number of items to the smiley face column as they do to the neutral face column and/or the sad face column (this is part of the honesty piece).

The leader of the exercise will then address each topic/item on the board one item at a time from left to right or right to left (we do this to ensure that we are not making the retro top heavy in any one of the columns). By addressing the topic/item, we mean that the leader will ask the team who wrote the item on the board and why. This is where the importance of this exercise comes to play.

No matter the topic of discussion, in order for the exercise to work and for the team to reap all of its benefits, team members must be completely and empathetically honest about his/her contribution. By doing so, the team learns to navigate conflict by having uncomfortable, but necessary discussions. As with anything, with practice comes perfection. And by perfection, I mean the unattainable kind; something that we are still reaching for 🙂 All kidding aside, it’s true – with every week there are steps back that we learn from paired with giant steps toward our overall goals and methodology.

With the addressing of each topic, the team decides action items that are recorded on the spot. This allows us to ensure that we are offering more than an end of iteration wrap-up, and instead generating real actions and change to ourselves, our team and our organization. We are assessing our ability to break down and solve problems and make improvements –measuring, building and learning constantly in our development process (sound familiar?).

While the exercise was nearly impossible to complete during the birth of our DevOps culture, it is now something that we could not live without. Team issues are no doubt as challenging, if not more, than the technical issues we face. Luckily this exercise allows us to face both and do so with our core value of empathy at its center. We not only are then able to end the week (and iteration) on a high note, but also go into our weekend and following iteration rejuvenated. Retros Rock!

Road trip to Persistence on CloudFoundry Chapter 2

Chapter 2

nguyen thinh

nguyen thinh

nguyen thinh

Latest posts by nguyen thinh (see all)

Road trip to Persistence on CloudFoundry

Chapter 2 – Bosh

Bosh is a deployment tool that can provision VMs on different IAAS such as AWS, OpenStack, vSphere, and even Baremetals. It monitors the VMs’ lives and keeps track of all processes that it deploys to the VMs. In this tutorial, we will focus on how to use bosh with vSphere. However, you can apply the same technique for your IAAS too. The reason we talk about Bosh in this roadtrip is because we use it to deploy Cloud Foundry.


Table of Contents


1. How it works

2. Install Bosh

  1. Create a Bosh Director Manifest
  2. Install
  3. Verify Installation

3. Configure Bosh

  1. Write your Cloud Config
  2. Upload cloud config to bosh director

4. Use Bosh

  1. Create a deployment Manifest
  2. Upload stemcell and redis releases
  3. Set deployment manifest and deploy it
  4. Interact with your new deployed redis server

1. How it works

Let’s say you want to bring up a VM that contains Redis on vSphere, you provide Bosh a vSphere stemcell (Operating System image) and a Redis Bosh Release (Installation scripts). Bosh then does the job for you.

alt text

Are you excited yet? Let’s get started on how to install Bosh on your IAAS and use it.

2. Install Bosh

In order to use Bosh in vSphere, you will need to deploy a Bosh Director VM

1. Create a Bosh Director Manifest

This manifest describes the director VM’s specs. Copy the example manifest from this docs and place it on your machine. Modify the networks section to match that of your vSphere environment.

2. Install

After configuring your deployment manifest, install bosh-init tool from this docs. Then type the following command to deploy a Bosh director VM

bosh-init deploy PATH_TO_YOUR_DIRECTOR_MANIFEST

3. Verify Installation

After completing the installation, download Bosh client to interact with Bosh director. If you have any installation problem, please refer to: docs

gem install bosh_cli --no-ri --no-rdoc

Then type

bosh target BOSH_DIRECTOR_URI

If the command succeed, you now have a functioning Bosh director.

3. Configure Bosh

To configure Bosh Director, you pass it a configuration file called Bosh Cloud Config. It allows you to define resources, networks, vms type specifically for all of your Bosh vms deployments.

alt text

1. Write your Cloud Config

To write your cloud config, copy the vSphere Cloud Config example here: tutorial and modify it accordingly.

2. Upload cloud config to bosh director

bosh update cloud-config PATH_TO_YOUR_CLOUD_CONFIG

After defining a cloud config, you are then able to deploy VMs on Bosh.

4. Use Bosh

Let’s deploy a simple redis server vm on vSphere using Bosh.

1. Create a deployment Manifest

---
name: redis-deployment
director_uuid: cd0eb8bc-831e-447d-99c1-9658c76e7721
stemcells:
- alias: trusty
  os: ubuntu-trusty
  version: latest
releases:
- name: redis
  version: latest
jobs:
- name: redis-job
  instances: 1
  templates:
  - {name: redis, release: redis}
  vm_type: medium
  stemcell: trusty
  azs: [z1]
  networks:
  - name: private
  properties:
    redis:
      password: REDIS_PASSWORD
      port: REDIS_PORT
update:
  canaries: 1
  max_in_flight: 3
  canary_watch_time: 30000-600000
  update_watch_time: 5000-600000

In this deployment manifest, I am deploying a redis VM using redis release and ubuntu stemcell. I wanted the VM type to be medium and the VM to be located at availability zone z1.

2. Upload stemcell and redis releases

bosh upload stemcell https://bosh.io/d/stemcells/bosh-vsphere-esxi-ubuntu-trusty-go_agent
bosh upload release https://bosh.io/d/github.com/cloudfoundry-community/redis-boshrelease

Alternatively, you can download them and run bosh upload locally.

3. Set deployment manifest and deploy it

bosh deployment PATH_TO_YOUR_DEPLOYMENT_MANIFEST
bosh deploy

4. Interact with your new deployed redis server

Find your vm IP using

bosh vms

Connect to your redis server

redis-cli -h YOUR_REDIS_VM_ADDRESS -a REDIS_PASSWORD -p REDIS_PORT

Test it

SET foo bar
GET foo

One Week at EMC Dojo

Latest posts by Brian Verkley (see all)

I write this while snacking on free banana chips I obtained from the lunchroom. A week in the Dojo is very different than a week programming at home or in a cube farm, and it has as much or more to do with culture than tools and process.

The EMC Dojo in Cambridge, MA (https://www.cloudfoundry.org/welcome-to-the-emc-dojo/) is a place that contributes code to the open source cloud foundry project using a specific development methodology built around pair programming, test driven development, and lean development, known as “the way”. They also teach the way to anyone interested in contributing to Cloud Foundry as well as other development teams within EMC. They earn the title of Dojo by literally being a “place of the way” .

From the moment I got there, I was a member of the team. I had planned to work in the same area, quietly observing the way while working on my own projects. That didn’t work out, and I’m so thankful it didn’t. On the first day, everyone in the Dojo had introduced themselves to me and were interested in what I was working on. They adopted my story into their backlog and encouraged me to attend their standup the next day and begin pair programming with them. I felt so valued before I had even added value. Before long I found myself pair programming on a PHP app (I have no experience in PHP) and doing anything I could to contribute.

They went to lunch together every day I was there, often participating in an office wide Yoga lunch or Go Programming Book Club. They cared about each other’s quality of life and personal hobbies. They struggled out loud with programming syntax and architectural design choices and often came together to discuss, solve, or vote. Every member of the team was heard and had the same weight in decision-making.

And it works. The speed and efficiency that they are able to take a story idea and turn it in to working code is amazing. And although they are all individually quite talented, there seems to be a “greater than the sum of their parts” thing happing with “the way”. They are completed unfazed by tackling something new largely because of their confidence that as a team they will be able to solve it.

What I witnessed, no, what I was a part of for my week, was a team of developers effectively coding for hours based on shared goals and methodology in an environment that made everyone happy. I’m so thankful to everyone who paired with me, shared with me, challenged debated and listened to me, welcomed me, and taught me.

What an outsider might notice first is the ping-pong table next to the cafeteria with available food and no checkout register, but that would be missing the point. The culture that they have built here and the benefits of developing in “the way” have left me with a lot more than this now empty bag of banana chips.

Road trip to Persistence on CloudFoundry Laying the framework with ScaleIO

Laying the framework with ScaleIO

Peter Blum

Over the past few months the Dojo has been working with all the types of storage to enable persistence within CloudFoundry. Across the next few weeks we are going to be road tripping through how we enabled EMC storage on the CloudFoundry platform. For our first leg of the journey, we start laying the framework by building our motorcycle, a ScaleIO cluster, which will carry us through the journey. ScaleIO, a software defined storage service that is both flexible to allow dynamic scaling of storage nodes as well as reliable to enable enterprise level confidence.

What is ScaleIO – SDS, SDC, & MDM!?

ScaleIO as we already pointed out is a software defined block storage. In laymen terms there are two huge benefits I see with using ScaleIO. Firstly, the actual storage backing ScaleIO can be dynamically scaled up and down by adding and removing SDS (ScaleIO Data Storage) server/nodes. Secondly, SDS nodes can run parallel to your applications running on a server, utilizing any additional free storage your applications are not using. These two points allow for a fully automated datacenter and a terrific base to start for block storage in CloudFoundry.

Throughout this article we will use SDS, SDC, and MDM, lets define them for some deeper understanding!
All three of these terms are actually services running on a node. These nodes can either be a Hypervisor (in the case of vSphere), a VM, or a bare metal machine.

SDS – ScaleIO Data Storage

This is the base of ScaleIO. SDS nodes store information locally on storage devices specified by the admin.

SDC – ScaleIO Data Client

If you intend to use a ScaleIO volume, you are required to become an SDC. To become an SDC you are required to install a kernel module (.KO) which is compiled specially for your specific Operating system version. These all can be found on EMC Support. In addition to the KO that gets installed there also will be a handy binary, drv_cfg. We will use this later on but make sure you have it!

MDM – Meta Data Manager

Think of the MDMs as the mothers of your ScaleIO deployment. They are the most important part of your ScaleIO deployment, they allow access to the storage (by means of mapping volumes from SDS’s to SDC’s), and most importantly they keep track of where all the data is living. Without the MDM’s you lose access to your data since “Mom” isn’t there to piece together the blocks you have written! Side Note: make sure you have at least 3 MDM nodes. This is the smallest number allowed since it is required to have 1 MDM each for Master, Slave, and Tiebreaker.

How to Install ScaleIO

The number of different ways to install ScaleIO is unlimited! In the Dojo we used two separate ways, each with their ups and downs. The first, “The MVP”, is simple and fast, and it will get you the quickest minimal viable product. The second method, “For the Grownups”, will provide you with a start for a fully production ready environment. Both of these will suffice for the rest of our road tripping blog.

The MVP

This process uses a Vagrant box to deploy a ScaleIO cluster. Using the EMC {Code} ScaleIO vagrant Github Repository, checkout the ReadMe to install ScaleIO in less than an hour (depending on your internet of course :smirk: ). Make sure to read through the Clusterinstall function of the ReadMe to understand the two different ways of installing the ScaleIO cluster.

For the GrownUps

This process will deploy ScaleIO on four separate Ubuntu machines/VMs.

Checkout The ScaleIO 2.0 Deployment Guide for more information and help

  • Go to EMC Support.
    • Search ScaleIO 2.0
    • Download the correct ScaleIO 2.0 software package for your OS/architecture type.
    • Ubuntu (We only support Ubuntu currently in CloudFoundry)
    • RHEL 6/7
    • SLES 11 SP3/12
    • OpenStack
    • Download the ScaleIO Linux Gateway.
  • Extract the *.zip files downloaded

Prepare Machines For Deploying ScaleIO

  • Minimal Requirements:
    • At least 3 machines for starting a cluster.
      • 3 MDM’s
      • Any number of SDC’s
    • Can use either a virtual or physical machine
    • OS must be installed and configured for use to install cluster including the following:
      • SSH must be installed, and be available for root. Double-check that passwords are properly provided to configuration.
      • libaio1 package should be installed as well. On Ubuntu: apt-get install libaio1

Prepare the IM (Installation Manager)

  • On the local machine SCP the Gateway Zip file to the Ubuntu Machine.
    scp ${GATEWAY_ZIP_FILE} ${UBUNTU_USER}@${UBUNTU_MACHINE}:${UBUNTU_PATH}
    
  • SSH into Machine that you intend to install the Gateway and Installation Manager on.
  • Install Java 8.0
    sudo apt-get install python-software-properties
    sudo add-apt-repository ppa:webupd8team/java
    sudo apt-get update
    sudo apt-get install oracle-java8-installer
    
  • Install Unzip and Unzip file
    sudo apt-get install unzip
    unzip ${UBUNTU_PATH}/${GATEWAY_ZIP_FILE}
    
  • Run the Installer on the unzipped debian package
    sudo GATEWAY_ADMIN_PASSWORD=<new_GW_admin_password> dpkg -i ${GATEWAY_FILE}.deb
    
  • Access the gateway installer GUI on a web browser using the Gateway Machine’s IP. http://{$GATEWAY_IP}
  • Login using admin and the password you used to run the debian package earlier.
  • Read over the install process on the Home page and click Get Started
  • Click browse and select the following packages to upload from your local machine. Then click Proceed to install
    • XCache
    • SDS
    • SDC
    • LIA
    • MDM

    Installing ScaleIO is done through a CSV. For our demo environment we run the minimal ScaleIO install. We built the following install CSV from the minimal template you will see on the Install page. You might need to build your own version to suit for your needs.

    IPs,Password,Operating System,Is MDM/TB,Is SDS,SDS Device List,Is SDC
    10.100.3.1,PASSWORD,linux,Master,Yes,/dev/sdb,No
    10.100.3.2,PASSWORD,linux,Slave,Yes,/dev/sdb,No
    10.100.3.3,PASSWORD,linux,TB,Yes,/dev/sdb,No
    
  • To manage the ScaleIO cluster you utilize the MDM, make sure that you set a password for the MDM and LIA services on the Credentials Configuration page.
  • NOTE: For our installation, we had no need to change advanced installation options or configure log server. Use these options at your own risk!
  • After submitting the installation form, a monitoring tab should become available to monitor the installation progress.
    • Once the Query Phase finishes successfully, select start upload phase. This phase uploads all the correct resources needed to the nodes indicated in the CSVs.
    • Once the Upload Phase finishes successfully, select start install phase.
    • Installation phase is hopefully self-explanatory.
  • Once all steps have completed, the ScaleIO Cluster is now deployed.

Using ScaleIO

  • To start using the cluster with the ScaleIO cli you can follow the below steps which are copied from the post installation instructions.

    To start using your storage:
    Log in to the MDM:

    scli --login --username admin --password <password>

    Add SDS devices: (unless they were already added using a CSV file containing devices)
    You must add at least one device to at least three SDSs, with a minimum of 100 GB free storage capacity per device.

    scli --add_sds_device --sds_ip <IP> --protection_domain_name default --storage_pool_name default --device_path /dev/sdX or D,E,...

    Add a volume:

    scli --add_volume --protection_domain_name default --storage_pool_name default --size_gb <SIZE> --volume_name <NAME>

    Map a volume:

    scli --map_volume_to_sdc --volume_name <NAME> --sdc_ip <IP>

Managing ScaleIO

When using ScaleIO with CloudFoundry we will use the ScaleIO REST Gateway to manage the cluster. There are other ways to manage the cluster such as the ScaleIO Cli and ScaleIO GUI, both of which are much harder for CloudFoundry to communicate with.

EOF

At this point you have a fully functional ScaleIO cluster that we can use with CloudFoundry and RexRay to deploy applications backed by ScaleIO storage! Stay tuned for our next blog post in which we will deploy a minimal CloudFoundry instance.

Cloud Foundry. Open Source. The Way. EMC [⛩] Dojo.

 

 

Building a healthy Concourse CI pipeline for Bosh deployed products

Julian Hjortshoj

Julian Hjortshoj

Julian is a 12 year veteran of DellEmc, and the current PM of the Cloud Foundry Diego Persistence team.

The Cloudfoundry Diego Persistence team recently spent a fair amount of time and effort building and refactoring the CI pipeline for our Ceph filesystem, volume driver, and service broker.  The end state from this exercise, while not perfect, is nonetheless pretty darn good:  It deploys Cloudfoundry, Diego, and a Cephfs cluster, along with our volume driver and service broker.  It runs our code through unit tests, certification tests, and acceptance tests.  It keeps our deployment up to date with the latest releases of CloudFoundry and the latest development branch changes to Diego.  It does all of this with minimal rework or delay; changes in our driver/broker bosh release typically flow through the pipeline in about 10 minutes.  

But our first attempt at creating the pipeline did not work very well or very quickly, so we thought it would be worth documenting our initial assumptions, what was wrong about them, some of what we learned while fixing them.

Our First Stab at It

We started with a set of assumptions about what we could run quickly and what would run slowly, and we tried to organize our pipeline around those assumptions to make sure that the quick stuff didn’t get blocked by the slow stuff.

Assumptions:

  • Cephfs cluster deployment is slow–it requires us to apt-get a largish list of parts and then provision a cluster.  This can take 20-30 minutes.
  • Since cluster deployment is slow, and we share a bosh release for the cephfs bosh job and our driver and broker bosh jobs, we should only trigger cephfs deployment nightly when nobody is waiting–we shouldn’t trigger it when our bosh release is updated.
  • Redeploying Cephfs is not safe–to make sure that it stays in a clean state, we should undeploy it before deploying it again.
  • CloudFoundry deployment is slow–we should not automatically pick up new CF releases because it might paralyze our pipeline during the work day.
  • The pipeline should clean up on failure–bad deployments of cephfs should get torn down automatically.

 

What We Eventually Learned

Our first pass at the pipeline (mostly) worked, but it was slow and inefficient.  Because we structured it to deploy some of the critical components nightly or on demand, and we tore down the ceph file system vm before redeploying it, any time we needed an update, we had to wait a long time.  In the case of cephfs, we also had to create a shadow pipeline just for manually triggering cephfs redeployment.  It turned out that most of of the assumptions above were wrong, so let’s take another look at those:

 

Bad Assumptions:

  • Cephfs cluster deployment is slow. This is only partially true.  Because we installed cephfs using apt-get, we were doing an end-run around Bosh package management, effectively ensuring that we would re-do work in our install script whether it was necessary or not.  We switched from apt-get to Bosh managed debian packages and that sped things up a lot.  Bosh caches packages and only fetches things that have actually changed.
  • We should only trigger cephfs deployment nightly or we will repeat slow cephfs deployments whenever code changes.  This is totally untrue.  Bosh is designed to detect changes from one version to the next, so when the broker job or the driver job changes, but cephfs hasn’t changed, deploying the cephfs job will result in a no-op for bosh.  
  • Redeploying Cephfs is not safe.  This might be partially true. In theory our ceph filesystem could get corrupted in ways that would cause the pipeline to keep failing, but treating this operation as unsafe is somewhat antithetical to cloud operations.  Bosh jobs should as much as possible be safe to redeploy without removing them.
  • CloudFoundry deployment is slow.  This is usually not true.  When there are new releases of CloudFoundry, they deploy incrementally just like other bosh deployments, so only the changed jobs will result in deployment changes.  The real culprit in slow deployment times happens when there is an update to the bosh stemcell, and bosh needs to download the stemcell before it can deploy.  In order to keep that from slowing down our pipeline during the workday, we created a “nightly stemcell” task in the pipeline that doesn’t do anything, but can only run at night.  Using the latest passed stemcell from that task, and setting the stemcell as a trigger in our deploy tasks ensures that when there is a stemcell change, our pipeline will pick it up at night, and redeploy with it, and that we will never have to wait for a stemcell download during the day:
resources:
- name: nightly
  type: time
  source:
    interval: 24h
    start: 01:00 AM -0800
    stop: 1:15 AM -0800

- name: aws-stemcell
  type: bosh-io-stemcell
  source:
    name: bosh-aws-xen-hvm-ubuntu-trusty-go_agent

...

jobs:
- name: nightly-stemcell
  plan:
  - aggregate:
    - get: nightly
      trigger: true
    - get: bosh-stemcell
      resource: aws-stemcell

- name: teardown-cephfs-cluster
  serial_groups:
  - cephfs-deploy
  plan:
  - aggregate:
    - get: cephfs-bosh-release
    - get: aws-stemcell
      passed:
      - nightly-stemcell
      trigger: true
    - get: deployments-runtime
  - task: teardown
    file: cephfs-bosh-release/scripts/ci/teardown.build.yml
    params:
      BOSH_PASSWORD: <>

      BOSH_TARGET: <>
      BOSH_USERNAME: <>
      DEPLOYMENT_NAME: cephfs
  • The pipeline should clean up on failureThis is generally a bad practice.  It means that we have no way of diagnosing failures in the pipeline.  Teardown after failure also doesn’t restore the health of the pipeline, unless the deployments in question are re-deployed after, but in the case of a deployment error, that could easily result in a tight loop of deployment and undeployment, so we never did that.

Where We Ended Up

Screen Shot 2016-06-29 at 2.57.33 PM

After we corrected all of our wrong assumptions, our pipeline is in much better shape:

  • Bosh deployments are incremental and frequent.  We pick up new releases as soon as they happen, and we re-test against them, so we get early warning of failures even when we didn’t make the breaking changes.
  • Our bosh job install scripts are as much as possible idempotent.  The only undeploy jobs we have in the pipeline are manually triggered.
  • We trigger slow stemcell downloads at night when nobody is working, and stick to the same stemcells during the day to avoid slow downloads.
  • Since we share the same bosh release for 3 different deployments (broker, driver, and file system) we trigger deployment of all 3 things whenever our bosh release changes.  Since Bosh is clever about not doing anything for unchanged jobs, this is a much easier approach than trying to manage separate versions of the bosh release for different jobs.
  • We use concourse serial groups to force serialization between the tasks that deploy things and the tasks that rely on those deployments.  Serial groups are far from perfect–they operate as a simple mutex with no read/write lock semantics–but for our purposes they proved to be good enough, and they are far easier than implementing our own locks.

The yaml for our current pipeline is here for reference.

Housekeeping

In addition to our nightly job to download stemcells, we also run a nightly task to clean up bosh releases by invoking bosh cleanup.  This is a very good idea–otherwise bosh keeps everything that’s been uploaded to it, which can quickly use up available disk space.

At some point in the future, we will probably want to add additional tasks to the pipeline to clean out our Amazon S3 buckets, but so far we haven’t done that.

Thanks

A special thanks to Connor Braa who recently joined our team from the Diego team where he did a great deal of Concourse wrangling.  Connor is responsible for providing us with most of the insights in this post.

Overview of GoLang with Xuebin He

Brian Roche

Brian Roche - Senior Director, Cloud Platform Team at Dell EMC. Brian Roche is the Leader of Dell EMC’s Cloud Platform Team. He is based in Cambridge, Massachusetts, USA at the #EMCDojo.

Join us tonight for a special Meetup to talk about GoLang.

Creating a Cloud Foundry Service Broker

Megan Murawski

Megan Murawski

Megan Murawski

Latest posts by Megan Murawski (see all)

12-factor apps are cool and Cloud Foundry is cool but, we don’t just have to worry about 12 factor apps.  We have legacy apps that we also need to pay attention to.  We believe there should be a way for all of your apps to enjoy the benefits of Cloud Foundry.  We have enabled this by implementing a Service Broker that binds an external storage service.  Before we talk about the creation of the Service Broker we will describe the role of a Service Broker in Cloud Foundry.

Services are integrated with Cloud Foundry by implementing a documented API for which the cloud controller is the client; we call this the Service Broker API.  Service brokers advertise a catalog of service offerings and service plans, as well as interpreting calls for provision (create), bind, unbind, and deprovision (delete).  By externalizing the backend services from the application in a PaaS provides a clear separation that can improve application development. Dev only needs to be able to connect to an external service to consume its APIs. In Cloud Foundry, this interface is “brokered” by the Service Broker.  Some examples of services which are essential to the 12 Factor app are: MySql, Redis, RabbitMQ, and now ScaleIO!

Screen Shot 2016-05-23 at 1.28.26 PM

A service broker sits in between the Cloud Foundry Runtime and the service itself. To create a service broker, you only need to implement the 5 Service Broker APIs: catalog, provision, deprovision, bind and unbind. The service broker essentially translates between the Cloud Controller API and the service’s own API to orchestrate creating a service instance (maybe provisioning a new database or creating a new user), providing the credentials to connect to the service, and disconnecting and deleting the service instance.

The APIs

Catalog is used to fetch the service catalog. This describes the service plans that are available with the backend and any cost.

Provision is used to create the service in the backend. Based on a chosen service plan, this call will actually create the service. In the case of persistence, this is provisioning storage to be used by your applications.

Bind will provide Cloud Foundry runtime with the credentials and configuration to access the service. This allows the application to connect to the service.

Unbind will remove the access credentials from the application’s environment.

Deprovision is used to delete the service on the backend. This could remove an account, delete a database, or in the case of persistence, delete the volume created on provision.

 

Creating the Broker

To create the ScaleIO service broker, we made use of the EMC open source tool Libstorage for communicating with the ScaleIO backend. Libstorage is capable of managing many different storage backends including ScaleIO, XtremIO, Isilon and VMAX in a client/server model.  By adding an API layer on top of Libstorage, we were able to quickly translate between Cloud Foundry API calls and storage APIs on the backend to provide persistence as a service within CF. Like much of the work we do at the EMC Dojo we have open sourced the Service Broker.  If you’d like to download it or get involved, check it out on github!

Follow Us on Twitter

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.