Difference between revisions of "WormCloud:Overview"

From WormBaseWiki
Jump to navigationJump to search
 
(One intermediate revision by the same user not shown)
Line 9: Line 9:
 
Several key technologies that we will be using are: Datomic, DynamoDB, Clojure, ClojureScript, Docker, Docker Hub, GitHub, HubFlow, Ansible.  
 
Several key technologies that we will be using are: Datomic, DynamoDB, Clojure, ClojureScript, Docker, Docker Hub, GitHub, HubFlow, Ansible.  
  
==mission==
+
==Mission==
 
To use Datomic/DynamoDB as a centralized database system for the WormBase project
 
To use Datomic/DynamoDB as a centralized database system for the WormBase project
  
==vision==  
+
==Vision==  
 
To develop a system that requires less effort to maintain / utilize while increasing the flexibility and functionality of the WormBase architecture.  
 
To develop a system that requires less effort to maintain / utilize while increasing the flexibility and functionality of the WormBase architecture.  
  
==approach==
+
==Approach==
 +
#Modernize technology stack starting with Datomic / DynamoDB
 +
#Use common tools in order to maintain consistency across the project
 +
#Maintain code repositories and documentation with MediaWiki and GitHub / GitHub Wiki
 +
 
 +
==Technologies==
 +
 
 +
===Operations===
 +
 
 +
====Docker====
 +
Docker is a tool that allows components of a project to be isolated from one another. By isolating tools and their dependencies we will be able to reduce the over head of installing applications. By using Docker we will theoretically only have to install each dependency once. This is especially important when passing tools / changes from development to production, as the environments should be identical. When developing a new tool it should not be priority to use Docker from the beginning. Their is a certain amount of overhead to creating the docker containers. Individual groups should decide on when they feel like they would benefit from adding putting the components into Docker containers.
 +
 
 +
====Docker Hub====
 +
Docker Hub is a WebSite to a repository of Docker containers. Like GitHub for Code, this is a great place to store containers that are used within the project. This will not only help our project by organizing our containers in one location but will also allow others, outside of the project, to use them as well.
 +
 
 +
====Ansible====
 +
Ansible is used for automating the provisioning VM's, running tests, and generating Docker containers. I have started a repository called "wormbase-architecture" feel free to add your own Ansible scripts to this project. By using Ansible we will be able to automate DevOps operations going forward while at the same time documenting how things are setup within the project.
 +
 
 +
====HubFlow====
 +
This is a wrapper around GIT that makes it relatively easy to follow a release cycle and also will provide consistency between the repositories across the project. By using this tool we will end up with a very systematic way of creating versions of each of the tools and will also be able to more easily maintian the branches within a repository. As a convention each person should put their name in the name of a feature branch. This will make it clear who to contact when investigating a branch.
 +
 
 +
===New Development Technologies===
 +
====Datomic / DynamoDB====
 +
As everybody knows Datomic is the main database that we are moving to from AceDB. Datomic is a DataLog and DynomoDB is an AWS database. We are using these two tools together to provide our central database system. I will be writing further information on how to maintain the database / transactors on another page. Each group over the next year will be porting over their technologies to work with the new database system: Hinxton Curration tool, CalTech curation tool and Datomic API's for the website (Catalyst)
 +
 
 +
====Clojure====
 +
Clojure is the main language that we are using for interacting with the database. It is a functional language which, by design, works very well with Datomic. When possible we should be using Clojure to interact with the database, in order to keep consistency throughout the project. If necessary there is a REST interface which can be used to interact with the database directly.
 +
 
 +
====ClojureScript====
 +
ClojureScript is a Clojure like language that compiles down to JavaScript. There are many very useful tools written in ClojureScript that are very useful, especially when used with datomic. Also when possible we should be moving towards ClojureScript throughout the project.
 +
 
 +
====GitHub====
 +
As much as possible we should be using GitHub as our primary place for maintaining our code base. This will make the project open to those outside of the project and also provides tools such as repository specific wiki's and issue trackers, that we should be using as much as possible.
 +
 
 +
===WormCloud specific Repositories===
 +
====ace-to-datomic====
 +
This repository contains code for converting AceDB Dumps to Datomic logs / databases.
 +
 
 +
====datomic-to-catalyst====
 +
This repository is where the APIs from Datomic for catalyst will be created. These API's provide page level information in JSON format for the WormBase website.
 +
 
 +
====CalTech-Curation====
 +
This repository will contain a version of CalTech's curation tools that runs off of Datomic. The first iteration will be very similar to the current Perl CGI scripts being used in production by their curators.
 +
 
 +
====Hinxton-Curation====
 +
This tool has already been started and is written in ClojureScript. The intent is that this tool will be very similar to the tools they already use in production but will use the Datomic database instead of AceDB.

Latest revision as of 19:41, 17 February 2016

WormCloud


Description

WormCloud is a WormBase project that aims to use cloud based technologies with the WormBase architecture in order to reduce complexity and increase functionality. In the end the WormBase architecture should be significantly more robust, take less effort to maintain and more flexible compared to the current architecture.

This project is broken down into three main sections: central database, curation, and website.

Several key technologies that we will be using are: Datomic, DynamoDB, Clojure, ClojureScript, Docker, Docker Hub, GitHub, HubFlow, Ansible.

Mission

To use Datomic/DynamoDB as a centralized database system for the WormBase project

Vision

To develop a system that requires less effort to maintain / utilize while increasing the flexibility and functionality of the WormBase architecture.

Approach

  1. Modernize technology stack starting with Datomic / DynamoDB
  2. Use common tools in order to maintain consistency across the project
  3. Maintain code repositories and documentation with MediaWiki and GitHub / GitHub Wiki

Technologies

Operations

Docker

Docker is a tool that allows components of a project to be isolated from one another. By isolating tools and their dependencies we will be able to reduce the over head of installing applications. By using Docker we will theoretically only have to install each dependency once. This is especially important when passing tools / changes from development to production, as the environments should be identical. When developing a new tool it should not be priority to use Docker from the beginning. Their is a certain amount of overhead to creating the docker containers. Individual groups should decide on when they feel like they would benefit from adding putting the components into Docker containers.

Docker Hub

Docker Hub is a WebSite to a repository of Docker containers. Like GitHub for Code, this is a great place to store containers that are used within the project. This will not only help our project by organizing our containers in one location but will also allow others, outside of the project, to use them as well.

Ansible

Ansible is used for automating the provisioning VM's, running tests, and generating Docker containers. I have started a repository called "wormbase-architecture" feel free to add your own Ansible scripts to this project. By using Ansible we will be able to automate DevOps operations going forward while at the same time documenting how things are setup within the project.

HubFlow

This is a wrapper around GIT that makes it relatively easy to follow a release cycle and also will provide consistency between the repositories across the project. By using this tool we will end up with a very systematic way of creating versions of each of the tools and will also be able to more easily maintian the branches within a repository. As a convention each person should put their name in the name of a feature branch. This will make it clear who to contact when investigating a branch.

New Development Technologies

Datomic / DynamoDB

As everybody knows Datomic is the main database that we are moving to from AceDB. Datomic is a DataLog and DynomoDB is an AWS database. We are using these two tools together to provide our central database system. I will be writing further information on how to maintain the database / transactors on another page. Each group over the next year will be porting over their technologies to work with the new database system: Hinxton Curration tool, CalTech curation tool and Datomic API's for the website (Catalyst)

Clojure

Clojure is the main language that we are using for interacting with the database. It is a functional language which, by design, works very well with Datomic. When possible we should be using Clojure to interact with the database, in order to keep consistency throughout the project. If necessary there is a REST interface which can be used to interact with the database directly.

ClojureScript

ClojureScript is a Clojure like language that compiles down to JavaScript. There are many very useful tools written in ClojureScript that are very useful, especially when used with datomic. Also when possible we should be moving towards ClojureScript throughout the project.

GitHub

As much as possible we should be using GitHub as our primary place for maintaining our code base. This will make the project open to those outside of the project and also provides tools such as repository specific wiki's and issue trackers, that we should be using as much as possible.

WormCloud specific Repositories

ace-to-datomic

This repository contains code for converting AceDB Dumps to Datomic logs / databases.

datomic-to-catalyst

This repository is where the APIs from Datomic for catalyst will be created. These API's provide page level information in JSON format for the WormBase website.

CalTech-Curation

This repository will contain a version of CalTech's curation tools that runs off of Datomic. The first iteration will be very similar to the current Perl CGI scripts being used in production by their curators.

Hinxton-Curation

This tool has already been started and is written in ClojureScript. The intent is that this tool will be very similar to the tools they already use in production but will use the Datomic database instead of AceDB.