Happy 2013 from the Engine Yard Data Team! We hope the new year finds you developing exciting new applications. We’ve been quite busy, and today we are happy to introduce you to a new product. We dedicated a big part of last year to expanding and strengthening our relational database stack. We introduced PostgreSQL as a first class citizen and made it the new default for our platform. We also released new versions of MySQL and made improvements to the provisioning of MySQL. We are starting the new year by adding our first NoSQL datastore!
We are thrilled to share our early access support for Riak, Basho’s distributed, key-value, and highly available database.
What is Riak?
Riak is an open-source, distributed key/value database made by our friends at Basho. Riak provides linear scalability, operational simplicity, and fault-tolerance. It replicates and retrieves data intelligently so it’s available for read and write operations, even in failure conditions.
All nodes hold data and there is no concept of master or replica. Every node has the same responsibility, which is to hold a portion of the data stored in the cluster. With Riak, you can afford to lose access to nodes due to network partition or hardware failure and not lose data or risk application downtime. Your cluster continues to operate normally by handing over data to the nodes that are up.
Riak is also simple to operate. Growing your cluster can be done easily without incurring a large operational burden. It automatically distributes data around the cluster and yields a near-linear performance increase as you add capacity.
When to use Riak?
Riak is designed for capturing, storing, and processing data in applications where high availability and write performance are key.
It is best suited in situations where downtime is unacceptable or applications need to be distributed over multiple regions, countries, and even cloud providers! Here are some of the use cases we’ve seen Riak thrive in:
- [Storing Session Data](http://johnleach.co.uk/words/1063/riak-syslog)
- [Storing Large Quantities of Rich Media](http://basho.com/blog/technical/2012/06/27/Riak-at-Voxer/) (video, audio, etc.)
- Storing large quantities of social or unstructured data
- Using Riak as a [Distributed Caching Layer ](http://basho.com/blog/technical/2012/01/30/Riak-in-Production-at-Posterous-Riak-Control-Preview/)
- Storing and managing [Ranking Data](http://devblog.seomoz.org/2011/10/using-riak-for-ranking-collection/)
- Managing [User Data](http://devblog.bu.mp/from-mongodb-to-riak-7138) for Social and [Gaming](http://labs.mochimedia.com/archive/2011/05/08/statebox/) Networks
- [Mining Social Data](http://blog.inagist.com/riak-at-inagistcom)
Companies like AOL, Clipboard, Voxer, Bump, Kiip, and Airbrake are using Riak on production and now you can too! It’s easier than ever to create a Riak cluster in your Engine Yard environment . ###When not to use Riak? Riak was not designed to solve every data storage need. Riak shines when distribution and high availability are needed, but sometimes a relational database may be the appropriate solution to your problem.
Riak is likely not the ideal solution if your application needs a highly centralized data storage application with fixed, unchanging data structures, and has no tolerance for eventual consistency.
We want you to be successful when using this product and we’ll do our best to advise if you are considering using Riak for your application.
Creating Riak Clusters
We have preliminary documentation to help you get started using Riak on Engine Yard. In short, you must enable the ‘Clusters’ and ‘Riak Clusters’ features in the Early access section of your account.
Once the features are enabled you will be able to create a new cluster by clicking on the ‘Add Cluster’ button in your environment (this button will only appear when the right features are enabled). Note that you will be able to create a Riak cluster after your initial environment has been created. Please let us know if you have any problems getting the feature to show for your account.
We’ll ask you to provide a few defaults in order to create your Riak cluster. Here is a detailed explanation of the choices we ask you to make: ###Cluster Size Basho recommends no fewer than 5 nodes for a production environment. A cluster of 3 nodes may be appropriate for a staging environment. Deployments of five nodes or greater will provide the best foundation for performance, resilience, and growth as the cluster expands. For more information, see Basho’s documentation on Cluster Capacity Planning. ###Backends Riak supports pluggable backends, this means that the database can be further tuned to the type of data you are saving by selecting an appropriate storage backend. We support all of Riak’s backends: Bitcask, LevelDB, Memory, and Multi (multiple backends per cluster).
- [Bitcask](http://docs.basho.com/riak/1.3.0/tutorials/choosing-a-backend/Bitcask/) is the default backend type for Riak. It provides low latency request times and high throughput. Keys must fit in memory.
- [LevelDB](http://docs.basho.com/riak/1.3.0/tutorials/choosing-a-backend/LevelDB/) is best when keys do not fit in memory. It supports Secondary Indexes (2i).
- [Memory](http://docs.basho.com/riak/1.3.0/tutorials/choosing-a-backend/Memory/) is best when using Riak as a distributed cache. All data is stored in in-memory tables and never persisted to disk or any other storage.
Note that when a backend is selected resources will be allocated to it. Choosing all backends may not be advisable unless you really know your application use case and need them. ###Storage Types The following storage types are supported: ephemeral, regular EBS, and EBS with Provisioned IOPS.
- Ephemeral storage provides you with lower latency than EBS volumes at the cost of persistence. Any data saved on ephemeral storage will be lost when the instance is stopped or terminated. This is a very fast storage type that does not support snapshots.
- Elastic Block Storage (EBS) gives you block-level persistence. These devices support snapshots and can guarantee a specific level of service when a provisioned I/O operations (PIOPS) value is given.
Note that a PIOPS value determines the number of I/O operations per second guaranteed by the infrastructure for each volume. Values range from 100 to 2000. ###Want an Extremely Fast Cluster? We also support High IO Quadruple Extra Large instances for Riak clusters. These instances are SSD-Backed are the most performant in terms of I/O operations. The number of instances you may have in a given environment is limited. Please contact Engine Yard Support to have your limit increased if you want to run your Riak cluster using this instance type.
A Few Recommendations
When creating your Riak clusters you want to strive for node homogeneity. Having a super fast node and four slow ones will not provide any performance gains. You want a predictable level of I/O and performance in your cluster.
Riak’s update path is not as painful as other databases. If you run into capacity problems with your selected instance type you can always add more instances (horizontally scale it) or vertically scale the nodes in your cluster by replacing them with larger instance types.
Please ask us any questions you may have. We are but a ticket away!
Want to know more?
We have aggregated a few resources to help you get started with Riak. We highly recommend Basho’s online documentation. It’s a fantastic source of information. Riak meetups happen all over the country (even in Europe and Japan!), we encourage you to attend them as the community is extremely helpful and willing to share information.
Additionally here are a few books, white papers, and blogposts that we’ve also enjoyed:
- Basho's From Relational to Riak- Advantages, Tradeoffs and Considerations
- Eric Redmond's [Little Riak Book](https://github.com/coderoshi/little_riak_book/)
- Mathias Meyer's [Riak Handbook](http://riakhandbook.com/)
- Adron Hall's [Riak is a Whole Big List of Things](http://compositecode.com/2013/01/11/riak-is-a-whole-big-list-of-things/)
Try Riak for free!
You can boot up a Riak cluster on the Engine Yard platform with 500 free hours.