Tag Archives: random key

Couchbase Keys

Following up on my previous post, Dealing with Large Data Sets in Couchbase, I wanted to talk a little bit about document keys in Couchbase. In the previous post I mentioned we didn’t create a key for our documents and instead chose to let Couchbase create a key for us. While going this route makes it super quick and easy to get started using Couchbase it does have drawbacks. As pointed out in my previous post mentioned above, you can run into limits with views making it more difficult to get to the documents you want. Another issue you will encounter is figuring out how to get the document you want to deal with. While a view can help you retrieve documents that have a random key think about how you will retrieve a specific document. You will need to add very specific criteria to the view to get the document you need or you will have to add enough criteria to the view to narrow the result set so that you can create further logic in whatever programming language you are using to filter the data down even more to the specific document you want. Using specific criteria in a view to narrow your results to a specific document is a very bad idea. For one, it’s not a reusable method to get other documents. You will either need to update the view each time to get the document you want or you will have to create a new view to get a different document.

Time for some examples. Suppose you have the following documents:

{
   name: "Philadelphia Flyers",
   sport: "Hockey",
   city: "Philadelphia",
   state: "PA"
}

{
   name: "Philadelphia Union",
   sport: "Soccer",
   city: "Philadelphia",
   state: "PA"
}

{
   name: "Washington Capitals",
   sport: "Hockey",
   city: "Washington",
   state: "VA"
}

{
   name: "DC United",
   sport: "Soccer",
   city: "Washington",
   state: "VA"
}

Since the documents above do not have a key I can’t just go and get the document with a name of “Philadelphia Union”. In order to get that document I would need a view. Here is an example of the view that would be required to grab the document with a name of “Philadelphia Union”.

function (doc, meta) {
	if (doc.sport == "Soccer" && doc.city == "Philadelphia") {
		emit(meta.id, null);
	}
}

Below is an example of the code, in Ruby, that will be needed in order to query the view to get the document key and then use that key to call Couchbase to get the actual document we want to work with.

#!/usr/bin/env ruby

require 'rubygems'
require 'couchbase'
require 'json'

# Couchbase Server IP
ip = 'localhost'
# Bucket name
bucket_name = 'Your_Bucket_Name'
# Design Doc Name
design_doc_name = 'Your_Doc_Name'

client = Couchbase.connect("http://#{ip}:8091/pools/default/buckets/#{bucket_name}")

# Get all the existing items from your view
design_doc = client.design_docs[design_doc_name]
# Replace "by_id" with the name of your view
view = design_doc.by_id 
  
# loop through all items in the view  
view.each do |doc|
  # get the view result document key
  doc_key = doc.key
  
  # query Couchbase to retrieve the document using the key retrieved from the view
  document = JSON.parse(client.get(doc_key, :quiet => false))
  
  # Do whatever you need to do with your document
  puts document.inspect
end

client.disconnect

While the view is simple to create and doesn’t take much time, it only returns the document key that was auto created by Couchbase. This means I have to create another request to Couchbase to actually get the document. Now you might be wondering why I didn’t just return the document via the view and that is not a best practice to do. For one, it causes the entire document to be indexed and that isn’t very efficient and it takes up more disk space then is necessary when Couchbase indexes the content. Plus, it is super fast to make multiple queries to Couchbase or any NoSQL solution for that matter. Most developers that come from a SQL environment or background have a hard time transitioning to making multiple queries against the datastore, and for good reason when dealing with SQL but with Couchbase and NoSQL solutions in general they are are built for speed and efficiency in querying data so making multiple requests isn’t a problem.

Now back to the document key issue. Suppose I want to get the soccer team from the city of Washington. I can’t do that using the view that was previously created and I either have to change the view or create a new one to get the new document I want. That isn’t very efficient.

The best way to solve the problem is to take some time to come up with a unique document key that you can remember or reconstruct from other data in your application to retrieve documents. Even if you think you will never need to retrieve a specific document from your dataset you should create a known key for your documents because I can guarantee that at some point in the future you will need to retrieve a specific document and knowing your document key structure will help out tremendously. Let’s look at the same examples I previously used but this time I will add a key that I specifically create so that I can retrieve the document very easily.

{
	id: "Hockey::PhiladelphiaFlyers",
	name: "Philadelphia Flyers",
	sport: "Hockey",
	city: "Philadelphia",
	state: "PA"
}

{
	id: "Soccer::PhiladelphiaUnion",
	name: "Philadelphia Union",
	sport: "Soccer",
	city: "Philadelphia",
	state: "PA"
}

{
	id: "Soccer::WashingtonCapitals",
	name: "Washington Capitals",
	sport: "Hockey",
	city: "Washington",
	state: "VA"
}

{
	id: "Soccer::DCUnited",
	name: "DC United",
	sport: "Soccer",
	city: "Washington",
	state: "VA"
}

I’ve put together a quick ruby example to retrieve the document for the soccer team in Philadelphia.

#!/usr/bin/env ruby

require 'rubygems'
require 'couchbase'

ip = 'localhost'
bucket_name = 'default'

client = Couchbase.connect("http://#{ip}:8091/pools/default/buckets/#{bucket_name}")

doc = client.get("Soccer::PhiladelphiaUnion", :quiet => false)
puts doc.inspect

client.disconnect

As you can see the code to retrieve a document directly is much shorter and easier to write. Now, if I want to get the hockey team from Philadelphia all I have to do is replace “Soccer::PhiladelphiaUnion” in the client.get method with “Hockey::PhiladelphiaFlyers” and I’ll have my new document.

Hopefully the examples above have helped you realize why it’s important to create a key for your documents. For more in depth help on creating documents for Couchbase I would recommend checking out this site, Couchbase Models, by Jasdeep Jaitla who works at Couchbase.