Asynchronous Couchbase Operations Using the Java SDK

If you work with Couchbase in Java, you may at times need to perform a batch of document reads, writes, updates, or deletes. If you are running these operations in a typical sequential loop, one document at a time, and if the order of the operations is not significant, then you’ve been missing out on the performance gains available in Couchbase’s support for asynchronous batch operations.

My latest article in the Couchbase SDK series for baeldung.com introduces Asynchronous Batch Operations in the Couchbase SDK by way extending the CRUD-repository-like service that was first introduced in a previous article, Using Couchbase in a Spring Application.

Couchbase Java SDK Tutorials

I’ve been working a lot with Couchbase as of late, learning both the Spring Data Couchbase community module and the native Couchbase SDK for Java, and I want to share a couple of tutorials that I’ve written recently for baeldung.com.

The first, titled Introduction to Couchbase SDK for Java, will get you up and going quickly, covering basic techniques such as connecting to a cluster, opening data buckets, and basic persistence operations in the native Couchbase SDK, including a brief overview of working with replicas in the event of a node outage.

The second tutorial, Using Couchbase in a Spring Application, presents a framework with which to work with a Couchbase environment, cluster, and multiple buckets, and a basic persistence layer for performing CRUD operations in a Spring application without using the Spring Data module.

Future articles in the series will cover the use of Couchbase’s asynchronous API for performing bulk operations, querying data buckets with MapReduce views, and querying with N1QL, Couchbase’s superset of SQL for working with JSON documents.

I hope that you will enjoy these tutorials and that you will stay tuned as the rest of the series unfolds.

More Spring Data Couchbase Tutorials

The second article in the Spring Data Couchbase Tutorial series on baeldung.com focuses on entity validation using the JSR-303 specification, optimistic locking, and query consistency while the third article introduces the use of multiple buckets, as well as Spatial views for writing queries against multi-dimensional data, such as geographic data.

You can find all articles in this series including the Introduction to Spring Data Couchbase on the persistence page at baeldung.com.

My next project will be a follow-up series dedicated to Couchbase itself and the use of the native Couchbase Java SDK to query and manipulate Couchbase data, so stay tuned for that.

Spring Data Couchbase Tutorial

If you are considering incorporating a NoSQL database into your application, you might want to consider Couchbase. And if the native Couchbase SDK seems intimidating, or if you are a big fan of using Spring Data for your persistence needs, please take a look at this tutorial on Spring Data Couchbase that I wrote for baeldung.com.

Spring Data Couchbase is a Spring community project that provides an abstraction layer for persisting and accessing Couchbase documents. In my tutorial, you will learn how to configure your project and how to write and use a basic repository interface, as well as how to use the Spring Data template abstraction, for interacting with Couchbase.

Follow-up articles will focus on other topics in Spring Data Couchbase, such as spatial-view-based queries, the use of multiple buckets, data validation, and optimistic locking.

Define Custom RAML Properties Using Annotations

The fourth article in my RAML series on Baeldung.com focuses on the use of a feature called annotations that is new in RAML 1.0. In brief, annotations provide a means for extending the metadata of an API specification, allowing you to define custom properties that are not within scope of the official language spec. I hope you’ll enjoy Define Custom RAML Properties Using Annotations.

In case you missed any of the first three articles in the RAML series, I’ve included the links below. You can also find them in the REST category of the Baeldung.com site.

Other articles in the series:

Modularization in RAML Using Includes, Libraries, Overlays and Extensions

The third article in my RAML series on Baeldung.com focuses on the modularization features of RAML. It contains a brief introduction to the use of includes (including typed fragments), libraries, overlays and extensions to make your API definitions more modular. I hope you’ll enjoy Modular RAML Using Includes, Libraries, Overlays and Extensions.

In case you missed the first two articles, I’ve included the links below. You can also find them in the REST category of the Baeldung.com site. The next article in the series will cover the use of annotations in RAML.

Other articles in the series:

Simplify RAML Using Resource Types and Traits

If you read my RAML Tutorial article, then you may have wondered whether the API definitions always have to be verbose, or if there are some shortcuts one can take in order to capture common patterns found in an API in order to simplify its definition.

If so, then you may enjoy Eliminate Redundancies in RAML Using Resource Types and Traits, the second article in the RAML series that I am currently writing for the Baeldung site.

The next article in the series will focus on modularization in RAML via the use of includes, libraries, overlays, and extensions.

Build and Deployment Patterns

One of my favorite topics in the Software Engineering realm is configuration management, and my favorite subtopic within that would be build and deployment patterns — i.e., what are the best practices around building and deploying software.

Ken Mugrage has written a fantastic post on this subject for the DevOps section of the DZone site that echoes at least three of my personal build and deployment mantras:

  • Build it once
  • Develop and Test on the same platform as Production
  • Always know which version(s) are running on each environment

I spent several years studying and developing build and deployment patterns and building tools to support them, long before it was called “DevOps,” while at the same time developing and maintaining software in a team environment, so I can attest to the fact that these patterns work and that they can save you a lot of time.

I hope you enjoy “5 Key Deployment Pipeline Patterns” by Ken Mugrage.

Modeling RESTful Services with RAML

If you’ve done much development, design, or research in the area of web services in the past ten years, you have most likely encountered the term RESTful services or at least the REST acronym, unless you’ve been living under a rock. [And let’s face it, if you have been living under a rock — not that there’s anything wrong with that —  it’s doubtful that you have had much involvement with web services, so there you go.]

RAML stands for RESTful API Modeling Language and is built upon the YAML and JSON standards. If software engineering or design is your thing, and you are faced with designing and/or developing RESTful services, then I would encourage you to learn a little about RAML. A good starting point is Introduction to RAML – The RESTful API Modeling Language, an article that I wrote recently for Baeldung.

Map Iteration and Performance: One Size Does Not Fit All

Riddle me this, Batman!

Question: Given the hypothetical types Foo and Bar, and a Java variable declared as Map<Foo, Bar> map, and a method that performs some operation involving the map’s keys, values, or both: what is the optimal method for iterating over the map?  (And newbies be like, “there’s more than one way to iterate over a map?”)

Answer: It depends.  (And yes, Virginia, there are many ways to iterate over a Map.)

If you’re still reading, my guess is that you may have never considered this question.  And that’s not surprising; the typical introductory Java textbook will more than likely show you exactly one way to iterate over a Map, and hey, if it ain’t broke, don’t fix it, right?  Hold that thought.  By the time you finish this article, you will have seen at least three ways to iterate over a Map, and when faced with a Map iteration problem of your own, you should be able to analyze the problem and discern which method is the most efficient for the problem at hand.

O Map, O Map, How Do I Iterate Over Thee?

Let us count the ways!  Here are the basic algorithms for map iteration.

  1. Iterate over the keys.
    for(Foo foo : map.keySet()) { . . . }
  2. Iterate over the values.
    for(Bar bar : map.values()) { . . . }
  3. Iterate over the entries.
    for(Map.Entry<Foo, Bar> entry : map.entrySet()) { . . . }

What’s the Big O?

If you have studied computer science, you have probably heard of Big-O notation.  Simply put, Big O notation is a way to measure and compare the efficiency of algorithms and is often used by programmers to determine the best algorithm to use for solving a particular problem involving large datasets.

Big-O values are generally expressed in terms of the number of operations, on average, that an algorithm takes to run, given the size n of the dataset that they operate over.  For example, looping through a one-dimensional array of size n would require on the order of n operations, so we would say that its Big-O value is O(n).  Likewise, looping through a two-dimensional array having dimensions (m,n) would require on the order of m*iterations, and since for very large datasets, Big-O treats all constants more or less as equals, this nested loop gets a score of O(n2).

Let’s examine each of the three iteration methods listed above in terms of their Big-O time efficiencies in performing various tasks.  As a reference, we’ll use the Big-O CheatSheet by Eric Rowell in order to help us calculate the Big-O values.

Task 1: Print all the keys

This one seems like a no-brainer.  You are accessing only the keys, so let’s choose algorithm 1 from above.

public void printKeys(Map<Foo, Bar> map) {
  for(Foo foo : map.keySet()) {
    System.out.println(foo);
  }
}

Since all we are doing is iterating over the map’s keys, the Big-O for this method is O(n).

Task 2: Print all the values

This one also is a no-brainer.  You are accessing only the values, so let’s choose algorithm 2.

public void printValues(Map<Foo, Bar> map) {
  for(Bar bar : map.values()) {
    System.out.println(bar);
  }
}

Again this method completes in O(n) time.

Task 3: Print all the key-value pairs

Here’s where things get a little sticky.  If use naive algorithm 1, we get the following:

Naive method:

public void printValuesNaively(Map<Foo, Bar> map) {
  for(Foo foo : map.keySet()) {
    System.out.println(foo + "," + map.get(foo));
  }
}

The downside to this approach is that in addition to iterating through the map at O(n), we are calling the get method within the body of the loop, resulting an additional n get operations.

Now, for a HashMap, this is really not all that bad in terms of Big-O, since the get operation has O(1).  We end up with O(n) for the iteration, plus n* O(1) for the get operations, for a total of O(n) + O(n) = 2O(n) = O(2n) = O(n).  Still, it is not as efficient as it could be, as we’ll see.

However if we have a TreeMap, whose get operation has O(log n), now we end up with O(n) for the iteration, plus n* O(log n) for the get operations, for a total of:
O(n) + n*O(log n) = O(n) + O(n log n) = O(n log n).

Better method:

By using algorithm 3, we can avoid the additional n get operations that were performed in the body of the naive method.

public void printValuesMoreEfficiently(Map<Foo, Bar> map) {
  for(Map.Entry<Foo, Bar> entry : map.entrySet()) {
    System.out.println(entry.getKey() + "," + entry.getValue());
  }
}

As a result, we are back to a simple iteration, for O(n), regardless of map type.

O Map, Where Art Thou?

Now you have seen three basic algorithms for map iteration, and you now know that the choice of iteration algorithm depends primarily on whether you need to access just the keys, just the values, or both.

References

Big-O CheatSheet
The Idiot’s Guide to Big(O) Notation
Java Collections – Performance (Time Complexity) from Information Technology Gems