5 reasons NOT to use DynamoDB

5 min readFeb 3, 2019

DynamoDB is a NoSQL database service founded and operated by Amazon since 2012. Despite being a fast and flexible database service for any scale, I realized some drawbacks when using it as my database service for my application.

Instead of telling you to stay away from DynamoDB, I’d prefer to talk about a few problems I had when I used it for my application (era.li).

Costly if unbeing well-designed

In DynamoDB, they prefer you to think carefully in all cases which might be happened in the future including business changes and traffic. I know that’s hard as the business might grow in different aspects tomorrow which you might not be able to control.

As a concept of DynamoDB, they care about read capacity unit (RCU) and write capacity unit (WCU) represent for the number of read requests per second and write requests per second, respectively (see more https://aws.amazon.com/dynamodb/pricing/provisioned/). It took me a few hours to understand this pricing method when I had started to implement some samples.

With RDBMS, I could easily design tables with many columns I need or split the main table into smaller tables (database-normalization) to optimize the database, then it is not a good practice in DynamoDB whereas they prefer only one table, the reason is more tables you have more cost you need to pay. In addition, you are not only paying for the table itself but also it’s secondary indexes (there are 2 types of secondary indexes).

Last but not least, you are also counted for outbound traffic which encourages you to use their service in order to save cost.

Consistency is not applicable for indexes

I had trouble with DynamoDB when I had this use case:

Write to the main table
Read on a global secondary index

The problem is that strongly consistent read is not applicable, indeed, only eventually consistent read is supported.

Let me explain more about this by giving you an example. I have the following table:

code: a random and unique string
href: a long URL string
hash: a unique string created by using a hash function with input is href

In this table, I had to choose hash as my partition key because I want to make sure only one href is allowed in the table regardless of the code, it allows me to reduce the duplication of href. For that reason, I need to use code as a global secondary index (GSI). And that moment, the problem appeared when I tried to check the existence of code before saving to the database, as code must be unique, surprisingly, I received an error message tells that applying consistent read on GSI is not allowed. Let's assume that there are 2 requests come into the system with very small delay window (or eventually same time), and they generate 2 identical code then I would have 2 rows of the same code in the table. I know this case is almost not possible in reality since the code is random, however, it might be an issue for a large traffic system.

Sorting is not a great option

This is one of the tradeoffs of DynamoDB when I was trying to sort my result based on creation date. Suppose that you have one of the following sorting conditions:

ORDER BY created_at DESC
ORDER BY created_at DESC, views DESC

However, they are not available in DynamoDB while they say that sort key is already sorted. You can only sort a column (field) if you mark it as a sort key, and this is not a great option, as not always sort field should be necessary a sort key.

Text searching is limited

Despite the power of full-text search in RDBMS, I just only want to mention about the operator of MySQL. For example, in MySQL, I could find a movie which has the title match my condition as below

SELECT * FROM movie WHERE title LIKE '%War%America%'

Nonetheless, in DynamoDB, I just could search if a string contains War or America, but in any orders. Additionally, I see that it is a matching word searching, in other words, War and war are different.

Pagination is hard

In MySQL, pagination is available, you could do 2 actions to acquire pagination:

Count all rows match condition
Select rows in offset from — to

In DynamoDB, pagination is very limited, since you need to perform many requests to retrieve the desired result

DynamoDB paginates the results from Query operations. With pagination, the Query results are divided into “pages” of data that are 1 MB in size (or less). An application can process the first page of results, then the second page, and so on.
A single Query will only return a result set that fits within the 1 MB size limit. To determine whether there are more results, and to retrieve them one page at a time, applications should do the following:
- Examine the low-level Query result:
+ If the result contains a LastEvaluatedKey element, proceed to step 2.
+ If there is not a LastEvaluatedKey in the result, then there are no more items to be retrieved.
- Construct a new Query request, with the same parameters as the previous one-but this time, take the LastEvaluatedKey value from step 1 and use it as the ExclusiveStartKey parameter in the new Query request.
- Run the new Query request.
- Go to step 1.

For instance, you need to receive items from page 2 you need to do as below:

Select all items from page 1
Create a new Query with LastEvaluatedKey from the previous request

Let’s imagine if you are querying for page N with N > 1000?

I have an alternative solution for this is to cache LastEvaluatedKey, but I don't think it is ideal for a large system.

Conclusion

By creating this post, I want to introduce you some of the drawbacks when using DynamoDB in which they don’t tell much about on their documentation or features page. Beyond that, I believe DynamoDB is a nice try if your business is predictable and simple.

Last but not least, this post is written by my understanding in the meantime of DynamoDB which it might be changed in the future, and I welcome any comments to help me improve my knowledge, and correction.

Originally published at dotronglong.com on February 3, 2019.