MongoDB – Natural sort

One of the most common problems with databases are sorting. Let’s see an simple example: we use MySQL/MariaDB and have table with some data. We want to sort this data. Everything is ok if we use only values in the same type in field, for example numeric (maybe some amount?), strings (titles), dates, keys etc. But what if field is custom, user-typed data and may contain many types? In that case, we have big problem, because standard sorting is not enought, it will not work. Example, value: “11”, “1”, “2”, “3”, “2a” will be sorted in order: 1, 11, 2, 2a, 3, not as we expected i.e. 1, 2, 2a, 3, 11. The solution is to use natural sorting.

Natural sorting is sort in “human friendly” way. Many languages provides functions to sort data using natural sorting, for example in PHP you can use natsort function. Of course you can also use a lot of external libraries, that provide that sorting option. But in databases, in isn’t simple. For example from first paragraph, MySQL or MariaDB, you can create special field for this and use function to make special string only for sorting from your original string. What about MongoDB? In that case, using functions is much harder and we must look for other solution. Fortunately, it’s very simple and great for developers – MongoDB has build-in option to use natural sorting, with many options! Let’s an simple example, how to do this:

db.myCollection.insert([
    { myField: "1" },
    { myField: "11" },
    { myField: "2" },
    { myField: "2a" },
    { myField: "3" },
    { myField: "ABC" }
])

db.myCollection.find()
.sort({myField: 1})
.collation({
    locale: "en_US", 
    numericOrdering: true
})

// Returns: 1, 2, 2a, 3, 11, ABC

And… it’s all! There is one important requirement to do this right: you always should save values in myField as strings, so, if you have some data for database, always use casting. If there will be different documents with different types (ints, floats, strings), it will not work anymore. You can also use collation option with aggregation framework, but must set it after all pipelines from aggregation. Here is an example for that:

db.myCollection.aggregate([
	{ '$match': { /* conditions */ }},
	// here other pipelines 
	{ '$sort' : { myField : 1 } }
	], { collation : { 
		locale: "en_US", 
    	numericOrdering: true 
	}}  	
)

You can also use that in for example Laravel module for MongoDB (model and query builder). It provides find options, it provides aggreate, and you can pass collation settings in second argument as one row in array (options array). Using this option is much better than trying to invite wheel once again – natural sort is very good option. If you have a huge amounts of data, you will use different searching systems.